Authz dilemmas

Authorization. The subject of answering “what is the user allowed to do?”

This is a hard problem in software systems. I recently tried to solve authz issues in my own side project, and I’ve been thinking about this for work.

The problem is that authz is one of those extra layers that I think most people don’t care about. We want our software to be protected appropriately, but I don’t think we want to do the work. When I’m writing a web app, I want to solve a user’s problem. Nobody’s problem is that they have to worry about who can access the data on your site. That problem doesn’t exist until they start to use your service.

A user is not interested in authorization. The user is interested in solving a problem and your software happens to solve it.

The problem of authorization only matters to a user after they sign up. This means that they probably don’t really care about authorization. The likely assumption is that your authorization works.

For a new system, you’re better off delivering features that solves the user’s problem. Using a burdensome authorization scheme takes away valuable time that you could spend on fixing their problems.

So, what should you do?

If you’re trying to make your app survive, you do what works and is the minimal amount of effort possible. After all, if you’re not solving a user’s problem, then who cares if you have some amazing authz system? Nobody.

Time to solve the problem

But now you’ve survived the primordial stage of a web app. You’ve got users, but your authorization controls are a mess. You’ve homegrown some stuff to check that user A can’t mess with user B’s data.

With your success, you begin to worry about your liability. Maybe GDPR is a concern for your system and you have to comply because your site serves a European audience.

Whatever the reason, the need for good authz controls always seems to grow. In an instant, the trust in your system can evaporate if you have crappy controls in place that allow people to take advantage and steal data from others.

With any luck, a system reaches an amount of functionality where you can service an audience with the features that you have, and you can invest in the technology that will ensure that customer data is safe.

Don’t kid yourself into thinking that authz controls are a competitive moat.

Even if you’re successful, your users don’t care about authorization. They’ll just expect it to work.

It’s still desirable to invest as little effort as possible while gaining adequate authz.

Monolith FTW

If you’ve built a monolith (which you should have at an early stage), then your authz story is manageable.

From what I’ve learned, authz is all about relationships.

The real question that I should have started with is: what is the user allowed to do with this data?

That’s a relationship question. We’re looking at the user’s relationship to some kind of data in your system.

The glorious part about a monolith is simple: all the data is already available to you.

Relationships can be complex. Answering what a user can do with a piece of data might involve checking many other data models in your system.

  • A user might be able to access data because they are an owner of the data.
  • A user might be able to access data because they are on a team that owns the data.
  • A user might be able to access data because they were granted explicit permission for this individual piece of data.
  • A user might be able to access data because the group they are a part of inherits permissions from some other group (i.e., their group is a superset of another group that owns the data).

The combinations are essentially endless. Regardless of how complex the relationship is, a monolith still has access to all of it.

Once you’ve outgrown your in-house solution, you can reach for some kind of policy engine that can represent the relationships in your system.

Oso is a pretty good example that I’ve encountered recently that can capture your system’s relationships to answer the question of “what can the user do?”

SOA Woes

Did you build a service oriented architecture? Well, welcome to authz hard mode.

A policy engine works well when all the data is accessible, but in a SOA, all the data isn’t accessible. Authz in a distributed system has to deal with networks. That’s a whole extra layer of complexity for the authz problem.

If you can imagine an authz check involving five different data models, then a SOA system might have those five different models in five different services. This could mean that doing an authorization check requires five separate network calls. Yikes.

Assume that all of five of these services have four 9s of uptime (i.e., 99.99% uptime). That means that your authz check is now three 9s of reliability (99.99%^5 = 99.95%).

Is there another option? Sure, you could rely on a single authz service.

In this model, the system pushes all authz checks to a single service to create an inverted index. All the relationship definitions are centralized to a single place, and the services in the SOA are responsible for pushing the details to the central service (e.g., “user A can edit document Z”).

Running the authz service yourself is possible or you could use something like Authzed and let them do it for you.

Why shell out some cash to a company? Because the downside of a central authz service is baked into the name: the service is central. In other words, you’ve created a single point of failure that your entire SOA architecture will depend on. This service will have high availability requirements if you want to have any hope of maintaining reasonable uptime on your system.

Whether you let individual services answer the authz checks or push the authz problem to a central service, you’re going to deal with tradeoffs related to uptime and operability.

Conclusions?

I don’t have any brilliant conclusion here. This is the exactly the problem space that I’m wrestling with both for my personal SaaS and for my professional life.

After doing research and reading material like the Google Zanzibar white paper, I think a centralized service is the industrial strength solution. It’s high cost, but massively scalable.

But does that make any sense for my small project? No way!

One of the aspects that I did like from Zanzibar is the idea of treating like an inverted index. The inverted index approach pushes a lot of the authz computation to the writer. Storage is used to set what a user is allowed to access. This model makes it possible to answer everything that a user is allowed to access by doing a lookup on the precomputed permissions.

In a more policy oriented system, that permission check is calculated by the reader at runtime. That lowers the storage requirements at the cost of runtime computation.

Given how cheap storage is these days, I think that would bias me towards authz technology that uses this inverted index approach.