Hacker News new | past | comments | ask | show | jobs | submit login
Distributed Authorization (osohq.com)
150 points by AnhTho_FR 9 months ago | hide | past | favorite | 64 comments



There's an interesting section here about one of my favourite challenges in authorization: how to efficiently return a list of things that the current user has permission to access, without running a "can_access()" permission check on every single one of them (which is bad if you have thousands of items and you want to paginate them).

Their solution is to let you configure rules that get turned into SQL fragments that you can run against your own database: https://www.osohq.com/docs/guides/integrate/filter-lists#lis... - example Rails app here: https://github.com/osohq/rails_list_filtering_sample_app

A team I worked with in the past came to the same conclusion - turning authorization rules into WHERE clauses is a very efficient way to solve this problem, if you can figure out a clean way to do it.


Same conclusion we came to, and the basis of our in-house permission gem for RoR. The most efficient declaration of permissions is to express them as a WHERE statement, and then the implementation of can_whatever() is just inclusion in the collection returned by the WHERE.

Permissions have three moving parts, who wants to do it, what do they want to do, and on what object. Any good permission system has to be able to efficiently answer any permutation of those variables. Given this person and this object, what can they do? Given this object and this action, who can do it? Given this person and this action, which objects can they act upon?

We’ve found most permissioning systems end up with a pick-2 approach, and the most common one to be abused is given a person and an action, give me the collection. This leads to implementing permissions twice, once in code, and once as a query.


Hi, wkirby! I'm the post author, I do DevRel at Oso.

> Permissions have three moving parts, who wants to do it, what do they want to do, and on what object. Any good permission system has to be able to efficiently answer any permutation of those variables. Given this person and this object, what can they do? Given this object and this action, who can do it? Given this person and this action, which objects can they act upon?

> We’ve found most permissioning systems end up with a pick-2 approach, and the most common one to be abused is given a person and an action, give me the collection. This leads to implementing permissions twice, once in code, and once as a query.

I love the way you put this! I'm always looking for good ways to talk about authorization without falling back on jargon and I've never come up with a way to talk about the difference between authorizing an action on a single resource and returning a list of authorized resources that I've been happy with. Would you mind if I adapted this in future writing?


By all means! I enjoyed your article here, and I will keep an eye on Oso on the future. Authorization has become a hobby horse of mine, and I always appreciate people who are thinking about the complexity required to meet real-world needs.


You might enjoy this one as well then: https://news.ycombinator.com/item?id=30878926


Love how you explained that. Quoted it on my blog here: https://simonwillison.net/2024/Apr/16/wkirby-on-hacker-news/


Hey cool, I appreciate it!


Yes! My team came to the same conclusion and are in the process of building just such a library for our platform.

- Actor - who is performing an action

- Policy - what types of possible actions are permitted on a resource for a type of actor

- Permission - actions an actor has been granted to perform.

With the intersection of these three objects you can determine if an action can be performed and actors can be granted granular access.


Hey simon! Oso CTO here.

Definitely one of my favourite problems too! Some additional context for those who don't think about this all the time: in many cases, the solution is as simple as "write some SQL where clauses to do the filtering I care about". e.g. I suspect the vast majority of people have logic like `where tenant_id = ?` or similar and they pass in a tenant ID on every query.

Where things get challenging is when you want to decouple the logic (e.g. have an abstraction in your code, or centralize logic in a service). Because then you're in the world of what's the decoupled API that allows me to filter my database.

The easiest way to do that is just generate return a big list of IDs the user can see, and put `id in (... list of ids)` on the query. But that involves (a) syncing the data to the central service and (b) that list can get pretty long.

And so that's why you would even need to think about turning rules into WHERE clauses in the first place :)


> efficiently return a list of things that the current user has permission to access

I've done this with out-of-the-box Keycloak Authorization Services. There is an entire standards based framework for authorization piggy-backed on OAUTH called UMA2. Keycloak provides an implementation of this. It's young and the documentation is a bit thin, and the learning curve is cliff shaped, but it does what is says on the tin.

My case involved authorizing verbs on a set of resources. The backend generates a permission ticket with an arbitrary list of resources and scopes and obtains (from Keycloak) an otherwise ordinary OAUTH access token containing a UMA2 RPT (Requesting Party Token) claim. That is cycled through the "token introspection" endpoint of Keycloak which returns a clean, simple JSON response with the subset of resources and scopes that are authorized. Net result is two requests for any arbitrary subset of resources and scopes. Nothing is stored or managed by the backend system: all the authorization stuff is in Keycloak. You can submit an open ended request that just dumps everything the user is authorized for.

It's simple enough that, foregoing signature checks that are otherwise performed, I prototype and test this stuff using shell scripts, curl and jq. Since it's all piggy-backed on the existing OAUTH system there is no additional infrastructure.


just wanted to make sure I understand correctly, upon authentication you just bake everything a user has access to (all policies) into a claims part of the JWT?

how it would look like for example if a user has access to 10000 objects: are all of them baked into a token as claims?


> A team I worked with in the past came to the same conclusion - turning authorization rules into WHERE clauses is a very efficient way to solve this problem, if you can figure out a clean way to do it.

For rails specifically, https://github.com/polleverywhere/moat was built with this in mind. It's heavily inspired by Pundit, but let's you write policies at the `ActiveRecord::Relation` level. So `policy_filter(Article).find_by!(id: params[:id])` would run something like `select * from articles where id = ? and id in (select id from articles where owner_id = ?);`.


Nice library


There’s a Django app called Bridgekeeper that goes a fair way toward doing this (of course, via the Django ORM). It’s got some pretty major design flaws of its own, and unfortunately hasn’t gotten much love in quite some time. However I still find the concept / intent to be quite telling. It certainly feels like it’s in the right ballpark.



Tailscale's ACLs (attach users to groups, attach objects to groups, cross link the two in policies) does a bit of indirection to reduce that volume.

e.g.: many objects <-> object tag <-> policy <-> user group attachments <-> user


I’m not entirely happy with the solution, but one big advantage of a JS authorization library called CASL is the same ability to turn rules into queries. I also like that I can create field-level permissions and even feed it a target object and get a reduced representation of that object which will include only fields the user is authorized to modify or read.

Unfortunately, there are a few issues not worth getting into here other than to say I feel that the industry has a very frustratingly uneven set of solutions for authorization, no doubt in part due to it being complicated.


(Oso CTO here). Out of curiosity what do you not like about CASL? It always seemed to have a similar goal in mind which I loved, but I suspect it hit similar challenges we had when replying on ORM integrations.


One big annoyance is including attributes beyond the target (which CASL calls the subject). There may be a plethora of environmental factors I want to evaluate in my rule. The two obvious options are:

1) build the rules programmatically, based on what you observe in the request context. This works fine until you want users to be able to create and assign custom policies and load them from a database.

2) put placeholders in the rule’s conditions, and swap them with the current contextual values when rehydrating the rules for a given request. Fine, except this obviates caching except for the raw rule from a database, and rehydrating dozens or hundreds of rules for every request starts to add up in terms of overhead.

I wish rule conditions could reference a “context” parameter (name not important) so I could create a condition like {userId: context.user.id} and at runtime I could pass the current context when I call can. That way I can rehydrate the rule once. I realize this creates all sorts of complications with serializing a rule to be stored in a database or sent over the wire, but that’s where some special placeholders could be understood natively by an Ability and hooked into a passed in context. However, that still creates an issue if I want to create a condition that is purely based on context ( e.g. {context.user.isTrialUser: false} )

The other thing I have been struggling with is that can with a subject name only (rather than an instance) will match a rule with no conditions along with rules that have matching conditions. I understand the author’s rationalization, but can potentially create unexpected results particularly when you offer a system that lets users build their own policies.

The last issue is CASL is nodejs only and I may need to support multiple platforms. I’ve looked at Casbin because of its multiplatform support and customizable model, but I’ve found it extraordinarily hard to use beyond simple RBAC or claims-based authorization, and it still doesn’t offer solutions like conditions-to-query filters or field-level authorizations.


Thanks for the extensive comment. I had similar experiences with CASL. I implemented the placeholders you described with support for arbitrary context. It also supported joins. It used a mongodb-like syntax that could be used with SQL, mongo, dynamodb or generate an in-memory filter function. Im glad to see others in this thread coming to similar conclusions. My design was similar to the OASIS model.


Ah got it, thanks for sharing! That's definitely context I'm missing from having never used it in an actual application.


if all the data is local why even need cloud authz solution?

you can run SQL locally using sqlite or something for authorization decisions and list filtering (give me everything I have access to)


Authorization can be relational, so JOIN to authz tables and and add WHERE conjunctions and it's enough -- if you have a database application anyways.


I tend to build the "ownership" model whenever I can. It works extremely well and has a few simple rules:

1. a user can own an entity/row/unit/whatever. They have full control over this unit.

2. a user can share ownership with another user/role.

3. a user can share various "rights" over any units they own -- CRUD, for example -- for any user/role.

4. a user can only interact with any unit they have a right to.

This can be implemented through a simple db table (or inline in the data itself) and doesn't depend on much. Once you build the middleware, you don't even need to think about the authorization layer.


This reminds me of Capability Based Security, which I really like

https://en.m.wikipedia.org/wiki/Capability-based_security

Sandstorm.org and WASI are both doing interesting things in this space to bring that model to running programs, so eg, the right to access the internet is a capability the OS has which can be given to programs who may then give it to subprocesses they run (or any lesser permission, maybe just the ability to access a single URL)

Its really clean and works great in practice from what I can tell


Cloudflare workers use this in an interesting way as well. The capability is basically a function or an object with data and functions is returned from a call to a remote service. The functions are intercepted with proxy to call other services instead of local code.

The callee basically decides what kind of capabilities it provides to the caller with these functions, anything you're given, you have the right to call without any further auth preconditions. Those capabilities can then be delegated by returning them, wrapped or not, to other callers.


I believe you're referring to this:

https://blog.cloudflare.com/javascript-native-rpc

(Just dropping the link here for anyone else wondering.)

(I'm the author of that blog post, and also the tech lead for Workers and the former founder of Sandstorm.)


Yes, great post and really interesting work! Thank you, I really appreciate the sharing culture at Cloudflare.


Cloudflare workers are lead by the creator of sandstorm, and use CapnProto internally which has a really neat capabilities based RPC mechanism as well.


Great to know! I'm not deep in the space but was reading about their recent impl update and it seems really well done and much of the ideas are quite thought provoking.

I'm not sure I'm sold on nano services, but if their scheduling system ends up being good enough, a lot of the problems behind treating local and remote calls homogeneously could go away.


I'm not either tbh, haven't used cloudflare workers myself, though I like their style (ultra low weight, run at the edge) a lot more than AWS's where you have to worry about cold starts. But for anything at the scale I'm doing it, one box running everything itself is enough.

Mostly just excited about capability security :) & what it can hopefully do to make doing things the right way (least privilege) as painless as possible - especially across program or network boundaries.


What happens if authorization to do something is revoked while your code is holding such a function handle?


I'd assume it throws an exception, like it would for a network error


Yeah, as I was reading the OSO piece, it was obvious that a great many of the problems they are solving exist because of RBAC (Role-Based Access Control). With RBAC, changes to access trigger changes to the principal. For example, when a new employee is hired, there is often a complex and time-consuming administrative process of getting their roles and permissions set up for their position, and when they change positions, teams, or leave the company, access control changes must be propagated based on these events.

With something like Attribute-Based Access Control (ABAC), the authorization system controls access to objects by evaluating rules against the attributes of both subject (the entity requesting access) and object (the entity to be accessed).

This can adapt dynamically determine access based on situational aspects: i.e. in an emergency situation, a subject may be granted access when it would be denied under normal conditions. If you've ever been in a situation when there's a problem in production and the only person who can fix it is unavailable, or can't get access, ABAC can be programmed to allow, temporarily, a backup access path. See e.g. https://csrc.nist.gov/Projects/Attribute-Based-Access-Contro...


Hey cratermoon, Oso CTO here.

I'm probably too close to it, so I'm not following: "a great many of the problems they are solving exist because of RBAC"

Oso supports authorization using any combination of RBAC/ReBAC/ABAC you want.

If anything, I would say that sticking with RBAC is the "easy way" to do it, but you push the complexity of managing it onto your end users (the ones who need to administer it). Whereas building authorization that uses attributes like you describe requires more implementation work, but can make the experience easier for users.

Am I understanding you correctly?


All of the examples given mention roles and users. There's no discussion about the attributes of the subjects and objects as first-class entities.


This sounds really elegant, I love it. Have you seen this deployed in a service-oriented architecture or primarily integrated as part of a single app/db?


Both. Usually the service has a table of "shares" and the owner(s) attached to the actual row. Thus determining if a user has a right to do something looks like this:

    select 1 from kites k 
    join shares s on (s.model = 'kites' and :operation in s.rights)
    where :user in k.owner or s.user = :user
or something like that.


I would love to see a reference implementation.

I know some people inline these rights into data inside a crdt. I really would love to see a reference implementation of that!



That's a really clean implementation. And the shares are used to resolve authorization here [1], right?

Two things that we're solving for at Oso is: making it easier for multiple teams to collaborate on permissions (which we solve by having a declarative language), and solving the list filtering problem (as talked about in the post).

If you don't need either of those two things and are happy with a DIY approach, what you've shared would work great IMO. If you packaged that up as a standalone solution, I could see a lot of people getting value from it!

There are not enough people sharing authz implementations out there, a blog post on this shares approach would be super cool.

[1] https://github.com/bottledcode/durable-php/blob/3ad509fcdbb3...


> And the shares are used to resolve authorization here [1], right?

That's correct!

> making it easier for multiple teams to collaborate on permissions (which we solve by having a declarative language), and solving the list filtering problem (as talked about in the post).

Those are pretty hard problems, so it's really cool to see someone solving it in a reusable way! For me, authz is always a chore ... making it something easy to specify in a way that "just works" is worth quite a bit in my mind!

> If you packaged that up as a standalone solution, I could see a lot of people getting value from it!

I don't really have much desire to get into maintaining an auth library; there's just not enough time in the day!

> a blog post on this shares approach would be super cool.

It's pending publish, actually! I've got a devlog (more like a book at this point) for something I've been working on for years now, but no posts are going to be published until I hit a milestone. I'm almost there... not much further.


> I don't really have much desire to get into maintaining an auth library; there's just not enough time in the day!

Haha, well in some ways I'm glad to hear that. That's why we exist :)

> It's pending publish, actually! I've got a devlog (more like a book at this point) for something I've been working on for years now, but no posts are going to be published until I hit a milestone. I'm almost there... not much further.

Send it over if you want another pair of eyes, and lmk when publishing so we can share with our community too. I'm sam [at] osohq.com


Judging by the name of supported operations it looks like this is built around a type of CQRS pattern. Am I reading this right? Is it possible grant granular operations like a specific type of read or write for an entity?


Indeed. It is CQRS with some ES in there (for sagas/workflows).

Right now, granularity is pretty granular but it could be more granular. For example, there is a concept of serialization scopes that I’m barely using. But it would be possible to (and fairly trivial) to use this to prevent serialization of the entity snapshot for certain properties. This would give the illusion (especially in the graphql projection) that an entity was “missing” fields. From a consumer aspect, it might mean that you can’t call certain methods. Already, the granularity may allow you to call a method and not be able to see its result, for example. It’s pretty neat … but this is also a reminder of sooo much left to document.


Congrats on the launch!

[Disclosure: I'm one of the co-founders of Aserto, the creators of Topaz].

The problem of data filtering is indeed a huge part of building an effective authorization system. Partial evaluation is one way of doing it, although with systems like OPA [0] it requires a lot of heavy lifting (parsing the returned AST and converting it into a WHERE clause). Looking forward to seeing how turnkey that can be with Oso.

With that said, there are applications where you really want the data close to the authorization engine. With a ReBAC model, you can easily find the objects that a user has access to, or the users that have access to an object, by walking the relationship graph. That's the approach we've taken with Topaz [1].

Funny timing - a few days ago we published a blog post on that very topic! [2]

[0] https://openpolicyagent.org

[1] https://topaz.sh

[2] https://www.aserto.com/blog/how-rebac-helps-solve-data-filte...


Hey all!

I'm Sam, cofounder + CTO at Oso.

Thank you all for the great discussions going on here. If folks are interested in either learning more about the product or working on these kinds of problems, you can email me directly at sam [at] osohq.com.


Kinda sounds like OPA (Open Policy Agent) [1], but a different implementation.

[1] https://www.openpolicyagent.org/docs/latest/


> Suppose you decide to add teams to gitcloud, so different teams in an organization can have different permissions. Will you be able to find all the functions and queries you need to modify in order to make teams work properly? Do you want to have to do that? > This is why we introduced Polar, our declarative DSL for authorization. Polar allows you to separate your authorization logic from your application logic and express it in a language that is purpose built for authorization. A simple Polar policy looks something like this:

First, I appreciate that it is the Polar "programming language" and not yet another kind of stupid YAML DSL.

However: you seem to target developers. Why do you force me to leave my IDE and use your "rules editor"? Can I not write all those things in my IDE, with all the support it brings, and integrate this into my CICD flow? (yes, there is the .polar file, but why force me to jump through hoops?)

Then, why did you create a new DSL and not a merely a (de-)serializable datastructure (which will indeed look like a dsl)? One, that is powerful enough to represent the capabilities you need. Then, I could in fact use any language (library) of my choice and create the rules from this language, which just has to create the datastructure.

Or, backwards: why do you think authorization is so special that it deserves a custom language? Is it more special than performance-testing, logging, auditing, debugging, metrics and so on?

Apart from that, I really like the `yes, if` idea! Would be nice to hear a bit more about that (unfortunately, the article pretty much ends there). Such as: how to deal with actions that change things and can (or must) potentially be run before the authorization is completed and such.


> However: you seem to target developers. Why do you force me to leave my IDE and use your "rules editor"? Can I not write all those things in my IDE, with all the support it brings, and integrate this into my CICD flow? (yes, there is the .polar file, but why force me to jump through hoops?)

Hey valenterry! Oso CTO here. You can absolutely write policies locally and integrate this with CI/CD. We have vscode extension for the former, and CI tools for running local dev environments and CI for running this locally or in CI or whatever.

The UI is mostly nice for getting started development experience, e.g. it integrates directly with Oso Cloud without needing to configure credentials.

> Then, why did you create a new DSL and not a merely a (de-)serializable datastructure (which will indeed look like a dsl)? One, that is powerful enough to represent the capabilities you need. Then, I could in fact use any language (library) of my choice and create the rules from this language, which just has to create the datastructure.

We have a post on this coming soon! The short version is that Polar is a logic language based on Prolog/Datalog/miniKanren. And logic languages are a particularly good fit for representing the branching conditional logic you often see in authorization configurations.

And it made it easier for us to do custom work like add inline policy tests.

> Apart from that, I really like the `yes, if` idea! Would be nice to hear a bit more about that (unfortunately, the article pretty much ends there). Such as: how to deal with actions that change things and can (or must) potentially be run before the authorization is completed and such.

We typically recommend authorizing in two places: at the start of a request, and then when fetching data.

e.g. in our demo app, authorizing "can a user create an issue" involves authorizing a "create_issue" action against the repository itself: https://github.com/osohq/gitcloud/blob/sam/list-filtering/se...

Whereas anything listing issues calls the `list_local` method and does the `yes, if` style approach.


Hey, thx for your answer. I really like to see that you are building upon existing logic languages not totally inventing your own. I think you should market it as that, because many people (like me) have been burned by companies that thought it would be a good idea to invent a completely new language (or even DSL) from scratch.

I think you should also focus on making integration with existing code as easy as possible. I know there even is one of the more research-class PLs that has first-level support for running small prolog-like scripts within other more imperative code. Essentially that is exactly what you guys do, just natively built into the language (which obviously you guys can't do).

Essentially, if the language is simply enough, I'd like to define and build both the facts as well as the logic from within my own programming language, rather than having to use your web-editor or other tooling to get support for syntax and things like that. Then, if I have a usecase where my own language is insufficient or annoying, I can still write "plain" code in your language if needed.


> We have a post on this coming soon! The short version is that Polar is a logic language based on Prolog/Datalog/miniKanren. And logic languages are a particularly good fit for representing the branching conditional logic you often see in authorization configurations.

Ha, I've been playing around with Biscuits (https://www.biscuitsec.org/) and was writing up a blog post on using them in a git forge. When I saw the Polar data units described as "facts" and read your end to end example (https://www.osohq.com/docs/tutorials/end-to-end-example) I thought "Oh this looks very similar". I will say - I do like how Polar seems to type stuff and provide some concepts that Biscuits force you to build out on your own, that's pretty neat.

What is the proof of identity in Polar? Is it something like a token in Biscuits? I'm curious if you can do things like add caveats to reduce what the token is capable of as it gets handed off to different systems. I consider that one of the "killer use cases" of biscuits.


Biscuits are really cool, one day I plan to try and convince Geoffroy to integrate Polar for policies :)

Currently Biscuits + Polar are ideologically similar but with distinct use cases at the moment. Oso is a central service that your backend speaks to whereas biscuits are decentralized cryptographic tokens.

So Oso API calls are all done with service-specific API keys, and don't need a proof of identity beyond that.

My mental model for tokens like biscuits is that they work like a "decentralized cache". I.e. you can take an authentication/authorization decision, sign it, and provide it to a client. They can reuse that decision at services without the service needing to reach out to the authN/authZ provider again.

It would play _really_ nicely together with the Distributed Authorization concepts we're talking about here: a client could ask the authz service: "I want a token for managing issues". The authz service/Oso could respond with a partially evaluated policy, like: "this token grants the user access to issues that belong to repository 1, 2,3, or issues that they are the creator" (encoded as facts to check a la biscuits).

When receiving that token, the issues service would know what filtering to do locally, without having to reach out to Oso.

The information passed around between the services mostly stays the same, but rather than make an API call each time, the necessary authz information is encoded in the biscuit token.

And then next level: biscuits can cross federation boundaries and be attentuated, etc. So it really starts unlocking novel ways to integrate application-level authz with infrastructure-level authz.


Biscuit maintainer here o/

There's definitelt some cool use cases we could collaborate on. One thing I'm looking at more deeply right now is tokens carrying the necessary data as they go through the system,to make sure one call does not see some of it rights change dynamically when going from service to service


Quick addition: in practice everyone that I know uses Git + CI/CD for managing + deploying policy changes.


Any such IDE needs a way to communicate with the services that are going to enforce the permissions. The DSL is a good way to do this. It allows inspection, diffing, and access by other tools, instead of being locked away in some hidden IDE binary structures.

Edit: But no IDEs will yet have any graphical tool to assist in rule development and checking. So the fact that one is provided, is better than it not existing at all. But you can always edit the DSL manually, if need be. They explicitly state that their rules-editor can sometimes be limited:

"The Rules Workbench, a visual rules editor that you can use to model most of these patterns"


Sure, the IDE edits a file that gets deployed. Just like all the other source files.


Slightly tangential, but is there any hope of seeing Polar return as a (maintained) open source system?

I absolutely love the concept of using a logic language for authorization, and I think Polar's aesthetic qualities make it significantly more approachable for most people (over Prolog/Datalog).

But even without the authorization problem, Polar is just... really nice looking. It would be awesome to be able to use it as its own language outright.


Yes. This is happening. Stay tuned.


Quick note for the osohq team: The "Read the docs" button leads to 404


Doh, fixing now. Thank you!


Is it self hosted? I can't see any docker image


This is a product launch. Full title: "Authorization is still a nightmare for engineers: Launching Distributed Authorization"


And now it's just "Distributed Authorization"; further distancing itself from the product it's actually about.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: