Problem Statement:
Enterprises storing data on Filecoin want fine grained access control for retrieval of their data. We don’t currently offer this.
Key Advantages of this proposed solution:
- It is simple and performant to run in Boost. Boost does not talk to an ACL service. Instead, all of the authorization is done ahead of time, and Boost just checks that for non-public data, supplied authorization matches the scope of the retrieval request.
- Retrieval of public data is not slowed by checking for authorization of private data. Performance for private data can scale even to potentially support Bitswap.
- People storing data manage their ACLs directly with a ACL service rather than through SPs
- ACL functionality is universal and portable. An enterprise doesn’t have to pick an SP based on whether they will support a particular type of ACL service, and they can move their data without having redo their ACLs. It can also work with an organizations existing ACLs.
Prior art:
- The closest thing we have to ACLs currently is the custom retrieval filter, which allows an SP to run a script on every retrieval determining whether to serve the retrieval. Services like CID Gravity have built ACL like functionality by providing an executable to SPs to use as their custom retrieval script. This executable in turn calls out to a centralized service to determine whether a user is authorized to access some data based on a set of ACLs.
Definitions:
- Data Owner - An organization or entity that wants to store their data and make it retrievable only to authorized users
- Authorized User - An person or entity who is authorized to access a given piece of non-public data stored by the data owner
- Storage Provider - the person holding the data owners data, and ultimately deciding whether to serve it for retrieval.
- ACL Service - A service which manages lists of rules that define access to a given data owner’s data. A large organization may want to setup fine grain access for many different users.
Solution Overview (product level):
- Data Owners work independently with an ACL service to setup access rules.
- Data Owners make a storage deal with an SP that they mark as not public, along with a FIL address that identifies them as an entity
- Separate from a retrieval request, an authorized user obtains an authorization token from an ACL service that describes what data they are able to access from a given data owner. Depending on the scope of this token, it can be used for one or more requests, against one or more pieces of data.
- Retrieval protocols are augmented to support sending an authorization token. When deciding whether to serve non-public data, an SP verifies the authorization token is valid (signed directly or indirectly by the data owner) and has a scope that covers the requested retrieval.
The resulting flow looks like this: