Maybe the transport protocol of our dreams was right there all along: the IPFS gateway.
StarGate is a specification to extend the IPFS gateway to support trustless, multipeer data transfer of fairly complex queries.
application/vnd.ipld.car+stargate
, which is passed in the Accept header to indicate a Stargate request@Marten Seemann already wrote an awesome answer —https://www.notion.so/Transferring-Content-Addressed-Data-over-HTTP-e0cb05500e8446519f58fdcc35b88b1b
I started designing an HTTP protocol based on certain assumptions:
As I started working through this, I realized “what fits in a URL” is an excellent constraint to achieve all of these, and then I realized people have been working on URL-ifying IPFS for years, in the form of gateways.
More importantly, gateways are the most widely deployed mechanism for transferring IPFS data (bigger than Bitswap?). Having a gateway implementation is often table-stakes for a new IPFS language implementation. If the Gateway API can be extended to work over libp2p connections (already possible cause of libp2p + HTTP spec) AND be an effective mechanism for trustless data transfer, that feels like an ideal path to wide deployment and usage.
The core of Stargate is a new specification for specialized CAR file, which is a valid general CARv1 file.
GET /ipfs/{cid}[/{path}][?{params}]
A StarGate car should specify {cid}
as the root CID in CAR Header
Following the CAR header, the first block is a star gate message.
The stargate message format is as follows:
type StarGateMessage struct {
Kind Kind (rename "knd")
Path nullable Path (rename "pth")
DAG nullable DAG (rename "dag")
} representation map
type Kind enum {
# Path indicates a pathing sequence
| Path ("p")
# DAG indicates a DAG block
| DAG ("d")
} representation string
type Path struct {
# name of this path segment
Segments [String] (rename "seg")
# CIDs required, in order, to verify this segment of the path
Blocks BlockMetadata (rename "blks")
} representation map
type DAG struct {
Ordering Ordering (rename "ord")
Blocks BlockMetadata (rename "blks")
} representation map
type Ordering enum {
# Depthfirst indicates blocks will be transmitted depth first
| DepthFirst ("d")
# BreadthFirst indicates blocks will be breadth depth first
| BreadthFirst ("b")
} representation string
# Metadata for each "link" in the DAG being communicated, each block gets one of
# these and missing blocks also get one
type BlockMetadatum struct {
Link Link
Status BlockStatus
} representation tuple
type BlockMetadata [BlockMetadatum]
type BlockStatus enum {
# Present means the linked block was present on this machine, and is included
# in this message
| Present ("p")
# NotSent means the linked block was present on this machine, but not sent
# - it needs to be fetched elsewhere
| NotSent ("n")
# Missing means I did not have the linked block, so I skipped over this part
# of the traversal
| Missing ("m")
# Duplicate means the linked block was encountered, but we already have traversed it
# so we're not traversing it again -- the block has likely already been transmitted
| Duplicate ("d")
} representation string
If there are URL path segments after the CID, the first message will be a Path StarGate message. Following the path message there will be a data block for every block in the BlockMetadata that has a status of BlockPresent
If the Segments part of Path message contained the last elements of the URL path, the following message will be a StarGate DAG message.
Otherwise, there will be additional Path segment sequences till the end of the URL Path
After the path, there will be 1 or more StarGate DAG messages
A StarGate DAG message simply gives a traversal order and BlockMetadata, followed by all present blocks, similar to the path.
The splitting of Paths and DAGs in to one or more messages is defined by the application protocol — i.e. /ipfs
in the parlance of UnixFS
Important note: CIDs in the CID chain are those required to VERIFY the path, but do not contain cid or block specified at that path itself. So for /ipfs/someCid/someFile
where someCid
points to a UnixFS HAMT Directory, the path segment header for someFile
would contain someCid
and as well as the cids for deeper levels of the HAMT up to the leaf node that points to the cid for someFile
. However, it would not contain the actual CID for somefile
itself.