<aside> 💡 The Production Engineering team merged with ProbeLab on 23rd January 2023. This page and those under it are to be considered historical and will eventually be archived.

</aside>

The KPI targeted for this project is the 7 day mean of the 95th centile measurement of the time to first byte as reported by all Gateway NGINX servers.

The TTFB encapsulates the time taken for go-ipfs to resolve and read a requested block. The block may be the root of a file dag that must be retrieved to fulfill the entire request, but this additional retrieval time may not be part of the TTFB metric if go-ipfs is able to begin streaming to the client immediately.

Quick Links

🔒 indicates private/internal resource

Current Focus: Increase data locality

Github Epic: Increase data locality in Gateway

The overarching hypothesis here is that improvements to TTFB that can be achieved by ensuring that more requests are served using a local blockstore instead of waiting on network block discovery. Popular and expensive to fetch data should be held locally for longer by gateways, while still respecting limits on disk usage.

Intially we are investigating better, cache-aware, garbage collection strategies for go-ipfs (github issue). We theorize that blocks are being deleted from the blockstore by garbage collection only to be re-requested at a later date which is much slower. Garbage collection is required to keep within disk space limits but the current implementation is unaware of the usage patterns of unpinned blocks.

The default go-ipfs garbage collection strategy is mark and sweep:

(bestEffortRoots, by default, only includes the MFS root)

We think improving retention of valuable blocks is a tractable problem to solve for the current small ProdEng team (1.5 people). We plan to:

Outline of proposed approach: