Retrieval Improvements Notes

Retrieval reliability

metrics that we are shooting for:
- success rate
- time to first byte
- transfer speed
2 aspects
- technological
  - reactive / address issues as they come
  - reliability
    - getting everything in a row in order to make a retrieval (a lot of diagnostics things that we are resolving)
      - items the miners needs to know how to retrieve the data, this can be broken
    - dialability (miners are not reachable)
    - filtering for only retrieval from certain peers
  - transfer
    - transfer reliability not directly related to ttfb
    - transfer is slow (probably lives in libp2p) → transfer is slow for graphsync (likely due to libp2p library not being optimized), not as slow for http but there are bandwidth problems for multiple retrievals
    - fixing libp2p library can require a few months → not a ton of expertise around this
  - making boost multi process - scale up to store storage and retrievals simultaneously
    - limiting factor right now is SPs don’t have bandwidth
    - can put things in diff process but if things are still using the network connection, not helpful
    - SPs will be “uploading” data in retrieval cases
- economic/getting miners in a good place
  - leaving retrievals on
    - state of deleted pieces, state of metadata, build tools to fix this
    - providers delete unsealed copies
  - need to be able to pay people while retrieving
    - leads to technological problem with payment channels
    - accelerate payments without going to the blockchain
    - grant team working on payment channels (payment channel research team)
      - leveraging FVM
    - indexer could be a good broker for retrievals (person that stores/retrieves + provider)
  - out of band payment story would work too
retrievals from multiple parties at once
- graphsync isn’t really built well for this
- current http implementation built is very simple (no payments, no auth) - might want to make things available over http and open to everyone b/c they are participating in slingshot/evergreen
  - problem: providers could get overloaded
- can do the same thing with bitswap