We need to invest time in our roadmap to investigate the many reasons why retrievals are not reliable and then tackle each of those in order to address the issues. Without this understanding, building out things like reputation systems, etc. may not actually move the needle against the core problems.
Motivation
Education
Storage Providers pain points
- Unclear incentives and consequences - with the current design of the protocol, there are limited incentives and consequences for not retrieving data that is stored.
- Profitability of retrieval deals is difficult - currently storage providers have shared that it is difficult to set an appropriate retrieval price. There is limited information to calculate and understand what price would ensure a profitable deal.
- No consequence for not retrieving data - while there are expectations that verified deals guarantee free and fast retrieval, there are no enforcements for it. A handful of storage providers have brought up this point.
- Resourcing constraints - storage providers have intricate operations for storage deals. Introducing the option of retrieval deals means they need to either invest in additional resources or divert some of their existing resources to serve retrieval deals.
- consumes computation / GPU - there are still many unknowns here. There have been few retrieval deals which require unsealing to have good data.
- consumes internet bandwidth
- consumes machines that could otherwise be doing work around storage deals (which are more profitable, and have consequences if not done properly)
- no load balancing on lotus - operation can easily be overwhelmed
- Introducing more risk/unknowns - Currently, miners are careful to ensure their systems are operating as expected, and there is an aversion to making updates which can affect their operations.
- Slashing - If a storage provider is unable to properly manage retrieval deals, and something causes their miner to crash, there is risk around missing window PoST. Getting slashed is a big consequence for them, and they'd rather not take the risk.
- Latency and Bandwidth - ensuring fast retrievals (for client perception) is hard, due to latency which is dependent on location and connection strength for each storage provider.
- An example from Hidde (one of the more sophisticated storage providers based in Europe, with strong internet bandwidth in Europe): If retrievals are in Europe, everything works great. If going to the US, there is more latency on the link, and graphsync protocol does not work great. If the retrieval is going to China, or Korea, retrieval does not work reliably.
- Expensive to have multiple copies stored per storage deal - Many of the MinerX group store 3 copies of deal data (1 sealed, 1 backup, and 1 unsealed for retrieval). This is expensive for them, and smaller miners may not be able to afford using 3x the storage resources per deal.
Client pain points
- Overall responsiveness is slow - as a client, there is an expectation that there should be some response or indication that a retrieval is in progress. This is currently lacking, making the overall retrieval process feel unresponsive.
- TTFB is very high, so retrievals do not seem responsive - as a user downloading, it ends up taking a while (1 min or so) to start getting data in.
- Slingshot clients mentioned that “fast retrieval” takes around 20 seconds, which is too slow for most websites and consumer use cases.
- Low concurrency limits - Miners are able to set concurrency limits for number of parallel retrievals, most miners set it quite low, which could be causing bottlenecks for client retrievals.
- Most of the minerX group has it set to below 10 (December 2021)
- Average SimultaneousTransfersForRetrieval = 10, Highest = 30, Mode = 1
- If miner has a concurrency limit set to 2, and has 2 retrievals going at the same time, any additional retrieval requests need to queue to wait for the previous retrievals to finish processing.
- Your retrieval will block until the miner is ready.
- It’s not clear if there is a way for miners to deprioritize connections.
- No method to communicate to miner - for some cases, clients may want to abandon the retrieval. Currently, go-data-transfer does not offer a way to tell the miner this, which can cause the miner to waste resources trying to serve a retrieval that a client no longer is waiting for.
- In this case, a miner needs to manually cancel the download if it is not progressing. For providers that may need to serve multiple retrievals, this can cause unnecessary slowdown for the active retrievals.
- Can consider having a method for miners to automatically manage dropped connections
- Payment requirements unclear - for clients who have not attempted retrieval before, it causes some confusion as to why they need to pay additional amount to retrieve their data that they already paid for in the storage deal.
- Slingshot clients generally were ok with this, but stated that in traditional storage services, the retrieval cost is included in the storage cost. This is not a pattern that they are accustomed to.
- Having an option for SPs to specify retrieval cost at the time of storage dealmaking could be helpful in setting expectations.