The first step in the SPARK retrieval check workflow is the selection of (cid, address)
pair that the given checker
should test.
We have the following requirements:
(cid, address)
is chosenchecker
will be assigned to test the given (cid, address)
job.
cid
from the same address
in a single measurement epoch. The sampling algorithm must be able to account for this.(cid, address, checker)
triple is sampled.Additional thoughts:
❌ A hard-coded list of (CID, address)
to pick from. This list must be private to SPARK Orchestrator (SPs must not be able to access it.)
A random walk of IPNI advertisements, using DRAND as a source of randomness.
What would our IPNI query look like?
Can the IPNI team build & ship this API in time for us?
There is no verification that SPs are submitting all CIDs to IPNI and won’t be done by LabWeek.
→ Propose the new API - open a new GH issue in https://github.com/ipni/storetheindex/issues
A random walk of Filecoin storage deals
Algo:
StateMarketDeals
method is over 3GB compressed, over 23GB decompressed. SPARK nodes cannot work with a dataset this large.❌ For each SPARK deal, the party paying for the retrieval provides a public list of CIDs & addresses to check. (This will be presumably based on Filecoin storage deals made by the paying party.)
We don’t know that the advertisement is honest