Idea: every job has an identifier generated by hashing all code and input data that are required to run the job. Two jobs doing the same work over the same data have the same hash – even if the two jobs are running on different machines who have never communicated.
If execution of jobs is also deterministic, then each job hash also uniquely identifies a set of outputs as well – because running the same code over the same data deterministically should result in the same output. This means that it’s possible to see where data output by one job is used as input to another, and how all of the jobs link together in a call tree.
Tl;dr:
Job Spec by expede · Pull Request #8 · ipvm-wg/spec
Why
Having standard hashable jobs is a precursor to some valuable qualities (discussed during CoD Summit²):
- Link job executions to their outputs.
- Outputs from previous executions of the same job can be re-used. If there’s a mechanism for previous job outputs to be discovered, those previous results can be used instead of running the same work again.
- Reputation. Makes transparent all the results that are being generated which can be used as the basis for a reputation system.
- Economic incentivisation of useful data. Using the call tree of jobs and their outputs, it would be possible to see how many jobs ultimately refer to a peice of data (either directly or via the output of another job).
- Job is reproducible even across networks. If someone wants to run the same job later to reproduce the result, they can be confident that the result they get will be the same as the first time it was run, even if the original network has shut down.
- To achieve this we need a standard way of executing jobs as well – i.e. given a job, the way in which the job is run needs to not affect the output. Broadly, determinism. And this may not really be necessary if we aren’t bothered about unsubstantive differences in the output.
- ❓Is it for users? Do we expect users to use this format to submit jobs?
Doing this in a standard cross-network way means that all these qualities work across networks as well. A network has good incentive to buy into this standard because it means they get access to all of the memoization and reputation-building (maybe) happening from other networks.
Principles
- Brooklyn: “work out the minimum viable standard, leave holes and extension points for the rest”.
- Implementers, please build parsers that preserve input but aren’t strict on it
- Need to have enough in the spec to trust that jobs that hash the same really do produce the same outputs, to within some substantive envelope
- The same job shouldn’t produce output with a different semantic meaning
- But it would fine for it to produce output with incidental differences
- E.g. if ordering of the output doesn’t matter, then outputs in different orders are fine, even though they don’t hash the same
- So we don’t care about execution options, e.g. timeouts. No “how”, only “what”
- But need to not have too much to be limiting innovation, else people won't sign up for it
Features
- Coverage of both input and output
- Input: just what is needed to unambiguously run the job to acheive the same semantic output – used as a key for discovery of output data
- Output: loads more metadata about how it was run – used for assessing the results (e.g. for trustworthiness) and back-linking from output data to the job that made it
- Deterministic and non-deterministic jobs
- Bacalhau has a desire to run more deterministic workloads but realistically non-deterministic workloads will be part of the system for a long time
- Other systems (e.g. Koi) likely to be similar – these users should be included
- Executor agnostic
- WASM jobs are the focus for IPVM, Docker jobs are the focus for Bacalhau. Others (like Koi) support user-configurable tasks (Bacalhau might like to support such a thing too).
- So the actual types of execution need to be extensible.
- Input data that has special meaning to the runtime
- Some data has special meaning in the context of executing the job, e.g. the job code, or in WASM world, maybe even the import modules and memory block.
- Spec should provide a way for these volumes to be highlighted
- Transparent transformation of jobs
- So that we can take something the user has submitted and discover the end result, even if their job spec was “transformed” into something else.
- Eg user says “run this rust”, system says “run this WASM”, user can still use their original spec to recover the result.
- Also means we can do versioning e.g. “this spec is v2, v1 is here”
- Also means that networks can do some “canonicalisation” of the spec if they want, e.g. if user asks for Docker job with “ubuntu:latest” system could rewrite this to “ubuntu:22.04” to preserve the actual value used, valuable for reproducability
- This shouldn’t be part of the input spec but useful metadata to have in the output spec
Not-features