Notes about refactoring the Job spec

Context

Currently (Nov. 2022), the /submit API endpoint takes a data payload whose Job model is significantly more complex than what is required. Specifically, there are a number of fields that don’t need to be in the spec. This document sits in the Pipeline folder because we need to simplify the way /submit works. A more comprehensive text about how job spec will evolve is available at Content-addressed, hash-linked jobs/functions.

Current state

The submitRequest object is bloated:

Screenshot 2022-11-28 at 12.29.40.png

Here’s the Job model:

// Job contains data about a job request in the bacalhau network.
type Job struct {
	// (1) WHAT
	// The specification of this job.
	Spec Spec
	// The unique global ID of this job in the bacalhau network.
	ID string
	APIVersion string	

	///// (2) HOW
	// The ID of the requester node that owns this job.
	RequesterNodeID string 
	// The public key of the Requester node that created this job
	RequesterPublicKey PublicKey
	// how will this job be executed by nodes on the network
	ExecutionPlan JobExecutionPlan 
	// The deal the client has made, such as which job bids they have accepted.
	Deal Deal
	
	//// (3) STATUS
	// The current state of the job
	State JobState
	// All events associated with the job
	Events []JobEvent
	// All local events associated with the job
	LocalEvents []JobLocalEvent

	//// METADATA
	// Time the job was submitted to the bacalhau network.
	CreatedAt time.Time
	// The ID of the client that created this job.
	ClientID string
}

Here’s the current Spec (nested field in Job):

// Spec is a complete specification of a job that can be run on some execution provider
type Spec struct {
	// e.g. docker or language
	Engine Engine

	Verifier Verifier

	// there can be multiple publishers for the job
	Publisher Publisher

	// executor specific data
	Docker   JobSpecDocker
	Language JobSpecLanguage
	Wasm     JobSpecWasm

	// the compute (cpu, ram) resources this job requires
	Resources ResourceUsageConfig

	// How long a job can run in seconds before it is killed.
	// This includes the time required to run, verify and publish results
	Timeout float64

	// the data volumes we will read in the job for example "read this ipfs cid"
	Inputs []StorageSpec

	// Input volumes that will not be sharded
	// for example to upload code into a base image
	// every shard will get the full range of context volumes
	Contexts []StorageSpec

	// the data volumes we will write in the job
	// for example "write the results to ipfs"
	Outputs []StorageSpec

	// Annotations on the job - could be user or machine assigned
	Annotations []string

	// the sharding config for this job
	// describes how the job might be split up into parallel shards
	Sharding JobShardingConfig

	// Do not track specified by the client
	DoNotTrack bool
}

Here’s an example request with a (roughly) minimal set of working fields.

{
    "data": {
        "ClientID": "...",
        "Job": {
            "APIVersion": "V1beta1",
            "Spec": {
                "Engine": "Docker",
                "Verifier": "Noop",
                "Publisher": "Estuary",
                "Docker": {
                    "Image": "ubuntu",
                    "Entrypoint": [
                        "date"
                    ]
                },
                "Timeout": 1800,
                "outputs": [
                    {
                        "StorageSource": "IPFS",
                        "Name": "outputs",
                        "path": "/outputs"
                    }
                ],
                "Sharding": {
                    "BatchSize": 1,
                    "GlobPatternBasePath": "/inputs"
                }
            },
            "Deal": {
                "Concurrency": 1
            }
        }
    },
    "signature": "...",
    "client_public_key": "..."
}

Discussion

We’re in a state where:

There are a number of fields in the Job spec that are not
Defaults can be improved (e.g. Deal.Concurrency , Timeout, etc.)

@Simon Worthington’s thoughts - Fields in the job can be split into 3 sections: