This doc was originally posted on HackMD at this url.

Enrico, 2022-10-10

Kudos to everyone who worked on this epic before me - I took the liberty of copy/pasting some of your thoughts here.

Context

Bacalhau (currently v0.3.x) can take workloads composed by a single job but to execute a multi-step workload they have to manually submit and track each job, this (1) poses a significant burden on the user, (2) makes reproducing a pipeline difficult, (3) data origin can only be determined by manually logging each step and backtracking.

In the effort to offering broader support towards modern multi-step workloads, this document aims at designing a compelling pipelining feature that is user-friendly and allows for complex workloads.

In this context, a Pipeline is completely user-defined meaning they write a pipeline spec detailing what/how each step (i.e. a Bacalhau job) is related to one another.

This document is based on prior work:

Personas

User stories