Overview

This document is a collection of notes and ideas for modifications to Balahau to allow the management of compute jobs on baremetal, including the running of a completely content-addressable operating system stack.

Whilst this document is specifically targetting baremetal servers, the proposed design/implementation also works for virtual machines (with any hypervisor), as well as web2 public cloud instances (for testing and early adoption).

Value Proposition

Were Bacalhau, or a complementary component, be able to manage the whole software stack and compute jobs from baremetal up, we would see many benefits. If the operating system and core services (Docker for example) were all content-addressable, stateless images, the value would be even more profound. At a high level, these benefits might include

  1. Heavily-reduced operational cost — nodes boot and run jobs. In case of an issue, the node performs a fast reboot. For updates/upgrades or role changes, CIDs are update to point to new components, and the node is fast-rebooted.
  2. Heavily-increased security — nodes are not running the traditional general purpose operating system they might otherwise be. This gives a far smaller attack surface. Instead, all components of the software stack are content-addressed, ensuring that we’re always running the exact bits we intended.
  3. Simplified Execution — A user could potentially instruct Bacalhau to run a particular set of tasks across a farm of freshly-racked servers (sans software) and have the results returned when completed.

User Stories

To help illustrate the value, the following user stories show potential workflows.

  1. Docker job:
    1. User submits a job using Bacalhau to a farm of servers
    2. Bacalhau defines the required software stack by listing the CIDs of required packages
    3. Bacalhau instructs the nodes to power on and network boot
    4. The nodes boot, retrieving all packages and components over IPFS by CID
    5. The job executes and Bacalhau collects the results
  2. Python job:
    1. User submits a job to Bacalhau in the form of a Python script
    2. Bacalhau defines a similar software stack by CID, but includes the Python package instead of Docker
    3. Bacalhau instructs the nodes to fast-reboot
    4. 30 seconds later, the new stack has been retrieved over IPFS and the Python script begins to execute
    5. The job completes and Bacalhau collects the results

The above user stories show how straightforward it is to execute a job and have the whole software stack content-addressable and ephemeral. It also shows how easy it is for a user to change the role of a node from a Docker compute node to a Python compute node.

Current Approach

Bacalhau today is capable of taking self-contained userspace jobs, such as Docker containers, and distributing them across multiple nodes to be executed and the output gathered.

Bacalhau does this by having a running agent on each participating node. A user can submit a job, which is then passed to the node to be executed. On completion, the output of the job is then retrieved.

This model works for many applications and use cases, including other container runtimes (OCI for example), WASM, standalone binaries or Python scripts.

This approach assumes that the nodes are already deployed and running an operating system, and assumes that the upfront deployment and ongoing operational maintenance is handled elsewhere. In various forums, questions have been raised about the possibility of Bacalhau being able to also manage the bootstrap and runtime of jobs on baremetal.

Solution Description