Problem statement

Currently when users submit a job, the only information they can see about the progress of that job is the information returned by the describe command, or the events describing the change of state. There is no mechanism for users to see any output from the execution of their job until the it is either complete, or cancelled, at which point they can see what was written to stdout or stderr.

As a User, I would like to see the output from my job in realtime, so that I can:

* Get an early view of the output of my job 
* Identify when the process is not running correctly
* Debug any issues I encounter with the running of my code  

Goals

It is expected that the user of the Bacalhau CLI will be able to view the contents of stdout/stderr for a job that they have created. This should be possible as part of the current job submission commands (--follow), and via a newly created logs command.

Non-goals

This tasks considers open access to logs a non-goal, only the job owner will be able to view the job logs.

The filtering of user-generated logs is a non-goal, we have no control or understanding of what the users’ jobs are writing to stdout/stderr as we do not expect them to be structured.

Constraints

The current Bacalhau architecture, although providing multiple options in how it is configured, provides multiple layers between the user’s entrypoint to the system (at a requester node) and where the execution finally takes place (a potentially non-local executor managed by a compute node). A simplified view is shown below.

┌────────────────┐
│                │
│  Bacalhau CLI  │
│                │
└────────┬───────┘
         │
  ┌──────▼──────┐         ┌────────────┐
  │             │         │            │
  │  Requester  ├────────►│  Compute   │
  │             │         │            │
  └─────────────┘         └──┬───────┬─┘
                             │       │
                 ┌───────────▼┐     ┌▼───────────┐
                 │ Executor 1 │     │ Executor 2 │
                 └────────────┘     └────────────┘

Given that the sole interface for the Bacalhau CLI is the public API provided by a requester node, the proposed solution is bound by the following constraints:

Proposed solution

The proposed solution takes a similar approach to other orchestration platforms, allowing intermediaries to set up the connection from the edge of the system through to the component running the code, and then allowing the the user-facing component to proxy the data back to the client over the same transport as they requested it.

In Bacalhau terms this consists of the Requester adding a new endpoint to its public API which will ask the appropriate Compute node for access to a job’s output. The Compute node should return a URL with a token, where the Request node can read the output from the execution, which it will then proxy across the original connection set up by the CLI.

Care will be taken to ensure any already captured output (in stdout/stderr) is returned once the initial connection is made, so that clients can catch up with output before they receive the real-time stream. It is unclear whether this should be captured behind an option or on by default.

The sequence diagram below shows the sequence of communication between the CLI and the eventual executor, with solid lines showing HTTP(s) requests, and dashed lines showing p2p streams or internal function calls.