Goal

Demonstrate an integration between Bacalhau’s insulated jobs and a federated learning workload.

Architecture

Federated learning frameworks such as Flower (which we’ll use here) have a client-server architecture.

Bacalhau insulated jobs has a batch job architecture.

We don’t want to enable networking between a federated client and a server because it makes the moderation flow less obviously secure: if the client can communicate with the server, we don’t get a chance to audit the return data.

So, we need to build an adapter between a client-server interaction and a batch job execution framework.

PXL_20230406_120548181.jpg

In the above, the top diagram shows the previous architecture. In the proposed new architecture, we build a new Flower BacalhauClient which is a wrapper around a Bacalhau client (ooh, this is a good use for our python sdk!) which dispatches requests into compute nodes based on a configured label for the client. The request can be approved by a human in the dashboard that is local to that compute node (representing a private data source), and the response also is moderated (eventually).

The adapter between the flower client and the bacalhau node will need to implement the client interface:

Screenshot 2023-04-06 at 13.33.35.png

https://github.com/adap/flower/blob/main/src/py/flwr/client/client.py#L35

The key question to establish is whether the interface can be implemented statelessly, since each call to the API will need to be handled by a separate instance of a container running the client code, or if additional work will need to be done to maintain state between separate instantiations.

Another question is whether the number of calls of these methods would be problematic for a manual approval flow.