Introduction

This is a perf testing of the new libp2p bprotocol that enables nodes to negotiate and communicate about jobs using point to point communication instead of gossipsub. You can check the design doc here Bacalhau Orchestration

Test Environment

Results Summary

Based on the observations of the test scenarios, bprotocol showed consistent and better performance than gossipsub regardless of the number of nodes and network size. In some cases, more nodes resulted in lower latencies even though we are still using a single machine.

In addition to that, network based on gossipsub started to fail most jobs and only had 4% availability with higher number of nodes and higher jobs submission rate.

Test Scenarios

Test 1: 60 Sequential Jobs

Configuration

export TOTAL_JOBS=60 
export BATCH_SIZE=10 
export CONCURRENCY=1 

Results

| --- | --- | --- | --- | --- | | Total Time | | | | | | bprotocol | 80s | 80s | 80s | 80s | | gossipsub | 80s | 80s | 600s | | | Single Job Latency | | | | | | bprotocol | 1.323 s ± 0.004 s | 1.33 ± 0.004 s | 1.33 ± 0.008 s | 1.32 ± 0.002 s | | gossipsub | 1.33 ± 0.008 s | 1.33 ± 0.006 s | 10.10 ± 0.298 s | | | TPS | | | | | | bprotocol | 0.75 | 0.75 | 0.75 | 0.75 | | gossipsub | 0.75 | 0.75 | 0.1 | |

Observations

  1. No difference with running this scenario using bprotocol regardless of the node count, even with 200 nodes.
  2. gossipsub performed well up to 10 nodes, but then started to show high latencies and these warning logs VERY High message latency, which are logged by our gossipsub handler when received messages are older than 500ms. I saw events reach 4 seconds latencies