Context

We want to decouple schema-management concerns from Lily and improve our user’s (include our) ability to consume that same schema.

Problem Statements

  1. TODO: Add other related problem statements
  2. There is currently a lot of data duplication in Lily’s data model, which in turn, leads to large tables.
    1. From @Forrest Weston: For example, there are ~1billion miner sector event entries (currently in prod mainnet database), each entry contains a stateroot reference. A CID is ~64 bytes. So there is roughly 64 gigabytes of CIDs in that table alone. Add that together with the rest of the tables that have state roots: at least 250 gigabytes of stateroots CIDs alone. This applies to message CID’s and block CID’s as well.

Proposed Solutions

Fully-relational, read-optimized schema

Schema-less Lily

Notes (just drop things here they come to mind)

https://github.com/polarsignals/arcticdb