We aim to design a system for storing and querying Ethereum data via ETL techniques.
The system should be:
- containerized
- SQL
- idempotent
- portable to move data between hosts
Docker docker-compose PostgreSQL
Docker will can rebuild a raw db
PostgreSQL data can be stored in a directory that is volume-mounted in the docker. Then the directory + code repo can move between hosts.
Will then need a second sister docker for running the scripts that load the db.
Applications will live in other dockers elsewhere possibly on other hosts.
Easiest to build database as a linear sequence of steps so we can rewind/rebuild as needed.
Will need at least one main metadata table that is just meta info about what the state of the db loading data.
But we need to only deal with a portion of Ethereum data at once due to size.
Another application will be loading data live into the db as it tracks the Ethereum node.
This operation needs to be able to gracefully work with the metadata table to wait for its turn, start at the right place, catch up, and stay caught up.