kopia lustrzana https://github.com/bugout-dev/moonstream
commit
0bf949efa8
|
@ -0,0 +1,126 @@
|
|||
# Moonstream architecture
|
||||
|
||||
Moonstream consists of:
|
||||
1. A colder data store in which we store large amounts of transactional data and metadata directly
|
||||
from various blockchains.
|
||||
2. A warmer data store in which we store data that streams in very quickly, for example from the
|
||||
Ethereum transaction pool. The data in the warm data store is not stored permanently. All data here
|
||||
is removed after a certain data-specific time-to-live (TTL).
|
||||
3. Crawlers which collect data from different blockchain related sources and insert them into either
|
||||
the slow data store or the fast one.
|
||||
4. The Moonstream API, which allows users to sign up to Moonstream, subscribe to different sources
|
||||
of data in Moonstream, and serve their requests for this data.
|
||||
5. The Moonstream frontend ([live](https://moonstream.to)) through which users can interact with
|
||||
Moonstream in their browsers.
|
||||
6. Moonstream client libraries through which users can interact with Moonstream from the programming
|
||||
environment of their choice.
|
||||
|
||||
This document gives a brief explanation of the role of each of these components and points you to
|
||||
more detailed information about whichever components you are particularly interested in.
|
||||
|
||||
It also tries to answer any questions you may have about why certain decisions/trade-offs were made.
|
||||
|
||||
## Data storage
|
||||
|
||||
[Codebase: `../db`](../db/)
|
||||
|
||||
### Fast vs. slow
|
||||
|
||||
Blockchains like Ethereum and Solana implement smart contract functionality by recording the state
|
||||
of accounts on the blockchain at every block. This record of state grows over time. Ethereum state
|
||||
already takes hundreds of gigabytes of storage. Solana state is even larger, and they host historical
|
||||
state centrally on a Google BigTable instance.
|
||||
|
||||
Moonstream is an open source project, and we intend for people to host Moonstream themselves. We cannot
|
||||
assume that someone hosting Moonstream has tons of cash to spend on high-quality storage (e.g. latest
|
||||
generation SSD). The most cost-effective way to store the large amount of state data (without relying
|
||||
on cloud object storage) is on a magnetic hard disk.
|
||||
|
||||
Although this makes storage cheaper, it makes it slower to read and write data from the data store.
|
||||
Since we have some crawlers which collect volatile data, like the data in the Ethereum transaction pool,
|
||||
we *also* need a fast storage layer that we can store and retrieve data from faster.
|
||||
|
||||
This is why we have two different classes of storage in Moonstream.
|
||||
|
||||
### Slow data store: Postgres
|
||||
|
||||
We use a Postgres database as the slow datastore. The code in the [`db/`](../db/) directory defines
|
||||
the schema for this Postgres database as well as migrations that you can use to set up a similar
|
||||
database yourself.
|
||||
|
||||
The [`db/`](../db/) directory contains:
|
||||
1. A Python package called `moonstreamdb` which defines the databse schema and can be used as a
|
||||
Python library to interact with the data store.
|
||||
2. [Alembic](https://alembic.sqlalchemy.org/en/latest/) migrations which can be used via the
|
||||
[`alembic.sh`](../db/alembic.sh) shell script to run the migrations against a Postgres database
|
||||
server.
|
||||
|
||||
The Ethereum blockchain crawler ([accessed through the `ethcrawler blocks` command](../crawlers/mooncrawl))
|
||||
stores Ethereum state in the slow database.
|
||||
|
||||
We also have other crawlers (e.g. the CoinMarketCap crawler) which store address and transaction
|
||||
metadata in the slow database. This is because the slow database is permanent whereas the fast database
|
||||
is assumed to be ephemeral.
|
||||
|
||||
### Fast data store: Bugout
|
||||
|
||||
Since different crawlers store data in the fast data store using different schemas, we use [Bugout](https://bugout.dev)
|
||||
as our fast data store with no extra assumptions about schema.
|
||||
|
||||
Bugout is open source and can be self-hosted as well from the following repositories:
|
||||
1. [Brood](https://github.com/bugout-dev/brood) - For authentication
|
||||
2. [Spire](https://github.com/bugout-dev/spire) - Data storage and access
|
||||
|
||||
Our Bugout instance also uses a Postgres database as the underlying data store. This Postgres server
|
||||
is provisioned on high-throughput SSD.
|
||||
|
||||
The crawlers that use the fast data store write to a single Bugout journal using a write-only token.
|
||||
Each crawler tags the data it writes with the type and any additional schema information.
|
||||
|
||||
The API reads from that journal using a read token. Queries are resolved using the tags that the crawlers created.
|
||||
|
||||
## Crawlers
|
||||
|
||||
[Codebase: `../crawlers`](../crawlers/)
|
||||
|
||||
Many of the Moonstream crawlers are written in Python. These are all packaged together in a single Python
|
||||
package called [`mooncrawl`](../crawlers/mooncrawl/).
|
||||
|
||||
Crawlers can be written in any programming language - some programming languages may be more preferable
|
||||
for certain kinds of data. For example, we plan to write our Solana crawlers in Rust because the Solana
|
||||
library support for "Solana programs" (their version of smart contracts) is much better in their native
|
||||
Rust.
|
||||
|
||||
The [Ethereum transaction pool crawler](../crawlers/ethtxpool/), for example, is written in Go.
|
||||
|
||||
## Moonstream API
|
||||
|
||||
[Codebase: `../backend`](../backend/)
|
||||
|
||||
The Moonstream API is written in Python and uses the [FastAPI framework](https://fastapi.tiangolo.com/).
|
||||
|
||||
API routes are defined in [`backend/moonstream/api.py`](../backend/moonstream/api.py), and that file
|
||||
is the right entrypoint into understanding the API codebase.
|
||||
|
||||
The API uses [Bugout](https://bugout.dev) for authentication and to manage resources like user subscriptions
|
||||
to different types of data.
|
||||
|
||||
It also defines [event providers](../backend/moonstream/providers/__init__.py), which are responsible for
|
||||
retrieving data of each available type (e.g. `ethereum_blockchain`, `ethereum_txpool`, etc.) from the
|
||||
fast and/or slow data stores and serving it to Moonstream users.
|
||||
|
||||
## Frontend
|
||||
|
||||
The Moonstream frontend is a [React](https://reactjs.org/) application. It uses the [Chakra UI](https://chakra-ui.com/)
|
||||
component library and [react-query](https://react-query.tanstack.com/) to manage data.
|
||||
|
||||
## Client libraries
|
||||
|
||||
These are still under development. If you would like to build a Moonstream client library for your
|
||||
favorite language, [reach out to @zomglings on Discord](https://discord.gg/K56VNUQGvA).
|
||||
|
||||
These are the languages we currently have libraries for:
|
||||
|
||||
### Python
|
||||
|
||||
This is a work in progress. [Pull request](https://github.com/bugout-dev/moonstream/pull/266).
|
Ładowanie…
Reference in New Issue