Index | About | Me | Jump to Menu Section

Material about building a database

Last modified: Wed May 11 2022 11:00:32 GMT+0000 (Coordinated Universal Time)

I may end up building something similar to a database at work. Besides that, I’m interested in writing more Rust and I will use toy databases and databases concepts as an excuse. This document is sorted by time and not by topic.


My reading notes on database-internals are on its own page, and not here.





influxdb benchmarks: They have a tool for that, and a hotsite.

arrow: Arrow seems to be on everybody’s mouth at the moment. Apache Arrow defines a language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs. The Arrow memory format also supports zero-copy reads for lightning-fast data access without serialization overhead.


Influx iox: Influx iox is the next gen storage engine for InfluxDB. They annouced it here.

Their stack is: (1) Rust; (2) apache arrow; (3) DataFusion; (4) Parquet;

Details of their arch:

It’s interesting that they say they now support Events data: “we’ve always had the vision that InfluxDB should be useful for event data (i.e. irregular time series) as well as metric data (i.e. regular time series)”. Their previous version had some issues with high-cardinality data (wich smells like events for me). So it seems like they fixed this with the new version! Pretty cool!!

Also, it seems like it took at least 2 and half years to get to a public usable state. So if I ever need to estimate some similar project, I would probably bet something around that 😛.

At their documentation they have some talks that may be interesting. I will watch it later.

More links on Influx IOX

Arrow-DataFusion: DataFusion is a very fast, extensible query engine for building high-quality data-centric systems in Rust, using the Apache Arrow in-memory format. DataFusion offers SQL and Dataframe APIs, excellent performance, built-in support for CSV, Parquet, JSON, and Avro, extensive customization, and a great community.

A lot of projects (besides Influx iox) also use DataFusion as a building block for distributed database. Since I’m interested in time-series at the moment:



Watched Andy Grove - Ballista: Distributed Compute with Rust and Apache Arrow

Data Fusion supports different type of sources, including in-memory only

The DataFusion flow is normally SQL -> DataFrame -> LogicalPlan

Extensions of DataFusion makes the ExecutionPlan distributed

Fligh protocol would allow clients to pull the results from different machines (not needing to fetch everything from leader node/scheduler)

Watched InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in Apache Arrow

QueryEngine is made of three main parts:

  1. Frontend (the query language parser)
  2. Intermediary Query Representation (The logical Plan)
  3. Concreat execution operators

For Data Fusion those are represented by the following Rust structs:

  1. SqlStatement
  2. LogicalPlan and Exp
  3. ExecutablePlan and RecordBatch (this last one is from Arrow)

DataFusion reads RecordBatches (arrow) and output RecordBatches.

DataFusion CLI

You can create an external table (pointing to the parquet file)

DataFusion supports explain. You read the LogicalPLan from bellow to up (data flows up from the leaves to the root of the tree)

DataFusion also supports explain verbose to see optimizations applied

LogicalPlan Creation:

  1. Declarative, you describe what you want; system figures out how
  2. Procedural: describe how directly (with LogicalPlanBuilder) or DataFrame

Type Coercion: Data Fusion adds typecast automatically when possible

DataFusion is Async, Vectorized, Eager Pull, Partionaed and Multi-Core

Watched Arrow and Substrait: Better Together

If Python can generate substrait plans -> pass to the query engine which could divide between different nodes -> and pass to the storage we would have a very good world 😃


Watched Query Processing in InfluxDB IOX

Query Processing in IOX has three front-ends:

  1. Storage gRPC frontend
  2. SQL FrontEnd (DataFusion)
  3. Reorg Frontend

All of them generate DataFusion LogicalPlan that can be optimized, phusical planned and excecuted the same way.

The output changes tho:

  1. StorageGrpc uses gRPC output with Arrow Record Batches (SeriesFrame)
  2. Arrow Flight IPC (for SQL frontend)
  3. ReadBuffer/ParquetWriter (for Reorg Frontend)

One of the query optimzation features is to not even look at irrelevant chunks if the user is querying by time since the chunks are partioned by time (and each of of them has the metadata regarding the range of dates they have)

IOx answeres queries by combining data from parquet files + in memory cache

Flux and InfluxQl talks to IOx via gRPC calls.


Good overview of a B+Tree: Why B-Tree: