Q1: What is Chapter 5 about? How does it fit into the bigger picture?
Previous chapters covered how state is organized (StateDB, stateObject) and how it’s authenticated (Merkle Patricia Trie). But neither answered: where do all these bytes actually end up on disk?
Chapter 5 traces the complete path from in-memory state changes to on-disk bytes. It covers the bottom two layers of a four-layer architecture:
Layer 4: StateDB ← Chapter 4: in-memory state read/write │ ▼Layer 3: Trie + TrieDB ← Chapter 3: MPT node management │ ▼Layer 2: rawdb accessor layer ← This chapter: key construction, RLP encoding │ ▼Layer 1: Pebble + Freezer ← This chapter: actual disk storageLayers 3–4 were covered previously. Chapter 5 focuses on Layer 2 (rawdb) — how keys are designed and data is encoded, and Layer 1 (Pebble + Freezer) — where data physically lives on disk.
Q2: How does geth store many different data types in a single database?
Geth stores everything — block headers, bodies, receipts, trie nodes, contract code, snapshots — in one flat key-value database. Different data types are distinguished by single-byte key prefixes defined in core/rawdb/schema.go:
| Prefix | Key format | Value |
|---|---|---|
"h" | h + blockNum(8) + hash(32) | Block header (RLP) |
"b" | b + blockNum(8) + hash(32) | Block body (RLP) |
"r" | r + blockNum(8) + hash(32) | Receipts (RLP) |
"c" | c + codeHash(32) | Contract bytecode |
"a" | a + accountHash(32) | Snapshot: account data |
"A" | A + hexPath | Trie node (account trie, path-based) |
"O" | O + accountHash(32) + hexPath | Trie node (storage trie, path-based) |
"l" | l + txHash(32) | Transaction lookup metadata |
Key construction functions are also in schema.go:
func headerKey(number uint64, hash common.Hash) []byte { return append(append(headerPrefix, encodeBlockNumber(number)...), hash.Bytes()...)}Block numbers are encoded as 8-byte big-endian integers, so keys within each prefix are naturally sorted by block number — this makes range scans efficient.
The single-byte prefix guarantees no key collisions between different data types.
Q3: What is the rawdb accessor layer and why does it exist?
core/rawdb/ provides typed accessor functions that wrap key construction + RLP encoding/decoding. Upper layers never construct raw keys or call db.Get() directly:
// Writing: upper layer just passes the structrawdb.WriteHeader(db, header)// Internally: RLP encode → build key "h" + num + hash → db.Put(key, data)
// Reading: upper layer just passes hash and block numberheader := rawdb.ReadHeader(db, hash, number)// Internally: build key → db.Get(key) → RLP decode → return *types.HeaderReading has an extra twist — it checks two locations:
func ReadHeaderRLP(db ethdb.Reader, hash common.Hash, number uint64) rlp.RawValue { // 1. Try Freezer first (old canonical blocks, O(1) lookup) data, _ = reader.Ancient("headers", number) if len(data) > 0 && crypto.Keccak256Hash(data) == hash { return data } // 2. Fall back to key-value store (recent blocks or non-canonical forks) data, _ = db.Get(headerKey(number, hash)) return data}The rawdb layer serves as a type-safe abstraction over the raw key-value store — upper layers think in terms of headers, bodies, and receipts, never raw bytes and key prefixes.
Q4: How does the interface design make the storage engine swappable?
Geth defines all storage contracts as interfaces in ethdb/database.go:
type KeyValueReader interface { Has(key []byte) (bool, error) Get(key []byte) ([]byte, error)}
type KeyValueWriter interface { Put(key []byte, value []byte) error Delete(key []byte) error}
type KeyValueStore interface { KeyValueReader KeyValueWriter Batcher // NewBatch() for atomic multi-key writes Iteratee // NewIterator() for ordered key scans Compacter // Compact() for LSM-tree compaction // ...}At the top, a single interface combines the key-value store with the ancient (Freezer) store:
type Database interface { KeyValueStore AncientStore}Pebble, LevelDB, and MemoryDB all implement KeyValueStore. Upper layers only depend on the interface, so switching engines is transparent:
- Production → Pebble (default)
- Testing → MemoryDB (in-memory map)
- Legacy nodes → LevelDB
Q5: How does Pebble work and what are its key design choices in geth?
Pebble is an LSM-tree key-value engine from CockroachDB, and geth’s default storage backend (replacing LevelDB). Three design choices stand out:
Async writes
writeOptions: pebble.NoSync,Put() and Batch.Write() return after writing to the in-memory WAL, without waiting for fsync. This is fast but means a power failure could lose the most recent writes. Geth tolerates this — it can recover from unclean shutdowns. Periodic background fsyncs via WALBytesPerSync limit the risk window.
Batch writes
Individual Put() calls are expensive — each one goes through the WAL separately. When geth needs to write many keys atomically (e.g., a block’s trie nodes), it uses Batch:
batch := db.NewBatch()batch.Put(key1, value1) // buffered in memorybatch.Put(key2, value2) // buffered in memorybatch.Write() // atomic: all writes succeed or all failBatch buffers Put/Delete operations in memory. Nothing touches the database until Write(), which applies the entire batch atomically. An IdealBatchSize constant (100 KB) guides callers on when to flush.
Bloom filters
Every SST file level has a bloom filter (10 bits per key). When looking up a key, the bloom filter answers “is this key possibly in this file?” — if “no,” the file is skipped entirely, avoiding wasted disk reads. This is especially valuable for Has() checks on non-existent keys.
Q6: What is the Freezer and why does geth need it?
Pebble (like all LSM-tree engines) continuously does background compaction — merging and rewriting SST files across levels. For active data this is necessary, but for historical blocks (block #1 through millions of finalized blocks) that will never change again, repeatedly compacting them is pure waste.
The Freezer solves this by moving finalized blocks out of Pebble into simple append-only flat files:
Pebble (key-value store) Freezer (flat files)┌──────────────────┐ ┌──────────────────┐│ Recent blocks │ ── move → │ headers file ││ Trie nodes │ │ bodies file ││ Snapshots │ │ receipts file ││ Contract code │ │ hashes file │└──────────────────┘ └──────────────────┘ Random read/write, Append-only, O(1) read, compaction overhead no compaction neededStorage format
Each data type is a table with two files:
- Index file — fixed-size 6-byte entries (2-byte file number + 4-byte offset). To find entry N, read 6 bytes at position N×6.
- Data file — actual data blobs, optionally Snappy-compressed. Capped at 2 GB per file.
Reading is O(1): seek to index entry → read file number and offset → seek into data file → read blob. No tree traversal.
Background migration
A background goroutine in chainFreezer periodically runs:
1. Compute freeze threshold (finalized block number)2. Copy header/body/receipt/hash for blocks below threshold into Freezer files3. fsync Freezer files (ensure data is on disk)4. Batch-delete the frozen blocks from PebbleCopy-then-delete with fsync in between ensures no data loss even if the process crashes mid-migration.
Read order
When reading, Freezer is checked first (old canonical blocks are likely there with O(1) access), then Pebble is checked as fallback (for recent blocks or non-canonical fork blocks).
Q7: What is the complete write path for a new block, from execution to disk?
StateDB.Commit() ├─ Contract code → rawdb.WriteCode() → Pebble: "c" + codeHash ├─ Trie nodes → triedb batch write → Pebble: "A" + path └─ Snapshot → → Pebble: "a" + accountHash
blockchain.writeBlock() ├─ Header → rawdb.WriteHeader() → Pebble: "h" + num + hash ├─ Body → rawdb.WriteBody() → Pebble: "b" + num + hash ├─ Receipts → rawdb.WriteReceipts() → Pebble: "r" + num + hash └─ Tx index → rawdb.WriteTxLookup() → Pebble: "l" + txHash
↓ (later, background goroutine)
chainFreezer.freeze() ├─ Copy header/body/receipt/hash to Freezer flat files ├─ fsync └─ Batch-delete frozen data from PebbleAll data initially goes to Pebble (fast async writes). Old finalized data is later migrated to the Freezer in the background (saves space, removes compaction overhead). Reads check Freezer first, then fall back to Pebble.
Some information may be outdated






