Mobile wallpaper 1Mobile wallpaper 2Mobile wallpaper 3
1825 words
9 minutes
Geth(3) QA

Q1: How do the four node types in geth’s MPT implementation work?#

Geth uses four concrete types to represent all trie nodes (trie/node.go):

fullNode struct { Children [17]node; flags nodeFlag }
shortNode struct { Key []byte; Val node; flags nodeFlag }
hashNode []byte
valueNode []byte

fullNode — Branch node#

The MPT branch node. A nibble has 16 possible values (0x0–0xf), so there are 16 child slots. The 17th slot (Children[16]) holds a value when a key terminates exactly at this branch point. Most slots are nil (no keys with that nibble).

shortNode — Extension node OR Leaf node#

A single struct represents both extension and leaf nodes. The distinction is determined by the type of Val:

  • Val is fullNode/shortNode/hashNodeextension node (Key is a shared prefix)
  • Val is valueNodeleaf node (Key is the remaining key suffix)

Example: two keys sharing prefix 0xca:

shortNode{Key: [c,a], Val: fullNode{...}} ← extension node, "ca" is common prefix
┌─────────────┼─────────────┐
Children[f] Children[3]
│ │
shortNode{Key:[e], Val:"alice"} shortNode{Key:[b], Val:"bob"}
↑ leaf node ↑ leaf node

hashNode — Lazy-loading placeholder#

A pure engineering concept with no MPT theory equivalent. The entire state trie has hundreds of millions of nodes — they can’t all be loaded into memory. When geth reads a fullNode from disk, its Children contain 32-byte hash references instead of real nodes.

Only when a Get() path passes through a hashNode does resolveAndTrack fetch the node from disk, decode it, and replace the hashNode in-place. This is lazy loading.

valueNode — Leaf value#

Simply a []byte holding the actual data stored in the trie. In the account trie, this is the RLP-encoded account (nonce, balance, storageRoot, codeHash).

nodeFlag — Cached metadata#

type nodeFlag struct {
hash hashNode // cached Keccak256 hash (reused until node is modified)
dirty bool // whether node has been modified since last commit
}

Every fullNode and shortNode carries a flags field. hash caches the computed hash to avoid recomputation. dirty tracks whether the node needs to be written to disk.

Summary mapping#

geth typeMPT conceptDataPurpose
fullNodeBranch node17 child slotsPath fork point, 16-way branching by nibble
shortNode + Val is nodeExtension nodeshared prefix + childCompress consecutive single-branch paths
shortNode + Val is valueNodeLeaf noderemaining key + valueStore final data
hashNode(no theory equivalent)32-byte hashLazy-loading placeholder, loaded from disk on demand
valueNode(value part of leaf)raw bytesThe actual stored data

Q2: What does the Trie struct mean and how does it work?#

type Trie struct {
root node // (1)
owner common.Hash // (2)
committed bool // (3)
unhashed int // (4)
uncommitted int // (5)
reader *Reader // (6)
}

Fields#

(1) root node — Entry point of the entire tree. All operations (Get, Update, Delete, Hash, Commit) start here. Can be fullNode, shortNode, hashNode, or nil (empty trie). Each Update replaces root with the new returned node.

(2) owner common.Hash — Identifies which trie this is. Ethereum has many tries:

  • Global account trie → owner = common.Hash{} (zero)
  • Each contract’s storage trie → owner = keccak256(contract address)

The owner is used by the pathdb backend to distinguish nodes: (owner, path) is the database key.

(3) committed bool — Set to true after Commit(). The Trie object cannot be used after commit because its internal node state is no longer complete. A new Trie must be created with the new root hash. This is a disposable design: create → read/write → Commit → discard → create new for next block.

(4) unhashed int — Performance counter. Incremented on every Update(). When >= 100, Hash() enables parallel hashing (multiple goroutines compute subtree hashes simultaneously). Small modifications don’t benefit from parallelism.

(5) uncommitted int — Same as unhashed but for Commit(). Enables parallel commit when > 100 modifications.

(6) reader *Reader — Disk reader. When recursion hits a hashNode, the reader fetches the real node from the database, decodes it from RLP, and returns it.

Lifecycle#

1. New(root_hash, db) → reader connects to DB, root loaded from disk
2. Get/Update/Delete → recurse from root, lazy-load hashNodes via reader
3. Hash() → bottom-up hash all dirty nodes, return state root
4. Commit() → collect dirty nodes into NodeSet, mark committed=true
5. Next block → New() with the new root hash, start fresh

Q3: How do the trie operations (Get, Update, Delete) work? Show with a concrete example.#

Building a trie: insert three key-value pairs#

Keys converted to nibbles (hex encoding):

"cat" (0x636174) → [6, 3, 6, 1, 7, 4, 16] (16 = terminator)
"car" (0x636172) → [6, 3, 6, 1, 7, 2, 16]
"dog" (0x646f67) → [6, 4, 6, f, 6, 7, 16]

Insert “cat” — empty trie, insert hits nil, creates a leaf:

root: shortNode{Key: [6,3,6,1,7,4,16], Val: "meow"}

Insert “car” — nibbles compared with “cat”:

cat: [6, 3, 6, 1, 7, 4, 16]
car: [6, 3, 6, 1, 7, 2, 16]
^ diverge at position 5 (matchlen = 5)

Creates a fullNode at the fork point, wraps with shared prefix:

root: shortNode{Key: [6,3,6,1,7], Val: fullNode(B)}
┌──────────────┼──────────────┐
Children[2] Children[4]
│ │
shortNode{[16],"vroom"} shortNode{[16],"meow"}

Insert “dog” — diverges from root’s key at position 1:

root.Key: [6, 3, 6, 1, 7]
dog: [6, 4, 6, f, 6, 7, 16]
^ matchlen = 1

Final trie:

root: shortNode{Key: [6], Val: fullNode(A)}
┌──────────────┼──────────────┐
Children[3] Children[4]
│ │
shortNode{[6,1,7]} shortNode{[6,f,6,7,16],"woof"}
Val: fullNode(B)
┌─────────────┼─────────────┐
Children[2] Children[4]
│ │
shortNode{[16],"vroom"} shortNode{[16],"meow"}

Get “car”#

Nibbles: [6, 3, 6, 1, 7, 2, 16], starting at root, pos=0:

Step 1: shortNode{Key:[6]} → key starts with [6]? yes → skip 1, pos=1
Step 2: fullNode(A) → key[1]=3 → Children[3], pos=2
Step 3: shortNode{Key:[6,1,7]} → key starts with [6,1,7]? yes → skip 3, pos=5
Step 4: fullNode(B) → key[5]=2 → Children[2], pos=6
Step 5: shortNode{Key:[16]} → key starts with [16]? yes → recurse to Val
Step 6: valueNode("vroom") → return "vroom" ✓

If any step encounters a hashNode instead of a real node: call resolveAndTrack → fetch from disk → decode → replace hashNode → continue.

Delete “cat”#

Follows the same path to find fullNode(B).Children[4], sets it to nil. Now fullNode(B) has only one remaining child (Children[2]). The reduction logic triggers:

  1. fullNode(B) collapses into shortNode{[2,16], "vroom"}
  2. Parent shortNode{[6,1,7]} now points to a shortNode — two shortNodes chain together, so they merge: shortNode{[6,1,7,2,16], "vroom"}

The trie always stays minimal — no fullNode is left with only one child.


Q4: What is the dirty flag and how does it work?#

What it is#

Every fullNode and shortNode has flags.dirty — a boolean meaning “this node has been modified but not yet written to disk”.

When it’s set#

Every insert or delete that creates or modifies a node calls t.newFlag():

func (t *Trie) newFlag() nodeFlag {
return nodeFlag{dirty: true}
}

Only nodes on the modified path become dirty. Untouched nodes remain clean with their cached hash intact.

How Hash() uses it#

func (h *hasher) hash(n node, force bool) []byte {
if hash, _ := n.cache(); hash != nil { // has cached hash and not dirty?
return hash // skip! return cached hash
}
// ... otherwise recompute: RLP encode + Keccak256
}
  • dirty node (dirty=true, hash=nil): must recompute RLP encoding + Keccak256
  • clean node (dirty=false, hash=0x456…): return cached hash immediately

How Commit() uses it#

hash, dirty := n.cache()
if hash != nil && !dirty {
return hash // not dirty → already on disk, skip
}
// dirty → collect into NodeSet for writing to disk

Only dirty nodes are collected into the NodeSet and written to disk.

Why it matters#

A single block might modify a few hundred accounts, but the trie has hundreds of millions of nodes. The dirty flag ensures Hash() and Commit() only do incremental work on the modified path, not a full tree traversal.

Update/Delete → nodes on path get dirty=true, hash=nil
Hash() → dirty nodes: recompute hash. clean nodes: return cached hash
Commit() → dirty nodes: collect to NodeSet → write disk. clean nodes: skip

Q5: How is trie data stored? What is at each layer?#

The storage has three layers. Using a concrete example with two accounts (alice and bob):

Layer 1: Trie in memory — Go struct pointer chains#

trie.root = &shortNode{
Key: [0x0a],
Val: &fullNode{
Children: [17]node{
[0xa]: &shortNode{Key: [..., 16], Val: valueNode{0xf8, 0x44, ...}}, // alice
[0xb]: &shortNode{Key: [..., 16], Val: valueNode{0xf8, 0x40, ...}}, // bob
},
},
}

These are ordinary Go objects living in the heap memory of the geth process. Shut down geth → gone.

Layer 2: triedb — in-memory byte cache#

After Trie.Commit(), dirty nodes are serialized to RLP bytes and handed to triedb. Still in memory.

hashdb backend — stores in a dirties map keyed by node hash:

dirties = map[hash]*cachedNode{
0x7f3e...: {node: [0xc4, 0x82, ...]}, // root
0x5d8a...: {node: [0xf8, 0x51, ...]}, // fullNode
0x1234...: {node: [0xc8, 0x82, ...]}, // alice leaf
0x5678...: {node: [0xc8, 0x82, ...]}, // bob leaf
}

pathdb backend — stores in diffLayer keyed by owner+path:

diffLayer{
block: 100,
nodes: {
owner(0x00..00) + path([]): [0xc4, 0x82, ...], // root
owner(0x00..00) + path([0x0a]): [0xf8, 0x51, ...], // fullNode
owner(0x00..00) + path([0x0a,0x0a,...]): [0xc8, 0x82, ...], // alice
owner(0x00..00) + path([0x0a,0x0b,...]): [0xc8, 0x82, ...], // bob
}
}

Shut down geth → gone (unless journaled for crash recovery).

Layer 3: ethdb — actual disk persistence#

When triedb.Commit() is called, data is finally written to LevelDB/Pebble on disk.

hashdb writes — key is the node’s Keccak256 hash:

LevelDB key → value:
0x7f3e... (32-byte hash) → [0xc4, 0x82, ...]
0x5d8a... (32-byte hash) → [0xf8, 0x51, ...]
0x1234... (32-byte hash) → [0xc8, 0x82, ...]
0x5678... (32-byte hash) → [0xc8, 0x82, ...]

pathdb writes — key is owner + trie path:

LevelDB key → value:
0x00..00 + [] → [0xc4, 0x82, ...]
0x00..00 + [0x0a] → [0xf8, 0x51, ...]
0x00..00 + [0x0a,0x0a,...] → [0xc8, 0x82, ...]
0x00..00 + [0x0a,0x0b,...] → [0xc8, 0x82, ...]

Shut down geth → still here. This is the only persistent layer.

Reading goes in reverse#

StateDB.GetBalance(alice)
→ trie.Get(keccak256(alice))
→ recurse from root, hit hashNode(0x5d8a...)
→ resolveAndTrack: ask triedb
→ hashdb: check dirties[0x5d8a...] in memory, else ethdb.Get(0x5d8a...)
→ pathdb: check diffLayer chain in memory, else diskLayer → ethdb
→ get RLP bytes → decodeNodeUnsafe → fullNode Go object
→ replace hashNode, continue recursing
→ reach valueNode → RLP decode → Account struct → return Balance

Summary diagram#

Memory (Go objects) Memory (bytes) Disk (bytes)
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ Trie │ │ triedb │ │ ethdb │
│ │ ──→ │ │ ──→ │ │
│ shortNode{ │ │ dirties map │ │ LevelDB/Pebble │
│ fullNode{ │ │ or diffLayer │ │ │
│ ... │ │ │ │ SST files │
│ } │ │ still in memory! │ │ WAL files │
│ } │ │ │ │ │
└──────────────────┘ └──────────────────┘ └──────────────────┘
shutdown → gone shutdown → gone shutdown → survives

What else is on disk besides trie nodes#

The same chaindata LevelDB/Pebble database stores everything, distinguished by key prefixes:

chaindata/
├── Trie nodes ← account trie + storage trie nodes
├── Block headers ← key = "h" + blockNum + hash
├── Block bodies ← key = "b" + blockNum + hash (transaction lists)
├── Receipts ← key = "r" + blockNum + hash
├── Contract code ← key = codeHash, value = bytecode
├── Preimages ← key = hash, value = original address/key (optional)
└── ancient/ (freezer)
├── headers/ ← old block headers (append-only, compressed)
├── bodies/ ← old block bodies
├── receipts/ ← old receipts
└── hashes/ ← old block hashes

Hot data (recent blocks) lives in the KV store for random access. Cold data (old blocks) is moved to the freezer (ancient database), stored sequentially by block number to save space.

Geth(3) QA
https://kehaozheng.vercel.app/posts/chainethgeth/03_qa/
Author
Kehao Zheng
Published at
2026-04-12
License
CC BY-NC-SA 4.0

Some information may be outdated