Geth(3) Merkle Patricia Trie - Kehao Zheng's Website

Ethereum stores all account and storage data in a Merkle Patricia Trie (MPT) — a tree structure where every node is content-addressed by its Keccak256 hash. This means you can verify any piece of state by checking a short proof against the 32-byte state root stored in the block header.

You already know the concept. This chapter shows how geth implements it — from the node types and key encoding scheme, through trie operations and hashing, to the persistence layer that writes trie nodes to disk.

How the Trie Fits into Geth#

Before looking at any code, here is how the trie is used during the lifecycle of a block:

1
 1. StateDB receives state changes (SetState, AddBalance, ...)
2
       │
3
       ▼
4
 2. Changes accumulate in dirty maps inside stateObject
5
       │
6
       ▼
7
 3. At commit time, dirty values are written into per-account tries
8
    and the account trie via Trie.Update()
9
       │
10
       ▼
11
 4. Trie.Hash() walks the tree bottom-up, RLP-encodes each dirty node,
12
    and either inlines it (< 32 bytes) or Keccak256-hashes it
13
       │
14
       ▼
15
 5. Trie.Commit() collects all dirty nodes into a NodeSet
16
       │
17
       ▼
18
 6. triedb.Database persists the NodeSet to disk
19
    (pathdb or hashdb backend)

Steps 1–2 are covered in Chapter 04 — Account and State. This chapter focuses on steps 3–6: the trie itself, its hashing, its commit process, and its persistence.

Key Encoding: Three Representations#

Trie keys pass through three different encodings in geth. Understanding them is essential for reading any trie code.

The comment at the top of trie/encoding.go defines all three:

KEYBYTES — the raw key bytes as the caller provides them (e.g., a 32-byte Keccak256 hash). This is the input to public API functions like Trie.Get() and Trie.Update().
HEX (nibbles) — each byte is split into two nibbles, so a 32-byte key becomes 64 one-byte entries (each in the range 0x00–0x0f). A trailing terminator byte 0x10 is appended for leaf keys. This is what the in-memory trie uses, because branching on individual nibbles is how the trie navigates its 16-way branch nodes.
COMPACT (hex prefix encoding) — the Yellow Paper’s encoding for keys stored on disk. It packs nibbles back into bytes and uses a flag byte to encode two pieces of information: whether the key length is odd, and whether the node is a leaf.

The conversion functions in trie/encoding.go:

1
func keybytesToHex(str []byte) []byte {
2
    l := len(str)*2 + 1
3
    var nibbles = make([]byte, l)
4
    for i, b := range str {
5
        nibbles[i*2] = b / 16
6
        nibbles[i*2+1] = b % 16
7
    }
8
    nibbles[l-1] = 16
9
    return nibbles
10
}
11

12
func hexToCompact(hex []byte) []byte {
13
    terminator := byte(0)
14
    if hasTerm(hex) {
15
        terminator = 1
16
        hex = hex[:len(hex)-1]
17
    }
18
    buf := make([]byte, len(hex)/2+1)
19
    buf[0] = terminator << 5 // the flag byte
20
    if len(hex)&1 == 1 {
21
        buf[0] |= 1 << 4 // odd flag
22
        buf[0] |= hex[0] // first nibble is contained in the first byte
23
        hex = hex[1:]
24
    }
25
    decodeNibbles(hex, buf[1:])
26
    return buf
27
}

Walking through the encoding pipeline:

keybytesToHex splits each byte into two nibbles (b / 16, b % 16) and appends terminator 16 (0x10). A 32-byte key becomes a 65-byte nibble array.
hexToCompact does two things: it sets a flag byte (buf[0]) encoding the terminator status (bit 5) and odd-length flag (bit 4), then re-packs the remaining nibbles into bytes. If the nibble count is odd, the first nibble is squeezed into the lower half of the flag byte.

Here is a concrete example:

1
KEYBYTES:  [0xca, 0xfe]
2
HEX:       [0x0c, 0x0a, 0x0f, 0x0e, 0x10]   (nibbles + terminator)
3
COMPACT:   [0x20, 0xca, 0xfe]                 (flag 0x20 = leaf + even length)

The reverse function compactToHex converts disk-stored compact keys back to in-memory hex nibbles when loading nodes from the database.

Node Types#

Geth represents all trie nodes using a single node interface and four concrete types. These are defined in trie/node.go:

1
type node interface {
2
    cache() (hashNode, bool)
3
    encode(w rlp.EncoderBuffer)
4
    fstring(string) string
5
}
6

7
type (
8
    fullNode struct {
9
        Children [17]node // Actual trie node data to encode/decode (needs custom encoder)
10
        flags    nodeFlag
11
    }
12
    shortNode struct {
13
        Key   []byte
14
        Val   node
15
        flags nodeFlag
16
    }
17
    hashNode  []byte
18
    valueNode []byte
19
)
20

21
type nodeFlag struct {
22
    hash  hashNode // cached hash of the node (may be nil)
23
    dirty bool     // whether the node has changes that must be written to the database
24
}

Each type maps to a role in the Merkle Patricia Trie:

Type	RLP form	Role
`fullNode`	17-element list	Branch node. `Children[0]`–`Children[15]` hold one child per nibble (0–f). `Children[16]` holds a value if a key terminates at this branch.
`shortNode`	2-element list	Extension or leaf node. If `Val` is another node (fullNode, shortNode, or hashNode), it’s an extension — `Key` is a shared prefix. If `Val` is a `valueNode`, it’s a leaf — `Key` is the remaining key suffix with terminator.
`hashNode`	32-byte string	Hash reference. A placeholder for a node that hasn’t been loaded from disk yet. Contains the Keccak256 hash of the node’s RLP encoding.
`valueNode`	byte string	Leaf value. The actual data stored in the trie (e.g., an RLP-encoded account).

The nodeFlag struct caches two pieces of metadata: the Keccak256 hash of the node (computed during hashing and reused until the node is modified) and a dirty flag that tracks whether the node has been modified since the last commit.

When nodes are decoded from RLP (loaded from disk), the decodeNodeUnsafe function in trie/node.go distinguishes between branch and short nodes by counting the list elements:

1
func decodeNodeUnsafe(hash, buf []byte) (node, error) {
2
    // ...
3
    elems, _, err := rlp.SplitList(buf)
4
    // ...
5
    switch c, _ := rlp.CountValues(elems); c {
6
    case 2:
7
        n, err := decodeShort(hash, elems)
8
        return n, wrapError(err, "short")
9
    case 17:
10
        n, err := decodeFull(hash, elems)
11
        return n, wrapError(err, "full")
12
    default:
13
        return nil, fmt.Errorf("invalid number of list elements: %v", c)
14
    }
15
}

Two elements means a short node (extension or leaf). Seventeen elements means a branch node. The function decodeShort further distinguishes extension from leaf by checking whether the decoded key has a terminator (using hasTerm).

The Trie Struct#

The Trie struct is the main in-memory trie. It is defined in trie/trie.go:

1
type Trie struct {
2
    root  node
3
    owner common.Hash
4

5
    committed bool
6
    unhashed  int
7
    uncommitted int
8

9
    reader         *Reader
10
    opTracer       *opTracer
11
    prevalueTracer *PrevalueTracer
12
}

Key fields:

root — the root node of the trie tree. All operations begin here.
owner — identifies which trie this is. For the account trie, this is the zero hash. For a storage trie, this is the account’s hashed address. The owner is used when reading/writing nodes to the database.
committed — once Commit() is called, the trie is marked as committed and can no longer be used. A new trie must be created from the new root.
unhashed / uncommitted — counters tracking modifications since the last hash/commit. When these exceed 100, the hasher and committer enable parallel processing.
reader — the handler for loading nodes from the database when the trie encounters a hashNode.

Two constructors create tries:

1
func New(id *ID, db database.NodeDatabase) (*Trie, error) {
2
    reader, err := NewReader(id.StateRoot, id.Owner, db)
3
    // ...
4
    trie := &Trie{
5
        owner:          id.Owner,
6
        reader:         reader,
7
        opTracer:       newOpTracer(),
8
        prevalueTracer: NewPrevalueTracer(),
9
    }
10
    if id.Root != (common.Hash{}) && id.Root != types.EmptyRootHash {
11
        rootnode, err := trie.resolveAndTrack(id.Root[:], nil)
12
        // ...
13
        trie.root = rootnode
14
    }
15
    return trie, nil
16
}
17

18
func NewEmpty(db database.NodeDatabase) *Trie {
19
    tr, _ := New(TrieID(types.EmptyRootHash), db)
20
    return tr
21
}

New creates a trie rooted at a specific state. If the root isn’t the empty root hash, it loads the root node from disk via resolveAndTrack. NewEmpty is a shorthand for an empty trie used mainly in tests.

Trie Operations#

Get#

Get looks up a key by walking the trie from root to leaf, consuming one nibble at each branch node and matching key prefixes at short nodes:

1
func (t *Trie) Get(key []byte) ([]byte, error) {
2
    // ...
3
    value, newroot, didResolve, err := t.get(t.root, keybytesToHex(key), 0)
4
    if err == nil && didResolve {
5
        t.root = newroot
6
    }
7
    return value, err
8
}
9

10
func (t *Trie) get(origNode node, key []byte, pos int) (value []byte, newnode node, didResolve bool, err error) {
11
    switch n := (origNode).(type) {
12
    case nil:
13
        return nil, nil, false, nil
14
    case valueNode:
15
        return n, n, false, nil
16
    case *shortNode:
17
        if !bytes.HasPrefix(key[pos:], n.Key) {
18
            return nil, n, false, nil
19
        }
20
        value, newnode, didResolve, err = t.get(n.Val, key, pos+len(n.Key))
21
        // ...
22
    case *fullNode:
23
        value, newnode, didResolve, err = t.get(n.Children[key[pos]], key, pos+1)
24
        // ...
25
    case hashNode:
26
        child, err := t.resolveAndTrack(n, key[:pos])
27
        // ...
28
        value, newnode, _, err := t.get(child, key, pos)
29
        return value, newnode, true, err
30
    // ...
31
    }
32
}

Walking through the recursive descent:

nil — key not found, return nil.
valueNode — reached a leaf value, return it.
shortNode — check if the remaining key starts with n.Key. If yes, skip past the matched prefix and recurse into n.Val. If no, key not found.
fullNode — consume one nibble (key[pos]) to select the child, recurse.
hashNode — the node isn’t loaded yet. Call resolveAndTrack to fetch it from disk, then retry with the resolved node. The didResolve flag propagates upward so the resolved node replaces the hashNode in-place (lazy loading).

Update#

Update inserts or deletes a key depending on the value length:

1
func (t *Trie) update(key, value []byte) error {
2
    t.unhashed++
3
    t.uncommitted++
4
    k := keybytesToHex(key)
5
    if len(value) != 0 {
6
        _, n, err := t.insert(t.root, nil, k, valueNode(value))
7
        // ...
8
        t.root = n
9
    } else {
10
        _, n, err := t.delete(t.root, nil, k)
11
        // ...
12
        t.root = n
13
    }
14
    return nil
15
}

A non-empty value triggers insert; an empty value triggers delete. Both return the new root node.

The insert method handles the most complex case — splitting a short node when two keys diverge:

1
// trie/trie.go (insert, shortened)
2

3
func (t *Trie) insert(n node, prefix, key []byte, value node) (bool, node, error) {
4
    if len(key) == 0 {
5
        if v, ok := n.(valueNode); ok {
6
            return !bytes.Equal(v, value.(valueNode)), value, nil
7
        }
8
        return true, value, nil
9
    }
10
    switch n := n.(type) {
11
    case *shortNode:
12
        matchlen := prefixLen(key, n.Key)
13
        if matchlen == len(n.Key) {
14
            dirty, nn, err := t.insert(n.Val, append(prefix, key[:matchlen]...), key[matchlen:], value)
15
            // ...
16
            return true, &shortNode{n.Key, nn, t.newFlag()}, nil
17
        }
18
        // Otherwise branch out at the index where they differ.
19
        branch := &fullNode{flags: t.newFlag()}
20
        _, branch.Children[n.Key[matchlen]], _ = t.insert(nil, ..., n.Key[matchlen+1:], n.Val)
21
        _, branch.Children[key[matchlen]], _ = t.insert(nil, ..., key[matchlen+1:], value)
22
        if matchlen == 0 {
23
            return true, branch, nil
24
        }
25
        return true, &shortNode{key[:matchlen], branch, t.newFlag()}, nil
26

27
    case *fullNode:
28
        dirty, nn, err := t.insert(n.Children[key[0]], append(prefix, key[0]), key[1:], value)
29
        // ...
30
        n.flags = t.newFlag()
31
        n.Children[key[0]] = nn
32
        return true, n, nil
33

34
    case nil:
35
        return true, &shortNode{key, value, t.newFlag()}, nil
36

37
    case hashNode:
38
        rn, err := t.resolveAndTrack(n, prefix)
39
        // ...
40
        return t.insert(rn, prefix, key, value)
41
    }
42
}

The key logic for a shortNode with a key mismatch:

Find the common prefix length (matchlen).
If keys match entirely, recurse into the child.
If keys diverge, create a new fullNode (branch) with two children: the existing value at n.Key[matchlen] and the new value at key[matchlen].
If there’s a common prefix (matchlen > 0), wrap the branch in a new shortNode for that prefix.

All newly created nodes are marked dirty via t.newFlag(), which returns nodeFlag{dirty: true}.

Delete#

The delete method mirrors insert but also handles node reduction — after removing a child from a branch node, if only one child remains, the branch collapses back into a short node. This keeps the trie minimal:

1
// trie/trie.go (delete, reduction logic)
2

3
case *fullNode:
4
    dirty, nn, err := t.delete(n.Children[key[0]], append(prefix, key[0]), key[1:])
5
    // ...
6
    n.Children[key[0]] = nn
7

8
    if nn != nil {
9
        return true, n, nil
10
    }
11
    // Count remaining children
12
    pos := -1
13
    for i, cld := range &n.Children {
14
        if cld != nil {
15
            if pos == -1 {
16
                pos = i
17
            } else {
18
                pos = -2
19
                break
20
            }
21
        }
22
    }
23
    if pos >= 0 {
24
        // Only one child remains — collapse branch into shortNode
25
        // ...
26
    }

When a deletion leaves a branch with only one non-nil child, the branch is replaced by a short node. If that single remaining child is itself a short node, their keys are concatenated to avoid a shortNode{..., shortNode{...}} chain.

Hashing: From Tree to Root Hash#

The Hash() method computes the trie’s root hash without modifying the database. It walks the tree bottom-up, RLP-encoding each node and either inlining it or replacing it with its Keccak256 hash.

The hashing logic lives in trie/hasher.go:

1
type hasher struct {
2
    sha      crypto.KeccakState
3
    tmp      []byte
4
    encbuf   rlp.EncoderBuffer
5
    parallel bool
6
}
7

8
func (h *hasher) hash(n node, force bool) []byte {
9
    if hash, _ := n.cache(); hash != nil {
10
        return hash
11
    }
12
    switch n := n.(type) {
13
    case *shortNode:
14
        enc := h.encodeShortNode(n)
15
        if len(enc) < 32 && !force {
16
            buf := make([]byte, len(enc))
17
            copy(buf, enc)
18
            return buf
19
        }
20
        hash := h.hashData(enc)
21
        n.flags.hash = hash
22
        return hash
23

24
    case *fullNode:
25
        enc := h.encodeFullNode(n)
26
        if len(enc) < 32 && !force {
27
            buf := make([]byte, len(enc))
28
            copy(buf, enc)
29
            return buf
30
        }
31
        hash := h.hashData(enc)
32
        n.flags.hash = hash
33
        return hash
34

35
    case hashNode:
36
        return n
37
    // ...
38
    }
39
}

The 32-byte inlining rule is the central decision in the hasher:

If the RLP-encoded node is less than 32 bytes AND force is false: the raw encoded bytes are returned. This node will be inlined (embedded directly in its parent’s encoding) rather than referenced by hash.
If the RLP-encoded node is 32 bytes or more, OR force is true: the node is Keccak256-hashed, and the 32-byte hash is cached in n.flags.hash for future reuse.

The force parameter is only true for the root node (called from hashRoot):

1
func (t *Trie) hashRoot() []byte {
2
    if t.root == nil {
3
        return types.EmptyRootHash.Bytes()
4
    }
5
    h := newHasher(t.unhashed >= 100)
6
    defer func() {
7
        returnHasherToPool(h)
8
        t.unhashed = 0
9
    }()
10
    return h.hash(t.root, true)
11
}

The root is always hashed (never inlined), even if its encoding is small, because the root hash is the state root stored in the block header.

When hashing a short node, the hasher converts the hex-encoded key to compact encoding before RLP-encoding it. For extension nodes, it recursively hashes the child first:

1
func (h *hasher) encodeShortNode(n *shortNode) []byte {
2
    if hasTerm(n.Key) {
3
        // Leaf node: encode [compactKey, value]
4
        var ln leafNodeEncoder
5
        ln.Key = hexToCompact(n.Key)
6
        ln.Val = n.Val.(valueNode)
7
        ln.encode(h.encbuf)
8
        return h.encodedBytes()
9
    }
10
    // Extension node: encode [compactKey, hash(child)]
11
    var en extNodeEncoder
12
    en.Key = hexToCompact(n.Key)
13
    en.Val = h.hash(n.Val, false)
14
    en.encode(h.encbuf)
15
    return h.encodedBytes()
16
}

For full nodes with many children, when parallel is true (triggered when 100+ nodes are unhashed), each child is hashed in its own goroutine using a separate hasher from the pool. Hashers are pooled via sync.Pool to reduce allocation overhead.

Committing: Dirty Nodes to NodeSet#

While Hash() only computes hashes (non-destructive), Commit() collects all dirty nodes into a trienode.NodeSet for writing to the database. After committing, the trie is unusable — a new one must be created from the new root.

The commit flow in trie/trie.go:

1
func (t *Trie) Commit(collectLeaf bool) (common.Hash, *trienode.NodeSet) {
2
    defer func() {
3
        t.committed = true
4
    }()
5
    if t.root == nil {
6
        // Handle empty trie...
7
    }
8
    rootHash := t.Hash()
9

10
    if hashedNode, dirty := t.root.cache(); !dirty {
11
        t.root = hashedNode
12
        return rootHash, nil
13
    }
14
    nodes := trienode.NewNodeSet(t.owner)
15
    for _, path := range t.deletedNodes() {
16
        nodes.AddNode(path, trienode.NewDeletedWithPrev(t.prevalueTracer.Get(path)))
17
    }
18
    t.root = newCommitter(nodes, t.prevalueTracer, collectLeaf).Commit(t.root, t.uncommitted > 100)
19
    t.uncommitted = 0
20
    return rootHash, nodes
21
}

The sequence:

Hash first — t.Hash() ensures all nodes have their hashes computed.
Quick check — if the root isn’t dirty, there’s nothing to commit; return nil.
Collect deleted nodes — nodes that were deleted or replaced are tracked so the database can remove stale entries.
Create a committer — the committer walks the tree and collects every dirty node into the NodeSet.

The committer in trie/committer.go recurses through the tree:

1
type committer struct {
2
    nodes       *trienode.NodeSet
3
    tracer      *PrevalueTracer
4
    collectLeaf bool
5
}
6

7
func (c *committer) commit(path []byte, n node, parallel bool) node {
8
    hash, dirty := n.cache()
9
    if hash != nil && !dirty {
10
        return hash
11
    }
12
    switch cn := n.(type) {
13
    case *shortNode:
14
        if _, ok := cn.Val.(*fullNode); ok {
15
            cn.Val = c.commit(append(path, cn.Key...), cn.Val, false)
16
        }
17
        cn.Key = hexToCompact(cn.Key)
18
        hashedNode := c.store(path, cn)
19
        // ...
20
    case *fullNode:
21
        c.commitChildren(path, cn, parallel)
22
        hashedNode := c.store(path, cn)
23
        // ...
24
    case hashNode:
25
        return cn
26
    }
27
}

For each dirty node, the committer:

Recursively commits children first (bottom-up).
Converts the short node’s key from hex to compact encoding (for disk storage).
Calls store(), which adds the node to the NodeSet with both its current encoding and its previous value (for the database to track changes).

The store method applies the same 32-byte threshold: if a node’s hash is nil (meaning it was too small to be hashed), it’s an embedded node that won’t be stored independently in the database. Larger nodes are added to the NodeSet keyed by their path.

When parallel is true (more than 100 uncommitted changes), commitChildren processes each child of a full node in its own goroutine, each with a separate NodeSet that’s merged back under a mutex.

Trie Database: The Persistence Layer#

The trie itself is purely in-memory. The triedb.Database (in triedb/database.go) handles persistence. It wraps a disk key-value store and delegates to one of two backends:

1
type Database struct {
2
    disk      ethdb.Database
3
    config    *Config
4
    preimages *preimageStore
5
    backend   backend        // either *hashdb.Database or *pathdb.Database
6
}
7

8
func NewDatabase(diskdb ethdb.Database, config *Config) *Database {
9
    // ...
10
    if config.PathDB != nil {
11
        db.backend = pathdb.New(diskdb, config.PathDB, config.IsVerkle)
12
    } else {
13
        db.backend = hashdb.New(diskdb, config.HashDB)
14
    }
15
    return db
16
}

The backend interface defines the operations both backends must support:

1
type backend interface {
2
    NodeReader(root common.Hash) (database.NodeReader, error)
3
    StateReader(root common.Hash) (database.StateReader, error)
4
    Size() (common.StorageSize, common.StorageSize)
5
    Commit(root common.Hash, report bool) error
6
    Close() error
7
}

Hash-Based Scheme (hashdb)#

The original backend. Nodes are indexed by their Keccak256 hash — the same hash used as the node reference inside the trie. This is conceptually simple: to look up a node, hash its reference and query the database.

The hashdb backend keeps dirty nodes in a memory cache with reference counting. When Commit() is called with a state root, it walks the tree of dirty nodes starting from that root and flushes them to disk. Cap() can be called to evict old cached nodes when memory grows too large.

Key operations: Reference() adds a parent→child reference, Dereference() removes one. When a root’s reference count drops to zero, its nodes can be garbage collected.

Path-Based Scheme (pathdb)#

The newer backend. Nodes are indexed by their trie path (the sequence of nibbles from root to the node) rather than by hash. This enables a layer-based diff model:

A diskLayer holds the base state that has been flushed to disk.
**diffLayer**s stack on top, each representing the state changes from one block. They form a tree (not a chain) to support reorgs.

When reading a node, pathdb searches from the newest diff layer downward. If no layer has the node, it falls back to disk. When enough diff layers accumulate, the oldest is merged (flattened) into the disk layer.

The path-based scheme is more efficient for block-by-block state management because it naturally groups changes by block and avoids the reference-counting complexity of hashdb. It also supports features like Journal() for crash recovery and Recover() for rollback to a historical state.

The scheme is selected at database creation time via the Config:

1
var HashDefaults = &Config{
2
    Preimages: false,
3
    IsVerkle:  false,
4
    HashDB:    hashdb.Defaults,
5
}

If config.PathDB is set, the path-based backend is used. Otherwise, the hash-based backend is the default.

StateTrie: Key Hashing for Security#

The raw Trie uses keys as-is. In Ethereum’s state trie, this would be dangerous — an attacker could craft keys that create deep, unbalanced paths, slowing down lookups to O(n).

The StateTrie (in trie/secure_trie.go) solves this by hashing every key with Keccak256 before passing it to the underlying trie:

1
type StateTrie struct {
2
    trie        Trie
3
    db          database.NodeDatabase
4
    preimages   preimageStore
5
    secKeyCache map[common.Hash][]byte
6
}
7

8
func (t *StateTrie) GetStorage(_ common.Address, key []byte) ([]byte, error) {
9
    enc, err := t.trie.Get(crypto.Keccak256(key))
10
    // ...
11
}
12

13
func (t *StateTrie) UpdateAccount(address common.Address, acc *types.StateAccount, _ int) error {
14
    hk := crypto.Keccak256(address.Bytes())
15
    data, err := rlp.EncodeToBytes(acc)
16
    // ...
17
    if err := t.trie.Update(hk, data); err != nil {
18
        return err
19
    }
20
    if t.preimages != nil {
21
        t.secKeyCache[common.Hash(hk)] = address.Bytes()
22
    }
23
    return nil
24
}

Every key goes through crypto.Keccak256() before reaching the trie. Since Keccak256 outputs are uniformly distributed, the resulting trie is balanced regardless of the input key pattern.

The secKeyCache stores the reverse mapping (hash → original key) so that the original key can be recovered when needed. These preimages are flushed to disk during Commit():

1
func (t *StateTrie) Commit(collectLeaf bool) (common.Hash, *trienode.NodeSet) {
2
    if len(t.secKeyCache) > 0 {
3
        if t.preimages != nil {
4
            t.preimages.InsertPreimage(t.secKeyCache)
5
        }
6
        clear(t.secKeyCache)
7
    }
8
    return t.trie.Commit(collectLeaf)
9
}

In Ethereum’s state model, two StateTrie instances are used:

The account trie (also called the world state trie) maps keccak256(address) → RLP-encoded account data. Its root is the Root field in the block header (see Chapter 02).
Each account’s storage trie maps keccak256(storageSlot) → RLP-encoded storage value. Its root is the StorageRoot field in the account object.

StackTrie: Write-Only Trie for Block Building#

The regular Trie supports arbitrary reads, inserts, and deletes. But during block building and receipt hashing, geth builds a trie from scratch with keys inserted in sorted order. For this use case, StackTrie (in trie/stacktrie.go) is a specialized write-only variant that is much more memory-efficient.

1
type StackTrie struct {
2
    root       *stNode
3
    h          *hasher
4
    last       []byte
5
    onTrieNode OnTrieNode
6
    // ...
7
}
8

9
type stNode struct {
10
    typ      uint8       // node type (emptyNode, branchNode, extNode, leafNode, hashedNode)
11
    key      []byte
12
    val      []byte
13
    children [16]*stNode // no slot 16 — values are stored in val, not children
14
}
15

16
type OnTrieNode func(path []byte, hash common.Hash, blob []byte)

Key differences from Trie:

Sorted insertion only. Update enforces ascending key order and returns an error if keys arrive out of order. Deletions are not supported.
Eager hashing. When a new key is inserted, StackTrie checks if any earlier subtrees are now complete (no future key can land in them because keys are sorted). Completed subtrees are immediately hashed and freed, so only the “right frontier” of the trie is kept in memory.
Callback on commit. The OnTrieNode callback is invoked for each node as it’s hashed, allowing the caller to write the node to disk immediately. This avoids accumulating all nodes in memory.
Simpler node types. Uses stNode with a typ field instead of separate Go types. Branch nodes have only 16 children (no slot 16 for values — values go in val).

The Update method enforces sorted order:

1
func (t *StackTrie) Update(key, value []byte) error {
2
    if len(value) == 0 {
3
        return errors.New("trying to insert empty (deletion)")
4
    }
5
    // ...
6
    k := writeHexKey(t.kBuf, key)
7
    if bytes.Compare(t.last, k) >= 0 {
8
        return errors.New("non-ascending key order")
9
    }
10
    // ...
11
    t.insert(t.root, k, vBuf, t.pBuf[:0])
12
    return nil
13
}

The hashing logic in StackTrie.hash() applies the same 32-byte threshold as the regular hasher — nodes smaller than 32 bytes that aren’t the root are inlined. For larger nodes (or the root), the Keccak256 hash is computed, and the onTrieNode callback is invoked:

1
// trie/stacktrie.go (hash method, simplified)
2

3
func (t *StackTrie) hash(st *stNode, path []byte) {
4
    // ... encode node into blob based on st.typ ...
5

6
    st.typ = hashedNode
7
    st.key = st.key[:0]
8

9
    if len(blob) < 32 && len(path) > 0 {
10
        st.val = bPool.getWithSize(len(blob))
11
        copy(st.val, blob)
12
        return
13
    }
14
    st.val = bPool.getWithSize(32)
15
    t.h.hashDataTo(st.val, blob)
16

17
    if t.onTrieNode != nil {
18
        t.onTrieNode(path, common.BytesToHash(st.val), blob)
19
    }
20
}

StackTrie is used in geth for:

Computing the transactions root and receipts root in block headers, where transactions/receipts are keyed by their RLP-encoded index in sorted order.
Writing trie nodes during snap sync state download, where accounts arrive in sorted hash order.

What’s Next#

With the trie implementation covered, we now have the data structure that backs all of Ethereum’s state. Chapter 04 — Account and State builds on this foundation to explain how StateDB and stateObject use the trie to read and write account balances, nonces, contract code, and storage slots.

Welcome