Q1: What is the complete lifecycle of a transaction from user submission to finalization?
Overall flow
User / DApp │ │ eth_sendRawTransaction (JSON-RPC) ▼┌─────────────┐│ RPC Server │ ← Ch.13: transport, dispatch, reflection└──────┬──────┘ │ ▼┌─────────────┐│TransactionAPI│ ← Ch.13: decode raw bytes → types.Transaction└──────┬──────┘ │ txPool.Add() ▼┌─────────────┐│ Transaction │ ← Ch.8: validation, pending/queue, blob pool│ Pool │└──────┬──────┘ │ NewTxsEvent ▼┌─────────────┐│ Handler │ ← Ch.12: broadcast to peers└──────┬──────┘ sqrt(N) direct + rest hash announcement │ │ CL calls ForkchoiceUpdated (triggers block building) ▼┌─────────────┐│ Miner │ ← Ch.9: fillTransactions, sort by tip└──────┬──────┘ │ ApplyTransaction for each tx ▼┌─────────────┐│ State │ ← Ch.6: preCheck → buyGas → EVM → refund → fee distribution│ Transition │└──────┬──────┘ │ evm.Call() or evm.Create() ▼┌─────────────┐│ EVM │ ← Ch.7: interpreter loop, opcode execution, gas metering└──────┬──────┘ │ SSTORE, CREATE etc. modify state ▼┌─────────────┐│ StateDB │ ← Ch.4: stateObject dirty map, journal rollback└──────┬──────┘ │ FinalizeAndAssemble ▼┌─────────────┐│ Block │ ← Ch.9: compute state root, assemble block│ Assembly │└──────┬──────┘ │ CL: getPayload → newPayload → forkchoiceUpdated ▼┌─────────────┐│ BlockChain │ ← Ch.10: InsertChain, validate, writeBlockWithState└──────┬──────┘ │ ChainHeadEvent ▼┌─────────────┐│ Storage │ ← Ch.3/4/5: trie commit → TrieDB → Pebble/LevelDB + Freezer│ Stack │└──────┬──────┘ │ broadcast to peers ▼┌─────────────┐│ P2P Network │ ← Ch.11/12: block propagation, lagging nodes sync└──────┬──────┘ │ ▼ Finalized ← CL finalizes, never rolled backStage 1: RPC arrival (Chapter 13)
The user calls eth_sendRawTransaction via HTTP/WebSocket/IPC, submitting a signed raw transaction.
HTTP POST → RPC Server decodes JSON → split method name on "_": service="eth", method="sendRawTransaction" → reflection lookup → TransactionAPI.SendRawTransaction → tx.UnmarshalBinary(rawBytes) → types.Transaction → SubmitTransaction() checks fee cap, EIP-155 protection → txPool.Add()From this point, the transaction leaves the external world and enters geth internals.
Stage 2: Transaction pool validation and storage (Chapter 8)
The TxPool coordinator routes to the appropriate sub-pool based on transaction type:
TxPool.Add() ├─ Normal transactions (type 0/1/2/4) → LegacyPool └─ Blob transactions (type 3) → BlobPool
Sub-pool validation: ├─ Signature recovery (ECDSA recover) ├─ Nonce check (not too low, not too large a gap) ├─ Balance check (can cover gasLimit × gasFeeCap + value) ├─ Gas limit check (doesn't exceed block gas limit) └─ Pool-level limits (account slots, global slots, minimum price)
Passes validation → pending map (ready for block inclusion)The pool emits NewTxsEvent via event.Feed.
Stage 3: P2P broadcast (Chapter 12)
The handler’s txBroadcastLoop() receives the event and executes dual-layer broadcast:
BroadcastTransactions() ├─ ~sqrt(N) peers: send full transaction (TransactionsMsg) └─ remaining peers: send only hash (NewPooledTransactionHashesMsg) │ ▼ remote peers receiving hash TxFetcher 3-stage pipeline ├─ wait 500ms (direct broadcast may arrive) ├─ didn't arrive → queue for request └─ send GetPooledTransactionsMsg to fetch
Within seconds → virtually every node in the network has this transactionStage 4: Block building trigger (Chapter 9)
The transaction waits in the pool. When the consensus layer decides “you are this slot’s proposer”:
CL calls ForkchoiceUpdated (with payloadAttributes) → Engine API notifies miner → BuildPayload() starts: ├─ Immediately build empty block (guarantee, never miss a slot) └─ Background goroutine repeatedly builds full blocks ├─ 0s: first full build ├─ 2s: rebuild (new txs may have arrived) ├─ 4s: rebuild again └─ 6s: CL calls GetPayload, return best versionStage 5: Transaction execution (Chapter 6)
During each build, fillTransactions() pulls transactions from the pool, sorts by effective tip, and executes one by one:
For each transaction: core.ApplyTransaction() → stateTransition.execute() ├─ preCheck() validate nonce, balance ├─ buyGas() pre-deduct gasLimit × gasFeeCap ├─ EVM dispatch evm.Call() or evm.Create() ├─ calcRefund() refund cap = gasUsed / 5 (EIP-3529) ├─ returnGas() return remaining gas to sender └─ fee distribution tip → coinbase, baseFee → burnedIf execution fails, state rolls back to the pre-transaction snapshot, gas pool is restored — failed transactions leave no trace.
Stage 6: EVM execution (Chapter 7)
Inside evm.Call(), the interpreter loop runs the contract bytecode:
for { ① op = contract.GetOp(pc) // fetch opcode ② operation = jumpTable[op] // lookup gas and handler ③ validate stack depth ④ deduct constantGas ⑤ compute and deduct dynamicGas (e.g., SLOAD cold/warm) ⑥ expand memory and charge ⑦ operation.execute() // execute! pc++}State-modifying opcodes (SSTORE, CREATE, etc.) write to StateDB’s dirty map, protected by the journal for rollback.
Stage 7: State commit (Chapters 3/4)
After all transactions are executed:
FinalizeAndAssemble() ├─ Process withdrawals (validator balances) ├─ System-level operations (beacon root, etc.) ├─ statedb.Commit() │ ├─ Each modified account → storage trie updated and hashed │ ├─ Account trie updated and hashed │ └─ Produces 32-byte Merkle state root └─ Assemble block (header + txs + receipts + withdrawals)This block is the payload returned to the CL via GetPayload.
Stage 8: Block insertion (Chapter 10)
After the CL validates the block on the beacon chain, it sends it back to geth via NewPayload:
InsertBlockWithoutSetHead() ├─ Header validation (timestamp, gas limit, consensus constraints) ├─ Body validation (tx root, uncles hash, withdrawals hash) ├─ State processing — re-execute all transactions (identical to stages 5-6!) │ Compare resulting state root with header's StateRoot │ Mismatch → reject block (INVALID) └─ writeBlockWithState() persist to disk
Then CL calls ForkchoiceUpdated to designate new head: → writeHeadBlock() updates canonical chain pointers → emit ChainHeadEventStage 9: Persistence and propagation (Chapters 5/11/12)
Data flows down the storage stack:
StateDB → Trie → TrieDB → ethdb (Pebble/LevelDB) │ └─ Old blocks → Freezer (append-only ancient storage)Meanwhile, the handler broadcasts the new block to peers. Lagging nodes catch up via snap sync or full sync.
Stage 10: Finalization
The CL eventually marks the block (or its ancestor) as finalized:
Geth updates currentFinalBlock pointer → This transaction can never be rolled back → Permanently part of the canonical chainFrom user pressing send to finalization, a transaction passes through RPC → pool → P2P → miner → state transition → EVM → StateDB → trie → blockchain → storage → network → finality — spanning every major subsystem in geth.
Q2: What are the cross-cutting design patterns in the geth codebase?
Recognizing these patterns gives you a “I’ve seen this before” feeling when reading any subsystem.
Pattern 1: Lifecycle interface (start/stop contract)
type Lifecycle interface { Start() error Stop() error}Almost all long-running components follow this contract:
Component Start() Stop()─────────────────────────────────────────────────────────────────Ethereum setupDiscovery, handler.Start handler.Stop, txPool.Close, blockchain.Stophandler txBroadcastLoop, txFetcher stop sync and broadcastP2P Server TCP listen, discovery, dialer disconnect all peersKey rules:
- Registration must precede Start — registering at runtime panics
- Stop in reverse order — consumers first, producers next, storage last
- Naming may vary (
Init/Close,New/Stop), but the pattern is the same
Pattern 2: Event Feed (publish/subscribe decoupling)
// Producertype BlockChain struct { chainHeadFeed event.Feed}func (bc *BlockChain) insertChain(...) { bc.chainHeadFeed.Send(ChainHeadEvent{Block: block})}
// ConsumerheadCh := make(chan core.ChainHeadEvent, 64)sub := bc.SubscribeChainHeadEvent(headCh)for { select { case head := <-headCh: // react to new chain head case err := <-sub.Err(): return }}Key feeds in geth:
Event Producer Consumer Purpose──────────────────────────────────────────────────────────────────────ChainHeadEvent BlockChain miner, handler, new block arrived filter system, txpoolNewTxsEvent TxPool handler new tx, needs broadcastWalletEvent account backend startNode wallet listener wallet plugged/unpluggedChainSideEvent BlockChain filter system side chain block (uncles)Why Feed instead of direct calls? Decoupling. BlockChain doesn’t need to import the miner package, miner doesn’t need to import the handler package. They communicate through events, unaware of each other’s existence. Adding a new consumer requires no changes to producer code.
Pattern 3: Backend interface (API vs. implementation separation)
JSON-RPC methods (ethapi/api.go) │ │ calls ▼ Backend interface (ethapi/backend.go) │ │ implemented by ├─ EthAPIBackend (full node) └─ LESAPIBackend (light client) │ │ delegates to ▼ BlockChain, TxPool, Miner, StateDB...eth_getBalance code doesn’t care whether it’s running on a full node or light client — it just calls backend.StateAndHeaderByNumberOrHash(). The concrete implementation decides where data comes from (local database vs. remote request).
This pattern appears in many places:
Backendinterface → bridge between RPC API and coreconsensus.Engineinterface → bridge between consensus logic and blockchainethdb.KeyValueStoreinterface → bridge between storage logic and concrete KV enginetxpool.SubPoolinterface → bridge between pool coordinator and pool implementations
Pattern 4: Config struct (one configuration aggregate per subsystem)
CLI flags │ │ utils.SetNodeConfig() ▼node.Config → data dir, P2P settings, RPC endpoints │ │ utils.SetEthConfig() ▼ethconfig.Config → sync mode, cache size, gas price │ ├─ p2p.Config → max peers, listen addr, NAT, bootnodes ├─ ChainConfig → fork activation times, chain ID, consensus rules └─ TxPool.Config → price limits, slot limits, journal settingsEach Config has hardcoded defaults, overridable by TOML file, overridable again by CLI flags. Three layers stacked, finally passed to the corresponding subsystem’s constructor.
Pattern 5: Four-layer storage model (read-through + write-flush)
Layer 4: StateDB in-memory dirty maps + journal rollback │ read-through ↓ ↑ writes accumulateLayer 3: Trie Merkle Patricia Trie nodes │ read-through ↓ ↑ commit flushesLayer 2: TrieDB caching layer (path-based or hash-based) │ read-through ↓ ↑ flush to diskLayer 1: ethdb KV store (Pebble/LevelDB) + FreezerRead direction: StateDB checks dirty cache first → miss penetrates to trie → trie penetrates to TrieDB → TrieDB penetrates to disk.
Write direction: Modifications accumulate in StateDB’s dirty map → commit flushes to trie → trie commits to TrieDB → TrieDB eventually flushes to disk.
This “accumulate at upper layers, periodically flush to lower layers” pattern makes per-transaction state modifications extremely cheap (memory only), paying the trie hashing and disk write cost only at block commit time.
Why recognizing these patterns matters
When you encounter an unfamiliar subsystem:
1. Does it have Start/Stop? → Lifecycle pattern, find where it's registered2. Does it Send or Subscribe to something? → Event Feed pattern, find producers and consumers3. Does it call an interface or concrete? → Backend pattern, find the implementation4. Does its constructor accept a Config? → Config pattern, find defaults and CLI mapping5. How many layers does its data cross? → Four-layer storage, find penetration and flush pathsThese 5 questions can help you locate any subsystem’s position in the geth architecture within minutes.
Some information may be outdated






