Spread Decomposition: Huang-Stoll (1997)

The previous articles developed the three theoretical reasons a spread exists:

Inventory risk — the maker bears directional exposure (ho-stoll-inventory-model)
Adverse selection — some counterparties are informed (glosten-milgrom-model, kyle-lambda)
Order processing — the operational cost of running a market-making business (technology, clearing, regulatory compliance)

Huang and Stoll (1997) unified these into an empirical framework that decomposes the observed spread into its constituent parts. This is where theory meets data.

The Decomposition Framework

Let $s$ be the observed spread. Huang-Stoll model the transaction price $P_{t}$ as:

$P_{t} = μ_{t} + \frac{s}{2} \cdot Q_{t}$

where $μ_{t}$ is the efficient (true) price and $Q_{t} \in {- 1, + 1}$ indicates the trade direction (buy or sell).

The efficient price evolves as:

$μ_{t} = μ_{t - 1} + α \cdot \frac{s}{2} \cdot Q_{t - 1} + ϵ_{t}$

where $α$ is the adverse selection component — the fraction of the half-spread that represents permanent information. The innovation $ϵ_{t}$ captures public information arrivals.

The full model decomposes the half-spread into three components with weights summing to one:

$\frac{s}{2} = adverse selection α \cdot \frac{s}{2} + inventory β \cdot \frac{s}{2} + order processing (1 - α - β) \cdot \frac{s}{2}$

Component	Symbol	Economic Content	Observable Signature
Adverse selection	$α$	Permanent price impact of trades — information that does not revert	Trade direction predicts future price level
Inventory	$β$	Transient price impact — quote adjustment to manage position	Trade direction predicts next quote change, but effect reverts
Order processing	$1 - α - β$	Pure cost — no price impact, no quote adjustment	Bid-ask bounce with no predictive content

Estimation

The model generates a testable relationship between successive price changes and trade directions. The autocovariance structure of $Δ P_{t} = P_{t} - P_{t - 1}$ as a function of lagged $Q_{t}$ identifies $α$ and $β$ .

In practice, estimation proceeds via GMM or OLS on the regression:

$Δ P_{t} = \frac{s}{2} (α + β) Q_{t} - \frac{s}{2} β Q_{t - 1} + ϵ_{t}$

The coefficient on $Q_{t}$ gives $α + β$ (total informative component), and the coefficient on $Q_{t - 1}$ identifies $β$ (inventory reversion). The residual is order processing.

Typical empirical findings for US equities (pre-decimalization):

Adverse selection: 30-50% of the spread
Inventory: 10-30%
Order processing: 20-40%

Post-decimalization and with electronic trading, order processing costs collapsed. Adverse selection now dominates for most liquid names.

TradFi vs. DeFi: A Comparative Table

The three components manifest differently across market structures:

Component	TradFi Manifestation	DeFi Manifestation
Adverse selection ( $α$ )	Informed traders (hedge funds, prop desks) trade on material non-public information or superior models. Market makers detect via flow toxicity metrics (VPIN, order flow imbalance).	MEV bots, sandwich attackers, and token insiders exploit latency or privileged information. LPs cannot adjust quotes before informed trades execute. Particularly severe on AMMs with no ability to discriminate.
Inventory risk ( $β$ )	Market maker accumulates position, hedges with correlated instruments, adjusts quotes per Ho-Stoll. Risk bounded by position limits and hedging.	LP reserves shift along the bonding curve. No active hedging unless LP runs off-chain strategy. “Impermanent loss” is the realized cost of unmanaged inventory risk. Concentrated liquidity (Uniswap v3) introduces range-bound inventory management.
Order processing ( $1 - α - β$ )	Exchange fees, clearing costs, technology infrastructure, regulatory compliance (SEC, FINRA). Declining over time due to electronic trading.	Gas fees, protocol swap fees, bridging costs. Variable and sometimes dominant (Ethereum L1 gas during congestion). L2s and alt-L1s compress this component.

Notable Asymmetries

Quote adjustment speed: A TradFi market maker updates quotes in microseconds after observing flow. An AMM LP must submit a separate transaction to reposition, competing with the same block space as the informed traders. This latency asymmetry massively inflates the adverse selection component on-chain.

Hedging: TradFi makers hedge continuously in correlated markets. AMM LPs have no native hedging mechanism — they must go off-chain or use separate DeFi protocols (perpetuals, options) to manage inventory risk.

Fee structure: TradFi spreads are endogenous (the maker chooses them). AMM fees are exogenous (set by governance or pool creation). This means AMM fees cannot adapt to changing adverse selection conditions within a pool’s lifetime — though Uniswap v4’s hook system begins to address this.

Practical Application

Spread decomposition is not just academic. It directly informs:

Execution quality measurement: regulators and institutional desks use adverse selection estimates to evaluate broker performance
Market maker strategy: knowing which component dominates tells you whether to improve hedging ( $β$ ) or flow selection ( $α$ )
Protocol design: DeFi protocols designing fee structures should estimate the adverse selection share — a flat fee that covers order processing but not adverse selection will bleed LP capital

Connecting the Module

This article synthesizes the theoretical models into an empirical toolkit:

ho-stoll-inventory-model provides the theory behind the $β$ component
glosten-milgrom-model and kyle-lambda provide the theory behind the $α$ component
trading-fundamentals introduced the spread as the price of immediacy — now we know that “immediacy” bundles three distinct costs
order-books-and-venues showed where these costs manifest across different venue types

The decomposition closes the loop: we started with “why do spreads exist?” and now have both theoretical models and an empirical method to answer “how much of the spread is due to each cause?”

Companion notebook: notebooks/market-microstructure/06-spread-decomposition.py — estimate Huang-Stoll components from tick data, compare decompositions across asset classes, visualize the TradFi vs. DeFi comparison with Altair.

Questions to sit with:

If adverse selection dominates the spread for a given asset, what does that imply about the profitability of passive market making on that asset?
Gas fees on Ethereum L1 can exceed the swap fee on Uniswap. When order processing costs dominate, does the adverse selection component still matter for LP returns?
How would you design a dynamic fee mechanism for an AMM that adjusts fees based on real-time estimates of the adverse selection component?

Learning Finance

Explorer

Spread Decomposition