Bigger Feature Spaces Uncover Sparse Pricing Patterns in Asset Pricing

Key takeaways

Capacity sparsity and factor sparsity are different
Bigger feature spaces can reveal sparse pricing patterns
Basis pursuit keeps the price-relevant few
The gain comes from discovery, not factor hoarding

In asset pricing, bigger models are often treated like heavier suitcases: they seem harder to carry and easier to overpack. This paper argues the real payoff from complexity is not keeping more factors, but making room to find a sparse pattern in prices. The authors separate two ideas: capacity sparsity, meaning how large the candidate feature space is, and factor sparsity, meaning that only a few risks may actually matter. They revisit the benchmark design of Didisheim et al. (2025) and push it into higher-complexity settings. There, nonlinear feature expansions paired with basis pursuit, a sparse-selection method, produce portfolios whose out-of-sample performance beats ridgeless benchmarks beyond a critical complexity threshold. The message is sharp: the virtue of complexity in asset pricing comes from expanding the space where a sparse structure of priced risks can be discovered, not from simply retaining more factors.

Didisheim et al. (2025) built a benchmark around large factor sets. That setup invites a simple guess. More inputs should mean more noise. This study asks a different question. What if the wider menu helps the model find a smaller truth? That sounds odd at first. But the same trap shows up in daily life. A larger search box can make the right file easier to spot. In asset pricing, the prize is a sparse set of priced risks. Those are the few signals that still matter after the clutter. The bigger space may not add answers. It may help the right answers stand out.

Two kinds of sparsity

The core split is simple. Capacity sparsity is about the size of the candidate feature space. Factor sparsity is about the number of priced risks that stay in the final story. The study says these are partners, not rivals. As the candidate space grows, nonlinear feature expansions add more ways to search. They are new features that bend beyond straight lines. Basis pursuit, a sparse-selection method, then keeps only a few useful terms. In the higher-complexity setting, that mix produces portfolios that beat ridgeless benchmarks on new data once complexity passes a critical point. Ridgeless benchmarks are models that do not push weights toward zero. The big gain is not from keeping more factors. It comes from finding a sparser pattern inside a larger room.

How the search works

Basis pursuit does the sorting. It starts with many candidate features. Then it chooses the smallest set that still explains the pricing pattern. The nonlinear feature expansions widen that candidate set first. They add curved links, not just straight ones. That matters because a sparse risk pattern can hide in plain sight. The comparison group is the ridgeless benchmark. That baseline leaves the model broad and does not pull weights toward zero. The setup asks one clean question. Does better test performance come from keeping more factors alive? Or does it come from giving sparsity more room to work?

The Didisheim et al. (2025) design gives the benchmark starting point.
Nonlinear feature expansions widen the candidate menu.
Basis pursuit keeps the pricing story sparse.

“The evidence shows that the gains from complexity arise not from retaining more factors, but from enlarging the space from which a sparse structure of priced risks can be identified.”

From the abstract

Why the room matters

That result changes the old story about complex models. Bigger feature spaces are not useful because they hold more signals. They are useful because they widen the hunt for a sparse rule. The introduction points to a tension between richer feature sets and lean models. This study turns that tension into teamwork. Sparse selection still matters. But it needs a rich enough space to search. That is why compressing many signals too early can miss useful low-variance directions. Those are the small-moving patterns that may still carry pricing clues. Complexity is not the end goal. It is the stage on which sparsity can act.

What this changes

For portfolio builders, the message is practical. Do not treat size and simplicity as a zero-sum trade. Build a wider menu first. Then let a sparse method choose the few risks that still price assets. That order matters. It says the value of complexity comes from discovery, not hoarding. It also gives a reason to keep using shrinkage, the habit of pulling weak weights toward zero. Shrinkage is not a rival to complexity here. It is the tool that turns a large search space into a lean result. In short, bigger models can still end up simple. They just need room to become simple in the right way.

What to test next

The next test is clear. Push the same benchmark design into still larger feature spaces. Then check whether basis pursuit keeps finding sparse priced risks. If it does, bigger models stop being clutter machines. They become searchlights. That would make one lesson hard to ignore. The win from complexity is not a long list of surviving factors. It is the chance to uncover a short list of real ones. This study's sharpest surprise holds up only if that pattern stays true as the search gets wider.

Bigger Feature Spaces Uncover Sparse Pricing Patterns in Asset Pricing

Two kinds of sparsity

How the search works

Why the room matters

What this changes

What to test next

Authors

Provenance

Keep reading

Comments