Outliers Can Make Subsample Estimation Inconsistent in Dynamic Time Series

Key takeaways

Outlier removal can fail in dynamic time series
Residual filters can spread contamination forward
Pointwise subsampling clashes with that spread
Patch removal restores consistency without contamination models

If you remove the bad data points from a time series, you might expect the analysis to recover the truth. This paper shows that expectation can fail in dynamic time series models. Even when the contamination locations are known perfectly, deleting the contaminated observations does not restore the clean-data objective, because the contamination keeps spreading through the residual filter and warps the estimation criterion. The result is stark: subsample-based estimators are generically inconsistent for the parameter of the uncontaminated process. The authors describe this as a structural clash between pointwise subsampling and residual propagation. To fix it, they introduce a patch removal operator, a way of transforming index sets so the estimator respects how contamination moves through the model. Under general high-level conditions, this new transformation leaves the estimator asymptotically unchanged when the data are clean, while restoring consistency under contamination. The approach applies to a broad class of residual-based estimators and does not require modelling the contamination process.

A single bad day in a stock chart can stain the whole line. If you have ever deleted an outlier from a spreadsheet, the instinct feels safe. This work shows that safety can be an illusion. In a dynamic time series, old values feed into new errors. That means one contaminated point can keep changing the residuals that drive the fit. Even perfect knowledge of the bad dates may not save the clean answer. The surprise is simple. Removing the bad rows does not always restore the model you wanted. The result matters whenever the goal is the hidden process, not the messy record. That is the usual goal in economics and finance.

Why deleting bad rows fails

At the center of the failure is the residual filter. A residual is the part left over after the model explains the data. In many dynamic models, that filter uses past observations. So a contaminated value can leak into later residuals, even after the original row is gone. The estimator then works on a warped target. Even with oracle knowledge, meaning perfect knowledge of the bad dates, deleting contaminated observations does not restore the uncontaminated objective. Subsample-based estimators then usually miss the clean target. In plain terms, they keep aiming at the wrong number. The clash is between pointwise subsampling and residual propagation. The damage moves, but the deletion does not. That is why the failure is structural, not a tuning mistake. Changing the sample split does not fix the logic of the fit.

How the damage travels

The fix is a patch removal operator. A patch is the whole zone affected by one bad point, not just the point itself. The operator changes the index set, meaning the list of time points the estimator uses. It removes the bad point and the dates that still carry its effect through the residual filter. Under clean data, this change fades away as the sample grows, so the estimator stays asymptotically unchanged. Asymptotically means in the long run, with large samples. Under contamination, the same rule lines up with the way damage spreads. That restores consistency, which means the estimate converges to the clean target.

What outliers can imitate

Outliers can do more than stand out. They can fake volatility clustering, which means fake bunches of big moves. They can also fake structural breaks, regime switching, and persistence. Persistence means shocks seem to linger for a very long time. A unit-root-like series looks almost stuck in place after a shock. Those are not small errors. They are false stories about how the data behave. The introduction treats that risk as a core reason to care about robust methods.

Outliers can fake volatility clustering by making calm data look jumpy.
Outliers can fake structural breaks by creating a false shift in level.
Outliers can fake regime switching by making one state look like many.
Outliers can fake persistence by making the series look near unit-root-like.

“Even under oracle knowledge of contamination locations, removing contaminated observations does not restore the uncontaminated objective.”

From the abstract

“The damage moves, but the deletion does not.”

Why this changes robust time-series work

This matters because many robust tools still think in single rows. The new result says that logic can miss the real target. The target is not just the flagged point. It is every point touched by its trail. The patch removal operator makes that trail visible in the index set. It also avoids the need to model the contamination process itself. That is a big deal when the dirty process is unknown. The method reaches a broad class of residual-based estimators, which are built from leftover errors. It gives them a way to stay faithful to the clean-data parameter.

What comes next

The surprise at the start still stands. In dynamic data, the damage is not just the bad point. It is the trail it leaves behind. That makes pointwise delete-and-fit workflows brittle. The patch idea gives a cleaner map. It asks the cleanup step to follow propagation, not just flags. The next test is whether the same rule stays practical across other residual-based estimators with different filters. If it does, many familiar outlier fixes will need a new default. That is the concrete consequence of the work. A bad point is no longer the whole story. The trail matters too.

Outliers Can Make Subsample Estimation Inconsistent in Dynamic Time Series

Why deleting bad rows fails

How the damage travels

What outliers can imitate

Why this changes robust time-series work

What comes next

Authors

Provenance

Keep reading

Comments