Validation & robustness

Tell skill from luck — overfitting, PBO, Monte Carlo, stress, and tail risk.

Overfitting & PBO

A strategy is overfit when it's tuned so tightly to past data that it fits noise rather than signal — it looks brilliant in the backtest and falls apart live. PBO (Probability of Backtest Overfitting) estimates the chance your best configuration is overfit; lower is better. Use it to discount a backtest that looks too clean. Walk-forward testing is the front-line defence — see the Walk-forward page.

Monte Carlo

Monte Carlo resamples and reshuffles your trade outcomes thousands of times to produce a distribution of results instead of one number. It shows the range of equity curves and drawdowns you might realistically have seen, so you can judge a strategy by its spread of outcomes, not just the single lucky (or unlucky) path it actually took.

Stress tests

Stress tests replay the strategy through historically hard regimes — crashes, chop, sharp rallies — and apply adversarial shocks like extra slippage and fill rejections. The question is not how it does on an average day, but whether it survives the worst ones.

VaR & CVaR

Value at Risk (VaR) is a loss threshold you won't exceed at a given confidence — e.g. 95% of periods lose less than X. Conditional VaR (CVaR) is the average loss on the bad periods beyond that threshold — the size of the tail. Together they describe not just how often you lose, but how bad the rare losses get.