Backtesting Without Lying To Yourself
The pattern is universal. A trader spends weeks optimizing a strategy on historical data. The equity curve goes up and to the right. Win rate is 72%. They go live confidently. Six weeks later, the account is down 18% and the strategy has been "scrapped".
The strategy was never the problem. The backtest was.
This article walks through the seven specific biases that make backtests look profitable when they should not. Avoiding even three of these will save you years of false confidence.
1. Look-Ahead Bias
Using information in the backtest that you wouldn't have had in real time.
Common form: Computing today's signal using today's close — then assuming you would have entered at today's open.
Other forms:
- Using next-bar's data in current-bar's signal (off-by-one indexing error)
- Using volume data that wasn't published yet at the time of trade
- Using fundamental data with the publication date, not the as-of date (e.g., Q3 earnings published Oct 28 used in trades on Oct 15)
Fix: Always lag every data point by at least one bar. Always check timestamps.
2. Survivorship Bias
Backtesting on a current list of stocks/coins. Every stock that delisted, went bankrupt, or got removed from the index is invisible to your backtest.
Real impact: A US small-cap "mean reversion" strategy backtested on the current Russell 2000 looks great. The actual Russell 2000 over 20 years includes Lehman Brothers, Enron, Wirecard, and 200+ other names you never see in the modern data. Real returns: dramatically worse.
Crypto example: Backtesting "buy the dip on top 100 coins" using today's top 100. Half of 2017's top 100 don't exist anymore.
Fix: Use a survivor-free historical universe. Norgate Data, Quandl, and CRSP provide these (paid). For crypto, CoinMarketCap historical snapshots.
3. Overfitting (Curve-Fitting)
Tuning parameters until the historical equity curve looks perfect. Each tweak makes the strategy more specific to past data, less robust to future data.
Symptom you'll recognize: "RSI 9 made me 23%, RSI 14 made me 18%, RSI 13 made me 27% — I'll use 13."
That 4% advantage of RSI 13 is noise. You've optimized to the specific path of past prices, not to a real edge. Out-of-sample, RSI 13 will perform the same as RSI 9.
Fix: Use walk-forward analysis. Test only on the out-of-sample period after optimizing on the in-sample. Common ratio: 70% in-sample / 30% out-of-sample. If out-of-sample diverges hard from in-sample, you've overfit.
4. Selection Bias (Cherry-Picking Period)
Backtesting only on a period when the strategy was naturally going to work.
Examples:
- Trend strategy backtested 2020-2021 (massive trending year due to QE) — of course it crushed
- Mean reversion strategy backtested in Q2 2023 (clean range) — of course it crushed
- Crypto strategy backtested 2017 only
Fix: Backtest across all market regimes in your dataset. At minimum: 1 bull trend, 1 bear trend, 1 range, 1 crash, 1 sideways grind. If your strategy only works in one regime, that's important to know.
5. Slippage and Commission Ignorance
Most backtests assume:
- You get filled at the exact price the signal triggered
- Zero commission
- Zero spread cost
- Zero overnight financing for swing trades
Real markets:
- Slippage is ~1 pip on liquid pairs, ~5+ pips on exotics
- Commissions: $5-10 per round trip
- Spread: 0.5-3 pips on majors, more on news
- Swap: -$5 to -$20 per night per standard lot
Fix: Subtract realistic trading costs in your backtest. For a strategy with 10-pip average targets, even 1 pip slippage + 0.5 pip spread = 15% of expected profit erased.
6. Position Sizing Mistakes
Using a fixed position size in backtest, then scaling up in live trading.
A strategy that returns 23% on a $1,000 backtest may return 0.5% on a $100,000 account because your fills are now too large to get the same prices. Slippage scales nonlinearly with size.
Fix: Backtest at the realistic size you'll trade. If your account is $50,000, simulate $50,000 fills, not $1,000.
7. Forward Bias In Stop / Target Selection
You design the strategy knowing what worked. "I'll use a 2× ATR stop because I noticed losses cluster around 1.8× ATR." You've now overfit your stop to historical data.
Fix: Set stop and target rules before optimizing. Then test. If the strategy works, it works. If you need to adjust, restart the test from scratch with the new rules — don't compound iterations on the same data.
A Sane Backtest Workflow
- Hypothesis first. Write the rule in 1 sentence: "Buy when 50 EMA crosses above 200 EMA, sell at 2× ATR target or 1× ATR stop."
- Define dataset. Pick instruments, date range, timeframe. Include at least one full bull/bear cycle.
- Split. 70% in-sample, 30% out-of-sample. Touch ONLY in-sample for optimization.
- Run baseline. Apply rule as-is to in-sample. Don't tweak yet.
- Apply costs. Subtract realistic commission, spread, slippage, swap.
- Check distribution. Look at not just total return, but distribution of trade returns. Are the wins concentrated in 3 monster trades? That's not robust.
- Walk forward. Apply the rule to out-of-sample. Out-of-sample performance should be 75-100% of in-sample. If it's 40%, you've overfit.
- Live demo. Run on a demo account for 30 trades. Did real fills match backtest assumptions?
- Small live size. If demo matches backtest, go live at 25% of intended size. Run another 30 trades.
- Scale. Only after 60+ live trades matching expectations.
The Hard Truth
A genuinely profitable strategy with proper risk management produces a 7-25% annualized return with 8-15% drawdowns. Anyone claiming consistent 80% win rates and 5%/month returns is showing you a backtest filled with the biases above.
The good news: a real 15% annualized strategy, with reinvestment, compounds to 300% over 10 years. That is wealth-building. You don't need fantasy returns.
What Tools Help
- TradingView Strategy Tester — fast for visual ideas, but very easy to fool yourself
- Backtrader (Python) — open source, handles all the biases above if used correctly
- MT5 Strategy Tester — built-in, MQL5 required
- Amibroker — paid, professional-grade
- QuantConnect / Quantopian-style platforms — handle survivorship + slippage natively
The tool matters less than the workflow. Bad workflow + great tool = false confidence.
Want to test your strategies on real historical scenarios? Quest Mode Module 6 covers system trading, forward testing, and the path from backtest to live.
Related reading: