Backtesting Guide — Principles & MarketBook Replay | BotBlog

Published: 2025-11-11 • BotBlog

Backtesting Guide — Principles & MarketBook Replay

This backtesting guide explains why backtesting matters, the core principles you must adopt, common dead-ends teams face, and a reproducible MarketBook replay approach for Betfair-style backtests. Follow the steps and use the included scripts to validate execution fidelity before you pilot live.

Short path: prototype signals → validate execution with MarketBook replay → paper trade in sandbox → pilot live with strict reconciliation.

Why backtesting matters

Backtesting turns an idea into an experiment you can measure and reason about. Therefore, you must treat backtests as controlled experiments: state assumptions, run reproducible tests, measure sensitivity, and validate execution fidelity. When you follow that approach you reduce the chance that a promising paper result surprises you in production.

Core principles

Reproducibility

Record every run: data version, git SHA, parameter values and random seeds. Then, save a run manifest so you can reproduce results and audit differences later.

Match fidelity to decision

If you only tune indicators, minute or hourly bars often suffice. However, if you decide on execution tactics (limit vs market, queueing), use ladder-level MarketBook snapshots. Increase fidelity only where it impacts decisions.

Explicit execution model

Simulate fills, partial fills, latency, and fees in code. Next, write tests that check simple deterministic scenarios so you know your simulator behaves as expected.

Robustness, not just returns

Perturb slippage, latency and parameters. Then, build a robustness matrix that shows how metrics drift. Prefer strategies that remain stable under realistic perturbations.

Common dead-ends and fixes

Optimistic fills

Many guides assume fills at bar-close or midpoint. That assumption inflates returns. Instead, walk the ladder and consume volume at price levels to compute realistic fills.

Overfitting via mass parameter search

Large grid searches on full datasets find noise. To avoid that, restrict hyperparameter search to training windows and validate on out-of-sample periods (nested validation).

Missing operational gates

Backtests without reconciliation, alerts, and a kill-switch can still fail in production. Therefore, include operational checks as deployment gates.

Data & storage

Collect MarketBook snapshots with publishTime (UTC), marketId, runner list, availableToBack and availableToLay arrays, and totalMatched. Store raw snapshots immutably (JSONL or versioned S3) and keep checksums and capture script versions.

{
  "marketId":"1.23456789",
  "publishTime":"2025-01-01T12:00:00.123Z",
  "runners":[
    {"selectionId":12345,
     "availableToBack":[{"price":2.0,"size":100.0}],
     "availableToLay":[{"price":2.02,"size":80.0}],
     "totalMatched":1234.5}
  ]
}

Validate each snapshot against a JSON Schema before replay. The included ingest script performs schema checks and writes normalized JSONL for deterministic replays.

Execution modelling

Implement these behaviors in your simulator:

Market orders: walk opposing ladder and consume volume until filled or exhausted.
Limit orders: attempt immediate match at price then rest remaining stake into a virtual book.
Partial fills: support remaining quantities and match them on subsequent snapshots.
Latency: delay submissions by a configurable period to simulate processing and network delay.
Fees & settlement: apply commission and handle SP/void rules where applicable.

# simplified consume ladder
def consume_ladder(ladder, qty):
    fills, remaining = [], qty
    for price, avail in ladder:  # ladder sorted best-first
        if remaining <= 0: break
        fill = min(remaining, avail)
        fills.append((price, fill))
        remaining -= fill
    return fills

Validation & walk-forward

Use walk-forward cross-validation: train on past window → test on next window → roll forward. Wrap hyperparameter searches inside the train window and collect test-period metrics across folds. Then, prefer strategies that perform consistently across many windows and perturbations.

Practical MarketBook replay pattern

Run this deterministic sequence to replay and simulate fills:

Ingest and validate snapshots (JSON Schema).
Sort snapshots by publishTime (UTC).
For each snapshot: match resting limit orders, then process strategy signals and submit orders at that timestamp.
Record every fill with order_id, timestamp, price and size.
At settlement apply commission and compute PnL; persist run manifest (git SHA, snapshot list, params).

Downloadable example scripts (replayer.py & ingest_marketbook.py) accompany this post for a starting implementation you can run locally.

TradingView — when to use it

Use TradingView to iterate signals quickly and visualise trades. Then, export signals and run them against MarketBook replays to evaluate execution. TradingView helps early-stage iteration; MarketBook replay tests execution realism.

Resources & downloads

Included resources (download from the resources page or request a ZIP):

replayer.py — deterministic MarketBook replay simulator (example).
ingest_marketbook.py — JSON Schema validator and normalizer.
marketbook_schema.json — JSON Schema for snapshots.
BACKTESTING-CHECKLIST-README.md — one-page checklist to gate deployments.

Checklist & deployment gates

Archive raw snapshots and record checksums.
Validate snapshots with JSON Schema.
Version simulator and record git SHA in run manifest.
Run walk-forward and nested validation.
Stress-test slippage, latency and data gaps.
Test kill-switch and reconciliation in staging before pilot.

Backtesting Guides