Lesson 6 — Backtesting & simulation (Jupyter)
Simulate strategies in Jupyter: load historical ticks/candles, replay data, simulate executions with slippage & partial fills, and compute performance metrics.
Prerequisites
- Lessons 1–5 completed (env, auth, market data, orders, webhooks)
- Jupyter/JupyterLab, pandas, numpy, matplotlib installed
- Historical market data: tick or high-frequency candle files (CSV/Parquet)
Overview
We’ll cover: notebook setup, data loading and normalization, a minimal replay engine, execution simulation (slippage & partial fills), logging trades, and computing metrics (P&L, win rate, drawdown).
1) Jupyter setup
Install or ensure Jupyter is available and create a new notebook for your backtest.
# Install essentials
pip install jupyterlab pandas numpy matplotlib pyarrow
Use Parquet for larger datasets (faster and more compact). Start Jupyter Lab with jupyter lab.
2) Load and normalize historical data
Example: load tick CSV with columns: timestamp, price, size, side, symbol.
import pandas as pd
# read tick-level CSV or parquet
df = pd.read_parquet('ticks.parquet') # or pd.read_csv('ticks.csv', parse_dates=['timestamp'])
df = df.sort_values('timestamp').reset_index(drop=True)
df.head()
Tip: Convert timestamps to pandas datetime64[ns] and ensure monotonic ordering for deterministic replays.
3) Minimal replay engine
A simple row-by-row replay that calls a strategy.handler on each tick. Keep state small and avoid expensive allocations inside the loop.
class ReplayEngine:
def __init__(self, df, strategy, start_cash=1000.0):
self.df = df
self.strategy = strategy
self.cash = start_cash
self.positions = {} # selection/symbol -> {'size', 'avg_price'}
self.trades = [] # executed trades
def run(self):
for idx, row in self.df.iterrows():
tick = row.to_dict()
orders = self.strategy.on_tick(tick) or []
for order in orders:
exec_report = self.execute(order, tick)
if exec_report['filled'] > 0:
self.trades.append(exec_report)
self._apply_exec(exec_report)
return self.trades
def execute(self, order, tick):
# naive execution: check tick price vs order price and fill accordingly
price = tick['price']
filled = 0.0
status = 'NO_FILL'
if order.get('type','LIMIT').upper() == 'MARKET':
# assume immediate fill at current tick price (or include slippage)
filled = order['size']
status = 'FILLED'
avg_price = price
else:
# limit: simple match condition
if order['side']=='BUY' and price <= order['price']:
filled = order['size']; status='FILLED'; avg_price=order['price']
elif order['side']=='SELL' and price >= order['price']:
filled = order['size']; status='FILLED'; avg_price=order['price']
else:
avg_price = None
return {'order':order,'filled':filled,'avg_price':avg_price,'status':status,'timestamp':tick['timestamp']}
Replace naive logic with liquidity-aware fills using orderbook snapshots for higher fidelity.
4) Simulate slippage & partial fills
Use orderbook snapshots to simulate the depth you consume and compute average fill price and possible partial fills.
def simulate_fill_from_orderbook(orderbook, side, size):
"""
orderbook: {'bids':[[price,size],...], 'asks':[[price,size],...]]}
side: 'BUY' or 'SELL'
size: desired size (float)
returns: (filled_size, avg_price)
"""
remaining = size
total_cost = 0.0
if side == 'BUY':
asks = orderbook.get('asks', [])
for p,s in asks:
take = min(remaining, s)
total_cost += take * float(p)
remaining -= take
if remaining <= 0: break
else:
bids = orderbook.get('bids', [])
for p,s in bids:
take = min(remaining, s)
total_cost += take * float(p)
remaining -= take
if remaining <= 0: break
filled = size - remaining
avg_price = (total_cost / filled) if filled > 0 else None
return filled, avg_price
Note: If filled < requested size, report partial fills and remaining; your strategy must decide whether to cancel, reprice or wait.
5) Fees and costs
Include maker/taker fees and any exchange-specific rebates. Apply fees at trade time to compute net P&L.
def apply_fees(amount, price, fee_rate):
# fee_rate expressed as fraction (e.g., 0.001 = 0.1%)
cost = amount * price
fee = cost * fee_rate
return cost + fee
Tip: Use different fee rates for maker vs taker fills and subtract them from P&L.
6) Compute P&L & performance metrics
After simulation, compute standard metrics: total P&L, trade count, win rate, average P&L, max drawdown, and simple Sharpe-esque ratio.
import pandas as pd
import numpy as np
def compute_metrics(trades):
df = pd.DataFrame(trades)
# Example: compute pnl per trade if entry and exit tracked
pnl = df['pnl'] if 'pnl' in df else pd.Series(np.zeros(len(df)))
total_pnl = pnl.sum()
win_rate = (pnl>0).mean() if len(pnl)>0 else 0
avg_pnl = pnl.mean() if len(pnl)>0 else 0
# simple equity curve and max drawdown
equity = pnl.cumsum()
peak = equity.cummax()
drawdown = (equity - peak)
max_dd = drawdown.min() if len(drawdown)>0 else 0
return {'total_pnl': float(total_pnl),'win_rate': float(win_rate),'avg_pnl': float(avg_pnl),'max_drawdown': float(max_dd)}
Tip: Keep trade-level logs with entry/exit timestamps, sizes, prices, fees and client_ref to reproduce results.
7) Replay fidelity & common pitfalls
- Tick vs candle data: candle-level backtests may hide intra-bar fills — use tick-level for execution-sensitive strategies.
- Lookahead bias: ensure signals only use past data (no peeking at future bars).
- Survivorship bias: use historical market lists relevant to the timeframe, not the current universe.
- Latency & slippage: include realistic delay between signal and execution and model slippage from orderbook depth.
8) Reproducibility & logging
Persist the following to reproduce tests: dataset version, seed, strategy code or parameters, start/end timestamps and full trade logs. Use git or dataset fingerprints to identify test runs.
# simple run metadata write
import json, time
meta = {'run_at': time.time(), 'dataset': 'ticks.parquet', 'strategy': 'sma-5-20'}
open('last_run_meta.json','w').write(json.dumps(meta, indent=2))
What you’ll build next
Lesson 7 will cover risk management, hedging and deployment for crypto bots: circuit breakers, throttles, green-up strategies and packaging for deployment.
