How do I trigger this skill in a workflow?

Triggers include keywords like 'bybit order book', 'order book backtest', 'download bybit data', 'ob500', 'order book imbalance', and other listed phrases.

What formats are produced by the pipeline?

Processed data is saved as Parquet files in ./data/processed, and backtest results are written to ./reports as JSON and Markdown summaries.

Can I run only specific strategies?

Yes. Use the --strategies flag in scripts/backtest.py to select any subset of: imbalance, breakout, false_breakout, scalping, momentum, reversal, spoofing, optimal_execution, market_making, latency_arb.

bybit-order-book

Scanned

@davidm413

npx machina-cli add skill @davidm413/bybit-order-book --openclaw

Files (1)

SKILL.md

4.7 KB

ByBit Order Book Backtester

End-to-end pipeline: download → process → backtest → report.

Dependencies

pip install undetected-chromedriver selenium pandas numpy pyarrow --break-system-packages

Chrome/Chromium must be installed for Selenium.

Workflow

The pipeline has 3 stages. Run them sequentially, or skip to later stages if data is already available.

Stage 1: Download Order Book Data

Prompt the user for:

Symbol (default: BTCUSDT)
Date range (default: last 30 days)

Run scripts/download_orderbook.py:

python scripts/download_orderbook.py \
  --symbol BTCUSDT \
  --start 2024-06-01 --end 2024-06-30 \
  --output ./data/raw

Key details:

Downloads from https://www.bybit.com/derivatives/en/history-data
Automatically chunks into 7-day windows (ByBit's limit)
Uses undetected-chromedriver for Cloudflare bypass
Outputs: ZIP files in ./data/raw/ named {date}_{symbol}_ob500.data.zip
For data format details: see references/bybit_data_format.md

If Selenium fails (Cloudflare blocks, UI changes): Instruct the user to manually download from the ByBit page and place ZIPs in ./data/raw/.

Stage 2: Process & Filter to Depth 50

Run scripts/process_orderbook.py:

python scripts/process_orderbook.py \
  --input ./data/raw \
  --output ./data/processed \
  --depth 50 \
  --sample-interval 1s

What it does:

Reads JSONL from ZIPs (each line = full 500-level L2 snapshot)
Filters to top 50 bid/ask levels
Computes derived features: mid_price, spread, volume_imbalance, microprice
Optionally downsamples (e.g., 1s, 5s, 1min) — recommended for faster backtests
Outputs: Parquet files in ./data/processed/

Without downsampling: ~860K snapshots/day, ~300 MB Parquet per day per symbol. With 1s downsampling: ~86K snapshots/day, ~5 MB per day — much more practical.

Stage 3: Backtest Strategies

Run scripts/backtest.py:

# Run all 10 strategies
python scripts/backtest.py \
  --input ./data/processed/BTCUSDT_ob50.parquet \
  --output ./reports

# Run specific strategies
python scripts/backtest.py \
  --input ./data/processed/BTCUSDT_ob50.parquet \
  --strategies imbalance,breakout,market_making \
  --output ./reports

# Quick test with limited rows
python scripts/backtest.py \
  --input ./data/processed/BTCUSDT_ob50.parquet \
  --max-rows 100000 \
  --output ./reports

Strategy keys: imbalance, breakout, false_breakout, scalping, momentum, reversal, spoofing, optimal_execution, market_making, latency_arb

Outputs in ./reports/:

{SYMBOL}_backtest_report.json — Full results with equity curves
{SYMBOL}_backtest_report.md — Comparison table and detailed metrics

Report metrics per strategy: total trades, winners/losers, win rate, cumulative PnL, Sharpe ratio, max drawdown (absolute and %), avg PnL per trade, avg hold time, profit factor, best/worst trade, equity curve.

For strategy logic and tunable parameters: see references/strategies.md

Customization

To modify strategy parameters, edit the __init__ method of any strategy class in scripts/backtest.py. Each strategy's self.params dict contains all tunables.

To add a new strategy:

Subclass Strategy in scripts/backtest.py
Implement on_snapshot(self, row, idx, df) with entry/exit logic
Register in STRATEGY_MAP

Troubleshooting

Selenium can't load ByBit page: ByBit uses Cloudflare. Ensure undetected-chromedriver is up to date. Try --no-headless to debug visually. Fall back to manual download.

Out of memory on processing: Use --sample-interval 1s or larger. Process one day at a time.

No trades generated: Strategy thresholds may be too tight for the data period. Relax parameters (lower thresholds, shorter lookbacks) in references/strategies.md.

Source

git clone https://clawhub.ai/davidm413/bybit-order-bookView on GitHub

Overview

End-to-end pipeline to download ByBit derivatives order-book data, filter to depth 50, and backtest 10 order-book strategies. It generates comprehensive performance reports with PnL, Sharpe, win rate, and strategy comparisons.

How This Skill Works

The workflow runs in three stages: (1) download order-book ZIPs from ByBit's derivatives history-data page using Selenium with undetected-chromedriver, (2) unzip and process ob500 JSONL files to keep only the top 50 levels and compute features like mid_price, spread, volume_imbalance, and microprice, saving as Parquet, (3) backtest any of 10 strategies against the processed data and produce reports in the reports folder.

When to Use It

You want to download historical ByBit order-book snapshots from the derivatives history-data page.
You need to filter large ob500 data to a depth-50 view for tractable backtests.
You want to run any of 10 order-book-based strategies (e.g., imbalance, breakout, market_making) on the data.
You want to generate full backtest performance reports with PnL, Sharpe, max drawdown, and strategy comparisons.
You need automated, end-to-end backtesting workflows with reproducible outputs.

Quick Start

Step 1: Install dependencies and prerequisites, including undetected-chromedriver, Selenium, pandas, numpy, pyarrow.
Step 2: Download data with defaults: python scripts/download_orderbook.py --symbol BTCUSDT --start 2024-06-01 --end 2024-06-30 --output ./data/raw
Step 3: Process to depth 50 and run backtests: python scripts/process_orderbook.py --input ./data/raw --output ./data/processed --depth 50 --sample-interval 1s && python scripts/backtest.py --input ./data/processed/BTCUSDT_ob50.parquet --output ./reports

Best Practices

Use depth 50 to balance detail and performance and to align with ob50 data used in examples.
Enable 1s downsampling for faster backtests while preserving key dynamics.
Test with a small --max-rows subset before running full datasets to verify setup.
If Selenium is blocked, switch to manual ZIP download and place in ./data/raw/.
Consult references/strategies.md for tunable parameters and strategy definitions.

Example Use Cases

Download BTCUSDT ob50 data for the last 30 days and backtest imbalance vs breakout.
Backtest market_making across multiple assets to compare PnL and drawdown.
Apply spoofing detection strategy to identify unusual order-book patterns in historical data.
Run latency_arb on BTCUSDT ob50 and compare to momentum-based strategies.
Generate a comprehensive report suite (JSON/MD) summarizing strategy performance and equity curves.

Frequently Asked Questions

Add this skill to your agents