QuantZoo is a production-grade, open-source Python framework for systematic trading strategy research, backtesting, walk-forward validation, real-time streaming, portfolio management, and risk analytics. It is designed for reproducibility, extensibility, and professional quant development.
Problem Statement
Quantitative trading research is often fragmented:
- Tutorials and blog posts use inconsistent data formats and ad-hoc backtesting logic.
- Results are not reproducible due to different assumptions, data sources, and lack of standardized infrastructure.
- Comparing strategies fairly is difficult without a common engine for execution, risk, and reporting.
QuantZoo solves these problems by providing a unified strategy API, a modular backtesting engine, standardized data ingestion and reporting, and comprehensive validation tools.
Approach
System Architecture
┌────────────────────────┐    ┌────────────────────┐    ┌───────────────────┐
│   Data / Ingestion     │──▶ │   Backtesting /     │──▶ │  Portfolio Engine  │
│ (CSV, Parquet, APIs)   │    │   Execution Engine  │    │ (allocation, risk) │
└────────────────────────┘    └────────────────────┘    └───────────────────┘
    │                          │                         │
    │                          ▼                         ▼
    │                   ┌────────────────┐        ┌──────────────────┐
    │                   │   FastAPI /     │        │   Streamlit UI    │
    │                   │   Real-time     │        │   Dashboard /     │
    │                   │   Streaming     │        │   Visualization   │
    │                   └────────────────┘        └──────────────────┘
    ▼
   CLI (qz) — runs backtests, exports leaderboards, and writes results to DuckDBFlow:
- Ingest raw market and alternative data into standardized Parquet/DuckDB tables.
- Run strategies in the event-driven backtester (vectorized/pandas-based) with slippage and commission models.
- Persist backtest artifacts and performance leaderboards to DuckDB for fast analysis and reporting.
- Expose real-time streams and replay controls via FastAPI and a Streamlit dashboard.
Architecture Principles
- Modular Design: Each component (strategy, data, execution, metrics, reporting) is a separate module and can be replaced or extended independently.
- Strategy as Pure Function: Strategies are implemented as pure signal generators that produce signals from OHLCV data with no hidden state or side effects.
- Separation of Concerns: Strategy logic, execution simulation, analytics, and persistence are strictly decoupled.
- No Look-Ahead Bias: All indicators and metrics are computed using only historical information available at the time of the signal.
- Reproducibility: Deterministic results via seed control, pinned dependencies, and YAML experiment configurations.
- Extensibility: New strategies, indicators, and data sources can be added with minimal boilerplate.
Technology Stack
- Python 3.11+
- pandas, numpy — fast vectorized time-series operations
- scikit-learn — ML strategies and validation pipelines
- FastAPI — real-time/replay streaming backend
- Streamlit — interactive strategy/portfolio dashboard
- DuckDB & Parquet — scalable analytics and audit trail storage
- PyTest — unit, integration, and regression tests
- Docker & GitHub Actions — reproducible CI/CD and deployments
Core Components
- Strategies: Implementations live in strategies/and inherit from a smallStrategyinterface. Example strategies included: MNQ_808 (example), momentum, volatility breakout, pairs trading.
- Backtesting Engine: Event-driven, vectorized engine that supports position sizing, slippage, commissions, and realistic execution simulation.
- Portfolio Engine: Capital allocation across strategies with equal-weight, risk parity, volatility targeting, and scheduled rebalancing.
- CLI & Persistence: A command-line interface (qz) runs backtests, generates reports, and exports leaderboards. Results are persisted in DuckDB for efficient analytics.
- Metrics & Reporting: Standardized metrics (Sharpe, max drawdown, win rate, profit factor, historical VaR, expected shortfall) and markdown/image reports generated per backtest.
- Indicators: Strictly historical technical indicators (ATR, RSI, MACD, Bollinger Bands) implemented with no look-ahead.
- Data Layer: Pluggable loaders for CSV/Parquet, Yahoo Finance, Polygon, Alpha Vantage, and custom feeds with quality-cleaning (splits/dividends/missing data handling).
- Real-Time Infrastructure: FastAPI service for live or replayed streams and a Streamlit dashboard for visualization and strategy selection.
Validation & Testing
- Walk-Forward Analysis: Expanding and sliding window validation with support for purged K-fold splits to avoid leakage.
- Unit Tests: Indicators, signal generation, and smaller modules tested on synthetic and controlled data.
- Integration Tests: End-to-end backtests with expected outputs to detect regressions.
- Regression Tests: Freeze outputs for canonical inputs and fail CI when results drift.
- CI/CD: GitHub Actions runs tests, linting, and type checks on every commit and publishes artifacts for review.
Example Results (2025)
MNQ_808 (15m, 2025 Backtest)
- Sharpe Ratio: 2.81
- Max Drawdown: 9.37%
- Win Rate: 51.6%
- Total Return: 29.1% (Jan–Oct 2025)
- Commission: $0.32/side, Slippage: 0.5 tick
!Dashboard Screenshot
These results illustrate the engine’s ability to capture high-frequency intraday signals while modeling realistic fees and slippage.
Key Features
- event‑driven backtesting engine with realistic fees and slippage, walk‑forward validation, and a DuckDB persistence layer
- PineScript-compatible strategy API to reduce translation friction
- Walk-forward optimization and purged K-fold validation primitives
- Real-time FastAPI streaming backend and Streamlit dashboard for live and replay modes
- Comprehensive metric suite: Sharpe, drawdown, profit factor, win rate, VaR/ES
- Reproducible YAML experiment configs and deterministic seeds
- Portfolio allocation and risk management primitives
- Scalable storage and reporting via DuckDB and Parquet
Development Highlights
- Vectorized Execution: Backtests leverage pandas/numpy to maximize speed and reliability.
- Data Quality Pipeline: Automated cleaning for missing bars, splits, and dividends.
- Execution Realism: Flexible slippage/commission models and basic market-impact hooks.
- Extensible API: Add strategies or indicators with minimal boilerplate — a single method to implement for signal generation.
- Documentation & Examples: API reference, YAML experiment examples, and reproducible notebooks for onboarding.
- Operational Stack: Monitoring with Prometheus and Grafana; CI/CD runs pytest, mypy type checks, flake8 linting, and Bandit security scans on every commit via GitHub Actions.
Impact
QuantZoo is used for research, education, and live trading prototypes. Its architecture influenced subsequent projects (e.g., QuantTerminal), and the project is open for contributions.
Roadmap
- Live broker integrations (Alpaca, IBKR)
- Advanced order types and market-impact-aware execution simulation
- Multi-asset portfolio optimization and allocation
- Machine learning strategy templates and automated feature pipelines
- Enhanced reporting, attribution, and automated notebooks for report generation
License
MIT License — see LICENSE
Contributing
- Fork and branch from main
- Add new strategies, indicators, or data sources
- Write tests and documentation
- Submit pull requests via GitHub
Built with Python 3.11+ • Powered by pandas, numpy, FastAPI, Streamlit, DuckDB, and scientific computing