Most quant workflows degrade the same way. Regime logic lives in one notebook. Portfolio construction lives in another. Execution is wired together the week before going live. The backtest path and the live path diverge early and silently, and by the time you discover the divergence it has already cost you.
This post is a blueprint for not doing that. It covers the full stack: data pipeline, backtesting framework, execution infrastructure, signal generation, sim-to-live promotion, and how to extract value from trading flow once you are live. The architecture is designed so that the same engine runs research, simulation, paper trading, and live trading without any rewriting of integration points.
This is the Quant Stack built from zero to one.
What the stack actually is
The Quant Stack is a vertically-integrated systematic trading operating system. Five pillars hold it together.
The data pipeline is responsible for point-in-time correct tick and bar data, ETL with look-ahead bias prevention, and a feature store that enforces parity between research and production. The backtesting framework is an event-driven simulation engine with realistic slippage, latency, and partial fill modelling, followed by walk-forward validation and permutation testing. The execution infrastructure handles smart order routing, algorithmic execution (VWAP, TWAP, POV), broker adapters, and the order management system. The signal and data layer generates alpha from market microstructure, trader behaviour, regime detection, and factor models. The monetisation layer extracts value from trading flow through internalization logic, hedging, and adverse-selection detection across trader cohorts.
None of these components works in isolation. The value comes from wiring them together correctly.
System architecture
The system is organised into four horizontal layers. Intelligence agents at the top produce regime classifications and signals. An orchestrator in the middle chains them into a single pipeline: regime to portfolio to execution discipline to allocation to guardian. The core engine below that runs the same backtest and execution logic regardless of whether you are in simulation or live. Control and risk sit at the bottom, gating every order before it reaches the broker.
DATA SOURCES
Exchange Tick Feeds | News/NLP Sentiment | Alt Data | Trader Behaviour
DATA INGESTION SERVICE
(Kafka → TimescaleDB / Feature Store)
INTELLIGENCE LAYER
Regime Agent (HMM) | Sentiment Agent (NLP) | Signal Generator
ORCHESTRATOR
regime → portfolio → discipline → allocation → guardian
exposes: FastAPI + CLI + Scheduler
CORE ENGINE
Backtesting Engine Execution Engine
- Event loop - Paper / Sandbox
- Order book simulation - Live (Broker Adapter)
- Slippage model - SOR / VWAP / TWAP
- Walk-forward validation - Position and P&L tracker
CONTROL / RISK REVIEW (Parallel)
Guardian (pre-trade) Trade Journal
Risk Monitor (live VaR) Portfolio Analyst
BROKER / EXCHANGE GATEWAY
(IBKR · Alpaca · Binance · FIX)
The key design rule is that the orchestrator and core engine share the same contracts. A strategy defined for backtesting uses the same BaseStrategy interface in paper and live mode. There is no translation layer, no port, no rewrite at the point of promotion.
A secondary rule: every agent is pluggable. The regime agent can be swapped from a Hidden Markov Model to a Gaussian Mixture Model without touching the orchestrator. The broker adapter can be swapped from Alpaca to Interactive Brokers without touching the execution engine.
Data flows and point-in-time correctness
Data moves through the system in two planes. The historical plane handles batch ETL for research and backtesting. The live plane handles streaming for real-time signal generation and execution. Both planes share the same schema contracts. If they do not, you will silently train on features computed differently from how they are computed in production, and your live Sharpe will be lower than your backtest Sharpe for reasons you cannot find in the code.
The historical plane looks like this: raw ticks come in from the exchange, pass through an ETL layer that normalises and enforces point-in-time timestamps, land in TimescaleDB as clean bars, get transformed into features in the feature store, feed the backtest engine, and produce metrics in the analytics database.
The live plane mirrors it: a WebSocket feed replaces the raw tick file, Kafka replaces the batch ETL queue, the same signal engine reads from the same feature definitions, the orchestrator produces orders instead of backtest decisions, and those orders flow through the OMS and EMS to the broker gateway.
The feedback loop closes when fills come back from the broker and update the analytics database with real execution data.
Point-in-time correctness is the most important property of the data pipeline and the most commonly violated one. Every data point must carry its availability timestamp alongside its event timestamp. If an earnings release happens on Tuesday but the vendor aggregates and delivers it on Thursday, Thursday is the availability timestamp. Using Tuesday as the query boundary in a backtest means you are trading on information you did not have. The resulting alpha is fictional.
# Polars point-in-time query
import polars as pl
def pit_query(table: pl.LazyFrame, as_of: str) -> pl.LazyFrame:
"""Return only data available at `as_of` datetime."""
return (
table
.filter(pl.col("available_at") <= pl.lit(as_of).str.to_datetime())
.sort("available_at", descending=True)
.unique(subset=["ticker", "metric"], keep="first")
)
This function should be the only entry point for any feature query in the backtesting engine. Never query by event timestamp alone.
Backtesting framework
The backtesting engine is event-driven, not vectorised. Vectorised backtesting is faster to write and adequate for daily-bar strategies where order timing within a bar does not matter. For anything involving intrabar execution, partial fills, latency-aware signal generation, or market microstructure, event-driven simulation is the only valid approach.
The engine processes a stream of market events: bar events, tick events, fill events, and signal events. A strategy subscribes to the event types it needs via on_bar, on_tick, and on_fill hooks. Orders flow from the strategy to the simulated order book, which applies a slippage model and generates fill events. Position and P&L tracking happens in a portfolio object that the strategy can query at any time.
The slippage model deserves attention. A linear model is not sufficient. Real market impact grows with the square root of order size relative to average daily volume. The formula is:
impact_bps = k * volatility * sqrt(order_size / average_daily_volume)
where k is a calibration constant typically between 0.3 and 1.0 depending on the asset class and market conditions. During stressed conditions, execution costs can increase 200 to 300 percent above normal levels. Any backtest that uses flat-rate slippage is overstating its own profitability.
Walk-forward validation is mandatory before a strategy is considered for paper trading. Split the historical period into expanding training windows and fixed-size out-of-sample test windows. A strategy should pass at least three sequential OOS windows with a Sharpe above a minimum threshold before progressing. The permutation test provides an additional check: shuffle the return series randomly and run the strategy. If it performs similarly on shuffled data as on real data, the edge is statistical noise, not signal.
The final 20 percent of the available history should be a locked holdout that is never used during any phase of development. This block exists solely to provide an honest estimate of live performance before any capital is committed.
Execution infrastructure
The execution layer has three components: the Order Management System (OMS), the Execution Management System (EMS) with algorithmic execution strategies, and the broker adapters.
The OMS is a state machine. An order transitions from pending to submitted to partial to filled or cancelled. Every state transition is logged with timestamps. The OMS is also where position limits and duplicate order detection live.
The EMS implements the execution algorithms. VWAP slices an order across the day proportionally to expected volume, minimising market impact by trading when the market is deep. TWAP slices uniformly across a time window. Participation-of-Volume (POV) targets a fixed fraction of market volume. Implementation Shortfall (IS) minimises the difference between the decision price and the average execution price, accepting more market impact early to reduce timing risk.
The choice of algorithm depends on the strategy. Market-making strategies need to post and cancel limit orders in milliseconds. Stat-arb strategies entering large positions should use VWAP or TWAP to avoid tipping their hand to the market. Signal-driven directional strategies with short prediction horizons may accept higher impact in exchange for faster execution via IS.
Broker adapters implement a common BrokerAdapter abstract base class. Every adapter exposes the same interface: submit_order, cancel_order, get_position, get_account. The orchestrator and OMS never call broker-specific APIs directly. This means swapping from Alpaca paper trading to Interactive Brokers live is a configuration change, not a code change.
from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import Optional
@dataclass(frozen=True)
class Order:
ticker: str
side: str # 'buy' | 'sell' | 'short' | 'cover'
qty: float
order_type: str # 'market' | 'limit' | 'twap' | 'vwap'
limit_price: Optional[float] = None
class BrokerAdapter(ABC):
@abstractmethod
def submit_order(self, order: Order) -> str: ... # returns order_id
@abstractmethod
def cancel_order(self, order_id: str) -> bool: ...
@abstractmethod
def get_position(self, ticker: str) -> float: ...
@abstractmethod
def get_account(self) -> dict: ...
Every order submitted through the OMS first passes through the Guardian risk check. Guardian evaluates position limits, concentration limits, daily loss limits, and VaR bounds before allowing an order to proceed. If any check fails, the order is rejected with a reason code and the rejection is logged. This is not optional and it is not configurable off in live mode.
Market making
Market making continuously quotes bid and ask prices, earning the spread while managing inventory risk. The canonical model is Avellaneda-Stoikov (2008), which derives optimal quotes as a function of current inventory, market volatility, and a risk aversion parameter.
The reservation price adjusts the midpoint estimate based on current inventory:
r = S - q * gamma * sigma^2 * T
where S is the mid price, q is inventory (positive for long, negative for short), gamma is risk aversion, sigma^2 is variance, and T is the time horizon. When inventory is long, r falls below mid, incentivising the model to post a lower ask and lean toward selling. When inventory is short, r rises above mid.
The optimal spread around the reservation price is:
delta* = gamma * sigma^2 * T + (2/gamma) * ln(1 + gamma/kappa)
where kappa is the order arrival rate. The model posts bid at r - delta*/2 and ask at r + delta*/2.
The practical risk that the Avellaneda-Stoikov model does not fully address is adverse selection. Informed traders know the true price is moving before the market maker updates their quotes. They systematically hit stale quotes for profit. The signal for this is order flow toxicity, measured by the Volume-Synchronised Probability of Informed Trading (VPIN) metric. When VPIN rises above approximately 0.7, the proportion of order flow that is informationally motivated is high enough that quoting at normal spreads becomes unprofitable. The model should widen spreads or withdraw quotes entirely until toxicity normalises.
Hard inventory limits are also non-negotiable. The model's spread-skewing mechanism works under normal conditions. Under a sustained one-sided flow, the model will accumulate inventory faster than it can unwind through skewing. A hard limit of, say, five percent of capital triggers a direct hedge via index futures, independent of what the spread model is doing.
Statistical arbitrage
Statistical arbitrage exploits mean-reverting spread relationships between cointegrated assets. The spread is a linear combination of two or more asset prices that is stationary: it has a defined long-run mean and reverts to it after deviations.
The pipeline proceeds in stages.
Universe selection filters for liquid, co-sector candidates. Average daily volume above fifty million dollars, same GICS sector, stable rolling correlation. Pairs with insufficient liquidity are excluded because execution slippage will exceed any spread edge.
Cointegration testing applies the Engle-Granger two-step test for pairs or the Johansen test for multi-asset baskets. The Augmented Dickey-Fuller test on the residual spread must reject the unit root null at p below 0.05. The half-life of mean reversion, estimated from the Ornstein-Uhlenbeck parameter, should fall between approximately one and thirty trading days. Shorter than one day and you are competing with HFT latency. Longer than thirty days and capital is tied up for too long relative to the edge.
Signal generation computes the z-score of the spread:
z_t = (spread_t - rolling_mean) / rolling_std
Entry signals occur when |z| > 2.0. Exit signals occur when |z| < 0.5. Stop-loss occurs at |z| > 4.0, which indicates the cointegration relationship has broken down rather than an extreme but reversible deviation.
Cointegration monitoring is ongoing. Relationships that were cointegrated over three years can break permanently during a sector rotation, a merger announcement, or a credit event. Rolling re-estimation of the cointegration vector on a 60-day window detects breakdown early. If the ADF p-value rises above 0.10 on two consecutive rolling windows, the pair is suspended pending review.
Market fragmentation matters here. Running stat-arb across 16 or more equity exchanges means that a price discrepancy detected on one venue may already be arbitraged away on another by the time your order routes. Smart order routing that sources liquidity from multiple venues simultaneously reduces this risk but requires more sophisticated infrastructure.
Signal and data layer
Signals come from two sources: market data and trader behaviour.
Market microstructure signals are derived from the order book itself. Order flow imbalance measures the signed difference between buyer-initiated and seller-initiated volume over a short window. When buyers dominate, short-term price pressure is upward. Trade imbalance diverges from quote imbalance when large orders are being worked algorithmically, which is itself a signal. Effective spread and realised spread decomposition separates the adverse selection component of spread cost from the inventory component, giving a direct measure of how informed the current order flow is.
Factor signals are the longer-horizon alpha drivers: momentum, value, carry, quality, and low-volatility. These are computed over daily bars and enter the portfolio construction layer rather than the execution layer. Factor PCA reduces the signal space and isolates the components with the most orthogonal explanatory power.
Trader cohort signals segment the counterparty flow by inferred behaviour. Retail flow that is directional and trend-following has different toxicity properties than institutional flow that is liquidity-seeking. Identifying which cohort is driving current flow lets the market-making strategy adjust quotes and the stat-arb strategy adjust position sizing.
The feature store is where all of these signals are materialised and versioned. Every feature is tagged with a computed_at timestamp. When a bug is found in a feature calculation and the calculation is corrected, the fix is deployed and historical features are recomputed from that tag forward. The backtest engine always queries features by both event time and computation version, so it is possible to replay exactly what the production system saw at any historical point.
Database schema
The primary store is TimescaleDB for time-series data and PostgreSQL for relational metadata.
-- TimescaleDB hypertable for ticks
CREATE TABLE ticks (
ts TIMESTAMPTZ NOT NULL,
ticker TEXT NOT NULL,
price NUMERIC(18,8) NOT NULL,
size NUMERIC(18,4),
side CHAR(1), -- 'B' | 'A' | 'N'
exchange TEXT,
available_at TIMESTAMPTZ NOT NULL -- PIT correctness, not equal to ts
);
SELECT create_hypertable('ticks', 'ts');
CREATE INDEX ON ticks (ticker, ts DESC);
-- Continuous aggregate for OHLCV bars
CREATE MATERIALIZED VIEW bars_1m
WITH (timescaledb.continuous) AS
SELECT
time_bucket('1 minute', ts) AS ts,
ticker,
first(price, ts) AS open,
max(price) AS high,
min(price) AS low,
last(price, ts) AS close,
sum(size) AS volume
FROM ticks
GROUP BY 1, 2;
-- PostgreSQL orders table
CREATE TABLE orders (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
strategy_id UUID NOT NULL REFERENCES strategies(id),
ticker TEXT NOT NULL,
side TEXT NOT NULL,
qty NUMERIC NOT NULL,
order_type TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
submitted_at TIMESTAMPTZ,
filled_at TIMESTAMPTZ,
fill_price NUMERIC,
slippage_bps NUMERIC -- fill_price vs arrival_price in basis points
);
The available_at column on the ticks table is the most important column in the schema. It is what makes PIT queries possible. If your ingestion pipeline only records event time, you cannot correctly replay the information environment of any historical date.
The slippage_bps column on the orders table is the most important column for live monitoring. Comparing realised slippage in bps against the slippage assumption in the corresponding backtest tells you whether your backtest cost model was accurate. Systematic divergence here is an early warning that the strategy will underperform its backtest in live trading.
API routes
The orchestrator exposes a FastAPI interface on port 8000.
| Method | Route | Description |
|---|---|---|
| POST | /backtest |
Run backtest with strategy config and date range |
| GET | /backtest/{id} |
Fetch backtest result metrics |
| POST | /pipeline/run |
Execute full regime-to-allocation pipeline |
| GET | /signals/latest |
Latest signal snapshot per ticker |
| GET | /regime/current |
Current market regime and confidence |
| POST | /orders |
Submit order intent (passes through Guardian) |
| GET | /portfolio |
Current positions and P&L |
| DELETE | /orders/{id} |
Cancel pending order |
| WS | /stream/ticks |
Real-time tick stream |
| WS | /stream/fills |
Live fill notifications |
The data service runs on port 8001 and exposes historical bars, ticks, feature store snapshots, and the analytics report JSON used by the screener pages.
The POST /orders endpoint does not submit directly to the broker. Every order intent flows through Guardian first. A Guardian rejection returns a 403 with a JSON body explaining which risk check failed. There is no override mechanism.
External dependencies
Market data uses yfinance for equities (no API key, no rate limit at reasonable volume) and CoinAPI for crypto tick history. For production equity data at tick resolution, Polygon.io is the standard choice.
Live feeds connect via WebSocket: Binance WS for crypto, Alpaca Stream for US equities, IBKR TWS for institutional equities and futures. Redis pub/sub handles internal event distribution between services.
Broker adapters wrap Interactive Brokers (TWS API or IB Gateway), Alpaca (REST and WebSocket), and Binance. Each implements BrokerAdapter.
Infrastructure runs on TimescaleDB, PostgreSQL, Redis, and optionally ClickHouse for analytics aggregations. Kafka handles the event bus between the data ingestor and the downstream consumers. GitHub Actions runs CI on every push and the nightly data pipeline on schedule.
A failure in any of these services should fail safely. If the market data feed drops, the system must not assume the last price is still valid. If Guardian fails to compute, trading should pause rather than proceed with unchecked orders. The cost of missing a trade is always lower than the cost of executing a bad one with no risk check.
Key files
quant-stack/
orchestrator/
main.py # FastAPI app entry point
pipeline.py # regime → portfolio → discipline → allocation → guardian
agents/
regime.py # HMM / GMM market regime detection
sentiment.py # NLP signal aggregation
discipline.py # execution timing filter
guardian.py # ⚠ pre-trade risk check, all orders pass here
core/
backtest/
engine.py # event-driven backtest loop
slippage.py # linear and square-root market impact models
portfolio.py # position tracking, P&L, drawdown
metrics.py # Sharpe, Sortino, Calmar, CAGR
execution/
oms.py # Order Management System state machine
algos.py # VWAP, TWAP, POV, Implementation Shortfall
broker/
base.py # BrokerAdapter abstract base class
ibkr.py # Interactive Brokers TWS adapter
alpaca.py # Alpaca REST + WebSocket adapter
strategies/
base.py # BaseStrategy with on_bar, on_tick, on_fill hooks
market_making.py # Avellaneda-Stoikov implementation
stat_arb.py # cointegration pairs trading
data/
ingestor.py # tick and bar ingestion, Kafka producers
feature_store.py # feature computation and versioning
pit.py # ⚠ point-in-time query helpers, do not bypass
universe.py # S&P 500 universe builder (Wikipedia + yfinance)
signals/
microstructure.py # order flow imbalance, VPIN, spread decomposition
factors.py # momentum, value, carry, quality, PCA
cohorts.py # trader behaviour segmentation
risk/
var.py # Historical VaR and CVaR
limits.py # position limits, concentration checks
monitor.py # live risk metrics, WebSocket push
analytics/
us-shariah-screener.ipynb # nightly Shariah filter, same as GA screener
factor-report.ipynb
infra/
docker-compose.yml # TimescaleDB, Redis, Kafka
.github/workflows/
ci.yml # test and lint on push
nightly.yml # data pipeline and notebook execution
guardian.py and pit.py are the two files where bugs are most expensive. Both have full unit test coverage. Do not modify either without running the test suite.
Common gotchas
Look-ahead bias in feature engineering. Rolling z-scores or normalisation factors computed on the full dataset before being used as backtest inputs silently incorporate future information. A Sharpe above three in a backtest is a red flag. Every transform must use an expanding window anchored to the PIT availability timestamp. The pit.py helpers enforce this if you use them.
Slippage model too optimistic. Default linear slippage models underestimate real market impact. During stress, execution costs can increase 200 to 300 percent above normal. A strategy that looks profitable at five basis points of slippage may lose money at twenty. Always stress-test at three times your base slippage assumption before paper trading.
Schema drift between research and production. Feature calculations in notebooks often diverge from their production implementations after bugfixes that are applied in one place and not the other. The feature store's computed_at versioning system exists to catch this. Whenever you change a feature calculation, increment the version and recompute history. Never edit a feature definition in production without doing the same in research.
Market making inventory accumulation. The Avellaneda-Stoikov spread-skewing mechanism manages inventory gracefully under balanced two-sided flow. Under sustained one-sided flow (all buyers, no sellers), inventory accumulates faster than the skew can unwind it. The hard inventory limit and the futures hedge are not optional parts of the model. They are the risk management layer that makes the rest of it viable.
yfinance split and dividend adjustments. yfinance backdates price adjustments for corporate actions. If you cache data and a split occurs, the cache contains unadjusted prices for new data alongside adjusted prices for historical data, creating artificial discontinuities in your return series. Always refetch on any detected corporate action. Store both raw and adjusted series separately.
Guardian disabled in paper mode. It is sometimes tempting to disable Guardian during paper trading to allow unrestricted signal testing. The danger is that this flag persists when code is promoted to live. The go_live() function should assert Guardian is active as its first line, failing loudly rather than silently.
TimescaleDB chunk compression. TimescaleDB compresses old data chunks automatically. Queries that span compressed and uncompressed chunks have very different query plans. Without the right index configuration, cold-start backtest queries can be 10 to 100 times slower than expected. Set compress_segmentby = 'ticker' and maintain a composite index on (ticker, ts DESC).
Leaky walk-forward validation. Walk-forward analysis only prevents overfitting if the OOS windows are genuinely held out. Researchers who observe OOS results, tune parameters, and observe OOS results again have contaminated the out-of-sample period. After three or more such cycles, the held-out data is no longer held out in any meaningful sense. Lock the final 20 percent of history before any development begins and do not look at it until the strategy is finalised.
Simulation to live promotion protocol
The promotion from simulation to live capital is the most dangerous moment in systematic trading. A structured gate process prevents deploying strategies that look good in backtest but fail on real data.
Phase 1 (Research): Full historical backtest across the complete available history. Walk-forward validation across at least three sequential OOS windows. Permutation test: the strategy must not perform materially above random on shuffled returns. Slippage stress test at three times baseline. Maximum drawdown below twenty percent. Sharpe above one in each OOS window.
Phase 2 (Paper trading, two to four weeks): Live market data, simulated orders. Track implementation shortfall: the difference between backtest fill prices and paper fill prices. If average IS exceeds thirty basis points, the strategy is not viable at the current size in the current market conditions.
Phase 3 (Live micro, ten percent of target capital): Real orders, minimum position size. Compare live Sharpe against paper Sharpe daily. If live Sharpe falls below sixty percent of paper Sharpe within two weeks, halt and investigate before proceeding. Log every fill for audit.
Phase 4 (Scale, twenty-five to one hundred percent): Gradual capital allocation increase. Monitor whether performance degrades as order size increases: this indicates market impact is material relative to the edge. Check correlation with other live strategies to avoid unintended concentration.
Phase 5 (Continuous monitoring): Rolling 30-day Sharpe must remain above 0.5. A circuit breaker halts trading automatically at five percent daily NAV loss. Cointegration re-test every 60 days for stat-arb pairs. Quarterly full regime review. Monthly comparison of realised slippage in bps against the backtest slippage model.
The key metric that most practitioners overlook is implementation shortfall. It is the ground truth of execution quality, defined as the difference between the decision price and the average execution price in basis points. A strategy's live Sharpe will asymptotically approach its paper Sharpe only if IS is controlled. Track it per strategy, per broker, and per time of day.
Monetising trading flow
Once you are live and execution data is accumulating, the flow itself becomes a data asset.
Order flow has structure. Retail flow tends to be directional and trend-following. Institutional flow tends to be liquidity-seeking and mean-reverting. Toxic flow, which comes from traders with private information, tends to move prices adversely against whoever is on the other side. These cohorts coexist in any active market. Identifying which type of flow you are receiving at any given time lets you respond appropriately: widen quotes against toxic flow, tighten against uninformed flow, internalise what you can, and hedge what you cannot.
Internalisation logic matches buy and sell orders from separate flow cohorts against each other instead of routing both to the exchange. The benefit is that neither order crosses the spread on exchange. The internalised trades contribute to your P&L as spread capture while reducing market impact for both counterparties. The constraint is that internalisation requires sufficient bilateral flow to match, which is a function of scale.
Hedging logic converts net directional exposure accumulated through market making or internalisation into a delta-neutral position via index futures or ETFs. The hedge should be updated whenever net delta exceeds a threshold, not on a fixed schedule, because fixed-schedule hedging introduces predictable patterns that sophisticated counterparties can exploit.
The value extraction layer runs alongside the trading strategy layer. It does not generate its own signals. It analyses the fills, identifies patterns in the flow, and feeds that analysis back into the signal and regime layers.
Common operations
Running a backtest:
# Via API
curl -X POST http://localhost:8000/backtest \
-H "Authorization: Bearer $TOKEN" \
-d '{"strategy": "stat_arb", "start": "2022-01-01", "end": "2024-12-31"}'
# Check results
curl http://localhost:8000/backtest/{id} -H "Authorization: Bearer $TOKEN"
Starting the local stack:
docker-compose up -d timescaledb redis kafka
python orchestrator/main.py --env=dev
Running the test suite:
pytest tests/ -v --tb=short
# PIT correctness suite specifically
pytest tests/test_pit.py -v
# Guardian boundary tests
pytest tests/test_guardian.py -v
Triggering the nightly pipeline locally:
python data/ingestor.py --date=today
jupyter nbconvert --to notebook --execute analytics/us-shariah-screener.ipynb
Emergency halt:
# Cancel all pending orders
curl -X DELETE http://localhost:8000/orders/all -H "Authorization: Bearer $TOKEN"
# Check risk state
curl http://localhost:8000/risk/state -H "Authorization: Bearer $TOKEN"
After any emergency halt, do a post-mortem before restarting. Replay the last hour of ticks and fills through the backtest engine and compare the signal values against what the live system produced. If they diverge, find the divergence before the next session begins.
References and further reading
The Avellaneda-Stoikov market making model is the standard starting point for any inventory-aware quoting strategy: Avellaneda, M. and Stoikov, S. (2008), "High-frequency trading in a limit order book", Quantitative Finance, 8(3), 217-224.
For the theoretical foundations of cointegration testing in pairs trading: Engle, R. and Granger, C. (1987), "Co-integration and error correction: representation, estimation, and testing", Econometrica, 55(2), 251-276.
The VPIN metric for order flow toxicity is from: Easley, D., Lopez de Prado, M. and O'Hara, M. (2012), "Flow toxicity and liquidity in a high-frequency world", Review of Financial Studies, 25(5), 1457-1493.
For a practitioner treatment of implementation shortfall and execution quality measurement: Almgren, R. and Chriss, N. (2001), "Optimal execution of portfolio transactions", Journal of Risk, 3(2), 5-39.
On the architecture principles behind Quant 2.0 stacks and the cost of training-serving skew in production systems: the broader ML engineering literature, particularly work on feature stores (Feast, Tecton) and the MLOps practices now standard at systematic funds including Man Group and Two Sigma, whose public talks and research notes are the most candid primary sources available on what production quant infrastructure actually looks like.