Glossary

2 min

Judge Engine

Judge Engine defined: the conservative evidence gate that reviews quant strategy candidates using DSR, PBO, walk-forward validation, costs, data lineage, and human approval.

The Judge Engine is Corrai's conservative promotion layer. It reviews a strategy candidate after the research workflow has produced evidence. Its job is not to create more variants or improve the backtest. Its job is to decide whether the candidate survives the required validation gates.

Related search phrases include quant strategy validation engine, AI backtest judge, strategy approval workflow, alpha evidence gate, and automated backtest review.

Why a Judge exists

Research systems are naturally optimistic. Researchers want ideas to work. AI agents can generate plausible explanations. Backtests can reward repeated searching. A Judge exists to counter that pressure.

The Judge is intentionally skeptical. It asks whether the candidate's evidence is strong enough after accounting for the known ways backtests mislead:

  • selection bias and multiple testing
  • look-ahead leakage
  • missing costs and same-bar fills
  • regime concentration
  • fragile parameters
  • incomplete data lineage
  • missing human review

For the broader failure modes, see Why Backtests Lie.

What the Judge reads

A Judge verdict should be based on an evidence package, not a summary chart. The package includes:

  • hypothesis and registered run id
  • full trial history
  • data source and point-in-time lineage
  • workflow graph and parameter set
  • cost and execution model
  • walk-forward and purged validation results
  • DSR and PBO diagnostics
  • risk, turnover, drawdown, and regime checks
  • reviewer notes and signoff status

If one of these pieces is missing, the Judge can block promotion because the evidence is incomplete.

Verdicts are explanations

A useful Judge does not only return pass or fail. It explains the blocking reason. Examples:

  • blocked because DSR is below threshold after registered trial count
  • blocked because walk-forward performance is concentrated in one window
  • blocked because the data uses revised values without availability timestamps
  • blocked because the strategy survives gross but fails after declared costs
  • blocked because human review is missing

These verdicts help agents and researchers decide what to do next. Sometimes the next action is to repair a data issue. Sometimes it is to reduce the search scope. Often it is to abandon the idea.

Judge versus optimizer

The Judge is not an optimizer. If the Judge adjusts parameters until the result passes, it becomes part of the search and destroys its own independence. Corrai keeps generation and judgment separate: agents and Alpha Canvas produce candidates; the Judge reviews them.

This separation is central to evidence-based alpha validation. It lets the system support fast AI research without turning validation into another overfitting loop.