The Wire Score — Methodology

By The PredictWire Research Team · Last reviewed April 21, 2026

Live — The Wire Signal board

See today’s sharpest Wire Signal divergences across every active Polymarket contract on /signals/ — our live scoreboard of YES VALUE, NO VALUE, and leaning markets, updated on every publish.

Receipts — the scoreboard

Curious how Wire has actually done on resolved markets? /receipts/ scores every case-study market: what the Wire Signal called at close, and what resolved. No cherry-picking.

The Wire Score is PredictWire’s calibration-adjusted probability and confidence grade for any prediction market. It takes a market’s raw price — the number Polymarket or Kalshi shows you — and shifts it toward what the actual resolution history says markets at that price should be worth, now with a separate calibration curve for every major market category. Then it labels that estimate with a letter grade from A to D that tells you how much of the number to trust.

As of April 21, 2026, the Wire Score is v2. The live version uses per-category calibration; the v1 global-curve results remain below in version history for transparency.

Every market page on PredictWire ships with a Wire Score. Every Daily Brief is a ranking of Wire Score movement. The rest of this page explains exactly what the number means, how it is built, and how to read it.

What you see on a market page

Field	Example	What it means
Point	64.3%	The calibrated probability — our best guess of the true Yes chance given market price, horizon, liquidity, and category.
Band	60.5 – 68.1%	The confidence interval. Wider band = less certainty about the point.
Grade	B	Letter grade derived from band width. A = trust it. D = noise.
Diff	−3.7 pp	How far our calibrated Point sits below (or above) the raw market price.
Category used	politics	Which calibration curve was applied. “global” if the market’s category is thin or unknown.

How to read the Grade

Grade	Band width	Brier (v2)	What it means in plain English
A	≤ 3%	0.017	Tight band, high-volume, in the calibration sweet spot. Treat the Point as reliable.
B	≤ 6%	0.137	Reasonable fit. The direction is probably right, the exact number may wiggle a few points.
C	≤ 12%	0.223	Thin market or ugly horizon. Useful as a signal of disagreement, not a forecast.
D	> 12%	—	Effectively noise. A good “this market needs more volume” warning label.

The formula (v2)

category = categorize(question) or caller-provided
p_cal    = f_cal[category](p)     # per-category isotonic curve, falls back to global
width    = √( p_cal · (1 − p_cal) / n[category] ) · h(days) · v(volume)
grade    = A if width ≤ 3%  else
           B if width ≤ 6%  else
           C if width ≤ 12% else
           D

f_cal[category]. Five separately-fit calibration curves, one each for politics (457 markets), sports (269), other (792), crypto (182), and geopolitics (143). Each is a monotone piecewise-linear function fit via weighted isotonic regression on its category’s (price, outcome) pairs, anchored at (0,0) and (1,1). Categories with fewer than 100 resolved markets (e.g. economy at 68) fall back to a global curve identical to v1’s.

n[category]. Market-level closing-price decile count, per category. The denominator for band-width now reflects how many politics markets (or sports, etc.) have historically closed in each decile — a tighter match to the question “how often did markets like this one resolve yes.”

h(days), v(volume). Unchanged from v1 — U-shaped horizon multiplier anchored at 1.0 in the 90–180-day sweet spot; liquidity multiplier from 1.0× at ≥$1M volume up to 2.5× below $100k.

Backtest results (v2)

We ran v2 against the same 194,111-snapshot archive, alongside v1 and raw prices. v2 beats v1 overall and in every category that had its own curve fit. Gates: (1) v2 ≥ 1% better than v1, (2) v2 ≥ v1 per-category, (3) grade Brier monotone A→C, (4) no decile worse than raw. All four passed.

Overall

Metric	Raw price	v1	v2	v2 vs raw	v2 vs v1
Brier score	0.0769	0.0753	0.0731	+4.86%	+2.85%

Per category

Category	Snapshots	Raw	v1	v2	Δv2 vs v1	Fit
politics	66,487	0.0839	0.0815	0.0809	+0.79%	✓
other	52,621	0.0830	0.0828	0.0815	+1.59%	✓
sports	43,736	0.0272	0.0270	0.0260	+3.47%	✓
crypto	12,637	0.0970	0.0932	0.0749	+19.65%	✓
economy	9,768	0.0871	0.0933	0.0933	-0.00%	fallback
geopolitics	7,028	0.1634	0.1597	0.1551	+2.85%	✓

Crypto is the standout — the per-category curve picks up a 19.6% Brier improvement over the global v1 curve. Crypto markets have a noticeably different calibration profile (base-rate yes ≈ 22%, distinct from politics at ~18% or sports at ~8%), so a dedicated curve pays off immediately.

Per grade (v2 grades)

Grade	Snapshots	Raw	v1	v2
A	131,831	0.0173	0.0174	0.0171
B	22,437	0.1426	0.1428	0.1364
C	39,843	0.2370	0.2288	0.2228
D	— no snapshots in archive —

The 10× gap between Grade A and Grade C is the whole point of the letter grade — it tells readers where the information is.

Per decile of raw price

Predicted bucket	Snapshots	Raw	v1	v2	Δv2 vs raw
0-10%	134,100	0.0212	0.0211	0.0210	+1.21%
10-20%	16,495	0.1467	0.1439	0.1401	+4.50%
20-30%	9,868	0.2149	0.2111	0.2040	+5.07%
30-40%	7,520	0.2319	0.2306	0.2260	+2.56%
40-50%	7,034	0.2422	0.2379	0.2325	+4.01%
50-60%	5,956	0.2540	0.2441	0.2376	+6.43%
60-70%	4,259	0.2435	0.2407	0.2235	+8.22%
70-80%	3,227	0.2631	0.2405	0.2205	+16.18%
80-90%	3,015	0.1911	0.1824	0.1792	+6.22%
90-100%	2,637	0.0420	0.0404	0.0400	+4.69%

Every decile improved vs raw, in v2 as in v1. The biggest v2 win is at 70–80% (+16.2% vs raw), where prediction markets were systematically under-confident and the per-category curves tighten the correction.

Honest framing

The Point is a modest edge. A 4.9% Brier improvement over raw prices on a liquid market is real but small. The Wire Score is a statistical shrinkage toward the historical fit of prices to outcomes within a category; it is not a trading signal.

The Grade is the real product. No other site tells you when to ignore a prediction market price. A Grade A on a high-volume political contract is a genuine endorsement; a Grade C on a low-volume sports novelty is a genuine warning. That is the service.

The category tag is new. Every Wire Score now reports which calibration curve it used. If you see category_used: global, that means either the market’s category couldn’t be identified or the category had too few resolved markets to justify its own curve. The raw-price → calibrated mapping is still strictly better than doing nothing, but it uses the all-categories-pooled shape rather than a specialist one.

What the Wire Score does not do

It is not a trading signal. Any edge you could extract from a ~5% Brier improvement is eaten by Polymarket’s fees and spreads many times over.
It does not know about news. The current v2 is a function only of price, volume, horizon, and category. If a market mispriced breaking news, the Wire Score will not flag it — our Daily Brief catches that kind of divergence.
It is still Polymarket-only. Cross-platform (Kalshi, PredictIt) support is on the v3 roadmap.

Reproducibility

Every number on this page is reproducible from the PredictWire Calibration Archive. The five per-category calibration curves, the backtest, and the formula itself are all open-source Python. Critique with a reproducible counter-example, not a press release.

Version history

v2 — April 21, 2026: Per-category calibration. Five separate isotonic curves for politics, sports, other, crypto, geopolitics; global-curve fallback for thin categories. Backtest: Brier +4.86% vs raw, +2.85% vs v1. Crypto improved 19.6% over v1. All four gates passed.
v1 — April 20, 2026: Initial release. Single global calibration curve fit on 194,111 archive snapshots. Brier +2.07% vs raw. Grade Brier monotone A→C. All deciles improved.
v3 (roadmap): Cross-platform divergence (Polymarket vs Kalshi), order-flow imbalance, time-decay weighting. Planned with the next archive refresh.

See it in action

The PredictWire Calibration Archive — the 1,914-market dataset the Wire Score is fit on.
Best Prediction Markets — current platform rankings.
Rankings Methodology — how we rate platforms themselves.

About this page: Written and reviewed by The PredictWire Research Team under our Editorial Standards. The Wire Score formula and per-category calibration fits are open and reproducible. Corrections and methodological critique: corrections@predictwire.io.