Sample size note: Most middle-range bins contain fewer than 50 markets (shown faded with *). Larger dots indicate more data. The calibration curve is most reliable at the extremes (0-5% and 95-100%).
Highest-Volume Resolved Markets
| Market |
Final Prob. |
Result |
Correct? |
Category |
Volume ($M) |
Methodology
This tracker measures calibration: whether markets that trade at X% actually resolve YES X% of the time.
- Data source: Polymarket's Gamma API for resolved events + CLOB orderbook for historical price snapshots (best bid/ask midpoint).
- Markets analyzed: Binary (YES/NO) markets with clear resolution. Multi-outcome markets (e.g., "Who will win the election?") are included as individual binary contracts, which means the 0-5% bin is heavily populated with losing candidates.
- Price snapshots: The YES token best-bid/ask midpoint price at 24 hours, 7 days, and 30 days before market close.
- Binning: Markets grouped into probability buckets (0-5%, 5-15%, 15-25%, ..., 95-100%). Dot size on the chart is proportional to sample size. Faded bins with * have fewer than 50 markets.
- Brier score: Mean squared error between forecast and outcome. 0.25 = coin-flip baseline. 0.02 = weather-forecast-grade accuracy. Lower is better.
- Perfect calibration = the dashed diagonal. Points above the line = market underpriced the event. Below = overpriced.
Important caveat: ~75% of price-tracked markets fall in the 0-5% bin (multi-outcome market losers). Middle bins (20-80%) contain fewer than 35 markets each — the calibration curve in this range has wide confidence intervals. The overall Brier score is dominated by extreme-probability bins. Interpret the middle of the curve with caution.
Other limitations: Price history limited to top 2,000 markets by volume. Cancelled or ambiguously resolved markets excluded. Categories inferred from question text via keyword matching.