gpt-5.1-2025-11-13
27 runs · 27 questions
Aggregate
| Search | Scored / Total | Mean Cramér | Med |bias| | Output tokens |
|---|---|---|---|---|
| off | 27 / 27 | 0.2679 | 0.350 | 216,144 |
Cramér-log distribution
One bar per Cramér-log bucket. Buckets match the heatmap color scale: tight (<0.05), clean (<0.2), productive (<0.7), different-interpretation (<2), suspect (≥2).
Per-question Cramér-log
One row per question. Cells colored by Cramér-log band. Click a question to open its detail view.