LitQA3
litqa3
10 runs · 5 models · evaluated by HybridEvaluator.
| # | Model ↕ | Variant ↕ | Mode ↕ | Score ↓ | Avg. dur ↕ | Tokens ↕ | Date ↕ |
|---|---|---|---|---|---|---|---|
| 1 | gpt-5-2-pro | tools,high | inject | 0.851 | 3.4m | 5.4M | 2026-02-03 |
| 2 | gpt-5-2 | tools,high | inject | 0.815 | 1.6m | 5.4M | 2026-02-03 |
| 3 | claude-opus-4-6 | tools,high | inject | 0.756 | 1.1m | 31.2M | 2026-03-22 |
| 4 | gemini-3-pro-preview | tools,high | inject | 0.744 | 1.7m | 61.0k | 2026-02-03 |
| 5 | claude-opus-4-5 | tools,high | inject | 0.732 | 40.0s | 23.4M | 2026-03-22 |
| 6 | gpt-5-2 | — | inject | 0.196 | 4.6s | 25.3k | 2026-02-03 |
| 7 | claude-opus-4-6 | — | inject | 0.185 | 7.4s | 50.2k | 2026-03-20 |
| 8 | gemini-3-pro-preview | — | inject | 0.185 | 1.3m | 56.5k | 2026-02-03 |
| 9 | gpt-5-2-pro | — | inject | 0.179 | 1.4m | 180.2k | 2026-02-03 |
| 10 | claude-opus-4-5 | — | inject | 0.143 | 6.8s | 51.9k | 2026-03-20 |
Click column headers to sort. Click mode chips to filter.