TableQA2 (image)
tableqa2-img
10 runs · 5 models · evaluated by HybridEvaluator.
| # | Model ↕ | Variant ↕ | Mode ↕ | Score ↓ | Avg. dur ↕ | Tokens ↕ | Date ↕ |
|---|---|---|---|---|---|---|---|
| 1 | claude-opus-4-5 | tools,high | file | 0.950 | 10.0s | 888.2k | 2026-03-22 |
| 2 | gemini-3-pro-preview | tools,high | file | 0.950 | 22.2s | 124.7k | 2026-02-03 |
| 3 | gpt-5-2 | tools,high | file | 0.950 | 1.7m | 1.8M | 2026-02-03 |
| 4 | gpt-5-2-pro | — | file | 0.940 | 43.9s | 795.5k | 2026-02-03 |
| 5 | gpt-5-2-pro | tools,high | file | 0.940 | 1.0m | 1.4M | 2026-02-03 |
| 6 | claude-opus-4-6 | tools,high | file | 0.930 | 10.2s | 1.6M | 2026-03-23 |
| 7 | gemini-3-pro-preview | — | file | 0.930 | 21.4s | 121.5k | 2026-02-03 |
| 8 | gpt-5-2 | — | file | 0.930 | 5.7s | 801.5k | 2026-02-03 |
| 9 | claude-opus-4-5 | — | file | 0.920 | 5.6s | 375.1k | 2026-03-20 |
| 10 | claude-opus-4-6 | — | file | 0.910 | 6.1s | 382.0k | 2026-03-20 |
Click column headers to sort. Click mode chips to filter.