The Data Intelligence Index is a comprehensive evaluation of frontier AI models on data-centric intelligence. As models and Agents become more powerful, we need a systematic way to measure their performance across diverse data challenges, from querying databases, SQL debugging, to data science, and more skills.
We assess frontier models across various aspects of data-centric intelligence, including DB querying, BI analysis, application debugging, human-centric interaction, digital, data science, and more. This provides a single view of both performance and cost efficiency. For methodology details on this index, see the blog page.
- Curated human-verified suite: v0.3 uses higher-quality, more representative tasks verified by humans, reducing the previous 8k tasks to around 2k while keeping the core data-centric skill coverage.
- Base vs Agent: To evaluate raw model ability and agentic capability separately, we design two evaluation settings: Base measures direct single-step generation, while Agent measures looped CLI/tool use.
SQL Only: restrict the index to pure SQL benchmarks (Mini-Dev (multi-dialect), LiveSQLBench, BIRD-Critic, BIRD-Interact). Include Vision: add BIRD-Vision and show only models with vision results.
Data Intelligence Index v0.3
Evaluating frontier AI on data-centric intelligence across various aspects, including DB querying, BI analysis, application debugging, human-centric interaction, digital, data science, and more.
Model Profiles
Hover legend to highlightEach axis is normalized to the top score in that benchmark. Missing results remain blank.
Score by Aspect
Benchmark overall score (%) · sorted by scoreCost vs Performance
Average cost per task vs Data Intelligence IndexBenchmark Details
Key Findings
Key findings are currently under analysis. We are reviewing Base and Agent results, trajectory-level behavior, and benchmark-specific error patterns before publishing summarized conclusions.