The Data Intelligence Index is a comprehensive evaluation of frontier AI models on data-centric intelligence. As models and Agents become more powerful, we need a systematic way to measure their performance across diverse data challenges, from querying databases, SQL debugging, to data science, and more skills.
We assess frontier models across various aspects of data-centric intelligence, including DB querying, BI analysis, application debugging, human-centric interaction, digital, data science, and more. This provides a single view of both performance and cost efficiency. For methodology details on this index, see the blog page.
SQL Only: restrict the index to pure SQL benchmarks (BIRD-SQL, LiveSQLBench, BIRD-Critic, BIRD-Interact). Include Vision: add BIRD-Vision and show only models with vision results.
Data Intelligence Index
Evaluating frontier AI on data-centric intelligence across various aspects, including DB querying, BI analysis, application debugging, human-centric interaction, digital, data science, and more.
Model Profiles
Hover legend to highlightEach axis is normalized to the top score in that dimension. Hover for raw scores.
Score by Aspect
Representative score (%) per aspect · sorted by indexCost vs Performance
Average cost per task vs Data Intelligence IndexBenchmark Details
Key Findings
Top model averages under 50%. SQL debugging peaks at 40.9%, human-centric interaction at just 28.8%.
Opus 4.6 wins overall, ranking 1st on DB querying, BI analysis, debugging, and code translation.
Kimi 2.5 leads Vision (47.0% vs Opus 43.8%) at a fraction of the cost. Best accuracy-per-dollar.
Human-centric interaction: Even Opus 4.6 achieves just 28.1% and Qwen3 Coder drops to 18.4%.
Opus dominates overall, but Kimi leads multi-modal, and Qwen3 matches Opus on DB querying at 1/17th cost.