Capability
Reasoning, coding, multimodal support, tool use, and large-context behavior are weighted highest for premium model selection.
Trust layer
Scores are editorial decision indices, not official provider benchmarks. The goal is practical model selection: which model is most likely to work well for a specific job, budget, and deployment constraint.
Formula
Reasoning, coding, multimodal support, tool use, and large-context behavior are weighted highest for premium model selection.
Speed and value matter more for high-throughput products, routing layers, and user-facing chat interfaces.
Open weights, local deployability, and enterprise control improve the control score even when raw hosted-model intelligence is lower.
Provider docs, public leaderboard references, pricing pages, and practical engineering fit are combined with clear editorial labeling.