Leaderboard

Top 10 AI models by raw Humanity's Last Exam.

Intelligence ranks the published roster by raw Humanity's Last Exam evidence. For the benchmark policy and supporting evidence rules, see the Methodology page.

April 7, 2026Review target April 14, 2026

View

Buyer-facing table

Ranked by raw reasoning benchmark

Published models ranked by raw Humanity's Last Exam benchmark performance.
Rank	Model	Raw HLEHumanity's Last Exam	GPQA	MathArena	Arc-AGI-2
01	Claude Opus 4.6 Anthropic United StatesVerified evidence

01

62.7%

Claude Opus 4.6

Anthropic

United StatesVerified evidence

GPQA

n/a

MathArena

66.2%

Arc-AGI-2

Editorial investigation

Raw benchmark view, stripped back

Intelligence is the least interpreted surface on the site. The page keeps the raw HLE result front and center, then leaves methodology and evidence depth to the model page and methodology page.

Open top intelligence model

"Best if your work involves genuinely hard problems ? deep research, complex code, or legal and financial analysis ? where accuracy matters more than speed."

Claude Opus 4.6

Anthropic

Top 10 AI models by raw Humanity's Last Exam.

Ranked by raw reasoning benchmark

Claude Opus 4.6

Raw benchmark view, stripped back

Qwen 3.6 Plus Preview

Gemini 3.1 Pro

GPT-5.4

Gemini 3 Flash

Claude Sonnet 4.6

Kimi K2.5

MiniMax M2.7

DeepSeek V3.2 (Thinking)

Grok 4.1 Fast