Best AI Models for Product Managers

Frontier models ranked by the PM Index — our product-management-weighted blend of Artificial Analysis benchmarks — next to price and context window. Use the slider to re-rank by quality or value for your budget.

Updated June 26, 2026

Which AI model should I use?

Pick your task and budget for a data-backed recommendation.

PM taskBudget

Recommended for writing prds

Claude Opus 4.8

PM Index 55.9 · $10.00/1M tok

See the full ranking for Writing PRDs →

Quality vs price

Higher = smarter (PM Index), left = cheaper. Ringed models are the best value — nothing beats them on both. Hover a point, or a lab below.

Anthropic
OpenAI
Z AI
Google
Alibaba
DeepSeek
MiniMax
Kimi
Xiaomi

Movers · last 7 days

▲ Climbing

Solar Open 100B▲2 · #68
Mistral Medium 3.1▲2 · #69
Magistral Medium 1.2▲1 · #63
Nova 2.0 Pro Preview▲1 · #64
Kimi K2▲1 · #65

▼ Falling

NVIDIA Nemotron 3 Nano 30B A3B▼8 · #76
Nova 2.0 Lite▼4 · #67
GLM-5-Turbo▼1 · #29
Kimi K2.5▼1 · #30
Gemini 3 Flash Preview▼1 · #31

★ New

Optimize ranking for:

Best qualityBest value

#	Model	PM Index	AA Intelligence	$/1M tok	Value
1	Muse Spark Meta	41.8	43.1	—	10.0
2	DeepSeek V4 Pro DeepSeek	44.2	44.3	$0.54	9.9
3	MiniMax-M3 MiniMax	44.0	44.4	$0.52	9.9
4	DeepSeek V4 Flash DeepSeek	40.1	40.3	$0.18	9.8
5	MiMo-V2.5 Xiaomi	40.1	40.1	$0.18	9.8
6	MiMo-V2.5-Pro Xiaomi	41.4	42.2	$0.54	9.7
7	GLM-5-Turbo Z AI	38.1	38.1	—	9.7
8	GLM-5.2 Z AI	51.3	51.1	$2.15	9.6
9	MiniMax-M2.7 MiniMax	37.1	38.1	$0.52	9.3
10	Qwen3.7 Plus Alibaba	37.1	39.0	$0.59	9.3
11	Hy3-preview Tencent	33.6	33.6	$0.20	9.2
12	MiMo-V2-Flash Xiaomi	33.2	33.2	$0.15	9.2
13	Qwen3.6 Plus Alibaba	38.7	39.6	$1.13	9.1
14	DeepSeek V3.2 DeepSeek	33.4	33.4	$0.34	9.1
15	MiMo-V2-Pro Xiaomi	40.3	40.3	$1.50	9.0
16	Kimi K2.6 Kimi	41.6	42.8	$1.71	9.0
17	Grok Build 0.1 0616 xAI	38.6	39.8	$1.25	9.0
18	Gemini 3 Flash Preview Google	37.8	37.8	$1.13	9.0
19	Kimi K2.5 Kimi	38.1	38.1	$1.19	9.0
20	MiniMax-M2.5 MiniMax	33.7	33.7	$0.52	9.0
21	GLM-5 Z AI	39.5	39.5	$1.55	8.9
22	Nemotron 3 Ultra 550B A55B NVIDIA	36.9	37.8	$1.18	8.9
23	Command A+ Cohere	29.3	29.3	$0.00	8.9
24	GPT-5.4 mini OpenAI	39.7	40.0	$1.69	8.9
25	Qwen3.6 27B Alibaba	36.7	37.1	$1.35	8.8
26	MiniMax-M2.1 MiniMax	31.4	31.4	$0.52	8.8
27	Gemini 3.5 Flash Google	49.6	50.2	$3.38	8.8
28	GLM-4.7 Z AI	33.8	33.8	$1.00	8.7
29	Step 3.7 Flash StepFun	28.8	29.7	$0.44	8.6
30	GLM-5.1 Z AI	39.7	40.2	$2.15	8.6
31	Kimi K2 Thinking Kimi	32.7	32.7	$1.07	8.6
32	Grok 4.3 xAI	35.4	37.6	$1.56	8.5
33	Step 3.5 Flash 2603 StepFun	26.0	26.0	$0.15	8.5
34	MiniMax-M2 MiniMax	28.3	28.3	$0.52	8.5
35	Step 3.5 Flash StepFun	25.5	25.5	$0.15	8.5
36	K-EXAONE LG AI Research	24.7	24.7	—	8.4
37	DeepSeek V3.2 Exp DeepSeek	25.4	25.4	$0.31	8.4
38	EXAONE 4.5 33B LG AI Research	23.0	23.0	—	8.3
39	ERNIE 5.0 Thinking Preview Baidu	21.9	21.9	—	8.2
40	Qwen3.6 Max Preview Alibaba	40.0	40.0	$2.92	8.2
41	Qwen3.7 Max Alibaba	44.9	46.0	$3.75	8.1
42	Nemotron Cascade 2 30B A3B NVIDIA	21.3	21.3	—	8.1
43	NVIDIA Nemotron 3 Super 120B A12B NVIDIA	23.3	25.4	$0.41	8.1
44	Mistral Small 4 Mistral	20.8	20.8	$0.26	8.0
45	Grok 4.20 0309 v2 xAI	37.0	37.0	$3.00	7.8
46	Sonar Reasoning Pro Perplexity	17.8	17.8	—	7.8
47	Grok 4.20 0309 xAI	36.5	36.5	$3.00	7.8
48	GPT-5.1 OpenAI	38.9	38.9	$3.44	7.8
49	Kimi K2 0905 Kimi	23.5	23.5	$1.07	7.7
50	GPT-5.4 OpenAI	51.3	51.4	$5.63	7.6
51	Gemini 3.1 Pro Preview Google	43.7	46.5	$4.50	7.6
52	Solar Open 100B Upstage	15.1	15.1	—	7.6
53	DeepSeek V3.1 Terminus DeepSeek	26.3	26.3	$1.92	7.5
54	Kimi K2 Kimi	19.4	19.4	$1.03	7.4
55	Solar Pro 2 (Preview) Upstage	12.5	12.5	—	7.3
56	Nova 2.0 Lite Amazon	17.3	20.5	$0.85	7.3
57	Solar Pro 3 Upstage	12.0	14.1	—	7.3
58	GPT-5.2 OpenAI	42.2	42.2	$4.81	7.3
59	Sonar Reasoning Perplexity	11.7	11.7	—	7.2
60	Llama Nemotron Super 49B v1.5 NVIDIA	12.4	12.4	$0.18	7.2
61	NVIDIA Nemotron 3 Nano 30B A3B NVIDIA	11.8	14.2	$0.10	7.2
62	Gemini 3 Pro Preview Google	39.6	39.6	$4.50	7.2
63	EXAONE 4.0 32B LG AI Research	10.6	10.6	—	7.1
64	Mistral Medium 3.5 Mistral	29.4	29.9	$3.00	7.1
65	Mistral Medium 3.1 Mistral	14.8	14.8	$0.80	7.1
66	Mistral Large 3 Mistral	14.2	15.9	$0.75	7.1
67	Claude Sonnet 4.6 Anthropic	47.5	47.2	$6.00	7.0
68	Sonar Perplexity	9.5	9.5	—	7.0
69	Sonar Pro Perplexity	9.3	9.3	—	7.0
70	Llama 4 Maverick Meta	11.9	14.3	$0.47	7.0
71	Solar Pro 2 Upstage	9.0	9.0	—	7.0
72	Granite 4.1 30B IBM	7.5	8.9	—	6.9
73	Granite 4.1 8B IBM	7.1	6.7	$0.06	6.8
74	Nova Lite Amazon	6.9	6.9	$0.10	6.8
75	Llama 4 Scout Meta	8.0	10.0	$0.29	6.8
76	R1 1776 Perplexity	6.3	6.3	—	6.8
77	ERNIE 4.5 300B A47B Baidu	9.0	9.0	$0.49	6.8
78	Solar Mini Upstage	6.2	6.2	$0.15	6.7
79	Gemini 2.5 Pro Google	27.0	27.0	$3.44	6.7
80	Llama 3.3 Instruct 70B Meta	8.6	8.6	$0.61	6.6
81	Granite 4.0 H Small IBM	5.2	5.2	$0.11	6.6
82	Phi-3 Mini Instruct 3.8B Microsoft	4.6	4.6	—	6.6
83	Jamba Reasoning 3B AI21 Labs	4.1	4.1	—	6.6
84	Phi-4 Microsoft	4.9	4.9	$0.22	6.5
85	Phi-4 Mini Instruct Microsoft	3.0	3.0	$0.00	6.5
86	Granite 4.1 3B IBM	3.2	3.2	—	6.5
87	Magistral Medium 1.2 Mistral	20.1	20.1	$2.75	6.4
88	Exaone 4.0 1.2B LG AI Research	2.9	2.9	—	6.4
89	Granite 4.0 H 1B IBM	2.7	2.7	—	6.4
90	Jamba 1.5 Mini AI21 Labs	2.7	2.7	$0.25	6.3
91	Tiny Aya Global Cohere	1.0	1.0	$0.00	6.3
92	Nova Pro Amazon	7.7	7.7	$1.40	6.1
93	Nova 2.0 Pro Preview Amazon	20.1	21.8	$3.44	6.0
94	Claude Opus 4.8 Anthropic	55.9	55.7	$10.00	5.5
95	Claude Opus 4.7 Anthropic	53.7	53.5	$10.00	5.3
96	Llama 3.1 Instruct 405B Meta	8.5	8.5	$3.69	4.8
97	Jamba 1.7 Large AI21 Labs	5.3	5.3	$3.50	4.6
98	GPT-5.5 OpenAI	54.8	54.8	$11.25	4.6
99	Jamba 1.5 Large AI21 Labs	5.1	5.1	$3.50	4.6
100	Jamba 1.6 Large AI21 Labs	5.0	5.0	$3.50	4.6
101	Nova Premier Amazon	12.7	12.7	$5.00	4.4
102	Claude Opus 4.6 Anthropic	43.7	43.7	$10.00	4.4
103	Command-R+ Cohere	3.0	3.0	$6.00	3.0
104	Grok 4 xAI	33.3	33.3	$11.00	2.8
105	Claude Fable 5 Anthropic	60.1	59.9	$20.00	0.0

Get the monthly PM AI rankings

New frontier models ship constantly. Get the updated PM Index rankings — and which model to use for which PM task — once a month. No spam, unsubscribe anytime.

Best AI model by PM use case

The right model depends on the job. These rankings reweight the benchmarks for specific product-management tasks.

How the PM Index works

Generic AI leaderboards optimize for coding or competitive math. The PM Index reweights Artificial Analysis benchmarks for the work product managers actually do — emphasizing general reasoning and multi-step competence, and de-emphasizing pure coding. It sits on the same 0–100 scale as the underlying indices so you can compare directly. Price shown is the AA-convention blended rate (3 parts input to 1 part output) per 1M tokens.

Model recommender

Which AI model should I use?

Value frontier chart

Movers · last 7 days

AI model rankings

Get the monthly PM AI rankings

Best AI model by PM use case

How the PM Index works