LLM Foundation Models

Compare performance, pricing, and features of leading Large Language Models

Model & Creator	Context Window	Quality Index	Price (USD/1M)	Tokens/s	Latency (s)
o1-preview OpenAI	128k	86	$27.56	147.5	22.89
o1-mini OpenAI	128k	84	$5.25	225.0	10.21
Gemini 2.0 Flash (exp) Google	2000k	82	$0.00	168.7	0.53
DeepSeek V3 DeepSeek	128k	80	$0.48	89.0	1.00
Gemini 1.5 Pro (Sep) Google	2000k	80	$2.19	60.1	0.82
Claude 3.5 Sonnet (Oct) Anthropic	200k	80	$6.00	67.2	1.02
GPT-4o (May '24) OpenAI	128k	78	$7.50	102.7	0.63
GPT-4o (Aug '24) OpenAI	128k	78	$4.38	90.5	0.63
Qwen2.5 72B Alibaba	131k	77	$0.40	67.0	0.58
Claude 3.5 Sonnet (June) Anthropic	200k	76	$6.00	58.0	0.93
Nova Pro Amazon	300k	75	$1.40	93.1	0.39
GPT-4 Turbo OpenAI	128k	75	$15.00	38.9	1.21
Mistral Large 2 (Jul '24) Mistral	128k	74	$3.00	33.4	0.50
Pixtral Large Mistral	128k	74	$3.00	37.8	0.39
Llama 3.1 405B Meta	128k	74	$3.50	29.8	0.73
Llama 3.3 70B Meta	128k	74	$0.67	72.8	0.48
GPT-4o (Nov '24) OpenAI	128k	73	$4.38	119.9	0.33
GPT-4o mini OpenAI	128k	73	$0.26	113.4	0.62
Gemini 1.5 Flash (Sep) Google	1000k	72	$0.13	186.6	0.41
Claude 3 Opus Anthropic	200k	70	$30.00	25.7	2.01
Llama 3.2 90B (Vision) Meta	128k	68	$0.81	47.6	0.34
Llama 3.1 70B Meta	128k	68	$0.72	72.7	0.46
Claude 3.5 Haiku Anthropic	200k	68	$1.60	64.5	0.72
Yi-Large 01.AI	32k	61	$3.00	66.5	0.44
Gemma 2 27B Google	8k	61	$0.26	48.1	0.75
Claude 3 Sonnet Anthropic	200k	57	$6.00	66.4	0.74
Command-R+ Cohere	128k	55	$5.19	50.2	0.48
Gemma 2 9B Google	8k	55	$0.12	170.0	0.41
Claude 3 Haiku Anthropic	200k	55	$0.50	123.1	0.55
Llama 3.1 8B Meta	128k	54	$0.10	182.9	0.35
Llama 3.2 11B (Vision) Meta	128k	54	$0.18	132.1	0.30
Llama 3.2 3B Meta	128k	49	$0.06	249.6	0.38
Llama 3 70B Meta	8k	47	$0.89	49.9	0.40
Llama 3 8B Meta	8k	45	$0.15	122.9	0.34
Llama 3.2 1B Meta	128k	26	$0.04	468.4	0.38
GPT-4 OpenAI	8k	-	$37.50	29.0	0.66
Llama 2 Chat 7B Meta	4k	-	$0.33	123.9	0.38
Gemini 1.0 Pro Google	33k	-	$0.75	102.8	1.29

Methodology

While higher quality models are typically more expensive, they do not all follow the same price-quality curve.

Quality Index

Average result across our evaluations covering different dimensions of model intelligence. Currently includes MMLU, GPQA, Math & HumanEval. OpenAI o1 model figures are preliminary.

Price

Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).

Median across providers

Figures represent median (P50) across all providers which support the model.

Data source: artificialanalysis.ai/methodology