What is a context window?

A context window is the maximum number of tokens a language model can process in a single request. It includes both the input prompt and the generated output. Larger windows let you fit longer documents, more conversation history, or more examples.

Which LLM has the largest context window?

Context window leaders change frequently. As of mid-2026, models like Google Gemini 1.5 Pro and Meta Llama 4 Scout support millions of tokens, while most mainstream models support 128K–1M tokens. Use the comparison table to see the current leaders.

Does a larger context window always mean better performance?

Not necessarily. Very long contexts can increase latency and cost, and some models lose accuracy at the extremes. The best model depends on your document length, budget, and whether you need vision, function calling, or JSON output.

How do I use the fit calculator?

Enter the size of your document in words or tokens. The calculator filters the table to show only models whose context window is large enough to fit it, along with an estimate of how much room remains for the response.

LLM Context Window Comparison

Compare context windows across 328 models. Filter by provider, modality, and minimum context size. Use the fit calculator to find models that handle your documents.

Last updated: 2026-06-13

Will your document fit?

Document words

Estimated tokens

Showing 328 of 328 models

Model	Provider	Context ↓	Output max	Input price	Capabilities
Meta: Llama 4 Scout Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model de...	Meta	10.0M	16K	$0.10	Vision Stream
Auto Router Your prompt will be processed by a meta-model and routed to one of dozens of mod...	Openrouter	2.0M	0	Variable	Vision Stream
Pareto Code Router The Pareto Router maintains a tiered shortlist of strong coding models, ranked b...	Openrouter	2.0M	0	Variable	Stream
xAI: Grok 4.20 Grok 4.20 is a reasoning model from xAI with industry-leading speed and agentic ...	xAI	2.0M	0	$1.25	Vision Stream
xAI: Grok 4.20 Multi-Agent Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative...	xAI	2.0M	0	$2.00	Vision Stream
OpenAI: GPT-5.4 GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into...	OpenAI	1.1M	128K	$2.50	Vision Stream
OpenAI: GPT-5.4 Pro GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified archi...	OpenAI	1.1M	128K	$30.00	Vision Stream
OpenAI: GPT-5.5 GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, ...	OpenAI	1.1M	128K	$5.00	Vision Stream
OpenAI: GPT-5.5 Pro GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and a...	OpenAI	1.1M	128K	$30.00	Vision Stream
Google: Gemini 3.1 Pro Preview Custom Tools Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves...	Google	1.0M	66K	$2.00	Vision Stream
Owl Alpha Owl Alpha is a high-performance foundation model designed for agentic workloads....	Openrouter	1.0M	262K	Free	Stream
DeepSeek: DeepSeek V4 Flash DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepS...	DeepSeek	1.0M	0	$0.10	Stream
DeepSeek: DeepSeek V4 Pro DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6...	DeepSeek	1.0M	384K	$0.43	Stream
Google: Gemini 2.5 Flash Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically desi...	Google	1.0M	66K	$0.30	Vision Stream
Google: Gemini 2.5 Flash Lite Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family,...	Google	1.0M	66K	$0.10	Vision Stream
Google: Gemini 2.5 Flash Lite Preview 09-2025 Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family,...	Google	1.0M	66K	$0.10	Vision Stream
Google: Gemini 2.5 Pro Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reaso...	Google	1.0M	66K	$1.25	Vision Stream
Google: Gemini 2.5 Pro Preview 05-06 Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reaso...	Google	1.0M	66K	$1.25	Vision Stream
Google: Gemini 2.5 Pro Preview 06-05 Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reaso...	Google	1.0M	66K	$1.25	Vision Stream
Google: Gemini 3 Flash Preview Gemini 3 Flash Preview is a high speed, high value thinking model designed for a...	Google	1.0M	66K	$0.50	Vision Stream
Google: Gemini 3.1 Flash Lite Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized ...	Google	1.0M	66K	$0.25	Vision Stream
Google: Gemini 3.1 Flash Lite Preview Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for hi...	Google	1.0M	66K	$0.25	Vision Stream
Google: Gemini 3.1 Pro Preview Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced...	Google	1.0M	66K	$2.00	Vision Stream
Google: Gemini 3.5 Flash Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro...	Google	1.0M	66K	$1.50	Vision Stream
Google: Lyria 3 Clip Preview 30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's famil...	Google	1.0M	66K	Free	Vision Stream
Google: Lyria 3 Pro Preview Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of mu...	Google	1.0M	66K	Free	Vision Stream
Meta: Llama 4 Maverick Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language mode...	Meta	1.0M	16K	$0.15	Vision Stream
MiniMax: MiniMax M3 MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, imag...	Minimax	1.0M	512K	$0.30	Vision Stream
Qwen: Qwen3 Coder 480B A35B Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation mod...	Qwen	1.0M	66K	$0.22	Stream
Qwen: Qwen3 Coder 480B A35B (free) Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation mod...	Qwen	1.0M	262K	Free	Stream
Xiaomi: MiMo-V2.5 MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic p...	Xiaomi	1.0M	131K	$0.14	Vision Stream
Xiaomi: MiMo-V2.5-Pro MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in gener...	Xiaomi	1.0M	131K	$0.43	Stream
OpenAI: GPT-4.1 GPT-4.1 is a flagship large language model optimized for advanced instruction fo...	OpenAI	1.0M	0	$2.00	Vision Stream
OpenAI: GPT-4.1 Mini GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o...	OpenAI	1.0M	33K	$0.40	Vision Stream
OpenAI: GPT-4.1 Nano For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest mode...	OpenAI	1.0M	33K	$0.10	Vision Stream
Writer: Palmyra X5 Palmyra X5 is Writer's most advanced model, purpose-built for building and scali...	Writer	1.0M	8K	$0.60	Stream
MiniMax: MiniMax-01 MiniMax-01 is a combines MiniMax-Text-01 for text generation and MiniMax-VL-01 f...	Minimax	1.0M	1.0M	$0.20	Vision Stream
Amazon: Nova 2 Lite Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads tha...	Amazon	1.0M	66K	$0.30	Vision Stream
Amazon: Nova Premier 1.0 Amazon Nova Premier is the most capable of Amazon’s multimodal models for comple...	Amazon	1.0M	32K	$2.50	Vision Stream
Anthropic: Claude Fable 5 Claude Fable 5 is a Mythos-class model from Anthropic, built for autonomous know...	Anthropic	1.0M	128K	$10.00	Vision Stream
Anthropic: Claude Opus 4.6 Opus 4.6 is Anthropic’s strongest model for coding and long-running professional...	Anthropic	1.0M	128K	$5.00	Vision Stream
Anthropic: Claude Opus 4.6 (Fast) Fast-mode variant of [Opus 4.6](/anthropic/claude-opus-4.6) - identical capabili...	Anthropic	1.0M	128K	$30.00	Vision Stream
Anthropic: Claude Opus 4.7 Opus 4.7 is the next generation of Anthropic's Opus family, built for long-runni...	Anthropic	1.0M	128K	$5.00	Vision Stream
Anthropic: Claude Opus 4.7 (Fast) Fast-mode variant of [Opus 4.7](/anthropic/claude-opus-4.7) - identical capabili...	Anthropic	1.0M	128K	$30.00	Vision Stream
Anthropic: Claude Opus 4.8 Claude Opus 4.8 is Anthropic's most capable generally available model in the Opu...	Anthropic	1.0M	128K	$5.00	Vision Stream
Anthropic: Claude Opus 4.8 (Fast) Fast-mode variant of [Opus 4.8](/anthropic/claude-opus-4.8) - identical capabili...	Anthropic	1.0M	128K	$10.00	Vision Stream
Anthropic: Claude Sonnet 4 Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonn...	Anthropic	1.0M	64K	$3.00	Vision Stream
Anthropic: Claude Sonnet 4.5 Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized f...	Anthropic	1.0M	64K	$3.00	Vision Stream
Anthropic: Claude Sonnet 4.6 Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier per...	Anthropic	1.0M	128K	$3.00	Vision Stream
MiniMax: MiniMax M1 MiniMax-M1 is a large-scale, open-weight reasoning model designed for extended c...	Minimax	1.0M	40K	$0.40	Stream
NVIDIA: Nemotron 3 Super NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating ju...	NVIDIA	1.0M	0	$0.09	Stream
NVIDIA: Nemotron 3 Super (free) NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating ju...	NVIDIA	1.0M	262K	Free	Stream
NVIDIA: Nemotron 3 Ultra NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model fr...	NVIDIA	1.0M	16K	$0.50	Stream
NVIDIA: Nemotron 3 Ultra (free) NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model fr...	NVIDIA	1.0M	66K	Free	Stream
Qwen: Qwen Plus 0728 Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybr...	Qwen	1.0M	33K	$0.26	Stream
Qwen: Qwen Plus 0728 (thinking) Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybr...	Qwen	1.0M	33K	$0.26	Stream
Qwen: Qwen-Plus Qwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a...	Qwen	1.0M	33K	$0.26	Stream
Qwen: Qwen3 Coder Flash Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their propriet...	Qwen	1.0M	66K	$0.20	Stream
Qwen: Qwen3 Coder Plus Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder...	Qwen	1.0M	66K	$0.65	Stream
Qwen: Qwen3.5 Plus 2026-02-15 The Qwen3.5 native vision-language series Plus models are built on a hybrid arch...	Qwen	1.0M	66K	$0.26	Vision Stream
Qwen: Qwen3.5 Plus 2026-04-20 Qwen3.5 Plus (April 2026) is a large-scale multimodal language model from Alibab...	Qwen	1.0M	66K	$0.30	Vision Stream
Qwen: Qwen3.5-Flash The Qwen3.5 native vision-language Flash models are built on a hybrid architectu...	Qwen	1.0M	66K	$0.07	Vision Stream
Qwen: Qwen3.6 Flash Qwen3.6 Flash is a fast, efficient language model from Alibaba's Qwen 3.6 series...	Qwen	1.0M	66K	$0.19	Vision Stream
Qwen: Qwen3.6 Plus Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear att...	Qwen	1.0M	66K	$0.33	Vision Stream
Qwen: Qwen3.7 Max Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text ...	Qwen	1.0M	66K	$1.25	Stream
Qwen: Qwen3.7 Plus Qwen3.7-Plus is a cost-effective model in Alibaba's Qwen3.7 series. It supports ...	Qwen	1.0M	66K	$0.32	Vision Stream
xAI: Grok 4.3 Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with te...	xAI	1.0M	0	$1.25	Vision Stream
OpenAI: GPT Chat Latest GPT Chat Latest points to OpenAI's stable API alias `chat-latest` that always re...	OpenAI	400K	128K	$5.00	Vision Stream
OpenAI: GPT-5 GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning,...	OpenAI	400K	128K	$1.25	Vision Stream
OpenAI: GPT-5 Codex GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering...	OpenAI	400K	128K	$1.25	Vision Stream
OpenAI: GPT-5 Image [GPT-5](https://openrouter.ai/openai/gpt-5) Image combines OpenAI's GPT-5 model ...	OpenAI	400K	128K	$10.00	Vision Stream
OpenAI: GPT-5 Image Mini GPT-5 Image Mini combines OpenAI's advanced language capabilities, powered by [G...	OpenAI	400K	128K	$2.50	Vision Stream
OpenAI: GPT-5 Mini GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reas...	OpenAI	400K	128K	$0.25	Vision Stream
OpenAI: GPT-5 Nano GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized fo...	OpenAI	400K	0	$0.05	Vision Stream
OpenAI: GPT-5 Pro GPT-5 Pro is OpenAI’s most advanced model, offering major improvements in reason...	OpenAI	400K	128K	$15.00	Vision Stream
OpenAI: GPT-5.1 GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronge...	OpenAI	400K	128K	$1.25	Vision Stream
OpenAI: GPT-5.1-Codex GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software enginee...	OpenAI	400K	128K	$1.25	Vision Stream
OpenAI: GPT-5.1-Codex-Max GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-run...	OpenAI	400K	128K	$1.25	Vision Stream
OpenAI: GPT-5.1-Codex-Mini GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex...	OpenAI	400K	100K	$0.25	Vision Stream
OpenAI: GPT-5.2 GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronge...	OpenAI	400K	128K	$1.75	Vision Stream
OpenAI: GPT-5.2 Pro GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agen...	OpenAI	400K	128K	$21.00	Vision Stream
OpenAI: GPT-5.2-Codex GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software eng...	OpenAI	400K	128K	$1.75	Vision Stream
OpenAI: GPT-5.3-Codex GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the fron...	OpenAI	400K	128K	$1.75	Vision Stream
OpenAI: GPT-5.4 Mini GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient...	OpenAI	400K	128K	$0.75	Vision Stream
OpenAI: GPT-5.4 Nano GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 f...	OpenAI	400K	128K	$0.20	Vision Stream
Amazon: Nova Lite 1.0 Amazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focuse...	Amazon	300K	5K	$0.06	Vision Stream
Amazon: Nova Pro 1.0 Amazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providi...	Amazon	300K	5K	$0.80	Vision Stream
OpenAI: GPT-5.4 Image 2 [GPT-5.4](https://openrouter.ai/openai/gpt-5.4) Image 2 combines OpenAI's GPT-5....	OpenAI	272K	128K	$8.00	Vision Stream
Arcee AI: Trinity Large Thinking Trinity Large Thinking is a powerful open source reasoning model from the team a...	Arcee Ai	262K	262K	$0.22	Stream
ByteDance Seed: Seed 1.6 Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It inco...	Bytedance Seed	262K	33K	$0.25	Vision Stream
ByteDance Seed: Seed 1.6 Flash Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed...	Bytedance Seed	262K	33K	$0.07	Vision Stream
ByteDance Seed: Seed-2.0-Lite Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers ...	Bytedance Seed	262K	131K	$0.25	Vision Stream
ByteDance Seed: Seed-2.0-Mini Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive sc...	Bytedance Seed	262K	131K	$0.10	Vision Stream
Google: Gemma 4 26B A4B Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from G...	Google	262K	0	$0.06	Vision Stream
Google: Gemma 4 26B A4B (free) Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from G...	Google	262K	33K	Free	Vision Stream
Google: Gemma 4 31B Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supportin...	Google	262K	262K	$0.12	Vision Stream
Google: Gemma 4 31B (free) Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supportin...	Google	262K	33K	Free	Vision Stream
inclusionAI: Ling-2.6-1T Ling-2.6-1T is an instant (instruct) model from inclusionAI and the company’s tr...	Inclusionai	262K	33K	$0.07	Stream
inclusionAI: Ling-2.6-flash Ling-2.6-flash is an instant (instruct) model from inclusionAI with 104B total p...	Inclusionai	262K	33K	$0.01	Stream
inclusionAI: Ring-2.6-1T Ring-2.6-1T is a 1T-parameter-scale thinking model with 63B active parameters, b...	Inclusionai	262K	66K	$0.07	Stream
Mistral: Devstral 2 2512 Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in...	Mistral AI	262K	0	$0.40	Stream
Mistral: Ministral 3 14B 2512 The largest model in the Ministral 3 family, Ministral 3 14B offers frontier cap...	Mistral AI	262K	0	$0.20	Vision Stream
Mistral: Ministral 3 8B 2512 A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, effici...	Mistral AI	262K	0	$0.15	Vision Stream
Mistral: Mistral Large 3 2512 Mistral Large 3 2512 is Mistral’s most capable model to date, featuring a sparse...	Mistral AI	262K	0	$0.50	Vision Stream
Mistral: Mistral Medium 3.5 Mistral Medium 3.5 is a dense 128B instruction-following model from Mistral AI. ...	Mistral AI	262K	0	$1.50	Vision Stream
Mistral: Mistral Small 4 Mistral Small 4 is the next major release in the Mistral Small family, unifying ...	Mistral AI	262K	0	$0.15	Vision Stream
MoonshotAI: Kimi K2 0905 Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It i...	Moonshotai	262K	262K	$0.60	Stream
MoonshotAI: Kimi K2 Thinking Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, ex...	Moonshotai	262K	262K	$0.60	Stream
MoonshotAI: Kimi K2.5 Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art ...	Moonshotai	262K	0	$0.38	Vision Stream
MoonshotAI: Kimi K2.6 Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-h...	Moonshotai	262K	262K	$0.68	Vision Stream
MoonshotAI: Kimi K2.7 Code MoonshotAI: Kimi K2.7 Code is a coding-focused model in Moonshot AI's Kimi K2 fa...	Moonshotai	262K	0	$0.95	Vision Stream
Morph: Morph V3 Large Morph's high-accuracy apply model for complex code edits. ~4,500 tokens/sec with...	Morph	262K	131K	$0.90	Stream
Nex AGI: Nex-N2-Pro (free) Nex-N2-Pro is an agentic mixture-of-experts model from Nex AGI, with 17B active ...	Nex Agi	262K	262K	Free	Vision Stream
NVIDIA: Nemotron 3 Nano 30B A3B NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest comput...	NVIDIA	262K	228K	$0.05	Stream
Poolside: Laguna M.1 (free) Laguna M.1 is the flagship coding agent model from [Poolside](https://poolside.a...	Poolside	262K	33K	Free	Stream
Poolside: Laguna XS.2 (free) Laguna XS.2 is the second-generation model in the XS size class from [Poolside](...	Poolside	262K	33K	Free	Stream
Qwen: Qwen3 235B A22B Instruct 2507 Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-ex...	Qwen	262K	16K	$0.09	Stream
Qwen: Qwen3 235B A22B Thinking 2507 Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Expe...	Qwen	262K	262K	$0.10	Stream
Qwen: Qwen3 Coder Next Qwen3-Coder-Next is an open-weight causal language model optimized for coding ag...	Qwen	262K	262K	$0.11	Stream
Qwen: Qwen3 Max Qwen3-Max is an updated release built on the Qwen3 series, offering major improv...	Qwen	262K	33K	$0.78	Stream
Qwen: Qwen3 Max Thinking Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed...	Qwen	262K	33K	$0.78	Stream
Qwen: Qwen3 Next 80B A3B Instruct Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next...	Qwen	262K	16K	$0.09	Stream
Qwen: Qwen3 Next 80B A3B Instruct (free) Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next...	Qwen	262K	0	Free	Stream
Qwen: Qwen3 Next 80B A3B Thinking Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next li...	Qwen	262K	33K	$0.10	Stream
Qwen: Qwen3 VL 235B A22B Instruct Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies stro...	Qwen	262K	16K	$0.20	Vision Stream
Qwen: Qwen3 VL 30B A3B Instruct Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generat...	Qwen	262K	33K	$0.13	Vision Stream
Qwen: Qwen3 VL 32B Instruct Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed...	Qwen	262K	33K	$0.10	Vision Stream
Qwen: Qwen3.5 397B A17B The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid a...	Qwen	262K	66K	$0.39	Vision Stream
Qwen: Qwen3.5-122B-A10B The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architec...	Qwen	262K	262K	$0.26	Vision Stream
Qwen: Qwen3.5-27B The Qwen3.5 27B native vision-language Dense model incorporates a linear attenti...	Qwen	262K	66K	$0.20	Vision Stream
Qwen: Qwen3.5-35B-A3B The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hyb...	Qwen	262K	262K	$0.14	Vision Stream
Qwen: Qwen3.5-9B Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to...	Qwen	262K	262K	$0.10	Vision Stream
Qwen: Qwen3.6 27B Qwen3.6 27B is a dense 27-billion-parameter language model from the Qwen Team at...	Qwen	262K	262K	$0.29	Vision Stream
Qwen: Qwen3.6 35B A3B Qwen3.6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 bi...	Qwen	262K	262K	$0.15	Vision Stream
Qwen: Qwen3.6 Max Preview Qwen3.6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on ...	Qwen	262K	66K	$1.04	Stream
StepFun: Step 3.5 Flash Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on ...	Stepfun	262K	16K	$0.09	Stream
Tencent: Hy3 preview Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed ...	Tencent	262K	0	$0.06	Stream
Xiaomi: MiMo-V2-Flash MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. I...	Xiaomi	262K	66K	$0.10	Stream
Z.ai: GLM 5 Turbo GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong perf...	Z Ai	262K	131K	$1.20	Stream
AI21: Jamba Large 1.7 Jamba Large 1.7 is the latest model in the Jamba open family, offering improveme...	AI21 Labs	256K	4K	$2.00	Stream
Cohere: Command A Command A is an open-weights 111B parameter model with a 256k context window foc...	Cohere	256K	8K	$2.50	Stream
Kwaipilot: KAT-Coder-Pro V2 KAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder ser...	Kwaipilot	256K	80K	$0.30	Stream
Mistral: Codestral 2508 Mistral's cutting-edge language model for coding released end of July 2025. Code...	Mistral AI	256K	0	$0.30	Stream
NVIDIA: Nemotron 3 Nano 30B A3B (free) NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest comput...	NVIDIA	256K	0	Free	Stream
NVIDIA: Nemotron 3 Nano Omni (free) NVIDIA Nemotron™ 3 Nano Omni is a 30B-A3B open multimodal model designed to func...	NVIDIA	256K	66K	Free	Vision Stream
Qwen: Qwen3 VL 8B Instruct Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL ser...	Qwen	256K	33K	$0.08	Vision Stream
Qwen: Qwen3 VL 8B Thinking Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multi...	Qwen	256K	33K	$0.12	Vision Stream
Relace: Relace Apply 3 Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits...	Relace	256K	128K	$0.85	Stream
Relace: Relace Search The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to ex...	Relace	256K	128K	$1.00	Stream
StepFun: Step 3.7 Flash Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts...	Stepfun	256K	256K	$0.20	Vision Stream
xAI: Grok Build 0.1 Grok Build 0.1 is xAI’s fast coding model trained specifically for agentic softw...	xAI	256K	0	$1.00	Vision Stream
MiniMax: MiniMax M2 MiniMax-M2 is a compact, high-efficiency large language model optimized for end-...	Minimax	205K	197K	$0.26	Stream
MiniMax: MiniMax M2.1 MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized f...	Minimax	205K	197K	$0.29	Stream
MiniMax: MiniMax M2.5 MiniMax-M2.5 is a SOTA large language model designed for real-world productivity...	Minimax	205K	197K	$0.15	Stream
MiniMax: MiniMax M2.7 MiniMax-M2.7 is a next-generation large language model designed for autonomous, ...	Minimax	205K	131K	$0.25	Stream
Z.ai: GLM 4.6 Compared with GLM-4.5, this generation brings several key improvements: Longer c...	Z Ai	203K	131K	$0.43	Stream
Z.ai: GLM 4.7 GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: en...	Z Ai	203K	131K	$0.40	Stream
Z.ai: GLM 4.7 Flash As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances perfo...	Z Ai	203K	16K	$0.06	Stream
Z.ai: GLM 5 GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex sys...	Z Ai	203K	0	$0.60	Stream
Z.ai: GLM 5.1 GLM-5.1 delivers a major leap in coding capability, with particularly significan...	Z Ai	203K	0	$0.98	Stream
Anthropic: Claude 3 Haiku Claude 3 Haiku is Anthropic's fastest and most compact model for near-instant re...	Anthropic	200K	4K	$0.25	Vision Stream
Anthropic: Claude 3.5 Haiku Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy...	Anthropic	200K	8K	$0.80	Vision Stream
Anthropic: Claude Haiku 4.5 Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering nea...	Anthropic	200K	64K	$1.00	Vision Stream
Anthropic: Claude Opus 4 Claude Opus 4 is benchmarked as the world’s best coding model, at time of releas...	Anthropic	200K	32K	$15.00	Vision Stream
Anthropic: Claude Opus 4.1 Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering im...	Anthropic	200K	32K	$15.00	Vision Stream
Anthropic: Claude Opus 4.5 Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex so...	Anthropic	200K	64K	$5.00	Vision Stream
OpenAI: o1 The latest and strongest model family from OpenAI, o1 is designed to spend more ...	OpenAI	200K	100K	$15.00	Vision Stream
OpenAI: o1-pro The o1 series of models are trained with reinforcement learning to think before ...	OpenAI	200K	100K	$150.00	Vision Stream
OpenAI: o3 o3 is a well-rounded and powerful model across domains. It sets a new standard f...	OpenAI	200K	100K	$2.00	Vision Stream
OpenAI: o3 Deep Research o3-deep-research is OpenAI's advanced model for deep research, designed to tackl...	OpenAI	200K	100K	$10.00	Vision Stream
OpenAI: o3 Mini OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning t...	OpenAI	200K	100K	$1.10	Stream
OpenAI: o3 Mini High OpenAI o3-mini-high is the same model as [o3-mini](/openai/o3-mini) with reasoni...	OpenAI	200K	100K	$1.10	Stream
OpenAI: o3 Pro The o-series of models are trained with reinforcement learning to think before t...	OpenAI	200K	100K	$20.00	Vision Stream
OpenAI: o4 Mini OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast,...	OpenAI	200K	100K	$1.10	Vision Stream
OpenAI: o4 Mini Deep Research o4-mini-deep-research is OpenAI's faster, more affordable deep research model—id...	OpenAI	200K	100K	$2.00	Vision Stream
OpenAI: o4 Mini High OpenAI o4-mini-high is the same model as [o4-mini](/openai/o4-mini) with reasoni...	OpenAI	200K	100K	$1.10	Vision Stream
Free Models Router The simplest way to get free inference. openrouter/free is a router that selects...	Openrouter	200K	0	Free	Vision Stream
Perplexity: Sonar Pro Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](h...	Perplexity	200K	8K	$3.00	Vision Stream
Perplexity: Sonar Pro Search Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is ...	Perplexity	200K	8K	$3.00	Vision Stream
DeepSeek: DeepSeek V3 0324 DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration...	DeepSeek	164K	16K	$0.20	Stream
DeepSeek: DeepSeek V3.1 DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) th...	DeepSeek	164K	33K	$0.21	Stream
DeepSeek: DeepSeek V3.1 Terminus DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v...	DeepSeek	164K	33K	$0.27	Stream
DeepSeek: DeepSeek V3.2 Exp DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek a...	DeepSeek	164K	66K	$0.27	Stream
DeepSeek: R1 DeepSeek R1 is here: Performance on par with [OpenAI o1](/openai/o1), but open-s...	DeepSeek	164K	16K	$0.70	Stream
DeepSeek: R1 0528 May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance...	DeepSeek	164K	33K	$0.50	Stream
Meta: Llama Guard 4 12B Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned...	Meta	164K	16K	$0.18	Vision Stream
Qwen: Qwen3 Coder 30B A3B Instruct Qwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model...	Qwen	160K	33K	$0.07	Stream
Qwen: Qwen3 14B Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series...	Qwen	132K	41K	$0.10	Stream
AionLabs: Aion-1.0 Aion-1.0 is a multi-model system designed for high performance across various ta...	Aion Labs	131K	33K	$4.00	Stream
AionLabs: Aion-1.0-Mini Aion-1.0-Mini 32B parameter model is a distilled version of the DeepSeek-R1 mode...	Aion Labs	131K	33K	$0.70	Stream
AionLabs: Aion-2.0 Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and s...	Aion Labs	131K	33K	$0.80	Stream
Arcee AI: Trinity Mini Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language m...	Arcee Ai	131K	131K	$0.04	Stream
Arcee AI: Virtuoso Large Virtuoso‑Large is Arcee's top‑tier general‑purpose LLM at 72 B parameters, tuned...	Arcee Ai	131K	64K	$0.75	Stream
Baidu: ERNIE 4.5 VL 424B A47B ERNIE-4.5-VL-424B-A47B is a multimodal Mixture-of-Experts (MoE) model from Baidu...	Baidu	131K	16K	$0.42	Vision Stream
DeepSeek: DeepSeek V3 DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instru...	DeepSeek	131K	16K	$0.20	Stream
DeepSeek: DeepSeek V3.2 DeepSeek-V3.2 is a large language model designed to harmonize high computational...	DeepSeek	131K	64K	$0.23	Stream
Google: Gemma 3 12B Gemma 3 introduces multimodality, supporting vision-language input and text outp...	Google	131K	16K	$0.05	Vision Stream
Google: Gemma 3 27B Gemma 3 introduces multimodality, supporting vision-language input and text outp...	Google	131K	16K	$0.08	Vision Stream
Google: Gemma 3 4B Gemma 3 introduces multimodality, supporting vision-language input and text outp...	Google	131K	16K	$0.05	Vision Stream
Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview) Gemini 3.1 Flash Image Preview, a.k.a. "Nano Banana 2," is Google’s latest state...	Google	131K	66K	$0.50	Vision Stream
IBM: Granite 4.1 8B Granite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from ...	Ibm Granite	131K	131K	$0.05	Stream
Meta: Llama 3.1 70B Instruct Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flav...	Meta	131K	16K	$0.40	Stream
Meta: Llama 3.1 8B Instruct Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flav...	Meta	131K	16K	$0.02	Stream
Meta: Llama 3.2 11B Vision Instruct Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed ...	Meta	131K	16K	$0.34	Vision Stream
Meta: Llama 3.2 1B Instruct Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently perf...	Meta	131K	60K	$0.03	Stream
Meta: Llama 3.2 3B Instruct Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimiz...	Meta	131K	80K	$0.05	Stream
Meta: Llama 3.2 3B Instruct (free) Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimiz...	Meta	131K	0	Free	Stream
Meta: Llama 3.3 70B Instruct The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and i...	Meta	131K	16K	$0.10	Stream
Meta: Llama 3.3 70B Instruct (free) The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and i...	Meta	131K	0	Free	Stream
Microsoft: Phi 4 Mini Instruct Phi-4-mini-instruct is a lightweight open model built upon synthetic data and fi...	Microsoft	131K	128K	$0.08	Stream
Mistral Large 2407 This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407...	Mistral AI	131K	0	$2.00	Stream
Mistral: Ministral 3 3B 2512 The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, effi...	Mistral AI	131K	0	$0.10	Vision Stream
Mistral: Mistral Medium 3 Mistral Medium 3 is a high-performance enterprise-grade language model designed ...	Mistral AI	131K	0	$0.40	Vision Stream
Mistral: Mistral Medium 3.1 Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-pe...	Mistral AI	131K	0	$0.40	Vision Stream
Mistral: Mistral Nemo A 12B parameter model with a 128k token context length built by Mistral in colla...	Mistral AI	131K	0	$0.02	Stream
MoonshotAI: Kimi K2 0711 Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model develo...	Moonshotai	131K	33K	$0.57	Stream
Nous: Hermes 3 405B Instruct Hermes 3 is a generalist language model with many improvements over Hermes 2, in...	Nousresearch	131K	16K	$1.00	Stream
Nous: Hermes 3 405B Instruct (free) Hermes 3 is a generalist language model with many improvements over Hermes 2, in...	Nousresearch	131K	0	Free	Stream
Nous: Hermes 3 70B Instruct Hermes 3 is a generalist language model with many improvements over [Hermes 2](/...	Nousresearch	131K	16K	$0.70	Stream
Nous: Hermes 4 405B Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and relea...	Nousresearch	131K	0	$1.00	Stream
Nous: Hermes 4 70B Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama...	Nousresearch	131K	0	$0.13	Stream
NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/...	NVIDIA	131K	16K	$0.40	Stream
OpenAI: gpt-oss-120b gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language...	OpenAI	131K	0	$0.04	Stream
OpenAI: gpt-oss-120b (free) gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language...	OpenAI	131K	131K	Free	Stream
OpenAI: gpt-oss-20b gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the A...	OpenAI	131K	0	$0.03	Stream
OpenAI: gpt-oss-20b (free) gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the A...	OpenAI	131K	8K	Free	Stream
OpenAI: gpt-oss-safeguard-20b gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss...	OpenAI	131K	66K	$0.07	Stream
Prime Intellect: INTELLECT-3 INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-train...	Prime Intellect	131K	131K	$0.20	Stream
Qwen: Qwen2.5 7B Instruct Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings th...	Qwen	131K	33K	$0.04	Stream
Qwen: Qwen2.5 VL 72B Instruct Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, f...	Qwen	131K	128K	$0.80	Vision Stream
Qwen: Qwen3 235B A22B Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by ...	Qwen	131K	8K	$0.45	Stream
Qwen: Qwen3 30B A3B Qwen3, the latest generation in the Qwen large language model series, features b...	Qwen	131K	16K	$0.12	Stream
Qwen: Qwen3 30B A3B Instruct 2507 Qwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter mixture-of-experts language mod...	Qwen	131K	32K	$0.05	Stream
Qwen: Qwen3 30B A3B Thinking 2507 Qwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning mode...	Qwen	131K	131K	$0.08	Stream
Qwen: Qwen3 32B Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series...	Qwen	131K	16K	$0.08	Stream
Qwen: Qwen3 8B Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, ...	Qwen	131K	8K	$0.05	Stream
Qwen: Qwen3 VL 235B A22B Thinking Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text gener...	Qwen	131K	33K	$0.26	Vision Stream
Qwen: Qwen3 VL 30B A3B Thinking Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generat...	Qwen	131K	33K	$0.13	Vision Stream
Qwen2.5 72B Instruct Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings t...	Qwen	131K	16K	$0.36	Stream
Sao10K: Llama 3.1 Euryale 70B v2.2 Euryale L3.1 70B v2.2 is a model focused on creative roleplay from [Sao10k](http...	Sao10k	131K	16K	$0.85	Stream
Sao10K: Llama 3.3 Euryale 70B Euryale L3.3 70B is a model focused on creative roleplay from [Sao10k](https://k...	Sao10k	131K	16K	$0.65	Stream
Switchpoint Router Switchpoint AI's router instantly analyzes your request and directs it to the op...	Switchpoint	131K	0	$0.85	Stream
Tencent: Hunyuan A13B Instruct Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model d...	Tencent	131K	131K	$0.14	Stream
TheDrummer: Cydonia 24B V4.1 Uncensored and creative writing model based on Mistral Small 3.2 24B with good r...	Thedrummer	131K	131K	$0.30	Stream
Z.ai: GLM 4.5 GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based a...	Z Ai	131K	98K	$0.60	Stream
Z.ai: GLM 4.5 Air GLM-4.5-Air is the lightweight variant of our latest flagship model family, also...	Z Ai	131K	131K	$0.13	Stream
Z.ai: GLM 4.6V GLM-4.6V is a large multimodal model designed for high-fidelity visual understan...	Z Ai	131K	33K	$0.30	Vision Stream
IBM: Granite 4.0 Micro Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These...	Ibm Granite	131K	131K	$0.02	Stream
Amazon: Nova Micro 1.0 Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency resp...	Amazon	128K	5K	$0.04	Stream
ByteDance: UI-TARS 7B UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based enviro...	Bytedance	128K	2K	$0.10	Vision Stream
Cohere: Command R (08-2024) command-r-08-2024 is an update of the [Command R](/models/cohere/command-r) with...	Cohere	128K	4K	$0.15	Stream
Cohere: Command R+ (08-2024) command-r-plus-08-2024 is an update of the [Command R+](/models/cohere/command-r...	Cohere	128K	4K	$2.50	Stream
Cohere: Command R7B (12-2024) Command R7B (12-2024) is a small, fast update of the Command R+ model, delivered...	Cohere	128K	4K	$0.04	Stream
Deep Cogito: Cogito v2.1 671B Cogito v2.1 671B MoE represents one of the strongest open models globally, match...	Deepcogito	128K	0	$1.25	Stream
DeepSeek: R1 Distill Llama 70B DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llam...	DeepSeek	128K	8K	$0.80	Stream
DeepSeek: R1 Distill Qwen 32B DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen ...	DeepSeek	128K	33K	$0.29	Stream
Inception: Mercury 2 Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion ...	Inception	128K	50K	$0.25	Stream
LiquidAI: LFM2-24B-A2B LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures des...	Liquid	128K	0	$0.03	Stream
Mistral Large This is Mistral AI's flagship model, Mistral Large 2 (version `mistral-large-240...	Mistral AI	128K	0	$2.00	Stream
Mistral: Mistral Small 3.1 24B Mistral Small 3.1 24B Instruct is an upgraded variant of Mistral Small 3 (2501),...	Mistral AI	128K	128K	$0.35	Vision Stream
Mistral: Mistral Small 3.2 24B Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistr...	Mistral AI	128K	16K	$0.07	Vision Stream
NVIDIA: Nemotron 3.5 Content Safety (free) NVIDIA Nemotron 3.5 Content Safety is a compact 4B-parameter multimodal guardrai...	NVIDIA	128K	8K	Free	Vision Stream
NVIDIA: Nemotron Nano 12B 2 VL (free) NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning mo...	NVIDIA	128K	128K	Free	Vision Stream
NVIDIA: Nemotron Nano 9B V2 (free) NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch ...	NVIDIA	128K	0	Free	Stream
OpenAI: GPT Audio The gpt-audio model is OpenAI's first generally available audio model. The new s...	OpenAI	128K	16K	$2.50	Stream
OpenAI: GPT Audio Mini A cost-efficient version of GPT Audio. The new snapshot features an upgraded dec...	OpenAI	128K	16K	$0.60	Stream
OpenAI: GPT-4 Turbo The latest GPT-4 Turbo model with vision capabilities. Vision requests can now u...	OpenAI	128K	4K	$10.00	Vision Stream
OpenAI: GPT-4 Turbo Preview The preview GPT-4 model with improved instruction following, JSON mode, reproduc...	OpenAI	128K	4K	$10.00	Stream
OpenAI: GPT-4o GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and im...	OpenAI	128K	16K	$2.50	Vision Stream
OpenAI: GPT-4o (2024-05-13) GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and im...	OpenAI	128K	4K	$5.00	Vision Stream
OpenAI: GPT-4o (2024-08-06) The 2024-08-06 version of GPT-4o offers improved performance in structured outpu...	OpenAI	128K	16K	$2.50	Vision Stream
OpenAI: GPT-4o (2024-11-20) The 2024-11-20 version of GPT-4o offers a leveled-up creative writing ability wi...	OpenAI	128K	16K	$2.50	Vision Stream
OpenAI: GPT-4o Search Preview GPT-4o Search Previewis a specialized model for web search in Chat Completions. ...	OpenAI	128K	16K	$2.50	Stream
OpenAI: GPT-4o-mini GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), ...	OpenAI	128K	16K	$0.15	Vision Stream
OpenAI: GPT-4o-mini (2024-07-18) GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), ...	OpenAI	128K	16K	$0.15	Vision Stream
OpenAI: GPT-4o-mini Search Preview GPT-4o mini Search Preview is a specialized model for web search in Chat Complet...	OpenAI	128K	16K	$0.15	Stream
OpenAI: GPT-5 Chat GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conv...	OpenAI	128K	16K	$1.25	Vision Stream
OpenAI: GPT-5.1 Chat GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, opt...	OpenAI	128K	32K	$1.25	Vision Stream
OpenAI: GPT-5.2 Chat GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, op...	OpenAI	128K	16K	$1.75	Vision Stream
OpenAI: GPT-5.3 Chat GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conve...	OpenAI	128K	16K	$1.75	Vision Stream
Body Builder (beta) Transform your natural language requests into structured OpenRouter API request ...	Openrouter	128K	0	Variable	Stream
OpenRouter: Fusion Fusion turns your prompt into a small multi-model deliberation. A panel of exper...	Openrouter	128K	0	Variable	Stream
Perplexity: Sonar Deep Research Sonar Deep Research is a research-focused model designed for multi-step retrieva...	Perplexity	128K	0	$2.00	Stream
Perplexity: Sonar Reasoning Pro Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](h...	Perplexity	128K	0	$2.00	Vision Stream
Qwen2.5 Coder 32B Instruct Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (...	Qwen	128K	33K	$0.66	Stream
Upstage: Solar Pro 3 Solar Pro 3 is Upstage's powerful Mixture-of-Experts (MoE) language model. With ...	Upstage	128K	0	$0.15	Stream
Perplexity: Sonar Sonar is lightweight, affordable, fast, and simple to use — now featuring citati...	Perplexity	127K	0	$1.00	Vision Stream
Morph: Morph V3 Fast Morph's fastest apply model for code edits. ~10,500 tokens/sec with 96% accuracy...	Morph	82K	38K	$0.80	Stream
AllenAI: Olmo 3 32B Think Olmo 3 32B Think is a large-scale, 32-billion-parameter model purpose-built for ...	Allenai	66K	66K	$0.15	Stream
Google: Nano Banana Pro (Gemini 3 Pro Image Preview) Nano Banana Pro is Google’s most advanced image-generation and editing model, bu...	Google	66K	33K	$2.00	Vision Stream
WizardLM-2 8x22B WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates h...	Microsoft	66K	8K	$0.62	Stream
MiniMax: MiniMax M2-her MiniMax M2-her is a dialogue-first large language model built for immersive role...	Minimax	66K	2K	$0.30	Stream
Mistral: Mixtral 8x22B Instruct Mistral's official instruct fine-tuned version of [Mixtral 8x22B](/models/mistra...	Mistral AI	66K	0	$2.00	Stream
Reka Flash 3 Reka Flash 3 is a general-purpose, instruction-tuned large language model with 2...	Rekaai	66K	66K	$0.10	Stream
Z.ai: GLM 4.5V GLM-4.5V is a vision-language foundation model for multimodal agent applications...	Z Ai	66K	16K	$0.60	Vision Stream
AionLabs: Aion-RP 1.0 (8B) Aion-RP-Llama-3.1-8B ranks the highest in the character evaluation portion of th...	Aion Labs	33K	33K	$0.80	Stream
Magnum v4 72B This is a series of models designed to replicate the prose quality of the Claude...	Anthracite Org	33K	2K	$3.00	Stream
Arcee AI: Coder Large Coder‑Large is a 32 B‑parameter offspring of Qwen 2.5‑Instruct that has been fur...	Arcee Ai	33K	0	$0.50	Stream
Venice: Uncensored (free) Venice Uncensored Dolphin Mistral 24B Venice Edition is a fine-tuned variant of ...	Cognitivecomputations	33K	0	Free	Stream
EssentialAI: Rnj 1 Instruct Rnj-1 is an 8B-parameter, dense, open-weight model family developed by Essential...	Essentialai	33K	0	$0.15	Stream
Google: Gemma 3n 4B Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource ...	Google	33K	0	$0.06	Stream
Google: Nano Banana (Gemini 2.5 Flash Image) Gemini 2.5 Flash Image, a.k.a. "Nano Banana," is now generally available. It is ...	Google	33K	33K	$0.30	Vision Stream
LiquidAI: LFM2.5-1.2B-Instruct (free) LFM2.5-1.2B-Instruct is a compact, high-performance instruction-tuned model buil...	Liquid	33K	0	Free	Stream
LiquidAI: LFM2.5-1.2B-Thinking (free) LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agen...	Liquid	33K	0	Free	Stream
Mistral: Mistral Small 3 Mistral Small 3 is a 24B-parameter language model optimized for low-latency perf...	Mistral AI	33K	16K	$0.05	Stream
Mistral: Saba Mistral Saba is a 24B-parameter language model specifically designed for the Mid...	Mistral AI	33K	0	$0.20	Stream
Perceptron: Perceptron Mk1 Perceptron Mk1 (Mark One) is Perceptron's highest-quality vision-language model ...	Perceptron	33K	8K	$0.15	Vision Stream
TheDrummer: Rocinante 12B Rocinante 12B is designed for engaging storytelling and rich prose. Early tester...	Thedrummer	33K	33K	$0.17	Stream
TheDrummer: Skyfall 36B V2 Skyfall 36B v2 is an enhanced iteration of Mistral Small 2501, specifically fine...	Thedrummer	33K	33K	$0.55	Stream
TheDrummer: UnslopNemo 12B UnslopNemo v4.1 is the latest addition from the creator of Rocinante, designed f...	Thedrummer	33K	33K	$0.40	Stream
Mistral: Voxtral Small 24B 2507 Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-a...	Mistral AI	32K	0	$0.10	Stream
OpenAI: GPT-3.5 Turbo GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural ...	OpenAI	16K	4K	$0.50	Stream
OpenAI: GPT-3.5 Turbo 16k This model offers four times the context length of gpt-3.5-turbo, allowing it to...	OpenAI	16K	4K	$3.00	Stream
Microsoft: Phi 4 [Microsoft Research](/microsoft) Phi-4 is designed to perform well in complex re...	Microsoft	16K	16K	$0.07	Stream
Reka Edge Reka Edge is an extremely efficient 7B multimodal vision-language model that acc...	Rekaai	16K	16K	$0.10	Vision Stream
Sao10K: Llama 3.1 70B Hanami x1 This is [Sao10K](/sao10k)'s experiment over [Euryale v2.2](/sao10k/l3.1-euryale-...	Sao10k	16K	0	$3.00	Stream
Google: Gemma 2 27B Gemma 2 27B by Google is an open model built from the same research and technolo...	Google	8K	2K	$0.65	Stream
Meta: Llama 3 70B Instruct Meta's latest class of model (Llama 3) launched with a variety of sizes & flavor...	Meta	8K	8K	$0.51	Stream
Meta: Llama 3 8B Instruct Meta's latest class of model (Llama 3) launched with a variety of sizes & flavor...	Meta	8K	0	$0.14	Stream
Sao10K: Llama 3 8B Lunaris Lunaris 8B is a versatile generalist and roleplaying model based on Llama 3. It'...	Sao10k	8K	16K	$0.04	Stream
OpenAI: GPT-4 OpenAI's flagship model, GPT-4 is a large-scale multimodal language model capabl...	OpenAI	8K	4K	$30.00	Stream
Inflection: Inflection 3 Pi Inflection 3 Pi powers Inflection's [Pi](https://pi.ai) chatbot, including backs...	Inflection	8K	1K	$2.50	Stream
Inflection: Inflection 3 Productivity Inflection 3 Productivity is optimized for following instructions. It is better ...	Inflection	8K	1K	$2.50	Stream
Mancer: Weaver (alpha) An attempt to recreate Claude-style verbosity, but don't expect the same level o...	Mancer	8K	2K	$0.75	Stream
ReMM SLERP 13B A recreation trial of the original MythoMax-L2-B13 but with updated models. #mer...	Undi95	6K	4K	$0.45	Stream
MythoMax 13B One of the highest performing and most popular fine-tunes of Llama 2 13B, with r...	Gryphe	4K	4K	$0.06	Stream
OpenAI: GPT-3.5 Turbo (older v0613) GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural ...	OpenAI	4K	4K	$1.00	Stream
OpenAI: GPT-3.5 Turbo Instruct This model is a variant of GPT-3.5 Turbo tuned for instructional prompts and omi...	OpenAI	4K	4K	$1.50	Stream

Model Compare

Side-by-side pricing & features

LLM Tokenizer

Count tokens in your documents

Embedding Calculator

Estimate vector DB costs

Understanding LLM context windows

What counts toward the limit?

The context window includes everything in the prompt: system instructions, conversation history, retrieved documents, and the space reserved for the model's response. If your input plus expected output exceeds the limit, you need a model with a larger window or a chunking strategy.

Common context window tiers

4K–8K: simple prompts and short chats.
32K–128K: long articles, code files, and medium conversations.
1M+: books, video transcripts, and large knowledge bases.

Context vs. cost trade-off

Models with very long context often charge more per token and can be slower. For many applications, splitting documents into smaller chunks and using retrieval is cheaper and more accurate than sending everything to a mega-context model.

Filter by capability

Use the capability badges to find models that support vision, function calling, JSON mode, or streaming. Combine filters to narrow down the exact model for your production workload.