LLM Context Window Comparison
Compare context windows across 328 models. Filter by provider, modality, and minimum context size. Use the fit calculator to find models that handle your documents.
Last updated: 2026-06-13
Will your document fit?
| Model | Provider | Context ↓ | Output max | Input price | Capabilities | Visual |
|---|---|---|---|---|---|---|
Meta: Llama 4 Scout Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model de... | Meta | 10.0M | 16K | $0.10 | Vision Stream | |
Auto Router Your prompt will be processed by a meta-model and routed to one of dozens of mod... | Openrouter | 2.0M | 0 | Variable | Vision Stream | |
Pareto Code Router The Pareto Router maintains a tiered shortlist of strong coding models, ranked b... | Openrouter | 2.0M | 0 | Variable | Stream | |
xAI: Grok 4.20 Grok 4.20 is a reasoning model from xAI with industry-leading speed and agentic ... | xAI | 2.0M | 0 | $1.25 | Vision Stream | |
xAI: Grok 4.20 Multi-Agent Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative... | xAI | 2.0M | 0 | $2.00 | Vision Stream | |
OpenAI: GPT-5.4 GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into... | OpenAI | 1.1M | 128K | $2.50 | Vision Stream | |
OpenAI: GPT-5.4 Pro GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified archi... | OpenAI | 1.1M | 128K | $30.00 | Vision Stream | |
OpenAI: GPT-5.5 GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, ... | OpenAI | 1.1M | 128K | $5.00 | Vision Stream | |
OpenAI: GPT-5.5 Pro GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and a... | OpenAI | 1.1M | 128K | $30.00 | Vision Stream | |
Google: Gemini 3.1 Pro Preview Custom Tools Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves... | 1.0M | 66K | $2.00 | Vision Stream | ||
Owl Alpha Owl Alpha is a high-performance foundation model designed for agentic workloads.... | Openrouter | 1.0M | 262K | Free | Stream | |
DeepSeek: DeepSeek V4 Flash DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepS... | DeepSeek | 1.0M | 0 | $0.10 | Stream | |
DeepSeek: DeepSeek V4 Pro DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6... | DeepSeek | 1.0M | 384K | $0.43 | Stream | |
Google: Gemini 2.5 Flash Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically desi... | 1.0M | 66K | $0.30 | Vision Stream | ||
Google: Gemini 2.5 Flash Lite Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family,... | 1.0M | 66K | $0.10 | Vision Stream | ||
Google: Gemini 2.5 Flash Lite Preview 09-2025 Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family,... | 1.0M | 66K | $0.10 | Vision Stream | ||
Google: Gemini 2.5 Pro Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reaso... | 1.0M | 66K | $1.25 | Vision Stream | ||
Google: Gemini 2.5 Pro Preview 05-06 Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reaso... | 1.0M | 66K | $1.25 | Vision Stream | ||
Google: Gemini 2.5 Pro Preview 06-05 Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reaso... | 1.0M | 66K | $1.25 | Vision Stream | ||
Google: Gemini 3 Flash Preview Gemini 3 Flash Preview is a high speed, high value thinking model designed for a... | 1.0M | 66K | $0.50 | Vision Stream | ||
Google: Gemini 3.1 Flash Lite Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized ... | 1.0M | 66K | $0.25 | Vision Stream | ||
Google: Gemini 3.1 Flash Lite Preview Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for hi... | 1.0M | 66K | $0.25 | Vision Stream | ||
Google: Gemini 3.1 Pro Preview Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced... | 1.0M | 66K | $2.00 | Vision Stream | ||
Google: Gemini 3.5 Flash Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro... | 1.0M | 66K | $1.50 | Vision Stream | ||
Google: Lyria 3 Clip Preview 30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's famil... | 1.0M | 66K | Free | Vision Stream | ||
Google: Lyria 3 Pro Preview Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of mu... | 1.0M | 66K | Free | Vision Stream | ||
Meta: Llama 4 Maverick Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language mode... | Meta | 1.0M | 16K | $0.15 | Vision Stream | |
MiniMax: MiniMax M3 MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, imag... | Minimax | 1.0M | 512K | $0.30 | Vision Stream | |
Qwen: Qwen3 Coder 480B A35B Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation mod... | Qwen | 1.0M | 66K | $0.22 | Stream | |
Qwen: Qwen3 Coder 480B A35B (free) Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation mod... | Qwen | 1.0M | 262K | Free | Stream | |
Xiaomi: MiMo-V2.5 MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic p... | Xiaomi | 1.0M | 131K | $0.14 | Vision Stream | |
Xiaomi: MiMo-V2.5-Pro MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in gener... | Xiaomi | 1.0M | 131K | $0.43 | Stream | |
OpenAI: GPT-4.1 GPT-4.1 is a flagship large language model optimized for advanced instruction fo... | OpenAI | 1.0M | 0 | $2.00 | Vision Stream | |
OpenAI: GPT-4.1 Mini GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o... | OpenAI | 1.0M | 33K | $0.40 | Vision Stream | |
OpenAI: GPT-4.1 Nano For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest mode... | OpenAI | 1.0M | 33K | $0.10 | Vision Stream | |
Writer: Palmyra X5 Palmyra X5 is Writer's most advanced model, purpose-built for building and scali... | Writer | 1.0M | 8K | $0.60 | Stream | |
MiniMax: MiniMax-01 MiniMax-01 is a combines MiniMax-Text-01 for text generation and MiniMax-VL-01 f... | Minimax | 1.0M | 1.0M | $0.20 | Vision Stream | |
Amazon: Nova 2 Lite Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads tha... | Amazon | 1.0M | 66K | $0.30 | Vision Stream | |
Amazon: Nova Premier 1.0 Amazon Nova Premier is the most capable of Amazon’s multimodal models for comple... | Amazon | 1.0M | 32K | $2.50 | Vision Stream | |
Anthropic: Claude Fable 5 Claude Fable 5 is a Mythos-class model from Anthropic, built for autonomous know... | Anthropic | 1.0M | 128K | $10.00 | Vision Stream | |
Anthropic: Claude Opus 4.6 Opus 4.6 is Anthropic’s strongest model for coding and long-running professional... | Anthropic | 1.0M | 128K | $5.00 | Vision Stream | |
Anthropic: Claude Opus 4.6 (Fast) Fast-mode variant of [Opus 4.6](/anthropic/claude-opus-4.6) - identical capabili... | Anthropic | 1.0M | 128K | $30.00 | Vision Stream | |
Anthropic: Claude Opus 4.7 Opus 4.7 is the next generation of Anthropic's Opus family, built for long-runni... | Anthropic | 1.0M | 128K | $5.00 | Vision Stream | |
Anthropic: Claude Opus 4.7 (Fast) Fast-mode variant of [Opus 4.7](/anthropic/claude-opus-4.7) - identical capabili... | Anthropic | 1.0M | 128K | $30.00 | Vision Stream | |
Anthropic: Claude Opus 4.8 Claude Opus 4.8 is Anthropic's most capable generally available model in the Opu... | Anthropic | 1.0M | 128K | $5.00 | Vision Stream | |
Anthropic: Claude Opus 4.8 (Fast) Fast-mode variant of [Opus 4.8](/anthropic/claude-opus-4.8) - identical capabili... | Anthropic | 1.0M | 128K | $10.00 | Vision Stream | |
Anthropic: Claude Sonnet 4 Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonn... | Anthropic | 1.0M | 64K | $3.00 | Vision Stream | |
Anthropic: Claude Sonnet 4.5 Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized f... | Anthropic | 1.0M | 64K | $3.00 | Vision Stream | |
Anthropic: Claude Sonnet 4.6 Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier per... | Anthropic | 1.0M | 128K | $3.00 | Vision Stream | |
MiniMax: MiniMax M1 MiniMax-M1 is a large-scale, open-weight reasoning model designed for extended c... | Minimax | 1.0M | 40K | $0.40 | Stream | |
NVIDIA: Nemotron 3 Super NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating ju... | NVIDIA | 1.0M | 0 | $0.09 | Stream | |
NVIDIA: Nemotron 3 Super (free) NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating ju... | NVIDIA | 1.0M | 262K | Free | Stream | |
NVIDIA: Nemotron 3 Ultra NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model fr... | NVIDIA | 1.0M | 16K | $0.50 | Stream | |
NVIDIA: Nemotron 3 Ultra (free) NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model fr... | NVIDIA | 1.0M | 66K | Free | Stream | |
Qwen: Qwen Plus 0728 Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybr... | Qwen | 1.0M | 33K | $0.26 | Stream | |
Qwen: Qwen Plus 0728 (thinking) Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybr... | Qwen | 1.0M | 33K | $0.26 | Stream | |
Qwen: Qwen-Plus Qwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a... | Qwen | 1.0M | 33K | $0.26 | Stream | |
Qwen: Qwen3 Coder Flash Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their propriet... | Qwen | 1.0M | 66K | $0.20 | Stream | |
Qwen: Qwen3 Coder Plus Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder... | Qwen | 1.0M | 66K | $0.65 | Stream | |
Qwen: Qwen3.5 Plus 2026-02-15 The Qwen3.5 native vision-language series Plus models are built on a hybrid arch... | Qwen | 1.0M | 66K | $0.26 | Vision Stream | |
Qwen: Qwen3.5 Plus 2026-04-20 Qwen3.5 Plus (April 2026) is a large-scale multimodal language model from Alibab... | Qwen | 1.0M | 66K | $0.30 | Vision Stream | |
Qwen: Qwen3.5-Flash The Qwen3.5 native vision-language Flash models are built on a hybrid architectu... | Qwen | 1.0M | 66K | $0.07 | Vision Stream | |
Qwen: Qwen3.6 Flash Qwen3.6 Flash is a fast, efficient language model from Alibaba's Qwen 3.6 series... | Qwen | 1.0M | 66K | $0.19 | Vision Stream | |
Qwen: Qwen3.6 Plus Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear att... | Qwen | 1.0M | 66K | $0.33 | Vision Stream | |
Qwen: Qwen3.7 Max Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text ... | Qwen | 1.0M | 66K | $1.25 | Stream | |
Qwen: Qwen3.7 Plus Qwen3.7-Plus is a cost-effective model in Alibaba's Qwen3.7 series. It supports ... | Qwen | 1.0M | 66K | $0.32 | Vision Stream | |
xAI: Grok 4.3 Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with te... | xAI | 1.0M | 0 | $1.25 | Vision Stream | |
OpenAI: GPT Chat Latest GPT Chat Latest points to OpenAI's stable API alias `chat-latest` that always re... | OpenAI | 400K | 128K | $5.00 | Vision Stream | |
OpenAI: GPT-5 GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning,... | OpenAI | 400K | 128K | $1.25 | Vision Stream | |
OpenAI: GPT-5 Codex GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering... | OpenAI | 400K | 128K | $1.25 | Vision Stream | |
OpenAI: GPT-5 Image [GPT-5](https://openrouter.ai/openai/gpt-5) Image combines OpenAI's GPT-5 model ... | OpenAI | 400K | 128K | $10.00 | Vision Stream | |
OpenAI: GPT-5 Image Mini GPT-5 Image Mini combines OpenAI's advanced language capabilities, powered by [G... | OpenAI | 400K | 128K | $2.50 | Vision Stream | |
OpenAI: GPT-5 Mini GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reas... | OpenAI | 400K | 128K | $0.25 | Vision Stream | |
OpenAI: GPT-5 Nano GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized fo... | OpenAI | 400K | 0 | $0.05 | Vision Stream | |
OpenAI: GPT-5 Pro GPT-5 Pro is OpenAI’s most advanced model, offering major improvements in reason... | OpenAI | 400K | 128K | $15.00 | Vision Stream | |
OpenAI: GPT-5.1 GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronge... | OpenAI | 400K | 128K | $1.25 | Vision Stream | |
OpenAI: GPT-5.1-Codex GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software enginee... | OpenAI | 400K | 128K | $1.25 | Vision Stream | |
OpenAI: GPT-5.1-Codex-Max GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-run... | OpenAI | 400K | 128K | $1.25 | Vision Stream | |
OpenAI: GPT-5.1-Codex-Mini GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex... | OpenAI | 400K | 100K | $0.25 | Vision Stream | |
OpenAI: GPT-5.2 GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronge... | OpenAI | 400K | 128K | $1.75 | Vision Stream | |
OpenAI: GPT-5.2 Pro GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agen... | OpenAI | 400K | 128K | $21.00 | Vision Stream | |
OpenAI: GPT-5.2-Codex GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software eng... | OpenAI | 400K | 128K | $1.75 | Vision Stream | |
OpenAI: GPT-5.3-Codex GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the fron... | OpenAI | 400K | 128K | $1.75 | Vision Stream | |
OpenAI: GPT-5.4 Mini GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient... | OpenAI | 400K | 128K | $0.75 | Vision Stream | |
OpenAI: GPT-5.4 Nano GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 f... | OpenAI | 400K | 128K | $0.20 | Vision Stream | |
Amazon: Nova Lite 1.0 Amazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focuse... | Amazon | 300K | 5K | $0.06 | Vision Stream | |
Amazon: Nova Pro 1.0 Amazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providi... | Amazon | 300K | 5K | $0.80 | Vision Stream | |
OpenAI: GPT-5.4 Image 2 [GPT-5.4](https://openrouter.ai/openai/gpt-5.4) Image 2 combines OpenAI's GPT-5.... | OpenAI | 272K | 128K | $8.00 | Vision Stream | |
Arcee AI: Trinity Large Thinking Trinity Large Thinking is a powerful open source reasoning model from the team a... | Arcee Ai | 262K | 262K | $0.22 | Stream | |
ByteDance Seed: Seed 1.6 Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It inco... | Bytedance Seed | 262K | 33K | $0.25 | Vision Stream | |
ByteDance Seed: Seed 1.6 Flash Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed... | Bytedance Seed | 262K | 33K | $0.07 | Vision Stream | |
ByteDance Seed: Seed-2.0-Lite Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers ... | Bytedance Seed | 262K | 131K | $0.25 | Vision Stream | |
ByteDance Seed: Seed-2.0-Mini Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive sc... | Bytedance Seed | 262K | 131K | $0.10 | Vision Stream | |
Google: Gemma 4 26B A4B Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from G... | 262K | 0 | $0.06 | Vision Stream | ||
Google: Gemma 4 26B A4B (free) Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from G... | 262K | 33K | Free | Vision Stream | ||
Google: Gemma 4 31B Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supportin... | 262K | 262K | $0.12 | Vision Stream | ||
Google: Gemma 4 31B (free) Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supportin... | 262K | 33K | Free | Vision Stream | ||
inclusionAI: Ling-2.6-1T Ling-2.6-1T is an instant (instruct) model from inclusionAI and the company’s tr... | Inclusionai | 262K | 33K | $0.07 | Stream | |
inclusionAI: Ling-2.6-flash Ling-2.6-flash is an instant (instruct) model from inclusionAI with 104B total p... | Inclusionai | 262K | 33K | $0.01 | Stream | |
inclusionAI: Ring-2.6-1T Ring-2.6-1T is a 1T-parameter-scale thinking model with 63B active parameters, b... | Inclusionai | 262K | 66K | $0.07 | Stream | |
Mistral: Devstral 2 2512 Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in... | Mistral AI | 262K | 0 | $0.40 | Stream | |
Mistral: Ministral 3 14B 2512 The largest model in the Ministral 3 family, Ministral 3 14B offers frontier cap... | Mistral AI | 262K | 0 | $0.20 | Vision Stream | |
Mistral: Ministral 3 8B 2512 A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, effici... | Mistral AI | 262K | 0 | $0.15 | Vision Stream | |
Mistral: Mistral Large 3 2512 Mistral Large 3 2512 is Mistral’s most capable model to date, featuring a sparse... | Mistral AI | 262K | 0 | $0.50 | Vision Stream | |
Mistral: Mistral Medium 3.5 Mistral Medium 3.5 is a dense 128B instruction-following model from Mistral AI. ... | Mistral AI | 262K | 0 | $1.50 | Vision Stream | |
Mistral: Mistral Small 4 Mistral Small 4 is the next major release in the Mistral Small family, unifying ... | Mistral AI | 262K | 0 | $0.15 | Vision Stream | |
MoonshotAI: Kimi K2 0905 Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It i... | Moonshotai | 262K | 262K | $0.60 | Stream | |
MoonshotAI: Kimi K2 Thinking Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, ex... | Moonshotai | 262K | 262K | $0.60 | Stream | |
MoonshotAI: Kimi K2.5 Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art ... | Moonshotai | 262K | 0 | $0.38 | Vision Stream | |
MoonshotAI: Kimi K2.6 Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-h... | Moonshotai | 262K | 262K | $0.68 | Vision Stream | |
MoonshotAI: Kimi K2.7 Code MoonshotAI: Kimi K2.7 Code is a coding-focused model in Moonshot AI's Kimi K2 fa... | Moonshotai | 262K | 0 | $0.95 | Vision Stream | |
Morph: Morph V3 Large Morph's high-accuracy apply model for complex code edits. ~4,500 tokens/sec with... | Morph | 262K | 131K | $0.90 | Stream | |
Nex AGI: Nex-N2-Pro (free) Nex-N2-Pro is an agentic mixture-of-experts model from Nex AGI, with 17B active ... | Nex Agi | 262K | 262K | Free | Vision Stream | |
NVIDIA: Nemotron 3 Nano 30B A3B NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest comput... | NVIDIA | 262K | 228K | $0.05 | Stream | |
Poolside: Laguna M.1 (free) Laguna M.1 is the flagship coding agent model from [Poolside](https://poolside.a... | Poolside | 262K | 33K | Free | Stream | |
Poolside: Laguna XS.2 (free) Laguna XS.2 is the second-generation model in the XS size class from [Poolside](... | Poolside | 262K | 33K | Free | Stream | |
Qwen: Qwen3 235B A22B Instruct 2507 Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-ex... | Qwen | 262K | 16K | $0.09 | Stream | |
Qwen: Qwen3 235B A22B Thinking 2507 Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Expe... | Qwen | 262K | 262K | $0.10 | Stream | |
Qwen: Qwen3 Coder Next Qwen3-Coder-Next is an open-weight causal language model optimized for coding ag... | Qwen | 262K | 262K | $0.11 | Stream | |
Qwen: Qwen3 Max Qwen3-Max is an updated release built on the Qwen3 series, offering major improv... | Qwen | 262K | 33K | $0.78 | Stream | |
Qwen: Qwen3 Max Thinking Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed... | Qwen | 262K | 33K | $0.78 | Stream | |
Qwen: Qwen3 Next 80B A3B Instruct Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next... | Qwen | 262K | 16K | $0.09 | Stream | |
Qwen: Qwen3 Next 80B A3B Instruct (free) Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next... | Qwen | 262K | 0 | Free | Stream | |
Qwen: Qwen3 Next 80B A3B Thinking Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next li... | Qwen | 262K | 33K | $0.10 | Stream | |
Qwen: Qwen3 VL 235B A22B Instruct Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies stro... | Qwen | 262K | 16K | $0.20 | Vision Stream | |
Qwen: Qwen3 VL 30B A3B Instruct Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generat... | Qwen | 262K | 33K | $0.13 | Vision Stream | |
Qwen: Qwen3 VL 32B Instruct Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed... | Qwen | 262K | 33K | $0.10 | Vision Stream | |
Qwen: Qwen3.5 397B A17B The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid a... | Qwen | 262K | 66K | $0.39 | Vision Stream | |
Qwen: Qwen3.5-122B-A10B The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architec... | Qwen | 262K | 262K | $0.26 | Vision Stream | |
Qwen: Qwen3.5-27B The Qwen3.5 27B native vision-language Dense model incorporates a linear attenti... | Qwen | 262K | 66K | $0.20 | Vision Stream | |
Qwen: Qwen3.5-35B-A3B The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hyb... | Qwen | 262K | 262K | $0.14 | Vision Stream | |
Qwen: Qwen3.5-9B Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to... | Qwen | 262K | 262K | $0.10 | Vision Stream | |
Qwen: Qwen3.6 27B Qwen3.6 27B is a dense 27-billion-parameter language model from the Qwen Team at... | Qwen | 262K | 262K | $0.29 | Vision Stream | |
Qwen: Qwen3.6 35B A3B Qwen3.6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 bi... | Qwen | 262K | 262K | $0.15 | Vision Stream | |
Qwen: Qwen3.6 Max Preview Qwen3.6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on ... | Qwen | 262K | 66K | $1.04 | Stream | |
StepFun: Step 3.5 Flash Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on ... | Stepfun | 262K | 16K | $0.09 | Stream | |
Tencent: Hy3 preview Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed ... | Tencent | 262K | 0 | $0.06 | Stream | |
Xiaomi: MiMo-V2-Flash MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. I... | Xiaomi | 262K | 66K | $0.10 | Stream | |
Z.ai: GLM 5 Turbo GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong perf... | Z Ai | 262K | 131K | $1.20 | Stream | |
AI21: Jamba Large 1.7 Jamba Large 1.7 is the latest model in the Jamba open family, offering improveme... | AI21 Labs | 256K | 4K | $2.00 | Stream | |
Cohere: Command A Command A is an open-weights 111B parameter model with a 256k context window foc... | Cohere | 256K | 8K | $2.50 | Stream | |
Kwaipilot: KAT-Coder-Pro V2 KAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder ser... | Kwaipilot | 256K | 80K | $0.30 | Stream | |
Mistral: Codestral 2508 Mistral's cutting-edge language model for coding released end of July 2025. Code... | Mistral AI | 256K | 0 | $0.30 | Stream | |
NVIDIA: Nemotron 3 Nano 30B A3B (free) NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest comput... | NVIDIA | 256K | 0 | Free | Stream | |
NVIDIA: Nemotron 3 Nano Omni (free) NVIDIA Nemotron™ 3 Nano Omni is a 30B-A3B open multimodal model designed to func... | NVIDIA | 256K | 66K | Free | Vision Stream | |
Qwen: Qwen3 VL 8B Instruct Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL ser... | Qwen | 256K | 33K | $0.08 | Vision Stream | |
Qwen: Qwen3 VL 8B Thinking Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multi... | Qwen | 256K | 33K | $0.12 | Vision Stream | |
Relace: Relace Apply 3 Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits... | Relace | 256K | 128K | $0.85 | Stream | |
Relace: Relace Search The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to ex... | Relace | 256K | 128K | $1.00 | Stream | |
StepFun: Step 3.7 Flash Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts... | Stepfun | 256K | 256K | $0.20 | Vision Stream | |
xAI: Grok Build 0.1 Grok Build 0.1 is xAI’s fast coding model trained specifically for agentic softw... | xAI | 256K | 0 | $1.00 | Vision Stream | |
MiniMax: MiniMax M2 MiniMax-M2 is a compact, high-efficiency large language model optimized for end-... | Minimax | 205K | 197K | $0.26 | Stream | |
MiniMax: MiniMax M2.1 MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized f... | Minimax | 205K | 197K | $0.29 | Stream | |
MiniMax: MiniMax M2.5 MiniMax-M2.5 is a SOTA large language model designed for real-world productivity... | Minimax | 205K | 197K | $0.15 | Stream | |
MiniMax: MiniMax M2.7 MiniMax-M2.7 is a next-generation large language model designed for autonomous, ... | Minimax | 205K | 131K | $0.25 | Stream | |
Z.ai: GLM 4.6 Compared with GLM-4.5, this generation brings several key improvements: Longer c... | Z Ai | 203K | 131K | $0.43 | Stream | |
Z.ai: GLM 4.7 GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: en... | Z Ai | 203K | 131K | $0.40 | Stream | |
Z.ai: GLM 4.7 Flash As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances perfo... | Z Ai | 203K | 16K | $0.06 | Stream | |
Z.ai: GLM 5 GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex sys... | Z Ai | 203K | 0 | $0.60 | Stream | |
Z.ai: GLM 5.1 GLM-5.1 delivers a major leap in coding capability, with particularly significan... | Z Ai | 203K | 0 | $0.98 | Stream | |
Anthropic: Claude 3 Haiku Claude 3 Haiku is Anthropic's fastest and most compact model for
near-instant re... | Anthropic | 200K | 4K | $0.25 | Vision Stream | |
Anthropic: Claude 3.5 Haiku Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy... | Anthropic | 200K | 8K | $0.80 | Vision Stream | |
Anthropic: Claude Haiku 4.5 Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering nea... | Anthropic | 200K | 64K | $1.00 | Vision Stream | |
Anthropic: Claude Opus 4 Claude Opus 4 is benchmarked as the world’s best coding model, at time of releas... | Anthropic | 200K | 32K | $15.00 | Vision Stream | |
Anthropic: Claude Opus 4.1 Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering im... | Anthropic | 200K | 32K | $15.00 | Vision Stream | |
Anthropic: Claude Opus 4.5 Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex so... | Anthropic | 200K | 64K | $5.00 | Vision Stream | |
OpenAI: o1 The latest and strongest model family from OpenAI, o1 is designed to spend more ... | OpenAI | 200K | 100K | $15.00 | Vision Stream | |
OpenAI: o1-pro The o1 series of models are trained with reinforcement learning to think before ... | OpenAI | 200K | 100K | $150.00 | Vision Stream | |
OpenAI: o3 o3 is a well-rounded and powerful model across domains. It sets a new standard f... | OpenAI | 200K | 100K | $2.00 | Vision Stream | |
OpenAI: o3 Deep Research o3-deep-research is OpenAI's advanced model for deep research, designed to tackl... | OpenAI | 200K | 100K | $10.00 | Vision Stream | |
OpenAI: o3 Mini OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning t... | OpenAI | 200K | 100K | $1.10 | Stream | |
OpenAI: o3 Mini High OpenAI o3-mini-high is the same model as [o3-mini](/openai/o3-mini) with reasoni... | OpenAI | 200K | 100K | $1.10 | Stream | |
OpenAI: o3 Pro The o-series of models are trained with reinforcement learning to think before t... | OpenAI | 200K | 100K | $20.00 | Vision Stream | |
OpenAI: o4 Mini OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast,... | OpenAI | 200K | 100K | $1.10 | Vision Stream | |
OpenAI: o4 Mini Deep Research o4-mini-deep-research is OpenAI's faster, more affordable deep research model—id... | OpenAI | 200K | 100K | $2.00 | Vision Stream | |
OpenAI: o4 Mini High OpenAI o4-mini-high is the same model as [o4-mini](/openai/o4-mini) with reasoni... | OpenAI | 200K | 100K | $1.10 | Vision Stream | |
Free Models Router The simplest way to get free inference. openrouter/free is a router that selects... | Openrouter | 200K | 0 | Free | Vision Stream | |
Perplexity: Sonar Pro Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](h... | Perplexity | 200K | 8K | $3.00 | Vision Stream | |
Perplexity: Sonar Pro Search Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is ... | Perplexity | 200K | 8K | $3.00 | Vision Stream | |
DeepSeek: DeepSeek V3 0324 DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration... | DeepSeek | 164K | 16K | $0.20 | Stream | |
DeepSeek: DeepSeek V3.1 DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) th... | DeepSeek | 164K | 33K | $0.21 | Stream | |
DeepSeek: DeepSeek V3.1 Terminus DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v... | DeepSeek | 164K | 33K | $0.27 | Stream | |
DeepSeek: DeepSeek V3.2 Exp DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek a... | DeepSeek | 164K | 66K | $0.27 | Stream | |
DeepSeek: R1 DeepSeek R1 is here: Performance on par with [OpenAI o1](/openai/o1), but open-s... | DeepSeek | 164K | 16K | $0.70 | Stream | |
DeepSeek: R1 0528 May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance... | DeepSeek | 164K | 33K | $0.50 | Stream | |
Meta: Llama Guard 4 12B Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned... | Meta | 164K | 16K | $0.18 | Vision Stream | |
Qwen: Qwen3 Coder 30B A3B Instruct Qwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model... | Qwen | 160K | 33K | $0.07 | Stream | |
Qwen: Qwen3 14B Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series... | Qwen | 132K | 41K | $0.10 | Stream | |
AionLabs: Aion-1.0 Aion-1.0 is a multi-model system designed for high performance across various ta... | Aion Labs | 131K | 33K | $4.00 | Stream | |
AionLabs: Aion-1.0-Mini Aion-1.0-Mini 32B parameter model is a distilled version of the DeepSeek-R1 mode... | Aion Labs | 131K | 33K | $0.70 | Stream | |
AionLabs: Aion-2.0 Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and s... | Aion Labs | 131K | 33K | $0.80 | Stream | |
Arcee AI: Trinity Mini Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language m... | Arcee Ai | 131K | 131K | $0.04 | Stream | |
Arcee AI: Virtuoso Large Virtuoso‑Large is Arcee's top‑tier general‑purpose LLM at 72 B parameters, tuned... | Arcee Ai | 131K | 64K | $0.75 | Stream | |
Baidu: ERNIE 4.5 VL 424B A47B ERNIE-4.5-VL-424B-A47B is a multimodal Mixture-of-Experts (MoE) model from Baidu... | Baidu | 131K | 16K | $0.42 | Vision Stream | |
DeepSeek: DeepSeek V3 DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instru... | DeepSeek | 131K | 16K | $0.20 | Stream | |
DeepSeek: DeepSeek V3.2 DeepSeek-V3.2 is a large language model designed to harmonize high computational... | DeepSeek | 131K | 64K | $0.23 | Stream | |
Google: Gemma 3 12B Gemma 3 introduces multimodality, supporting vision-language input and text outp... | 131K | 16K | $0.05 | Vision Stream | ||
Google: Gemma 3 27B Gemma 3 introduces multimodality, supporting vision-language input and text outp... | 131K | 16K | $0.08 | Vision Stream | ||
Google: Gemma 3 4B Gemma 3 introduces multimodality, supporting vision-language input and text outp... | 131K | 16K | $0.05 | Vision Stream | ||
Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview) Gemini 3.1 Flash Image Preview, a.k.a. "Nano Banana 2," is Google’s latest state... | 131K | 66K | $0.50 | Vision Stream | ||
IBM: Granite 4.1 8B Granite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from ... | Ibm Granite | 131K | 131K | $0.05 | Stream | |
Meta: Llama 3.1 70B Instruct Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flav... | Meta | 131K | 16K | $0.40 | Stream | |
Meta: Llama 3.1 8B Instruct Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flav... | Meta | 131K | 16K | $0.02 | Stream | |
Meta: Llama 3.2 11B Vision Instruct Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed ... | Meta | 131K | 16K | $0.34 | Vision Stream | |
Meta: Llama 3.2 1B Instruct Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently perf... | Meta | 131K | 60K | $0.03 | Stream | |
Meta: Llama 3.2 3B Instruct Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimiz... | Meta | 131K | 80K | $0.05 | Stream | |
Meta: Llama 3.2 3B Instruct (free) Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimiz... | Meta | 131K | 0 | Free | Stream | |
Meta: Llama 3.3 70B Instruct The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and i... | Meta | 131K | 16K | $0.10 | Stream | |
Meta: Llama 3.3 70B Instruct (free) The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and i... | Meta | 131K | 0 | Free | Stream | |
Microsoft: Phi 4 Mini Instruct Phi-4-mini-instruct is a lightweight open model built upon synthetic data and fi... | Microsoft | 131K | 128K | $0.08 | Stream | |
Mistral Large 2407 This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407... | Mistral AI | 131K | 0 | $2.00 | Stream | |
Mistral: Ministral 3 3B 2512 The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, effi... | Mistral AI | 131K | 0 | $0.10 | Vision Stream | |
Mistral: Mistral Medium 3 Mistral Medium 3 is a high-performance enterprise-grade language model designed ... | Mistral AI | 131K | 0 | $0.40 | Vision Stream | |
Mistral: Mistral Medium 3.1 Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-pe... | Mistral AI | 131K | 0 | $0.40 | Vision Stream | |
Mistral: Mistral Nemo A 12B parameter model with a 128k token context length built by Mistral in colla... | Mistral AI | 131K | 0 | $0.02 | Stream | |
MoonshotAI: Kimi K2 0711 Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model develo... | Moonshotai | 131K | 33K | $0.57 | Stream | |
Nous: Hermes 3 405B Instruct Hermes 3 is a generalist language model with many improvements over Hermes 2, in... | Nousresearch | 131K | 16K | $1.00 | Stream | |
Nous: Hermes 3 405B Instruct (free) Hermes 3 is a generalist language model with many improvements over Hermes 2, in... | Nousresearch | 131K | 0 | Free | Stream | |
Nous: Hermes 3 70B Instruct Hermes 3 is a generalist language model with many improvements over [Hermes 2](/... | Nousresearch | 131K | 16K | $0.70 | Stream | |
Nous: Hermes 4 405B Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and relea... | Nousresearch | 131K | 0 | $1.00 | Stream | |
Nous: Hermes 4 70B Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama... | Nousresearch | 131K | 0 | $0.13 | Stream | |
NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/... | NVIDIA | 131K | 16K | $0.40 | Stream | |
OpenAI: gpt-oss-120b gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language... | OpenAI | 131K | 0 | $0.04 | Stream | |
OpenAI: gpt-oss-120b (free) gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language... | OpenAI | 131K | 131K | Free | Stream | |
OpenAI: gpt-oss-20b gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the A... | OpenAI | 131K | 0 | $0.03 | Stream | |
OpenAI: gpt-oss-20b (free) gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the A... | OpenAI | 131K | 8K | Free | Stream | |
OpenAI: gpt-oss-safeguard-20b gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss... | OpenAI | 131K | 66K | $0.07 | Stream | |
Prime Intellect: INTELLECT-3 INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-train... | Prime Intellect | 131K | 131K | $0.20 | Stream | |
Qwen: Qwen2.5 7B Instruct Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings th... | Qwen | 131K | 33K | $0.04 | Stream | |
Qwen: Qwen2.5 VL 72B Instruct Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, f... | Qwen | 131K | 128K | $0.80 | Vision Stream | |
Qwen: Qwen3 235B A22B Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by ... | Qwen | 131K | 8K | $0.45 | Stream | |
Qwen: Qwen3 30B A3B Qwen3, the latest generation in the Qwen large language model series, features b... | Qwen | 131K | 16K | $0.12 | Stream | |
Qwen: Qwen3 30B A3B Instruct 2507 Qwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter mixture-of-experts language mod... | Qwen | 131K | 32K | $0.05 | Stream | |
Qwen: Qwen3 30B A3B Thinking 2507 Qwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning mode... | Qwen | 131K | 131K | $0.08 | Stream | |
Qwen: Qwen3 32B Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series... | Qwen | 131K | 16K | $0.08 | Stream | |
Qwen: Qwen3 8B Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, ... | Qwen | 131K | 8K | $0.05 | Stream | |
Qwen: Qwen3 VL 235B A22B Thinking Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text gener... | Qwen | 131K | 33K | $0.26 | Vision Stream | |
Qwen: Qwen3 VL 30B A3B Thinking Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generat... | Qwen | 131K | 33K | $0.13 | Vision Stream | |
Qwen2.5 72B Instruct Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings t... | Qwen | 131K | 16K | $0.36 | Stream | |
Sao10K: Llama 3.1 Euryale 70B v2.2 Euryale L3.1 70B v2.2 is a model focused on creative roleplay from [Sao10k](http... | Sao10k | 131K | 16K | $0.85 | Stream | |
Sao10K: Llama 3.3 Euryale 70B Euryale L3.3 70B is a model focused on creative roleplay from [Sao10k](https://k... | Sao10k | 131K | 16K | $0.65 | Stream | |
Switchpoint Router Switchpoint AI's router instantly analyzes your request and directs it to the op... | Switchpoint | 131K | 0 | $0.85 | Stream | |
Tencent: Hunyuan A13B Instruct Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model d... | Tencent | 131K | 131K | $0.14 | Stream | |
TheDrummer: Cydonia 24B V4.1 Uncensored and creative writing model based on Mistral Small 3.2 24B with good r... | Thedrummer | 131K | 131K | $0.30 | Stream | |
Z.ai: GLM 4.5 GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based a... | Z Ai | 131K | 98K | $0.60 | Stream | |
Z.ai: GLM 4.5 Air GLM-4.5-Air is the lightweight variant of our latest flagship model family, also... | Z Ai | 131K | 131K | $0.13 | Stream | |
Z.ai: GLM 4.6V GLM-4.6V is a large multimodal model designed for high-fidelity visual understan... | Z Ai | 131K | 33K | $0.30 | Vision Stream | |
IBM: Granite 4.0 Micro Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These... | Ibm Granite | 131K | 131K | $0.02 | Stream | |
Amazon: Nova Micro 1.0 Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency resp... | Amazon | 128K | 5K | $0.04 | Stream | |
ByteDance: UI-TARS 7B UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based enviro... | Bytedance | 128K | 2K | $0.10 | Vision Stream | |
Cohere: Command R (08-2024) command-r-08-2024 is an update of the [Command R](/models/cohere/command-r) with... | Cohere | 128K | 4K | $0.15 | Stream | |
Cohere: Command R+ (08-2024) command-r-plus-08-2024 is an update of the [Command R+](/models/cohere/command-r... | Cohere | 128K | 4K | $2.50 | Stream | |
Cohere: Command R7B (12-2024) Command R7B (12-2024) is a small, fast update of the Command R+ model, delivered... | Cohere | 128K | 4K | $0.04 | Stream | |
Deep Cogito: Cogito v2.1 671B Cogito v2.1 671B MoE represents one of the strongest open models globally, match... | Deepcogito | 128K | 0 | $1.25 | Stream | |
DeepSeek: R1 Distill Llama 70B DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llam... | DeepSeek | 128K | 8K | $0.80 | Stream | |
DeepSeek: R1 Distill Qwen 32B DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen ... | DeepSeek | 128K | 33K | $0.29 | Stream | |
Inception: Mercury 2 Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion ... | Inception | 128K | 50K | $0.25 | Stream | |
LiquidAI: LFM2-24B-A2B LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures des... | Liquid | 128K | 0 | $0.03 | Stream | |
Mistral Large This is Mistral AI's flagship model, Mistral Large 2 (version `mistral-large-240... | Mistral AI | 128K | 0 | $2.00 | Stream | |
Mistral: Mistral Small 3.1 24B Mistral Small 3.1 24B Instruct is an upgraded variant of Mistral Small 3 (2501),... | Mistral AI | 128K | 128K | $0.35 | Vision Stream | |
Mistral: Mistral Small 3.2 24B Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistr... | Mistral AI | 128K | 16K | $0.07 | Vision Stream | |
NVIDIA: Nemotron 3.5 Content Safety (free) NVIDIA Nemotron 3.5 Content Safety is a compact 4B-parameter multimodal guardrai... | NVIDIA | 128K | 8K | Free | Vision Stream | |
NVIDIA: Nemotron Nano 12B 2 VL (free) NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning mo... | NVIDIA | 128K | 128K | Free | Vision Stream | |
NVIDIA: Nemotron Nano 9B V2 (free) NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch ... | NVIDIA | 128K | 0 | Free | Stream | |
OpenAI: GPT Audio The gpt-audio model is OpenAI's first generally available audio model. The new s... | OpenAI | 128K | 16K | $2.50 | Stream | |
OpenAI: GPT Audio Mini A cost-efficient version of GPT Audio. The new snapshot features an upgraded dec... | OpenAI | 128K | 16K | $0.60 | Stream | |
OpenAI: GPT-4 Turbo The latest GPT-4 Turbo model with vision capabilities. Vision requests can now u... | OpenAI | 128K | 4K | $10.00 | Vision Stream | |
OpenAI: GPT-4 Turbo Preview The preview GPT-4 model with improved instruction following, JSON mode, reproduc... | OpenAI | 128K | 4K | $10.00 | Stream | |
OpenAI: GPT-4o GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and im... | OpenAI | 128K | 16K | $2.50 | Vision Stream | |
OpenAI: GPT-4o (2024-05-13) GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and im... | OpenAI | 128K | 4K | $5.00 | Vision Stream | |
OpenAI: GPT-4o (2024-08-06) The 2024-08-06 version of GPT-4o offers improved performance in structured outpu... | OpenAI | 128K | 16K | $2.50 | Vision Stream | |
OpenAI: GPT-4o (2024-11-20) The 2024-11-20 version of GPT-4o offers a leveled-up creative writing ability wi... | OpenAI | 128K | 16K | $2.50 | Vision Stream | |
OpenAI: GPT-4o Search Preview GPT-4o Search Previewis a specialized model for web search in Chat Completions. ... | OpenAI | 128K | 16K | $2.50 | Stream | |
OpenAI: GPT-4o-mini GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), ... | OpenAI | 128K | 16K | $0.15 | Vision Stream | |
OpenAI: GPT-4o-mini (2024-07-18) GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), ... | OpenAI | 128K | 16K | $0.15 | Vision Stream | |
OpenAI: GPT-4o-mini Search Preview GPT-4o mini Search Preview is a specialized model for web search in Chat Complet... | OpenAI | 128K | 16K | $0.15 | Stream | |
OpenAI: GPT-5 Chat GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conv... | OpenAI | 128K | 16K | $1.25 | Vision Stream | |
OpenAI: GPT-5.1 Chat GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, opt... | OpenAI | 128K | 32K | $1.25 | Vision Stream | |
OpenAI: GPT-5.2 Chat GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, op... | OpenAI | 128K | 16K | $1.75 | Vision Stream | |
OpenAI: GPT-5.3 Chat GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conve... | OpenAI | 128K | 16K | $1.75 | Vision Stream | |
Body Builder (beta) Transform your natural language requests into structured OpenRouter API request ... | Openrouter | 128K | 0 | Variable | Stream | |
OpenRouter: Fusion Fusion turns your prompt into a small multi-model deliberation. A panel of exper... | Openrouter | 128K | 0 | Variable | Stream | |
Perplexity: Sonar Deep Research Sonar Deep Research is a research-focused model designed for multi-step retrieva... | Perplexity | 128K | 0 | $2.00 | Stream | |
Perplexity: Sonar Reasoning Pro Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](h... | Perplexity | 128K | 0 | $2.00 | Vision Stream | |
Qwen2.5 Coder 32B Instruct Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (... | Qwen | 128K | 33K | $0.66 | Stream | |
Upstage: Solar Pro 3 Solar Pro 3 is Upstage's powerful Mixture-of-Experts (MoE) language model. With ... | Upstage | 128K | 0 | $0.15 | Stream | |
Perplexity: Sonar Sonar is lightweight, affordable, fast, and simple to use — now featuring citati... | Perplexity | 127K | 0 | $1.00 | Vision Stream | |
Morph: Morph V3 Fast Morph's fastest apply model for code edits. ~10,500 tokens/sec with 96% accuracy... | Morph | 82K | 38K | $0.80 | Stream | |
AllenAI: Olmo 3 32B Think Olmo 3 32B Think is a large-scale, 32-billion-parameter model purpose-built for ... | Allenai | 66K | 66K | $0.15 | Stream | |
Google: Nano Banana Pro (Gemini 3 Pro Image Preview) Nano Banana Pro is Google’s most advanced image-generation and editing model, bu... | 66K | 33K | $2.00 | Vision Stream | ||
WizardLM-2 8x22B WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates h... | Microsoft | 66K | 8K | $0.62 | Stream | |
MiniMax: MiniMax M2-her MiniMax M2-her is a dialogue-first large language model built for immersive role... | Minimax | 66K | 2K | $0.30 | Stream | |
Mistral: Mixtral 8x22B Instruct Mistral's official instruct fine-tuned version of [Mixtral 8x22B](/models/mistra... | Mistral AI | 66K | 0 | $2.00 | Stream | |
Reka Flash 3 Reka Flash 3 is a general-purpose, instruction-tuned large language model with 2... | Rekaai | 66K | 66K | $0.10 | Stream | |
Z.ai: GLM 4.5V GLM-4.5V is a vision-language foundation model for multimodal agent applications... | Z Ai | 66K | 16K | $0.60 | Vision Stream | |
AionLabs: Aion-RP 1.0 (8B) Aion-RP-Llama-3.1-8B ranks the highest in the character evaluation portion of th... | Aion Labs | 33K | 33K | $0.80 | Stream | |
Magnum v4 72B This is a series of models designed to replicate the prose quality of the Claude... | Anthracite Org | 33K | 2K | $3.00 | Stream | |
Arcee AI: Coder Large Coder‑Large is a 32 B‑parameter offspring of Qwen 2.5‑Instruct that has been fur... | Arcee Ai | 33K | 0 | $0.50 | Stream | |
Venice: Uncensored (free) Venice Uncensored Dolphin Mistral 24B Venice Edition is a fine-tuned variant of ... | Cognitivecomputations | 33K | 0 | Free | Stream | |
EssentialAI: Rnj 1 Instruct Rnj-1 is an 8B-parameter, dense, open-weight model family developed by Essential... | Essentialai | 33K | 0 | $0.15 | Stream | |
Google: Gemma 3n 4B Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource ... | 33K | 0 | $0.06 | Stream | ||
Google: Nano Banana (Gemini 2.5 Flash Image) Gemini 2.5 Flash Image, a.k.a. "Nano Banana," is now generally available. It is ... | 33K | 33K | $0.30 | Vision Stream | ||
LiquidAI: LFM2.5-1.2B-Instruct (free) LFM2.5-1.2B-Instruct is a compact, high-performance instruction-tuned model buil... | Liquid | 33K | 0 | Free | Stream | |
LiquidAI: LFM2.5-1.2B-Thinking (free) LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agen... | Liquid | 33K | 0 | Free | Stream | |
Mistral: Mistral Small 3 Mistral Small 3 is a 24B-parameter language model optimized for low-latency perf... | Mistral AI | 33K | 16K | $0.05 | Stream | |
Mistral: Saba Mistral Saba is a 24B-parameter language model specifically designed for the Mid... | Mistral AI | 33K | 0 | $0.20 | Stream | |
Perceptron: Perceptron Mk1 Perceptron Mk1 (Mark One) is Perceptron's highest-quality vision-language model ... | Perceptron | 33K | 8K | $0.15 | Vision Stream | |
TheDrummer: Rocinante 12B Rocinante 12B is designed for engaging storytelling and rich prose. Early tester... | Thedrummer | 33K | 33K | $0.17 | Stream | |
TheDrummer: Skyfall 36B V2 Skyfall 36B v2 is an enhanced iteration of Mistral Small 2501, specifically fine... | Thedrummer | 33K | 33K | $0.55 | Stream | |
TheDrummer: UnslopNemo 12B UnslopNemo v4.1 is the latest addition from the creator of Rocinante, designed f... | Thedrummer | 33K | 33K | $0.40 | Stream | |
Mistral: Voxtral Small 24B 2507 Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-a... | Mistral AI | 32K | 0 | $0.10 | Stream | |
OpenAI: GPT-3.5 Turbo GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural ... | OpenAI | 16K | 4K | $0.50 | Stream | |
OpenAI: GPT-3.5 Turbo 16k This model offers four times the context length of gpt-3.5-turbo, allowing it to... | OpenAI | 16K | 4K | $3.00 | Stream | |
Microsoft: Phi 4 [Microsoft Research](/microsoft) Phi-4 is designed to perform well in complex re... | Microsoft | 16K | 16K | $0.07 | Stream | |
Reka Edge Reka Edge is an extremely efficient 7B multimodal vision-language model that acc... | Rekaai | 16K | 16K | $0.10 | Vision Stream | |
Sao10K: Llama 3.1 70B Hanami x1 This is [Sao10K](/sao10k)'s experiment over [Euryale v2.2](/sao10k/l3.1-euryale-... | Sao10k | 16K | 0 | $3.00 | Stream | |
Google: Gemma 2 27B Gemma 2 27B by Google is an open model built from the same research and technolo... | 8K | 2K | $0.65 | Stream | ||
Meta: Llama 3 70B Instruct Meta's latest class of model (Llama 3) launched with a variety of sizes & flavor... | Meta | 8K | 8K | $0.51 | Stream | |
Meta: Llama 3 8B Instruct Meta's latest class of model (Llama 3) launched with a variety of sizes & flavor... | Meta | 8K | 0 | $0.14 | Stream | |
Sao10K: Llama 3 8B Lunaris Lunaris 8B is a versatile generalist and roleplaying model based on Llama 3. It'... | Sao10k | 8K | 16K | $0.04 | Stream | |
OpenAI: GPT-4 OpenAI's flagship model, GPT-4 is a large-scale multimodal language model capabl... | OpenAI | 8K | 4K | $30.00 | Stream | |
Inflection: Inflection 3 Pi Inflection 3 Pi powers Inflection's [Pi](https://pi.ai) chatbot, including backs... | Inflection | 8K | 1K | $2.50 | Stream | |
Inflection: Inflection 3 Productivity Inflection 3 Productivity is optimized for following instructions. It is better ... | Inflection | 8K | 1K | $2.50 | Stream | |
Mancer: Weaver (alpha) An attempt to recreate Claude-style verbosity, but don't expect the same level o... | Mancer | 8K | 2K | $0.75 | Stream | |
ReMM SLERP 13B A recreation trial of the original MythoMax-L2-B13 but with updated models. #mer... | Undi95 | 6K | 4K | $0.45 | Stream | |
MythoMax 13B One of the highest performing and most popular fine-tunes of Llama 2 13B, with r... | Gryphe | 4K | 4K | $0.06 | Stream | |
OpenAI: GPT-3.5 Turbo (older v0613) GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural ... | OpenAI | 4K | 4K | $1.00 | Stream | |
OpenAI: GPT-3.5 Turbo Instruct This model is a variant of GPT-3.5 Turbo tuned for instructional prompts and omi... | OpenAI | 4K | 4K | $1.50 | Stream |
Understanding LLM context windows
What counts toward the limit?
The context window includes everything in the prompt: system instructions, conversation history, retrieved documents, and the space reserved for the model's response. If your input plus expected output exceeds the limit, you need a model with a larger window or a chunking strategy.
Common context window tiers
4K–8K: simple prompts and short chats.
32K–128K: long articles, code files, and medium conversations.
1M+: books, video transcripts, and large knowledge bases.
Context vs. cost trade-off
Models with very long context often charge more per token and can be slower. For many applications, splitting documents into smaller chunks and using retrieval is cheaper and more accurate than sending everything to a mega-context model.
Filter by capability
Use the capability badges to find models that support vision, function calling, JSON mode, or streaming. Combine filters to narrow down the exact model for your production workload.