Skip to main content
VePrompts

LLM Context Window Comparison

Compare context windows across 328 models. Filter by provider, modality, and minimum context size. Use the fit calculator to find models that handle your documents.

Last updated: 2026-06-13

Will your document fit?

Showing 328 of 328 models
ModelProviderContext ↓Output maxInput price CapabilitiesVisual
Meta: Llama 4 Scout
Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model de...
Meta10.0M16K$0.10
Vision Stream
Auto Router
Your prompt will be processed by a meta-model and routed to one of dozens of mod...
Openrouter2.0M0Variable
Vision Stream
Pareto Code Router
The Pareto Router maintains a tiered shortlist of strong coding models, ranked b...
Openrouter2.0M0Variable
Stream
xAI: Grok 4.20
Grok 4.20 is a reasoning model from xAI with industry-leading speed and agentic ...
xAI2.0M0$1.25
Vision Stream
xAI: Grok 4.20 Multi-Agent
Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative...
xAI2.0M0$2.00
Vision Stream
OpenAI: GPT-5.4
GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into...
OpenAI1.1M128K$2.50
Vision Stream
OpenAI: GPT-5.4 Pro
GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified archi...
OpenAI1.1M128K$30.00
Vision Stream
OpenAI: GPT-5.5
GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, ...
OpenAI1.1M128K$5.00
Vision Stream
OpenAI: GPT-5.5 Pro
GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and a...
OpenAI1.1M128K$30.00
Vision Stream
Google: Gemini 3.1 Pro Preview Custom Tools
Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves...
Google1.0M66K$2.00
Vision Stream
Owl Alpha
Owl Alpha is a high-performance foundation model designed for agentic workloads....
Openrouter1.0M262KFree
Stream
DeepSeek: DeepSeek V4 Flash
DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepS...
DeepSeek1.0M0$0.10
Stream
DeepSeek: DeepSeek V4 Pro
DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6...
DeepSeek1.0M384K$0.43
Stream
Google: Gemini 2.5 Flash
Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically desi...
Google1.0M66K$0.30
Vision Stream
Google: Gemini 2.5 Flash Lite
Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family,...
Google1.0M66K$0.10
Vision Stream
Google: Gemini 2.5 Flash Lite Preview 09-2025
Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family,...
Google1.0M66K$0.10
Vision Stream
Google: Gemini 2.5 Pro
Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reaso...
Google1.0M66K$1.25
Vision Stream
Google: Gemini 2.5 Pro Preview 05-06
Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reaso...
Google1.0M66K$1.25
Vision Stream
Google: Gemini 2.5 Pro Preview 06-05
Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reaso...
Google1.0M66K$1.25
Vision Stream
Google: Gemini 3 Flash Preview
Gemini 3 Flash Preview is a high speed, high value thinking model designed for a...
Google1.0M66K$0.50
Vision Stream
Google: Gemini 3.1 Flash Lite
Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized ...
Google1.0M66K$0.25
Vision Stream
Google: Gemini 3.1 Flash Lite Preview
Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for hi...
Google1.0M66K$0.25
Vision Stream
Google: Gemini 3.1 Pro Preview
Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced...
Google1.0M66K$2.00
Vision Stream
Google: Gemini 3.5 Flash
Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro...
Google1.0M66K$1.50
Vision Stream
Google: Lyria 3 Clip Preview
30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's famil...
Google1.0M66KFree
Vision Stream
Google: Lyria 3 Pro Preview
Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of mu...
Google1.0M66KFree
Vision Stream
Meta: Llama 4 Maverick
Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language mode...
Meta1.0M16K$0.15
Vision Stream
MiniMax: MiniMax M3
MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, imag...
Minimax1.0M512K$0.30
Vision Stream
Qwen: Qwen3 Coder 480B A35B
Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation mod...
Qwen1.0M66K$0.22
Stream
Qwen: Qwen3 Coder 480B A35B (free)
Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation mod...
Qwen1.0M262KFree
Stream
Xiaomi: MiMo-V2.5
MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic p...
Xiaomi1.0M131K$0.14
Vision Stream
Xiaomi: MiMo-V2.5-Pro
MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in gener...
Xiaomi1.0M131K$0.43
Stream
OpenAI: GPT-4.1
GPT-4.1 is a flagship large language model optimized for advanced instruction fo...
OpenAI1.0M0$2.00
Vision Stream
OpenAI: GPT-4.1 Mini
GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o...
OpenAI1.0M33K$0.40
Vision Stream
OpenAI: GPT-4.1 Nano
For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest mode...
OpenAI1.0M33K$0.10
Vision Stream
Writer: Palmyra X5
Palmyra X5 is Writer's most advanced model, purpose-built for building and scali...
Writer1.0M8K$0.60
Stream
MiniMax: MiniMax-01
MiniMax-01 is a combines MiniMax-Text-01 for text generation and MiniMax-VL-01 f...
Minimax1.0M1.0M$0.20
Vision Stream
Amazon: Nova 2 Lite
Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads tha...
Amazon1.0M66K$0.30
Vision Stream
Amazon: Nova Premier 1.0
Amazon Nova Premier is the most capable of Amazon’s multimodal models for comple...
Amazon1.0M32K$2.50
Vision Stream
Anthropic: Claude Fable 5
Claude Fable 5 is a Mythos-class model from Anthropic, built for autonomous know...
Anthropic1.0M128K$10.00
Vision Stream
Anthropic: Claude Opus 4.6
Opus 4.6 is Anthropic’s strongest model for coding and long-running professional...
Anthropic1.0M128K$5.00
Vision Stream
Anthropic: Claude Opus 4.6 (Fast)
Fast-mode variant of [Opus 4.6](/anthropic/claude-opus-4.6) - identical capabili...
Anthropic1.0M128K$30.00
Vision Stream
Anthropic: Claude Opus 4.7
Opus 4.7 is the next generation of Anthropic's Opus family, built for long-runni...
Anthropic1.0M128K$5.00
Vision Stream
Anthropic: Claude Opus 4.7 (Fast)
Fast-mode variant of [Opus 4.7](/anthropic/claude-opus-4.7) - identical capabili...
Anthropic1.0M128K$30.00
Vision Stream
Anthropic: Claude Opus 4.8
Claude Opus 4.8 is Anthropic's most capable generally available model in the Opu...
Anthropic1.0M128K$5.00
Vision Stream
Anthropic: Claude Opus 4.8 (Fast)
Fast-mode variant of [Opus 4.8](/anthropic/claude-opus-4.8) - identical capabili...
Anthropic1.0M128K$10.00
Vision Stream
Anthropic: Claude Sonnet 4
Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonn...
Anthropic1.0M64K$3.00
Vision Stream
Anthropic: Claude Sonnet 4.5
Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized f...
Anthropic1.0M64K$3.00
Vision Stream
Anthropic: Claude Sonnet 4.6
Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier per...
Anthropic1.0M128K$3.00
Vision Stream
MiniMax: MiniMax M1
MiniMax-M1 is a large-scale, open-weight reasoning model designed for extended c...
Minimax1.0M40K$0.40
Stream
NVIDIA: Nemotron 3 Super
NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating ju...
NVIDIA1.0M0$0.09
Stream
NVIDIA: Nemotron 3 Super (free)
NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating ju...
NVIDIA1.0M262KFree
Stream
NVIDIA: Nemotron 3 Ultra
NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model fr...
NVIDIA1.0M16K$0.50
Stream
NVIDIA: Nemotron 3 Ultra (free)
NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model fr...
NVIDIA1.0M66KFree
Stream
Qwen: Qwen Plus 0728
Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybr...
Qwen1.0M33K$0.26
Stream
Qwen: Qwen Plus 0728 (thinking)
Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybr...
Qwen1.0M33K$0.26
Stream
Qwen: Qwen-Plus
Qwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a...
Qwen1.0M33K$0.26
Stream
Qwen: Qwen3 Coder Flash
Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their propriet...
Qwen1.0M66K$0.20
Stream
Qwen: Qwen3 Coder Plus
Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder...
Qwen1.0M66K$0.65
Stream
Qwen: Qwen3.5 Plus 2026-02-15
The Qwen3.5 native vision-language series Plus models are built on a hybrid arch...
Qwen1.0M66K$0.26
Vision Stream
Qwen: Qwen3.5 Plus 2026-04-20
Qwen3.5 Plus (April 2026) is a large-scale multimodal language model from Alibab...
Qwen1.0M66K$0.30
Vision Stream
Qwen: Qwen3.5-Flash
The Qwen3.5 native vision-language Flash models are built on a hybrid architectu...
Qwen1.0M66K$0.07
Vision Stream
Qwen: Qwen3.6 Flash
Qwen3.6 Flash is a fast, efficient language model from Alibaba's Qwen 3.6 series...
Qwen1.0M66K$0.19
Vision Stream
Qwen: Qwen3.6 Plus
Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear att...
Qwen1.0M66K$0.33
Vision Stream
Qwen: Qwen3.7 Max
Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text ...
Qwen1.0M66K$1.25
Stream
Qwen: Qwen3.7 Plus
Qwen3.7-Plus is a cost-effective model in Alibaba's Qwen3.7 series. It supports ...
Qwen1.0M66K$0.32
Vision Stream
xAI: Grok 4.3
Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with te...
xAI1.0M0$1.25
Vision Stream
OpenAI: GPT Chat Latest
GPT Chat Latest points to OpenAI's stable API alias `chat-latest` that always re...
OpenAI400K128K$5.00
Vision Stream
OpenAI: GPT-5
GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning,...
OpenAI400K128K$1.25
Vision Stream
OpenAI: GPT-5 Codex
GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering...
OpenAI400K128K$1.25
Vision Stream
OpenAI: GPT-5 Image
[GPT-5](https://openrouter.ai/openai/gpt-5) Image combines OpenAI's GPT-5 model ...
OpenAI400K128K$10.00
Vision Stream
OpenAI: GPT-5 Image Mini
GPT-5 Image Mini combines OpenAI's advanced language capabilities, powered by [G...
OpenAI400K128K$2.50
Vision Stream
OpenAI: GPT-5 Mini
GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reas...
OpenAI400K128K$0.25
Vision Stream
OpenAI: GPT-5 Nano
GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized fo...
OpenAI400K0$0.05
Vision Stream
OpenAI: GPT-5 Pro
GPT-5 Pro is OpenAI’s most advanced model, offering major improvements in reason...
OpenAI400K128K$15.00
Vision Stream
OpenAI: GPT-5.1
GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronge...
OpenAI400K128K$1.25
Vision Stream
OpenAI: GPT-5.1-Codex
GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software enginee...
OpenAI400K128K$1.25
Vision Stream
OpenAI: GPT-5.1-Codex-Max
GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-run...
OpenAI400K128K$1.25
Vision Stream
OpenAI: GPT-5.1-Codex-Mini
GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex...
OpenAI400K100K$0.25
Vision Stream
OpenAI: GPT-5.2
GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronge...
OpenAI400K128K$1.75
Vision Stream
OpenAI: GPT-5.2 Pro
GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agen...
OpenAI400K128K$21.00
Vision Stream
OpenAI: GPT-5.2-Codex
GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software eng...
OpenAI400K128K$1.75
Vision Stream
OpenAI: GPT-5.3-Codex
GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the fron...
OpenAI400K128K$1.75
Vision Stream
OpenAI: GPT-5.4 Mini
GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient...
OpenAI400K128K$0.75
Vision Stream
OpenAI: GPT-5.4 Nano
GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 f...
OpenAI400K128K$0.20
Vision Stream
Amazon: Nova Lite 1.0
Amazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focuse...
Amazon300K5K$0.06
Vision Stream
Amazon: Nova Pro 1.0
Amazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providi...
Amazon300K5K$0.80
Vision Stream
OpenAI: GPT-5.4 Image 2
[GPT-5.4](https://openrouter.ai/openai/gpt-5.4) Image 2 combines OpenAI's GPT-5....
OpenAI272K128K$8.00
Vision Stream
Arcee AI: Trinity Large Thinking
Trinity Large Thinking is a powerful open source reasoning model from the team a...
Arcee Ai262K262K$0.22
Stream
ByteDance Seed: Seed 1.6
Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It inco...
Bytedance Seed262K33K$0.25
Vision Stream
ByteDance Seed: Seed 1.6 Flash
Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed...
Bytedance Seed262K33K$0.07
Vision Stream
ByteDance Seed: Seed-2.0-Lite
Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers ...
Bytedance Seed262K131K$0.25
Vision Stream
ByteDance Seed: Seed-2.0-Mini
Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive sc...
Bytedance Seed262K131K$0.10
Vision Stream
Google: Gemma 4 26B A4B
Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from G...
Google262K0$0.06
Vision Stream
Google: Gemma 4 26B A4B (free)
Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from G...
Google262K33KFree
Vision Stream
Google: Gemma 4 31B
Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supportin...
Google262K262K$0.12
Vision Stream
Google: Gemma 4 31B (free)
Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supportin...
Google262K33KFree
Vision Stream
inclusionAI: Ling-2.6-1T
Ling-2.6-1T is an instant (instruct) model from inclusionAI and the company’s tr...
Inclusionai262K33K$0.07
Stream
inclusionAI: Ling-2.6-flash
Ling-2.6-flash is an instant (instruct) model from inclusionAI with 104B total p...
Inclusionai262K33K$0.01
Stream
inclusionAI: Ring-2.6-1T
Ring-2.6-1T is a 1T-parameter-scale thinking model with 63B active parameters, b...
Inclusionai262K66K$0.07
Stream
Mistral: Devstral 2 2512
Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in...
Mistral AI262K0$0.40
Stream
Mistral: Ministral 3 14B 2512
The largest model in the Ministral 3 family, Ministral 3 14B offers frontier cap...
Mistral AI262K0$0.20
Vision Stream
Mistral: Ministral 3 8B 2512
A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, effici...
Mistral AI262K0$0.15
Vision Stream
Mistral: Mistral Large 3 2512
Mistral Large 3 2512 is Mistral’s most capable model to date, featuring a sparse...
Mistral AI262K0$0.50
Vision Stream
Mistral: Mistral Medium 3.5
Mistral Medium 3.5 is a dense 128B instruction-following model from Mistral AI. ...
Mistral AI262K0$1.50
Vision Stream
Mistral: Mistral Small 4
Mistral Small 4 is the next major release in the Mistral Small family, unifying ...
Mistral AI262K0$0.15
Vision Stream
MoonshotAI: Kimi K2 0905
Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It i...
Moonshotai262K262K$0.60
Stream
MoonshotAI: Kimi K2 Thinking
Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, ex...
Moonshotai262K262K$0.60
Stream
MoonshotAI: Kimi K2.5
Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art ...
Moonshotai262K0$0.38
Vision Stream
MoonshotAI: Kimi K2.6
Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-h...
Moonshotai262K262K$0.68
Vision Stream
MoonshotAI: Kimi K2.7 Code
MoonshotAI: Kimi K2.7 Code is a coding-focused model in Moonshot AI's Kimi K2 fa...
Moonshotai262K0$0.95
Vision Stream
Morph: Morph V3 Large
Morph's high-accuracy apply model for complex code edits. ~4,500 tokens/sec with...
Morph262K131K$0.90
Stream
Nex AGI: Nex-N2-Pro (free)
Nex-N2-Pro is an agentic mixture-of-experts model from Nex AGI, with 17B active ...
Nex Agi262K262KFree
Vision Stream
NVIDIA: Nemotron 3 Nano 30B A3B
NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest comput...
NVIDIA262K228K$0.05
Stream
Poolside: Laguna M.1 (free)
Laguna M.1 is the flagship coding agent model from [Poolside](https://poolside.a...
Poolside262K33KFree
Stream
Poolside: Laguna XS.2 (free)
Laguna XS.2 is the second-generation model in the XS size class from [Poolside](...
Poolside262K33KFree
Stream
Qwen: Qwen3 235B A22B Instruct 2507
Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-ex...
Qwen262K16K$0.09
Stream
Qwen: Qwen3 235B A22B Thinking 2507
Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Expe...
Qwen262K262K$0.10
Stream
Qwen: Qwen3 Coder Next
Qwen3-Coder-Next is an open-weight causal language model optimized for coding ag...
Qwen262K262K$0.11
Stream
Qwen: Qwen3 Max
Qwen3-Max is an updated release built on the Qwen3 series, offering major improv...
Qwen262K33K$0.78
Stream
Qwen: Qwen3 Max Thinking
Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed...
Qwen262K33K$0.78
Stream
Qwen: Qwen3 Next 80B A3B Instruct
Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next...
Qwen262K16K$0.09
Stream
Qwen: Qwen3 Next 80B A3B Instruct (free)
Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next...
Qwen262K0Free
Stream
Qwen: Qwen3 Next 80B A3B Thinking
Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next li...
Qwen262K33K$0.10
Stream
Qwen: Qwen3 VL 235B A22B Instruct
Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies stro...
Qwen262K16K$0.20
Vision Stream
Qwen: Qwen3 VL 30B A3B Instruct
Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generat...
Qwen262K33K$0.13
Vision Stream
Qwen: Qwen3 VL 32B Instruct
Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed...
Qwen262K33K$0.10
Vision Stream
Qwen: Qwen3.5 397B A17B
The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid a...
Qwen262K66K$0.39
Vision Stream
Qwen: Qwen3.5-122B-A10B
The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architec...
Qwen262K262K$0.26
Vision Stream
Qwen: Qwen3.5-27B
The Qwen3.5 27B native vision-language Dense model incorporates a linear attenti...
Qwen262K66K$0.20
Vision Stream
Qwen: Qwen3.5-35B-A3B
The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hyb...
Qwen262K262K$0.14
Vision Stream
Qwen: Qwen3.5-9B
Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to...
Qwen262K262K$0.10
Vision Stream
Qwen: Qwen3.6 27B
Qwen3.6 27B is a dense 27-billion-parameter language model from the Qwen Team at...
Qwen262K262K$0.29
Vision Stream
Qwen: Qwen3.6 35B A3B
Qwen3.6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 bi...
Qwen262K262K$0.15
Vision Stream
Qwen: Qwen3.6 Max Preview
Qwen3.6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on ...
Qwen262K66K$1.04
Stream
StepFun: Step 3.5 Flash
Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on ...
Stepfun262K16K$0.09
Stream
Tencent: Hy3 preview
Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed ...
Tencent262K0$0.06
Stream
Xiaomi: MiMo-V2-Flash
MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. I...
Xiaomi262K66K$0.10
Stream
Z.ai: GLM 5 Turbo
GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong perf...
Z Ai262K131K$1.20
Stream
AI21: Jamba Large 1.7
Jamba Large 1.7 is the latest model in the Jamba open family, offering improveme...
AI21 Labs256K4K$2.00
Stream
Cohere: Command A
Command A is an open-weights 111B parameter model with a 256k context window foc...
Cohere256K8K$2.50
Stream
Kwaipilot: KAT-Coder-Pro V2
KAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder ser...
Kwaipilot256K80K$0.30
Stream
Mistral: Codestral 2508
Mistral's cutting-edge language model for coding released end of July 2025. Code...
Mistral AI256K0$0.30
Stream
NVIDIA: Nemotron 3 Nano 30B A3B (free)
NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest comput...
NVIDIA256K0Free
Stream
NVIDIA: Nemotron 3 Nano Omni (free)
NVIDIA Nemotron™ 3 Nano Omni is a 30B-A3B open multimodal model designed to func...
NVIDIA256K66KFree
Vision Stream
Qwen: Qwen3 VL 8B Instruct
Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL ser...
Qwen256K33K$0.08
Vision Stream
Qwen: Qwen3 VL 8B Thinking
Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multi...
Qwen256K33K$0.12
Vision Stream
Relace: Relace Apply 3
Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits...
Relace256K128K$0.85
Stream
Relace: Relace Search
The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to ex...
Relace256K128K$1.00
Stream
StepFun: Step 3.7 Flash
Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts...
Stepfun256K256K$0.20
Vision Stream
xAI: Grok Build 0.1
Grok Build 0.1 is xAI’s fast coding model trained specifically for agentic softw...
xAI256K0$1.00
Vision Stream
MiniMax: MiniMax M2
MiniMax-M2 is a compact, high-efficiency large language model optimized for end-...
Minimax205K197K$0.26
Stream
MiniMax: MiniMax M2.1
MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized f...
Minimax205K197K$0.29
Stream
MiniMax: MiniMax M2.5
MiniMax-M2.5 is a SOTA large language model designed for real-world productivity...
Minimax205K197K$0.15
Stream
MiniMax: MiniMax M2.7
MiniMax-M2.7 is a next-generation large language model designed for autonomous, ...
Minimax205K131K$0.25
Stream
Z.ai: GLM 4.6
Compared with GLM-4.5, this generation brings several key improvements: Longer c...
Z Ai203K131K$0.43
Stream
Z.ai: GLM 4.7
GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: en...
Z Ai203K131K$0.40
Stream
Z.ai: GLM 4.7 Flash
As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances perfo...
Z Ai203K16K$0.06
Stream
Z.ai: GLM 5
GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex sys...
Z Ai203K0$0.60
Stream
Z.ai: GLM 5.1
GLM-5.1 delivers a major leap in coding capability, with particularly significan...
Z Ai203K0$0.98
Stream
Anthropic: Claude 3 Haiku
Claude 3 Haiku is Anthropic's fastest and most compact model for near-instant re...
Anthropic200K4K$0.25
Vision Stream
Anthropic: Claude 3.5 Haiku
Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy...
Anthropic200K8K$0.80
Vision Stream
Anthropic: Claude Haiku 4.5
Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering nea...
Anthropic200K64K$1.00
Vision Stream
Anthropic: Claude Opus 4
Claude Opus 4 is benchmarked as the world’s best coding model, at time of releas...
Anthropic200K32K$15.00
Vision Stream
Anthropic: Claude Opus 4.1
Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering im...
Anthropic200K32K$15.00
Vision Stream
Anthropic: Claude Opus 4.5
Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex so...
Anthropic200K64K$5.00
Vision Stream
OpenAI: o1
The latest and strongest model family from OpenAI, o1 is designed to spend more ...
OpenAI200K100K$15.00
Vision Stream
OpenAI: o1-pro
The o1 series of models are trained with reinforcement learning to think before ...
OpenAI200K100K$150.00
Vision Stream
OpenAI: o3
o3 is a well-rounded and powerful model across domains. It sets a new standard f...
OpenAI200K100K$2.00
Vision Stream
OpenAI: o3 Deep Research
o3-deep-research is OpenAI's advanced model for deep research, designed to tackl...
OpenAI200K100K$10.00
Vision Stream
OpenAI: o3 Mini
OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning t...
OpenAI200K100K$1.10
Stream
OpenAI: o3 Mini High
OpenAI o3-mini-high is the same model as [o3-mini](/openai/o3-mini) with reasoni...
OpenAI200K100K$1.10
Stream
OpenAI: o3 Pro
The o-series of models are trained with reinforcement learning to think before t...
OpenAI200K100K$20.00
Vision Stream
OpenAI: o4 Mini
OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast,...
OpenAI200K100K$1.10
Vision Stream
OpenAI: o4 Mini Deep Research
o4-mini-deep-research is OpenAI's faster, more affordable deep research model—id...
OpenAI200K100K$2.00
Vision Stream
OpenAI: o4 Mini High
OpenAI o4-mini-high is the same model as [o4-mini](/openai/o4-mini) with reasoni...
OpenAI200K100K$1.10
Vision Stream
Free Models Router
The simplest way to get free inference. openrouter/free is a router that selects...
Openrouter200K0Free
Vision Stream
Perplexity: Sonar Pro
Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](h...
Perplexity200K8K$3.00
Vision Stream
Perplexity: Sonar Pro Search
Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is ...
Perplexity200K8K$3.00
Vision Stream
DeepSeek: DeepSeek V3 0324
DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration...
DeepSeek164K16K$0.20
Stream
DeepSeek: DeepSeek V3.1
DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) th...
DeepSeek164K33K$0.21
Stream
DeepSeek: DeepSeek V3.1 Terminus
DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v...
DeepSeek164K33K$0.27
Stream
DeepSeek: DeepSeek V3.2 Exp
DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek a...
DeepSeek164K66K$0.27
Stream
DeepSeek: R1
DeepSeek R1 is here: Performance on par with [OpenAI o1](/openai/o1), but open-s...
DeepSeek164K16K$0.70
Stream
DeepSeek: R1 0528
May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance...
DeepSeek164K33K$0.50
Stream
Meta: Llama Guard 4 12B
Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned...
Meta164K16K$0.18
Vision Stream
Qwen: Qwen3 Coder 30B A3B Instruct
Qwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model...
Qwen160K33K$0.07
Stream
Qwen: Qwen3 14B
Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series...
Qwen132K41K$0.10
Stream
AionLabs: Aion-1.0
Aion-1.0 is a multi-model system designed for high performance across various ta...
Aion Labs131K33K$4.00
Stream
AionLabs: Aion-1.0-Mini
Aion-1.0-Mini 32B parameter model is a distilled version of the DeepSeek-R1 mode...
Aion Labs131K33K$0.70
Stream
AionLabs: Aion-2.0
Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and s...
Aion Labs131K33K$0.80
Stream
Arcee AI: Trinity Mini
Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language m...
Arcee Ai131K131K$0.04
Stream
Arcee AI: Virtuoso Large
Virtuoso‑Large is Arcee's top‑tier general‑purpose LLM at 72 B parameters, tuned...
Arcee Ai131K64K$0.75
Stream
Baidu: ERNIE 4.5 VL 424B A47B
ERNIE-4.5-VL-424B-A47B is a multimodal Mixture-of-Experts (MoE) model from Baidu...
Baidu131K16K$0.42
Vision Stream
DeepSeek: DeepSeek V3
DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instru...
DeepSeek131K16K$0.20
Stream
DeepSeek: DeepSeek V3.2
DeepSeek-V3.2 is a large language model designed to harmonize high computational...
DeepSeek131K64K$0.23
Stream
Google: Gemma 3 12B
Gemma 3 introduces multimodality, supporting vision-language input and text outp...
Google131K16K$0.05
Vision Stream
Google: Gemma 3 27B
Gemma 3 introduces multimodality, supporting vision-language input and text outp...
Google131K16K$0.08
Vision Stream
Google: Gemma 3 4B
Gemma 3 introduces multimodality, supporting vision-language input and text outp...
Google131K16K$0.05
Vision Stream
Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview)
Gemini 3.1 Flash Image Preview, a.k.a. "Nano Banana 2," is Google’s latest state...
Google131K66K$0.50
Vision Stream
IBM: Granite 4.1 8B
Granite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from ...
Ibm Granite131K131K$0.05
Stream
Meta: Llama 3.1 70B Instruct
Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flav...
Meta131K16K$0.40
Stream
Meta: Llama 3.1 8B Instruct
Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flav...
Meta131K16K$0.02
Stream
Meta: Llama 3.2 11B Vision Instruct
Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed ...
Meta131K16K$0.34
Vision Stream
Meta: Llama 3.2 1B Instruct
Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently perf...
Meta131K60K$0.03
Stream
Meta: Llama 3.2 3B Instruct
Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimiz...
Meta131K80K$0.05
Stream
Meta: Llama 3.2 3B Instruct (free)
Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimiz...
Meta131K0Free
Stream
Meta: Llama 3.3 70B Instruct
The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and i...
Meta131K16K$0.10
Stream
Meta: Llama 3.3 70B Instruct (free)
The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and i...
Meta131K0Free
Stream
Microsoft: Phi 4 Mini Instruct
Phi-4-mini-instruct is a lightweight open model built upon synthetic data and fi...
Microsoft131K128K$0.08
Stream
Mistral Large 2407
This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407...
Mistral AI131K0$2.00
Stream
Mistral: Ministral 3 3B 2512
The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, effi...
Mistral AI131K0$0.10
Vision Stream
Mistral: Mistral Medium 3
Mistral Medium 3 is a high-performance enterprise-grade language model designed ...
Mistral AI131K0$0.40
Vision Stream
Mistral: Mistral Medium 3.1
Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-pe...
Mistral AI131K0$0.40
Vision Stream
Mistral: Mistral Nemo
A 12B parameter model with a 128k token context length built by Mistral in colla...
Mistral AI131K0$0.02
Stream
MoonshotAI: Kimi K2 0711
Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model develo...
Moonshotai131K33K$0.57
Stream
Nous: Hermes 3 405B Instruct
Hermes 3 is a generalist language model with many improvements over Hermes 2, in...
Nousresearch131K16K$1.00
Stream
Nous: Hermes 3 405B Instruct (free)
Hermes 3 is a generalist language model with many improvements over Hermes 2, in...
Nousresearch131K0Free
Stream
Nous: Hermes 3 70B Instruct
Hermes 3 is a generalist language model with many improvements over [Hermes 2](/...
Nousresearch131K16K$0.70
Stream
Nous: Hermes 4 405B
Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and relea...
Nousresearch131K0$1.00
Stream
Nous: Hermes 4 70B
Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama...
Nousresearch131K0$0.13
Stream
NVIDIA: Llama 3.3 Nemotron Super 49B V1.5
Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/...
NVIDIA131K16K$0.40
Stream
OpenAI: gpt-oss-120b
gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language...
OpenAI131K0$0.04
Stream
OpenAI: gpt-oss-120b (free)
gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language...
OpenAI131K131KFree
Stream
OpenAI: gpt-oss-20b
gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the A...
OpenAI131K0$0.03
Stream
OpenAI: gpt-oss-20b (free)
gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the A...
OpenAI131K8KFree
Stream
OpenAI: gpt-oss-safeguard-20b
gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss...
OpenAI131K66K$0.07
Stream
Prime Intellect: INTELLECT-3
INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-train...
Prime Intellect131K131K$0.20
Stream
Qwen: Qwen2.5 7B Instruct
Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings th...
Qwen131K33K$0.04
Stream
Qwen: Qwen2.5 VL 72B Instruct
Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, f...
Qwen131K128K$0.80
Vision Stream
Qwen: Qwen3 235B A22B
Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by ...
Qwen131K8K$0.45
Stream
Qwen: Qwen3 30B A3B
Qwen3, the latest generation in the Qwen large language model series, features b...
Qwen131K16K$0.12
Stream
Qwen: Qwen3 30B A3B Instruct 2507
Qwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter mixture-of-experts language mod...
Qwen131K32K$0.05
Stream
Qwen: Qwen3 30B A3B Thinking 2507
Qwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning mode...
Qwen131K131K$0.08
Stream
Qwen: Qwen3 32B
Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series...
Qwen131K16K$0.08
Stream
Qwen: Qwen3 8B
Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, ...
Qwen131K8K$0.05
Stream
Qwen: Qwen3 VL 235B A22B Thinking
Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text gener...
Qwen131K33K$0.26
Vision Stream
Qwen: Qwen3 VL 30B A3B Thinking
Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generat...
Qwen131K33K$0.13
Vision Stream
Qwen2.5 72B Instruct
Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings t...
Qwen131K16K$0.36
Stream
Sao10K: Llama 3.1 Euryale 70B v2.2
Euryale L3.1 70B v2.2 is a model focused on creative roleplay from [Sao10k](http...
Sao10k131K16K$0.85
Stream
Sao10K: Llama 3.3 Euryale 70B
Euryale L3.3 70B is a model focused on creative roleplay from [Sao10k](https://k...
Sao10k131K16K$0.65
Stream
Switchpoint Router
Switchpoint AI's router instantly analyzes your request and directs it to the op...
Switchpoint131K0$0.85
Stream
Tencent: Hunyuan A13B Instruct
Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model d...
Tencent131K131K$0.14
Stream
TheDrummer: Cydonia 24B V4.1
Uncensored and creative writing model based on Mistral Small 3.2 24B with good r...
Thedrummer131K131K$0.30
Stream
Z.ai: GLM 4.5
GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based a...
Z Ai131K98K$0.60
Stream
Z.ai: GLM 4.5 Air
GLM-4.5-Air is the lightweight variant of our latest flagship model family, also...
Z Ai131K131K$0.13
Stream
Z.ai: GLM 4.6V
GLM-4.6V is a large multimodal model designed for high-fidelity visual understan...
Z Ai131K33K$0.30
Vision Stream
IBM: Granite 4.0 Micro
Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These...
Ibm Granite131K131K$0.02
Stream
Amazon: Nova Micro 1.0
Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency resp...
Amazon128K5K$0.04
Stream
ByteDance: UI-TARS 7B
UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based enviro...
Bytedance128K2K$0.10
Vision Stream
Cohere: Command R (08-2024)
command-r-08-2024 is an update of the [Command R](/models/cohere/command-r) with...
Cohere128K4K$0.15
Stream
Cohere: Command R+ (08-2024)
command-r-plus-08-2024 is an update of the [Command R+](/models/cohere/command-r...
Cohere128K4K$2.50
Stream
Cohere: Command R7B (12-2024)
Command R7B (12-2024) is a small, fast update of the Command R+ model, delivered...
Cohere128K4K$0.04
Stream
Deep Cogito: Cogito v2.1 671B
Cogito v2.1 671B MoE represents one of the strongest open models globally, match...
Deepcogito128K0$1.25
Stream
DeepSeek: R1 Distill Llama 70B
DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llam...
DeepSeek128K8K$0.80
Stream
DeepSeek: R1 Distill Qwen 32B
DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen ...
DeepSeek128K33K$0.29
Stream
Inception: Mercury 2
Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion ...
Inception128K50K$0.25
Stream
LiquidAI: LFM2-24B-A2B
LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures des...
Liquid128K0$0.03
Stream
Mistral Large
This is Mistral AI's flagship model, Mistral Large 2 (version `mistral-large-240...
Mistral AI128K0$2.00
Stream
Mistral: Mistral Small 3.1 24B
Mistral Small 3.1 24B Instruct is an upgraded variant of Mistral Small 3 (2501),...
Mistral AI128K128K$0.35
Vision Stream
Mistral: Mistral Small 3.2 24B
Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistr...
Mistral AI128K16K$0.07
Vision Stream
NVIDIA: Nemotron 3.5 Content Safety (free)
NVIDIA Nemotron 3.5 Content Safety is a compact 4B-parameter multimodal guardrai...
NVIDIA128K8KFree
Vision Stream
NVIDIA: Nemotron Nano 12B 2 VL (free)
NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning mo...
NVIDIA128K128KFree
Vision Stream
NVIDIA: Nemotron Nano 9B V2 (free)
NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch ...
NVIDIA128K0Free
Stream
OpenAI: GPT Audio
The gpt-audio model is OpenAI's first generally available audio model. The new s...
OpenAI128K16K$2.50
Stream
OpenAI: GPT Audio Mini
A cost-efficient version of GPT Audio. The new snapshot features an upgraded dec...
OpenAI128K16K$0.60
Stream
OpenAI: GPT-4 Turbo
The latest GPT-4 Turbo model with vision capabilities. Vision requests can now u...
OpenAI128K4K$10.00
Vision Stream
OpenAI: GPT-4 Turbo Preview
The preview GPT-4 model with improved instruction following, JSON mode, reproduc...
OpenAI128K4K$10.00
Stream
OpenAI: GPT-4o
GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and im...
OpenAI128K16K$2.50
Vision Stream
OpenAI: GPT-4o (2024-05-13)
GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and im...
OpenAI128K4K$5.00
Vision Stream
OpenAI: GPT-4o (2024-08-06)
The 2024-08-06 version of GPT-4o offers improved performance in structured outpu...
OpenAI128K16K$2.50
Vision Stream
OpenAI: GPT-4o (2024-11-20)
The 2024-11-20 version of GPT-4o offers a leveled-up creative writing ability wi...
OpenAI128K16K$2.50
Vision Stream
OpenAI: GPT-4o Search Preview
GPT-4o Search Previewis a specialized model for web search in Chat Completions. ...
OpenAI128K16K$2.50
Stream
OpenAI: GPT-4o-mini
GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), ...
OpenAI128K16K$0.15
Vision Stream
OpenAI: GPT-4o-mini (2024-07-18)
GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), ...
OpenAI128K16K$0.15
Vision Stream
OpenAI: GPT-4o-mini Search Preview
GPT-4o mini Search Preview is a specialized model for web search in Chat Complet...
OpenAI128K16K$0.15
Stream
OpenAI: GPT-5 Chat
GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conv...
OpenAI128K16K$1.25
Vision Stream
OpenAI: GPT-5.1 Chat
GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, opt...
OpenAI128K32K$1.25
Vision Stream
OpenAI: GPT-5.2 Chat
GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, op...
OpenAI128K16K$1.75
Vision Stream
OpenAI: GPT-5.3 Chat
GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conve...
OpenAI128K16K$1.75
Vision Stream
Body Builder (beta)
Transform your natural language requests into structured OpenRouter API request ...
Openrouter128K0Variable
Stream
OpenRouter: Fusion
Fusion turns your prompt into a small multi-model deliberation. A panel of exper...
Openrouter128K0Variable
Stream
Perplexity: Sonar Deep Research
Sonar Deep Research is a research-focused model designed for multi-step retrieva...
Perplexity128K0$2.00
Stream
Perplexity: Sonar Reasoning Pro
Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](h...
Perplexity128K0$2.00
Vision Stream
Qwen2.5 Coder 32B Instruct
Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (...
Qwen128K33K$0.66
Stream
Upstage: Solar Pro 3
Solar Pro 3 is Upstage's powerful Mixture-of-Experts (MoE) language model. With ...
Upstage128K0$0.15
Stream
Perplexity: Sonar
Sonar is lightweight, affordable, fast, and simple to use — now featuring citati...
Perplexity127K0$1.00
Vision Stream
Morph: Morph V3 Fast
Morph's fastest apply model for code edits. ~10,500 tokens/sec with 96% accuracy...
Morph82K38K$0.80
Stream
AllenAI: Olmo 3 32B Think
Olmo 3 32B Think is a large-scale, 32-billion-parameter model purpose-built for ...
Allenai66K66K$0.15
Stream
Google: Nano Banana Pro (Gemini 3 Pro Image Preview)
Nano Banana Pro is Google’s most advanced image-generation and editing model, bu...
Google66K33K$2.00
Vision Stream
WizardLM-2 8x22B
WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates h...
Microsoft66K8K$0.62
Stream
MiniMax: MiniMax M2-her
MiniMax M2-her is a dialogue-first large language model built for immersive role...
Minimax66K2K$0.30
Stream
Mistral: Mixtral 8x22B Instruct
Mistral's official instruct fine-tuned version of [Mixtral 8x22B](/models/mistra...
Mistral AI66K0$2.00
Stream
Reka Flash 3
Reka Flash 3 is a general-purpose, instruction-tuned large language model with 2...
Rekaai66K66K$0.10
Stream
Z.ai: GLM 4.5V
GLM-4.5V is a vision-language foundation model for multimodal agent applications...
Z Ai66K16K$0.60
Vision Stream
AionLabs: Aion-RP 1.0 (8B)
Aion-RP-Llama-3.1-8B ranks the highest in the character evaluation portion of th...
Aion Labs33K33K$0.80
Stream
Magnum v4 72B
This is a series of models designed to replicate the prose quality of the Claude...
Anthracite Org33K2K$3.00
Stream
Arcee AI: Coder Large
Coder‑Large is a 32 B‑parameter offspring of Qwen 2.5‑Instruct that has been fur...
Arcee Ai33K0$0.50
Stream
Venice: Uncensored (free)
Venice Uncensored Dolphin Mistral 24B Venice Edition is a fine-tuned variant of ...
Cognitivecomputations33K0Free
Stream
EssentialAI: Rnj 1 Instruct
Rnj-1 is an 8B-parameter, dense, open-weight model family developed by Essential...
Essentialai33K0$0.15
Stream
Google: Gemma 3n 4B
Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource ...
Google33K0$0.06
Stream
Google: Nano Banana (Gemini 2.5 Flash Image)
Gemini 2.5 Flash Image, a.k.a. "Nano Banana," is now generally available. It is ...
Google33K33K$0.30
Vision Stream
LiquidAI: LFM2.5-1.2B-Instruct (free)
LFM2.5-1.2B-Instruct is a compact, high-performance instruction-tuned model buil...
Liquid33K0Free
Stream
LiquidAI: LFM2.5-1.2B-Thinking (free)
LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agen...
Liquid33K0Free
Stream
Mistral: Mistral Small 3
Mistral Small 3 is a 24B-parameter language model optimized for low-latency perf...
Mistral AI33K16K$0.05
Stream
Mistral: Saba
Mistral Saba is a 24B-parameter language model specifically designed for the Mid...
Mistral AI33K0$0.20
Stream
Perceptron: Perceptron Mk1
Perceptron Mk1 (Mark One) is Perceptron's highest-quality vision-language model ...
Perceptron33K8K$0.15
Vision Stream
TheDrummer: Rocinante 12B
Rocinante 12B is designed for engaging storytelling and rich prose. Early tester...
Thedrummer33K33K$0.17
Stream
TheDrummer: Skyfall 36B V2
Skyfall 36B v2 is an enhanced iteration of Mistral Small 2501, specifically fine...
Thedrummer33K33K$0.55
Stream
TheDrummer: UnslopNemo 12B
UnslopNemo v4.1 is the latest addition from the creator of Rocinante, designed f...
Thedrummer33K33K$0.40
Stream
Mistral: Voxtral Small 24B 2507
Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-a...
Mistral AI32K0$0.10
Stream
OpenAI: GPT-3.5 Turbo
GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural ...
OpenAI16K4K$0.50
Stream
OpenAI: GPT-3.5 Turbo 16k
This model offers four times the context length of gpt-3.5-turbo, allowing it to...
OpenAI16K4K$3.00
Stream
Microsoft: Phi 4
[Microsoft Research](/microsoft) Phi-4 is designed to perform well in complex re...
Microsoft16K16K$0.07
Stream
Reka Edge
Reka Edge is an extremely efficient 7B multimodal vision-language model that acc...
Rekaai16K16K$0.10
Vision Stream
Sao10K: Llama 3.1 70B Hanami x1
This is [Sao10K](/sao10k)'s experiment over [Euryale v2.2](/sao10k/l3.1-euryale-...
Sao10k16K0$3.00
Stream
Google: Gemma 2 27B
Gemma 2 27B by Google is an open model built from the same research and technolo...
Google8K2K$0.65
Stream
Meta: Llama 3 70B Instruct
Meta's latest class of model (Llama 3) launched with a variety of sizes & flavor...
Meta8K8K$0.51
Stream
Meta: Llama 3 8B Instruct
Meta's latest class of model (Llama 3) launched with a variety of sizes & flavor...
Meta8K0$0.14
Stream
Sao10K: Llama 3 8B Lunaris
Lunaris 8B is a versatile generalist and roleplaying model based on Llama 3. It'...
Sao10k8K16K$0.04
Stream
OpenAI: GPT-4
OpenAI's flagship model, GPT-4 is a large-scale multimodal language model capabl...
OpenAI8K4K$30.00
Stream
Inflection: Inflection 3 Pi
Inflection 3 Pi powers Inflection's [Pi](https://pi.ai) chatbot, including backs...
Inflection8K1K$2.50
Stream
Inflection: Inflection 3 Productivity
Inflection 3 Productivity is optimized for following instructions. It is better ...
Inflection8K1K$2.50
Stream
Mancer: Weaver (alpha)
An attempt to recreate Claude-style verbosity, but don't expect the same level o...
Mancer8K2K$0.75
Stream
ReMM SLERP 13B
A recreation trial of the original MythoMax-L2-B13 but with updated models. #mer...
Undi956K4K$0.45
Stream
MythoMax 13B
One of the highest performing and most popular fine-tunes of Llama 2 13B, with r...
Gryphe4K4K$0.06
Stream
OpenAI: GPT-3.5 Turbo (older v0613)
GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural ...
OpenAI4K4K$1.00
Stream
OpenAI: GPT-3.5 Turbo Instruct
This model is a variant of GPT-3.5 Turbo tuned for instructional prompts and omi...
OpenAI4K4K$1.50
Stream

Understanding LLM context windows

What counts toward the limit?

The context window includes everything in the prompt: system instructions, conversation history, retrieved documents, and the space reserved for the model's response. If your input plus expected output exceeds the limit, you need a model with a larger window or a chunking strategy.

Common context window tiers

4K–8K: simple prompts and short chats.
32K–128K: long articles, code files, and medium conversations.
1M+: books, video transcripts, and large knowledge bases.

Context vs. cost trade-off

Models with very long context often charge more per token and can be slower. For many applications, splitting documents into smaller chunks and using retrieval is cheaper and more accurate than sending everything to a mega-context model.

Filter by capability

Use the capability badges to find models that support vision, function calling, JSON mode, or streaming. Combine filters to narrow down the exact model for your production workload.