LLM Model Comparison
16 models · Updated 2025 · Prices per 1M tokens
Prices shown are standard API rates as of May 2025. Batch/cached rates may be lower. Open-source models shown as "Free / Self-host" — inference costs vary by provider.
About
This tool provides a side-by-side comparison of 16 leading large language models including GPT-4o, GPT-4-Turbo, Claude 3.5 Sonnet, Claude 3 Haiku, Gemini 1.5 Pro, Gemini 1.5 Flash, Llama 3.1 405B, Llama 3.1 70B, Mistral Large 2, DeepSeek-V2, Qwen 2.5 72B, and more. Each model entry includes context window size, input/output pricing per million tokens, supported modalities (text/image/audio/video/code), knowledge cutoff date, open/closed source status, license, key strengths, and release date. Filter by provider, open-source status, or modality. Sort by any column. All data is embedded client-side — no API calls or server needed.
How to use
- 1 Browse the table to compare all 16 models at a glance.
- 2 Type in the search box to filter by model name, provider, or keyword.
- 3 Click a column header to sort by that property (click again to reverse).
- 4 Use the filter chips to show only open-source models or filter by modality.
- 5 Click any model row to expand its full details including strengths and license.
- Which LLM has the largest context window?
- As of 2025, Gemini 1.5 Pro leads with a 2 million token context window, followed by Claude 3.5 Sonnet at 200k and GPT-4-Turbo at 128k. Llama 3.1 models support up to 128k tokens.
- What does input vs output pricing mean?
- LLM APIs charge separately for prompt tokens (input) and generated tokens (output). Output typically costs 3-5x more than input because generation is computationally heavier. Pricing is quoted per 1 million tokens.
- What is the difference between open-source and closed-source LLMs?
- Open-source LLMs (Llama, Mistral, Qwen, DeepSeek) publish weights publicly — you can run them locally or fine-tune them without API costs. Closed-source models (GPT, Claude, Gemini) are accessed only through paid APIs.
- What are modalities in the context of LLMs?
- Modalities refer to the types of input and output a model supports: text (all models), image input (GPT-4V, Claude 3, Gemini), audio (Whisper, Gemini), video (Gemini 1.5), and code generation. Multimodal models process more than one type.