Every AI model, in one place.
Pricing, benchmarks, provider latency, and how teams actually use each one.
- amazonAmazon: Nova Micro 1.0
Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in the Amazon Nova family of models at a very low cost. With a context length...
Language128k ctx$0.04/M - amazonAmazon: Nova Pro 1.0
Amazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providing a combination of accuracy, speed, and cost for a wide range of tasks. As of December...
Language300k ctx$0.80/M - openaiOpenAI: GPT-4o (2024-11-20)
The 2024-11-20 version of GPT-4o offers a leveled-up creative writing ability with more natural, engaging, and tailored writing to improve relevance & readability. It’s also better at working with uploaded...
Language128k ctx$2.50/M - mistralaiMistral Large 2407
This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch announcement [here](https://mistral.ai/news/mistral-large-2407/)....
Language131k ctx$2.00/M - qwenQwen2.5 Coder 32B Instruct
Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). Qwen2.5-Coder brings the following improvements upon CodeQwen1.5: - Significantly improvements in **code generation**, **code reasoning**...
Language33k ctx$0.66/M - TthedrummerTheDrummer: UnslopNemo 12B
UnslopNemo v4.1 is the latest addition from the creator of Rocinante, designed for adventure writing and role-play scenarios.
Language33k ctx$0.40/M - anthropicAnthropic: Claude 3.5 Haiku
Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use. Engineered to excel in real-time applications, it delivers quick response times that are essential for dynamic...
Language200k ctx$0.80/M - AOanthracite-orgMagnum v4 72B
This is a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet(https://openrouter.ai/anthropic/claude-3.5-sonnet) and Opus(https://openrouter.ai/anthropic/claude-3-opus). The model is fine-tuned on top of [Qwen2.5 72B](https://openrouter.ai/qwen/qwen-2.5-72b-instruct).
Language16k ctx$3.00/M - qwenQwen: Qwen2.5 7B Instruct
Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and...
Language33k ctx$0.04/M - inflectionInflection: Inflection 3 Pi
Inflection 3 Pi powers Inflection's [Pi](https://pi.ai) chatbot, including backstory, emotional intelligence, productivity, and safety. It has access to recent news, and excels in scenarios like customer support and roleplay. Pi...
Language8k ctx$2.50/M - inflectionInflection: Inflection 3 Productivity
Inflection 3 Productivity is optimized for following instructions. It is better for tasks requiring JSON output or precise adherence to provided guidelines. It has access to recent news. For emotional...
Language8k ctx$2.50/M - TthedrummerTheDrummer: Rocinante 12B
Rocinante 12B is designed for engaging storytelling and rich prose. Early testers have reported: - Expanded vocabulary with unique and expressive word choices - Enhanced creativity for vivid narratives -...
Language33k ctx$0.17/M - meta-llamaMeta: Llama 3.2 11B Vision Instruct
Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data. It excels in tasks such as image captioning and...
Language131k ctx$0.34/M - meta-llamaMeta: Llama 3.2 1B Instruct
Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently performing natural language tasks, such as summarization, dialogue, and multilingual text analysis. Its smaller size allows it to operate...
Language60k ctx$0.03/M - meta-llamaMeta: Llama 3.2 3B Instruct
Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization. Designed with the latest transformer architecture, it...
Language80k ctx$0.05/M - meta-llamaMeta: Llama 3.2 3B Instruct (free)
Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization. Designed with the latest transformer architecture, it...
Language131k ctx$0.00/M - qwenQwen2.5 72B Instruct
Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and...
Language33k ctx$0.36/M - cohereCohere: Command R (08-2024)
command-r-08-2024 is an update of the [Command R](/models/cohere/command-r) with improved performance for multilingual retrieval-augmented generation (RAG) and tool use. More broadly, it is better at math, code and reasoning and...
Language128k ctx$0.15/M - cohereCohere: Command R+ (08-2024)
command-r-plus-08-2024 is an update of the [Command R+](/models/cohere/command-r-plus) with roughly 50% higher throughput and 25% lower latencies as compared to the previous Command R+ version, while keeping the hardware footprint...
Language128k ctx$2.50/M - Ssao10kSao10K: Llama 3.1 Euryale 70B v2.2
Euryale L3.1 70B v2.2 is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). It is the successor of [Euryale L3 70B v2.1](/models/sao10k/l3-euryale-70b).
Language131k ctx$0.85/M - NnousresearchNous: Hermes 3 70B Instruct
Hermes 3 is a generalist language model with many improvements over [Hermes 2](/models/nousresearch/nous-hermes-2-mistral-7b-dpo), including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the...
Language131k ctx$0.70/M - NnousresearchNous: Hermes 3 405B Instruct
Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the...
Language131k ctx$1.00/M - NnousresearchNous: Hermes 3 405B Instruct (free)
Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the...
Language131k ctx$0.00/M - Ssao10kSao10K: Llama 3 8B Lunaris
Lunaris 8B is a versatile generalist and roleplaying model based on Llama 3. It's a strategic merge of multiple models, designed to balance creativity with improved logic and general knowledge....
Language8k ctx$0.04/M - openaiOpenAI: GPT-4o (2024-08-06)
The 2024-08-06 version of GPT-4o offers improved performance in structured outputs, with the ability to supply a JSON schema in the respone_format. Read more [here](https://openai.com/index/introducing-structured-outputs-in-the-api/). GPT-4o ("o" for "omni") is...
Language128k ctx$2.50/M - meta-llamaMeta: Llama 3.1 70B Instruct
Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 70B instruct-tuned version is optimized for high quality dialogue usecases. It has demonstrated strong...
Language131k ctx$0.40/M - meta-llamaMeta: Llama 3.1 8B Instruct
Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 8B instruct-tuned version is fast and efficient. It has demonstrated strong performance compared to...
Language131k ctx$0.02/M - mistralaiMistral: Mistral Nemo
A 12B parameter model with a 128k token context length built by Mistral in collaboration with NVIDIA. The model is multilingual, supporting English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese,...
Language131k ctx$0.02/M - openaiOpenAI: GPT-4o-mini
GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more affordable...
Language128k ctx$0.15/M - openaiOpenAI: GPT-4o-mini (2024-07-18)
GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more affordable...
Language128k ctx$0.15/M - googleGoogle: Gemma 2 27B
Gemma 2 27B by Google is an open model built from the same research and technology used to create the [Gemini models](/models?q=gemini). Gemma models are well-suited for a variety of...
Language8k ctx$0.65/M - openaiOpenAI: GPT-4o
GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of [GPT-4 Turbo](/models/openai/gpt-4-turbo) while being twice as...
Language128k ctx$2.50/M - openaiOpenAI: GPT-4o (2024-05-13)
GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of [GPT-4 Turbo](/models/openai/gpt-4-turbo) while being twice as...
Language128k ctx$5.00/M - meta-llamaMeta: Llama 3 70B Instruct
Meta's latest class of model (Llama 3) launched with a variety of sizes & flavors. This 70B instruct-tuned version was optimized for high quality dialogue usecases. It has demonstrated strong...
Language8k ctx$0.51/M - meta-llamaMeta: Llama 3 8B Instruct
Meta's latest class of model (Llama 3) launched with a variety of sizes & flavors. This 8B instruct-tuned version was optimized for high quality dialogue usecases. It has demonstrated strong...
Language8k ctx$0.14/M - mistralaiMistral: Mixtral 8x22B Instruct
Mistral's official instruct fine-tuned version of [Mixtral 8x22B](/models/mistralai/mixtral-8x22b). It uses 39B active parameters out of 141B, offering unparalleled cost efficiency for its size. Its strengths include: - strong math, coding,...
Language66k ctx$2.00/M - microsoftWizardLM-2 8x22B
WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates highly competitive performance compared to leading proprietary models, and it consistently outperforms all existing state-of-the-art opensource models. It is...
Language66k ctx$0.62/M - openaiOpenAI: GPT-4 Turbo
The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and function calling. Training data: up to December 2023.
Language128k ctx$10.00/M - anthropicAnthropic: Claude 3 Haiku
Claude 3 Haiku is Anthropic's fastest and most compact model for near-instant responsiveness. Quick and accurate targeted performance. See the launch announcement and benchmark results [here](https://www.anthropic.com/news/claude-3-haiku) #multimodal
Language200k ctx$0.25/M - mistralaiMistral Large
This is Mistral AI's flagship model, Mistral Large 2 (version `mistral-large-2407`). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch announcement [here](https://mistral.ai/news/mistral-large-2407/)....
Language128k ctx$2.00/M - openaiOpenAI: GPT-3.5 Turbo (older v0613)
GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks. Training data up to Sep 2021.
Language4k ctx$1.00/M - openaiOpenAI: GPT-4 Turbo Preview
The preview GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Training data: up to Dec 2023. **Note:** heavily rate limited by OpenAI while...
Language128k ctx$10.00/M - openrouterAuto Router
Your prompt will be processed by a meta-model and routed to one of dozens of models (see below), optimizing for the best possible output. To see which model was used,...
Image2000k ctxFree tier - openaiOpenAI: GPT-3.5 Turbo Instruct
This model is a variant of GPT-3.5 Turbo tuned for instructional prompts and omitting chat-related optimizations. Training data: up to Sep 2021.
Language4k ctx$1.50/M - openaiOpenAI: GPT-3.5 Turbo 16k
This model offers four times the context length of gpt-3.5-turbo, allowing it to support approximately 20 pages of text in a single request at a higher cost. Training data: up...
Language16k ctx$3.00/M - MmancerMancer: Weaver (alpha)
An attempt to recreate Claude-style verbosity, but don't expect the same level of coherence or memory. Meant for use in roleplay/narrative situations.
Language8k ctx$0.75/M - Uundi95ReMM SLERP 13B
A recreation trial of the original MythoMax-L2-B13 but with updated models. #merge
Language6k ctx$0.45/M - GgrypheMythoMax 13B
One of the highest performing and most popular fine-tunes of Llama 2 13B, with rich descriptions and roleplay. #merge
Language4k ctx$0.06/M