Every AI model, in one place.

Pricing, benchmarks, provider latency, and how teams actually use each one.

349 matches

qwen
Qwen: Qwen3 8B
Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seamless switching between "thinking" mode for math,...
Language41k ctx
$0.05/M
qwen
Qwen: Qwen3 14B
Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for...
Language41k ctx
$0.10/M
qwen
Qwen: Qwen3 32B
Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for...
Language41k ctx
$0.08/M
qwen
Qwen: Qwen3 235B A22B
Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, activating 22B parameters per forward pass. It supports seamless switching between a "thinking" mode for complex reasoning, math, and...
Language131k ctx
$0.46/M
openai
OpenAI: o4 Mini High
OpenAI o4-mini-high is the same model as [o4-mini](/openai/o4-mini) with reasoning_effort set to high. OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining...
Language200k ctx
$1.10/M
openai
OpenAI: o3
o3 is a well-rounded and powerful model across domains. It sets a new standard for math, science, coding, and visual reasoning tasks. It also excels at technical writing and instruction-following....
Language200k ctx
$2.00/M
openai
OpenAI: o4 Mini
OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining strong multimodal and agentic capabilities. It supports tool use and demonstrates competitive reasoning...
Language200k ctx
$1.10/M
openai
OpenAI: GPT-4.1
GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and...
Language1048k ctx
$2.00/M
openai
OpenAI: GPT-4.1 Mini
GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard...
Language1048k ctx
$0.40/M
openai
OpenAI: GPT-4.1 Nano
For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million...
Language1048k ctx
$0.10/M
meta-llama
Meta: Llama 4 Maverick
Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward...
Language1049k ctx
$0.15/M
meta-llama
Meta: Llama 4 Scout
Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input...
Language328k ctx
$0.10/M
deepseek
DeepSeek: DeepSeek V3 0324
DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team. It succeeds the [DeepSeek V3](/deepseek/deepseek-chat-v3) model and performs really well...
Language164k ctx
$0.20/M
openai
OpenAI: o1-pro
The o1 series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o1-pro model uses more compute to think harder and provide...
Language200k ctx
$150.00/M
mistralai
Mistral: Mistral Small 3.1 24B
Mistral Small 3.1 24B Instruct is an upgraded variant of Mistral Small 3 (2501), featuring 24 billion parameters with advanced multimodal capabilities. It provides state-of-the-art performance in text-based reasoning and...
Language128k ctx
$0.35/M
google
Google: Gemma 3 4B
Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...
Language131k ctx
$0.05/M
google
Google: Gemma 3 12B
Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...
Language131k ctx
$0.05/M
cohere
Cohere: Command A
Command A is an open-weights 111B parameter model with a 256k context window focused on delivering great performance across agentic, multilingual, and coding use cases. Compared to other leading proprietary...
Language256k ctx
$2.50/M
openai
OpenAI: GPT-4o-mini Search Preview
GPT-4o mini Search Preview is a specialized model for web search in Chat Completions. It is trained to understand and execute web search queries.
Language128k ctx
$0.15/M
openai
OpenAI: GPT-4o Search Preview
GPT-4o Search Previewis a specialized model for web search in Chat Completions. It is trained to understand and execute web search queries.
Language128k ctx
$2.50/M
R
rekaai
Reka Flash 3
Reka Flash 3 is a general-purpose, instruction-tuned large language model with 21 billion parameters, developed by Reka. It excels at general chat, coding tasks, instruction-following, and function calling. Featuring a...
Language66k ctx
$0.10/M
google
Google: Gemma 3 27B
Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...
Language131k ctx
$0.08/M
T
thedrummer
TheDrummer: Skyfall 36B V2
Skyfall 36B v2 is an enhanced iteration of Mistral Small 2501, specifically fine-tuned for improved creativity, nuanced writing, role-playing, and coherent storytelling.
Language33k ctx
$0.55/M
perplexity
Perplexity: Sonar Reasoning Pro
Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perplexity.ai/guides/pricing#detailed-pricing-breakdown-for-sonar-reasoning-pro-and-sonar-pro) Sonar Reasoning Pro is a premier reasoning model powered by DeepSeek R1 with Chain of Thought (CoT). Designed for...
Language128k ctx
$2.00/M
perplexity
Perplexity: Sonar Pro
Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perplexity.ai/guides/pricing#detailed-pricing-breakdown-for-sonar-reasoning-pro-and-sonar-pro) For enterprises seeking more advanced capabilities, the Sonar Pro API can handle in-depth, multi-step queries with added extensibility, like...
Language200k ctx
$3.00/M
perplexity
Perplexity: Sonar Deep Research
Sonar Deep Research is a research-focused model designed for multi-step retrieval, synthesis, and reasoning across complex topics. It autonomously searches, reads, and evaluates sources, refining its approach as it gathers...
Language128k ctx
$2.00/M
mistralai
Mistral: Saba
Mistral Saba is a 24B-parameter language model specifically designed for the Middle East and South Asia, delivering accurate and contextually relevant responses while maintaining efficient performance. Trained on curated regional...
Language33k ctx
$0.20/M
openai
OpenAI: o3 Mini High
OpenAI o3-mini-high is the same model as [o3-mini](/openai/o3-mini) with reasoning_effort set to high. o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and...
Language200k ctx
$1.10/M
AL
aion-labs
AionLabs: Aion-1.0
Aion-1.0 is a multi-model system designed for high performance across various tasks, including reasoning and coding. It is built on DeepSeek-R1, augmented with additional models and techniques such as Tree...
Language131k ctx
$4.00/M
AL
aion-labs
AionLabs: Aion-1.0-Mini
Aion-1.0-Mini 32B parameter model is a distilled version of the DeepSeek-R1 model, designed for strong performance in reasoning domains such as mathematics, coding, and logic. It is a modified variant...
Language131k ctx
$0.70/M
AL
aion-labs
AionLabs: Aion-RP 1.0 (8B)
Aion-RP-Llama-3.1-8B ranks the highest in the character evaluation portion of the RPBench-Auto benchmark, a roleplaying-specific variant of Arena-Hard-Auto, where LLMs evaluate each other’s responses. It is a fine-tuned base model...
Language33k ctx
$0.80/M
qwen
Qwen: Qwen2.5 VL 72B Instruct
Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects. It is also highly capable of analyzing texts, charts, icons, graphics, and layouts within images.
Language128k ctx
$0.80/M
qwen
Qwen: Qwen-Plus
Qwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a balanced performance, speed, and cost combination.
Language1000k ctx
$0.26/M
openai
OpenAI: o3 Mini
OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and coding. This model supports the `reasoning_effort` parameter, which can be set to...
Language200k ctx
$1.10/M
mistralai
Mistral: Mistral Small 3
Mistral Small 3 is a 24B-parameter language model optimized for low-latency performance across common AI tasks. Released under the Apache 2.0 license, it features both pre-trained and instruction-tuned versions designed...
Language33k ctx
$0.05/M
perplexity
Perplexity: Sonar
Sonar is lightweight, affordable, fast, and simple to use — now featuring citations and the ability to customize sources. It is designed for companies seeking to integrate lightweight question-and-answer features...
Language127k ctx
$1.00/M
deepseek
DeepSeek: R1 Distill Llama 70B
DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). The model combines advanced distillation techniques to achieve high performance across...
Language8k ctx
$0.80/M
deepseek
DeepSeek: R1
DeepSeek R1 is here: Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass....
Language64k ctx
$0.70/M
M
minimax
MiniMax: MiniMax-01
MiniMax-01 is a combines MiniMax-Text-01 for text generation and MiniMax-VL-01 for image understanding. It has 456 billion parameters, with 45.9 billion parameters activated per inference, and can handle a context...
Language1000k ctx
$0.20/M
microsoft
Microsoft: Phi 4
[Microsoft Research](/microsoft) Phi-4 is designed to perform well in complex reasoning tasks and can operate efficiently in situations with limited memory or where quick responses are needed. At 14 billion...
Language16k ctx
$0.07/M
S
sao10k
Sao10K: Llama 3.1 70B Hanami x1
This is [Sao10K](/sao10k)'s experiment over [Euryale v2.2](/sao10k/l3.1-euryale-70b).
Language16k ctx
$3.00/M
deepseek
DeepSeek: DeepSeek V3
DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of the previous versions. Pre-trained on nearly 15 trillion tokens, the reported evaluations...
Language128k ctx
$0.20/M
S
sao10k
Sao10K: Llama 3.3 Euryale 70B
Euryale L3.3 70B is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). It is the successor of [Euryale L3 70B v2.2](/models/sao10k/l3-euryale-70b).
Language131k ctx
$0.65/M
openai
OpenAI: o1
The latest and strongest model family from OpenAI, o1 is designed to spend more time thinking before responding. The o1 model series is trained with large-scale reinforcement learning to reason...
Language200k ctx
$15.00/M
cohere
Cohere: Command R7B (12-2024)
Command R7B (12-2024) is a small, fast update of the Command R+ model, delivered in December 2024. It excels at RAG, tool use, agents, and similar tasks requiring complex reasoning...
Language128k ctx
$0.04/M
meta-llama
Meta: Llama 3.3 70B Instruct
The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model...
Language131k ctx
$0.10/M
meta-llama
Meta: Llama 3.3 70B Instruct (free)
The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model...
Language66k ctx
$0.00/M
amazon
Amazon: Nova Lite 1.0
Amazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focused on fast processing of image, video, and text inputs to generate text output. Amazon Nova Lite...
Language300k ctx
$0.06/M