LLMperplexity

Perplexity: Sonar Pro Search

Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity's most advanced agentic search system. It is designed for deeper reasoning and analysis. Pricing is based...

Anyone in the Space can @-mention Perplexity: Sonar Pro Search with the team's shared context - pooled credits, one chat, one memory.

All models

Starter is free forever - 1 Space, 100 credits/month, 1 MCP. No card.

Verdict

Sonar Pro Search is Perplexity's premium search-grounded model, designed for queries that need real-time web data and citation-backed answers. It excels at research tasks where factual accuracy and source attribution matter more than creative generation. The trade-off is cost: at $15/Mtok output, it's 5x pricier than GPT-4o and best reserved for high-value lookups rather than bulk content work. Reach for this when you need verifiable, up-to-date information with inline citations—skip it for creative writing or code generation where search grounding adds little value.

Best for

  • Real-time research with source citations
  • Fact-checking and claim verification
  • Market intelligence and competitive analysis
  • Technical documentation lookup
  • News monitoring and trend analysis

Strengths

Sonar Pro Search integrates live web retrieval directly into inference, returning answers with inline citations to source URLs. The 200K context window handles long research threads without losing track of prior queries. Image input support lets you upload charts or screenshots for analysis alongside text questions. Unlike base LLMs that hallucinate dates or statistics, this model grounds responses in fresh web data, making it reliable for time-sensitive queries where accuracy is non-negotiable.

Trade-offs

Output pricing at $15/Mtok makes this one of the most expensive models per token—fine for targeted research but prohibitive for high-volume use cases like content generation or customer support. The search-first architecture means it underperforms on pure reasoning tasks where web context is irrelevant: mathematical proofs, creative fiction, or code refactoring see no benefit from the grounding layer. Latency is higher than non-search models due to real-time retrieval, so interactive applications may feel sluggish compared to GPT-4o or Claude.

Specifications

Provider
perplexity
Category
llm
Context length
200,000 tokens
Max output
8,000 tokens
Modalities
text, image
License
proprietary
Released
2025-10-30

Pricing

Input
$3.00/Mtok
Output
$15.00/Mtok
Model ID
perplexity/sonar-pro-search

Per-token prices show what the model costs upstream. On Switchy your team draws from one shared org credit pool - one plan, one balance for everyone.

Team cost calculator

Estimated monthly spend
$116.16
17.6M tokens / month
5 seats · 80 msgs/day

Switchy meters this against your org's shared credit pool - one plan, one balance for everyone.

Providers

ProviderContextInputOutputP50 latencyThroughput30d uptime
perplexity200k$3.00/Mtok$15.00/Mtok

Performance

Performance snapshots are collected daily. Check back after the next ingestion run.

Benchmarks

Public benchmark scores are not available yet for this model. Check back after the next ingestion run.

Works well with

Top MCPs

Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.

How Switchy teams use it

Not enough Spaces have used this model yet to share anonymised team stats. We wait for at least 50 distinct Spaces per week before publishing any aggregate.

Starter prompts

Competitive Product Analysis

Compare the top 3 project management tools launched in 2024, focusing on AI features, pricing tiers, and user reviews. Include source links for each claim.
Open in a Space →

Technical Standard Lookup

What are the current WCAG 2.2 Level AA requirements for color contrast ratios? Cite the official guidelines and any recent updates.
Open in a Space →

Market Trend Summary

Summarize key developments in edge AI hardware over the past 6 months, including new chip releases and adoption metrics. Provide dated sources.
Open in a Space →

Fact-Check Claims

Verify this claim: 'Over 60% of Fortune 500 companies now use generative AI in production.' Find recent studies or reports that confirm or refute it.
Open in a Space →

Regulatory Update Scan

What changes to EU AI Act enforcement went into effect in Q1 2025? Summarize with links to official announcements or legal texts.
Open in a Space →

Example outputs

Illustrative - representative of the model's voice and quality, not literal recordings.

Prompt

What are the latest FDA approvals for GLP-1 receptor agonists in the past 6 months, and what clinical trial data supported each decision?

Output

Based on recent FDA announcements, three GLP-1 receptor agonists received approvals between October 2024 and March 2025. Tirzepatide (Mounjaro) gained expanded indication for chronic weight management on November 8, 2024, supported by the SURMOUNT-1 trial showing 22.5% mean weight reduction over 72 weeks (n=2,539). Semaglutide 2.4mg (Wegovy) received approval for cardiovascular risk reduction in overweight adults on December 15, 2024, based on SELECT trial data demonstrating 20% reduction in major adverse cardiovascular events. Most recently, oral semaglutide 50mg (Rybelsus) was approved for weight management on February 21, 2025, with OASIS-1 trial results showing 15.1% weight loss at 68 weeks. Each approval included specific citations to FDA press releases and published trial protocols in NEJM and The Lancet.

Notes

This example demonstrates Sonar Pro's core strength: real-time search integration with citation-backed medical information. The 200K context window allows processing lengthy clinical documents. However, the $15/Mtok output cost makes this expensive for high-volume research workflows — a single detailed medical query can easily generate 500+ tokens of cited response.

Prompt

Compare the energy efficiency specifications of the top 5 selling heat pump models in Europe for 2025, including SCOP ratings and operating temperature ranges.

Output

The five best-selling heat pump models in Europe as of Q1 2025 show significant efficiency variation. Daikin Altherma 3 H HT leads with SCOP 5.10 (A+++ rated), operating range -28°C to +35°C, followed by Vaillant aroTHERM plus with SCOP 4.90 and -25°C to +30°C range. Mitsubishi Ecodan Zubadan achieves SCOP 4.80 with exceptional cold-climate performance to -28°C. Viessmann Vitocal 250-A reaches SCOP 4.70, while Bosch Compress 7000i AW records SCOP 4.65. Market data sourced from EHPA Q1 2025 sales reports and manufacturer technical datasheets. All models meet ErP 2024 requirements with seasonal efficiency exceeding 150% at standard test conditions.

Notes

Showcases Sonar Pro's ability to aggregate current product specifications across manufacturers with source attribution. The search component pulls live technical sheets rather than relying on stale training data. Trade-off: comparative queries like this can hit the upper pricing tier quickly, and the model may surface conflicting specs from different regional sources without always flagging the discrepancy.

Prompt

Analyze this architectural rendering [image of modern building facade] and identify the cladding materials, likely construction system, and any building code compliance concerns for California seismic zones.

Output

The rendering shows a five-story mixed-use structure with fiber-cement panel cladding (likely James Hardie or similar) in a rainscreen configuration, mounted on a steel moment frame visible through the glazing pattern. The curtain wall system appears to be unitized aluminum with thermally broken frames. For California seismic zones 3-4, this design would require special moment frame detailing per ASCE 7-22 Chapter 12, with drift limits of 2% for the frame system. The cladding attachment must accommodate 3-inch lateral movement per AAMA 501.4. Potential compliance concerns include: inadequate seismic gaps at floor lines (rendering shows tight panel spacing), and the cantilevered corner element may need supplementary damping. California Title 24 energy compliance appears feasible with specified glazing U-values below 0.30.

Notes

Demonstrates multimodal capability — Sonar Pro can process architectural images and cross-reference current building codes through search. The 200K context allows ingesting full code sections for detailed compliance checks. Limitation: visual analysis of technical drawings is less precise than specialized CAD tools, and the model may not catch subtle rendering artifacts that signal structural issues.

Use-case deep-dives

Customer support research automation

When Sonar Pro wins for support teams handling complex product questions

A 12-person SaaS support team fields 200+ tickets daily, many requiring multi-source research across docs, forums, and competitor sites. Sonar Pro Search is the right call here because it returns grounded, cited answers from live web data in a single pass—no separate RAG pipeline, no stale embeddings. At $3/$15 per Mtok with a 200k context window, you can feed entire ticket histories plus search results and still stay under $0.50 per complex ticket. The trade-off: if your questions live entirely in internal docs, a standard RAG setup on GPT-4o is cheaper. But when answers require current external information—product comparisons, regulatory updates, integration guides—Sonar Pro closes tickets faster because it searches and synthesizes in one model call.

Market intelligence report generation

Why Sonar Pro beats standard LLMs for weekly competitive analysis

A 4-person strategy consultancy publishes weekly market briefs for clients in fintech and healthcare. Each brief synthesizes 30-50 sources: news, filings, analyst reports, social sentiment. Sonar Pro Search handles this because it retrieves and ranks current sources as part of inference, then cites them inline—eliminating the manual search-then-prompt workflow. The 200k context window means you can include prior briefs for style consistency and still process a week's worth of raw material in one generation. At $15/Mtok output, a 5,000-word cited report costs roughly $0.75. The boundary: if you're generating 100+ reports daily, the output cost stacks up; consider a cheaper base model with a separate search API. Under 50 reports/week, Sonar Pro is faster to ship and easier to audit.

Real-time fact-checking for content teams

When Sonar Pro is the right model for live editorial workflows

A 20-person media outlet publishes 15-20 articles daily, each requiring fact-checks on claims, dates, and attributions before publication. Sonar Pro Search fits because editors can query it mid-draft and get cited, current answers without leaving the CMS—no separate research tab, no manual cross-referencing. The model searches live sources and returns inline citations, so the fact-check and the draft update happen in the same step. At $3 input, a 2,000-word draft with 10 fact-check queries costs under $0.20. The limit: if your fact-checks are historical (pre-2023 events, archival sources), a standard LLM with a curated knowledge base is more reliable. But for breaking news, policy changes, or anything requiring today's web, Sonar Pro keeps pace with the news cycle.

Frequently asked

Is Perplexity Sonar Pro Search good for research and fact-checking?

Yes, this is its primary strength. Sonar Pro Search connects to live web data and citations, making it ideal for research tasks that need current information. Unlike static models, it pulls real-time sources and provides references. If you need a model that knows what happened yesterday or can verify claims against current data, this is the right tool.

Is Perplexity Sonar Pro Search cheaper than GPT-4 or Claude for search tasks?

At $3 input and $15 output per million tokens, it's competitive with mid-tier models but not the cheapest option. GPT-4o runs $2.50/$10 per Mtok, while Claude 3.5 Sonnet costs $3/$15. You're paying for the integrated search capability, not raw inference speed. If you'd otherwise chain a standard LLM with a search API, Sonar Pro consolidates that workflow and may save integration costs.

Can Perplexity Sonar Pro Search handle long documents with its 200k context window?

The 200k token context window handles roughly 150,000 words, enough for most research papers, legal documents, or technical manuals. However, search-augmented models prioritize retrieval over deep reasoning across the entire context. For pure document analysis without needing live web data, Claude 3.5 Sonnet or GPT-4 Turbo may perform better at similar context lengths.

How does Sonar Pro Search compare to using ChatGPT with web browsing?

Sonar Pro Search is purpose-built for search and citation, while ChatGPT's browsing is a bolted-on feature. Sonar typically returns more sources per query and formats citations more consistently. The trade-off: ChatGPT offers stronger general reasoning and coding ability. If 80% of your workflow is research and fact-gathering, Sonar Pro wins. If you need one model for everything, stick with GPT-4o or Claude.

Should I use Sonar Pro Search for customer-facing chatbots?

Only if your chatbot needs to answer questions requiring current information—like product availability, news updates, or policy changes. For static knowledge domains or conversational AI, standard LLMs are faster and cheaper. Sonar Pro's latency is higher due to search overhead, and the output pricing at $15/Mtok adds up quickly in high-volume chat scenarios. Test response times before committing.

Data last verified 7 hours ago.Sources aggregated hourly to weekly. See docs/architecture/model-directory.md.