LLMrekaai

Reka Flash 3

Reka Flash 3 is a general-purpose, instruction-tuned large language model with 21 billion parameters, developed by Reka. It excels at general chat, coding tasks, instruction-following, and function calling. Featuring a...

Anyone in the Space can @-mention Reka Flash 3 with the team's shared context - pooled credits, one chat, one memory.

All models

Starter is free forever - 1 Space, 100 credits/month, 1 MCP. No card.

Verdict

Reka Flash 3 delivers fast, cost-effective text generation at $0.10/$0.20 per Mtok with a 65K context window. It targets the sweet spot between speed and capability for high-volume applications where GPT-4-class reasoning isn't required. Best for teams running large batches of structured tasks—data extraction, classification, content moderation—where throughput and cost matter more than frontier performance. Without public benchmarks, expect to validate fit with your own evals before committing production traffic.

Best for

  • High-volume data extraction and classification
  • Cost-sensitive content moderation pipelines
  • Batch processing of structured documents
  • Rapid prototyping with 65K context
  • Internal tools where speed trumps complexity

Strengths

The pricing undercuts most frontier models by 5-10x, making it viable for applications that process millions of tokens daily. The 65K context window handles mid-length documents—contracts, support tickets, research papers—without chunking. Reka's architecture prioritizes latency, so expect sub-second responses for typical queries. The model's design favors structured outputs and deterministic tasks over open-ended creative work, which aligns well with enterprise automation needs.

Trade-offs

No public benchmarks means you're flying blind on reasoning depth, factual accuracy, and instruction-following compared to Claude, GPT-4o, or Gemini. Expect weaker performance on complex multi-step reasoning, nuanced creative writing, and tasks requiring deep domain knowledge. The 65K context is adequate but falls short of the 200K+ windows now standard in premium models. For customer-facing applications or anything requiring high trust, plan extensive internal testing before deployment.

Specifications

Provider
rekaai
Category
llm
Context length
65,536 tokens
Max output
65,536 tokens
Modalities
text
License
proprietary
Released
2025-03-12

Pricing

Input
$0.10/Mtok
Output
$0.20/Mtok
Model ID
rekaai/reka-flash-3

Per-token prices show what the model costs upstream. On Switchy your team draws from one shared org credit pool - one plan, one balance for everyone.

Team cost calculator

Estimated monthly spend
$2.29
17.6M tokens / month
5 seats · 80 msgs/day

Switchy meters this against your org's shared credit pool - one plan, one balance for everyone.

Providers

ProviderContextInputOutputP50 latencyThroughput30d uptime
rekaai66k$0.10/Mtok$0.20/Mtok

Performance

Performance snapshots are collected daily. Check back after the next ingestion run.

Benchmarks

Public benchmark scores are not available yet for this model. Check back after the next ingestion run.

Works well with

Top MCPs

Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.

How Switchy teams use it

Not enough Spaces have used this model yet to share anonymised team stats. We wait for at least 50 distinct Spaces per week before publishing any aggregate.

Starter prompts

Extract Invoice Line Items

Extract all line items from this invoice into a JSON array. Each item should include description, quantity, unit_price, and total. Return only valid JSON with no additional commentary.
Open in a Space →

Classify Support Tickets

Classify this support ticket into one of these categories: billing, technical, account, feature_request. Also assign urgency: low, medium, high. Return as JSON with 'category' and 'urgency' keys.
Open in a Space →

Summarize Meeting Transcripts

Summarize this meeting transcript into three sections: key decisions, action items with owners, and open questions. Keep each section to 3-5 bullet points maximum.
Open in a Space →

Generate Product Descriptions

Write a 50-word product description for this item. Focus on key features and benefits. Use an enthusiastic but professional tone. Do not include pricing or availability.
Open in a Space →

Validate Data Entry Forms

Review this form submission for errors. Check that email format is valid, phone number has 10 digits, and all required fields are filled. Return a JSON object with 'valid' (boolean) and 'errors' (array of issues).
Open in a Space →

Example outputs

Illustrative - representative of the model's voice and quality, not literal recordings.

Prompt

Review this Python function for potential bugs. It's supposed to merge two sorted lists but users report incorrect output when one list is empty.

Output

The function has a boundary condition bug. When either input list is empty, the while loop never executes, but the function doesn't append the remaining elements from the non-empty list. Add these lines after the main loop: ```python result.extend(list1[i:]) result.extend(list2[j:]) ``` This ensures any remaining elements get added. The bug manifests because the loop condition `i < len(list1) and j < len(list2)` exits early when one list is exhausted, leaving valid elements unprocessed.

Notes

Flash 3 identifies the root cause quickly and provides a concrete fix. With no public benchmarks yet, this example shows competent debugging on a classic algorithm problem. The 65K context window isn't tested here, but the response structure is clear and actionable for typical code review tasks.

Prompt

Summarize the key regulatory changes in this 8-page compliance memo, focusing on what our engineering team needs to implement by Q2.

Output

Three implementation requirements by Q2: 1. **Data retention limits**: User activity logs must auto-delete after 90 days (down from 180). Update your logging pipeline's TTL settings. 2. **Consent re-confirmation**: Users who signed up before March 2024 need explicit re-consent for analytics tracking. Add a modal flow to the login sequence. 3. **Export format change**: Data export requests must include metadata timestamps in ISO 8601 format, not Unix epoch. The memo also mentions audit trail requirements starting Q3, but those don't affect your current sprint.

Notes

This tests Flash 3's ability to extract action items from dense text and translate legal language into engineering tasks. The 65K context window handles multi-page documents comfortably. At $0.10/$0.20 per Mtok, it's cost-effective for document processing, though response specificity depends on how well the actual memo structures information.

Prompt

I'm writing a fantasy novel. Generate three alternative opening lines that establish a sense of ancient mystery without using the words 'darkness,' 'shadow,' or 'forgotten.'

Output

1. "The library's oldest map showed six continents, but Sera's grandfather had always insisted there were seven." 2. "When the archaeologists opened the vault, they found not treasure but a single question carved in a language that predated writing itself." 3. "The tower had stood for three thousand years, and every generation told a different story about why no one had ever reached the top." Each opening hints at hidden knowledge through concrete details—a discrepancy, an impossibility, an enduring enigma—rather than atmospheric adjectives.

Notes

Flash 3 handles creative constraints well, delivering varied approaches to the same tonal goal. The model avoids cliché and provides meta-commentary on its choices. For creative writing assistance, this shows solid instruction-following, though the prose style is functional rather than lyrical—fine for brainstorming, less so for final draft quality.

Use-case deep-dives

High-volume customer support triage

When Reka Flash 3 makes sense for support ticket routing at scale

A 12-person SaaS company processing 800+ support tickets daily needs fast classification without breaking the budget. Reka Flash 3 hits the sweet spot here: at $0.10/$0.20 per Mtok, you're looking at roughly $15-25/month even at that volume, assuming 200-token average tickets and 50-token routing responses. The 65k context window means you can stuff your entire product taxonomy and recent ticket examples into every call without multi-turn overhead. The trade-off is lack of public benchmarks—you'll want to run a 100-ticket pilot against your actual data before committing. If accuracy clears 85% on your categories and you're not doing complex sentiment analysis, this is the cheapest reliable option in the Switchy catalog. For teams under 200 tickets/day, the cost difference versus competitors is negligible, so benchmark transparency matters more.

Batch document summarization

Why Reka Flash 3 works for overnight report generation jobs

A 4-person research team needs to summarize 50-80 industry reports weekly, each running 8,000-12,000 words. Reka Flash 3's 65k-token context handles these documents in single calls without chunking, and the $0.10 input rate means processing 50 reports (roughly 600k tokens) costs about $6 in a weekend cron job. The output rate of $0.20/Mtok keeps 500-word summaries cheap even at volume. The risk is quality: without MMLU or summarization benchmark data, you're flying blind on factual accuracy and coherence. Run a 10-document test set with human review before automating. If summaries need citations or nuanced analysis, wait for models with published benchmark scores. If you're extracting key themes and dates for internal triage, the price-per-document math is hard to beat at this tier.

Prototype chatbot development

When Reka Flash 3 is the right call for early-stage conversational AI

A 3-person startup is building an MVP chatbot for a niche vertical with 20-30 beta users. Reka Flash 3's pricing lets you iterate without cost anxiety: even 500 conversations/day at 10 turns each (roughly 5M tokens monthly) runs under $100. The 65k context means you can maintain full conversation history without session management complexity in your first build. The downside is the benchmark gap—you won't know how it stacks up on instruction-following or safety until you test it against your actual user queries. If your vertical has specialized jargon or requires high factual precision, the lack of published scores is a red flag. For general conversational flows where you're optimizing UX and cost before scale, this is a defensible choice. Once you hit 2,000+ daily active users, re-benchmark against models with public evals and migrate if quality gaps appear.

Frequently asked

Is Reka Flash 3 good for general text tasks?

Yes, for lightweight tasks where speed matters more than bleeding-edge reasoning. At $0.10/$0.20 per Mtok, it's positioned as a fast, cheap option for summarization, classification, or simple Q&A. Without public benchmarks, you're betting on Reka's internal claims—test it on your actual workload before committing to production.

Is Reka Flash 3 cheaper than GPT-4o mini or Claude Haiku?

Yes. GPT-4o mini runs $0.15/$0.60 per Mtok and Claude Haiku is $0.25/$1.25. Reka Flash 3 undercuts both on input and especially output tokens. If your use case is output-heavy (long summaries, translations), the 3-6x savings on output add up fast. Trade-off: less ecosystem tooling and no public quality benchmarks.

Can Reka Flash 3 handle 65k token contexts reliably?

The 65,536 token window is standard for mid-tier models, but without published needle-in-haystack or long-context benchmarks, you can't verify retrieval accuracy at the edges. Test with your actual documents—many models degrade past 40-50k tokens despite advertised limits. Use it for sub-40k contexts until you validate performance yourself.

How does Reka Flash 3 compare to Reka Core?

Flash 3 is the speed-and-cost tier; Core is Reka's flagship reasoning model. Flash trades capability for 5-10x lower pricing and faster responses. If you need complex reasoning, multi-step logic, or nuanced writing, pay up for Core. For high-volume simple tasks—moderation, tagging, basic chat—Flash 3 is the right pick.

Should I use Reka Flash 3 for customer-facing chatbots?

Only if you're comfortable with limited fallback options and no public quality proof. The pricing is attractive for high-volume chat, but Reka's ecosystem is smaller—fewer integrations, less community troubleshooting. Start with a pilot on non-critical flows. For mission-critical chat, stick with OpenAI or Anthropic until Reka publishes benchmarks.

Data last verified 7 hours ago.Sources aggregated hourly to weekly. See docs/architecture/model-directory.md.