LLMsao10k

Sao10K: Llama 3.1 Euryale 70B v2.2

Euryale L3.1 70B v2.2 is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). It is the successor of [Euryale L3 70B v2.1](/models/sao10k/l3-euryale-70b).

Anyone in the Space can @-mention Sao10K: Llama 3.1 Euryale 70B v2.2 with the team's shared context - pooled credits, one chat, one memory.

All models

Starter is free forever - 1 Space, 100 credits/month, 1 MCP. No card.

Verdict

Llama 3.1 Euryale 70B v2.2 is a community fine-tune optimized for creative writing and roleplay scenarios. It trades benchmark performance for stronger character consistency and narrative flow in multi-turn conversations. At $0.85/Mtok both ways and 131K context, it's economical for long-form fiction generation where you need coherent character voices across dozens of exchanges. Reach for this when creative output quality matters more than factual accuracy or reasoning benchmarks.

Best for

  • Long-form creative fiction generation
  • Multi-turn roleplay with character consistency
  • Narrative-driven chatbot experiences
  • Story brainstorming and plot development

Strengths

This fine-tune prioritizes narrative coherence and character voice over raw benchmark scores. The 131K context window supports extended story arcs without losing thread, and the symmetric $0.85/Mtok pricing makes iterative creative workflows affordable. Community feedback highlights stronger emotional range and less repetitive phrasing than base Llama 3.1 70B in creative contexts. The model maintains distinct character personalities across long conversations without constant re-prompting.

Trade-offs

No public benchmarks means you're flying blind on factual accuracy, math reasoning, and code generation relative to peers. This model sacrifices general-purpose performance for creative specialization — expect weaker results on analytical tasks, structured data extraction, or technical Q&A compared to base Llama 3.1 70B or GPT-4 class models. The proprietary license limits commercial deployment options. If your use case needs verifiable accuracy or strong reasoning, look elsewhere.

Specifications

Provider
sao10k
Category
llm
Context length
131,072 tokens
Max output
16,384 tokens
Modalities
text
License
proprietary
Released
2024-08-28

Pricing

Input
$0.85/Mtok
Output
$0.85/Mtok
Model ID
sao10k/l3.1-euryale-70b

Per-token prices show what the model costs upstream. On Switchy your team draws from one shared org credit pool - one plan, one balance for everyone.

Team cost calculator

Estimated monthly spend
$14.96
17.6M tokens / month
5 seats · 80 msgs/day

Switchy meters this against your org's shared credit pool - one plan, one balance for everyone.

Providers

ProviderContextInputOutputP50 latencyThroughput30d uptime
sao10k131k$0.85/Mtok$0.85/Mtok

Performance

Performance snapshots are collected daily. Check back after the next ingestion run.

Benchmarks

Public benchmark scores are not available yet for this model. Check back after the next ingestion run.

Works well with

Top MCPs

Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.

How Switchy teams use it

Not enough Spaces have used this model yet to share anonymised team stats. We wait for at least 50 distinct Spaces per week before publishing any aggregate.

Starter prompts

Character Backstory Expansion

I have a character named Elena, a former architect now living off-grid. Expand her backstory: what drove her to leave the city, what skills she brought with her, and one formative childhood memory that explains her self-reliance.
Open in a Space →

Dialogue Scene Drafting

Write a tense dinner conversation between three siblings arguing over selling their late mother's house. Each has different financial needs and emotional attachments. Keep their voices distinct and let subtext drive the conflict.
Open in a Space →

Plot Branch Exploration

In my story, the protagonist just discovered her mentor has been lying about his past. Give me three different ways this revelation could unfold in the next chapter, each with different emotional tones and consequences.
Open in a Space →

Setting Atmosphere Building

Describe an abandoned research station on Europa's ice surface. Focus on sensory details that convey isolation and unease. The protagonist is about to discover something that changes everything — set that tone without revealing the discovery.
Open in a Space →

Character Voice Consistency Check

My character Marcus is a cynical ex-cop who uses dark humor as a defense mechanism. Write his internal monologue as he walks through a crime scene that reminds him why he quit the force. Keep his voice consistent with someone who's seen too much but still cares.
Open in a Space →

Example outputs

Illustrative - representative of the model's voice and quality, not literal recordings.

Prompt

I'm building a fantasy RPG dialogue system. Write a conversation between a weary innkeeper and a suspicious traveler who's asking too many questions about the local lord's movements.

Output

The innkeeper sets down a clay mug with deliberate slowness. "Lord Vasrick? Keeps to himself these days. Winter's hard on old bones." Her eyes don't leave the traveler's hands. "You're the third stranger this tenday asking after him. Funny, that." She wipes the counter in slow circles. "The roads aren't safe for curious folk. Bandits, they say. Though some bandits ask questions before they rob you." A pause. "Your coin's good here, but your questions aren't. Drink up and move along come morning."

Notes

Euryale excels at creative writing with strong character voice and subtext. The 131k context window supports complex narrative threads across long story sessions. This fine-tune prioritizes literary quality over strict instruction-following, so users seeking technical precision may find responses drift toward narrative embellishment.

Prompt

Analyze this relationship dynamic: Two co-founders have been arguing about product direction for months. One wants to pivot to enterprise, the other believes their consumer traction is about to break through. How should their board navigate this?

Output

This is a classic vision-versus-validation deadlock. The enterprise advocate likely sees predictable revenue and shorter sales cycles; the consumer defender is betting on network effects that haven't materialized yet. The board shouldn't pick a side—they should demand evidence. Set a 90-day experiment: allocate 20% of resources to one enterprise pilot with clear success metrics. If consumer metrics don't inflect upward in that window, the market has spoken. The real risk isn't the wrong direction—it's burning runway while two smart people argue past each other. Force the data to referee, then back the winner completely.

Notes

The model handles nuanced reasoning about interpersonal and strategic problems well, producing analysis that balances multiple perspectives. The response structure is clear without being formulaic. However, with no public benchmarks available, users should verify factual accuracy on specialized domains before relying on outputs for high-stakes decisions.

Prompt

Rewrite this customer support email to be more empathetic but still set boundaries: 'Your refund request is outside our 30-day window. We can't process this. Policy is policy.'

Output

I understand how frustrating this situation must be, and I genuinely wish I could resolve it the way you're hoping. Our 30-day refund window exists so we can manage inventory and process returns while products are still current, but I realize that doesn't make your situation any easier. While I'm not able to approve a refund at this point, I'd like to explore what other options might work for you—perhaps store credit toward a future purchase, or connecting you with our product team if there's a quality issue we should know about. Would either of those help?

Notes

Euryale's fine-tuning shows strength in emotionally intelligent rewriting tasks, maintaining warmth while preserving necessary boundaries. The output demonstrates natural phrasing without corporate jargon. The model's creative writing bias means it may over-elaborate on simpler requests—users wanting terse, technical responses should provide explicit length constraints in their prompts.

Use-case deep-dives

Multi-turn creative writing collaboration

When 131K context lets your writing team iterate without re-prompting

A 4-person content studio building serialized fiction needs a model that holds 40+ pages of story bible, character sheets, and draft chapters in memory while generating new scenes. Llama 3.1 Euryale 70B v2.2 handles this at $0.85/Mtok — cheaper than GPT-4 class models — with a 131K context window that keeps the entire narrative arc loaded across sessions. The trade-off: no public benchmarks mean you're testing creative coherence yourself, and response quality on technical tasks is unproven. If your team writes 200K+ tokens/month and needs long-context creative consistency over coding or analysis, this model delivers the window at a price point that makes daily iteration affordable.

Internal knowledge base Q&A

Budget-friendly long-context retrieval for small team wikis

A 10-person startup maintains 80 pages of onboarding docs, process guides, and product specs in Notion. They need a model that can ingest the full knowledge base and answer new-hire questions without chunking or vector search infrastructure. Euryale's 131K window fits the entire corpus in a single prompt at $0.85/Mtok — roughly half the cost of Claude 3.5 Sonnet for the same context load. The risk: without published benchmarks, you won't know how it performs on factual recall or instruction-following until you test it against your own docs. If you're under 500 queries/month and can afford a week of validation, this model offers long-context access at a price that doesn't require VC funding.

Long-form meeting transcript summarization

When 2-hour call transcripts need single-pass summaries on a tight budget

A 6-person agency records client strategy calls that run 90-120 minutes, generating 25K-35K token transcripts. They need accurate summaries with action items, but can't justify $4/Mtok models for 40 calls/month. Euryale processes the full transcript in one pass at $0.85/Mtok, avoiding the context-splitting errors that plague smaller models. The catch: no MMLU or instruction-following scores mean summarization quality is a black box until you run your own evals. If your team does under 50 calls/month and can spot-check the first 10 summaries for accuracy, this model cuts your transcript processing cost by 60-80% compared to frontier alternatives.

Frequently asked

Is Llama 3.1 Euryale 70B good for creative writing and roleplay?

Yes, this is exactly what it's built for. Euryale is a community fine-tune of Llama 3.1 70B specifically optimized for creative storytelling, character consistency, and roleplay scenarios. It trades some benchmark performance for better narrative coherence and less corporate guardrails. If you're generating fiction or running character-based interactions, this beats base Llama 3.1 70B.

Is $0.85 per Mtok worth it compared to base Llama 3.1 70B?

It's the same price as most hosted Llama 3.1 70B endpoints, but you're paying for the creative fine-tune. If you need the model for coding or factual Q&A, stick with base Llama 3.1 70B. If you're doing creative work where personality and narrative flow matter more than raw accuracy, the tuning justifies the cost. You won't find this cheaper elsewhere for the same quality.

Can it handle 131k token context windows reliably?

The model inherits Llama 3.1's 131k context window, but practical performance degrades past 80-90k tokens like most long-context models. For creative work, you'll rarely need the full window anyway. If you're doing multi-chapter novel generation or massive worldbuilding documents, test carefully past 100k tokens. For typical roleplay sessions under 50k tokens, it's solid.

How does Euryale v2.2 compare to the original Euryale release?

Version 2.2 refines instruction-following and reduces repetition issues from earlier versions. The base model upgrade to Llama 3.1 also brings better multilingual support and reasoning. If you used v1.x and hit loops or inconsistent character voices, v2.2 fixes most of that. The creative tuning is more stable without sacrificing the personality that made Euryale popular.

Should I use this for customer-facing chatbots?

No. Euryale's fine-tuning removes safety guardrails and optimizes for creative freedom, not business use cases. It can generate content that violates typical content policies. For customer support, documentation, or any regulated application, use base Llama 3.1 70B Instruct or a model with commercial safety tuning. Use Euryale only for internal creative tools or personal projects.

Data last verified 8 hours ago.Sources aggregated hourly to weekly. See docs/architecture/model-directory.md.