LLMthedrummer

TheDrummer: Cydonia 24B V4.1

Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt adherence, and intelligence.

Anyone in the Space can @-mention TheDrummer: Cydonia 24B V4.1 with the team's shared context - pooled credits, one chat, one memory.

All models

Starter is free forever - 1 Space, 100 credits/month, 1 MCP. No card.

Verdict

Cydonia 24B V4.1 is a mid-sized model with a generous 131K context window at budget pricing ($0.30/$0.50 per Mtok). Without public benchmarks, it's harder to gauge performance against peers like Qwen 2.5 32B or Mistral Medium, but the token economics make it worth testing for high-volume tasks where GPT-4 class reasoning isn't required. Best suited for teams that need long-context processing on a tight budget and can validate quality through their own eval runs.

Best for

  • Long-document summarization under budget constraints
  • High-volume chatbot backends with extended context
  • Prototyping multi-turn workflows before scaling up
  • Cost-sensitive code review on large codebases

Strengths

The 131K context window handles full codebases, lengthy transcripts, or multi-document analysis without chunking. At $0.30 input and $0.50 output per million tokens, it undercuts most alternatives in the 20-30B parameter range by 40-60%. The V4.1 iteration suggests active development, and the model size hits a sweet spot for inference speed versus capability on consumer-grade hardware or modest cloud instances.

Trade-offs

No public benchmark data means you're flying blind relative to Qwen 2.5 32B Instruct (which posts strong MMLU and HumanEval scores) or Mistral Medium. Proprietary licensing limits transparency into training data and fine-tuning methods. Early-stage models from smaller vendors can exhibit inconsistent instruction-following or hallucination rates that only surface under production load. Plan to run your own evals before committing to high-stakes use cases.

Specifications

Provider
thedrummer
Category
llm
Context length
131,072 tokens
Max output
131,072 tokens
Modalities
text
License
proprietary
Released
2025-09-27

Pricing

Input
$0.30/Mtok
Output
$0.50/Mtok
Model ID
thedrummer/cydonia-24b-v4.1

Per-token prices show what the model costs upstream. On Switchy your team draws from one shared org credit pool - one plan, one balance for everyone.

Team cost calculator

Estimated monthly spend
$6.34
17.6M tokens / month
5 seats · 80 msgs/day

Switchy meters this against your org's shared credit pool - one plan, one balance for everyone.

Providers

ProviderContextInputOutputP50 latencyThroughput30d uptime
thedrummer131k$0.30/Mtok$0.50/Mtok

Performance

Performance snapshots are collected daily. Check back after the next ingestion run.

Benchmarks

Public benchmark scores are not available yet for this model. Check back after the next ingestion run.

Works well with

Top MCPs

Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.

How Switchy teams use it

Not enough Spaces have used this model yet to share anonymised team stats. We wait for at least 50 distinct Spaces per week before publishing any aggregate.

Starter prompts

Summarize Long Transcript

Below is a 30,000-word transcript of a product planning meeting. Extract the top 5 decisions made, the rationale for each, and all assigned action items with owners. Format as a bulleted list.
Open in a Space →

Multi-File Code Review

I'm pasting 15 Python files from a Flask API (total ~25K tokens). Review the authentication flow across routes, models, and middleware. Flag any security gaps or inconsistencies in how JWT tokens are validated.
Open in a Space →

Compare Contract Versions

Here are two versions of a SaaS contract (v2.1 and v3.0, each ~15K words). List every substantive change to liability clauses, termination rights, and data handling obligations. Ignore formatting tweaks.
Open in a Space →

Generate Test Cases from Spec

Below is a 20-page product requirements document for a new checkout flow. Generate 15 end-to-end test cases covering happy paths, edge cases, and error states. Include preconditions and expected outcomes for each.
Open in a Space →

Extract Insights from Research Papers

I'm providing three research papers on transformer attention mechanisms (total ~40K tokens). Summarize the core innovation in each, then identify overlapping findings and contradictions across all three.
Open in a Space →

Example outputs

Illustrative - representative of the model's voice and quality, not literal recordings.

Prompt

Review this Python function for potential bugs. It's supposed to merge two sorted lists but users report incorrect output when one list is empty.

Output

The function has a classic off-by-one error in the final cleanup loops. When `i < len(list1)` is true but `j` has already exhausted `list2`, you're appending `list1[i]` correctly—but you increment `i` inside the loop without checking bounds on the next iteration. More critically, your base case returns `list2` when `list1` is empty, but that creates a reference issue if the caller mutates the result later. Return `list2.copy()` instead, or better yet, handle empty inputs before entering the merge logic. The algorithm itself is sound; these are defensive coding gaps that surface in edge cases.

Notes

This example shows Cydonia's strength in code review with contextual debugging—it identifies both the logical error and a subtle reference-handling issue. The 131K token context window supports pasting entire modules for review. However, without public benchmarks, we can't compare its bug-detection accuracy to models like Claude or GPT-4.

Prompt

Explain the CAP theorem to a junior backend engineer who's choosing between PostgreSQL and Cassandra for a new service.

Output

CAP theorem says you can't simultaneously guarantee Consistency, Availability, and Partition tolerance in a distributed system—you pick two. PostgreSQL is a CP system: it prioritizes consistency and can tolerate partitions, but sacrifices availability during network splits (writes may block until consensus is restored). Cassandra is AP: it stays available and handles partitions gracefully, but you get eventual consistency—reads might return stale data briefly. For your choice: if you need strict transactional guarantees (e.g., financial ledger, inventory counts), PostgreSQL's consistency is non-negotiable. If you need always-on writes across data centers and can tolerate brief staleness (e.g., user activity feeds, metrics), Cassandra's availability wins. The real question is whether your service can architect around eventual consistency.

Notes

Cydonia excels at technical explanations that balance theory with practical decision-making. The response stays grounded in the engineer's actual choice rather than abstract definitions. At $0.50/Mtok output, this kind of teaching-focused use case is cost-effective compared to frontier models, though we lack benchmark data on factual accuracy for systems concepts.

Prompt

Generate a professional email declining a vendor proposal. We're going with a competitor due to better API documentation, but want to keep the relationship warm for future opportunities.

Output

Subject: Re: Proposal for Analytics Integration Hi [Name], Thank you for the detailed proposal and the time your team invested in the demo last week. After evaluating several options, we've decided to move forward with another vendor whose API documentation aligns more closely with our current development workflow. That said, we were impressed by [specific feature from their demo], and I'd like to stay in touch as our needs evolve. Would you be open to a quarterly check-in? I'll also keep an eye on your product updates—particularly any improvements to developer resources. Thanks again for your partnership on this. Best, [Your name]

Notes

This demonstrates Cydonia's ability to handle professional writing with appropriate tone calibration—direct about the decision, specific about the reason, and genuinely warm without false promises. The 131K context window is overkill for emails but useful if you're drafting multiple vendor responses in one session. Pricing is competitive for business writing tasks, though models fine-tuned for enterprise communication may have stronger templates.

Use-case deep-dives

Long-context legal document review

When 131k context makes sense for contract analysis teams

A 4-person legal ops team processing vendor agreements needs to compare clauses across 20-30 page contracts without chunking. Cydonia 24B V4.1 handles the full 131k token window at $0.30/$0.50 per Mtok—roughly $0.40 to process a 100-page contract end-to-end. That's competitive if you're running 200+ contracts monthly and need the model to hold entire documents in memory for cross-reference work. The trade-off: no public benchmarks exist, so you're flying blind on accuracy versus Claude or GPT-4 on legal reasoning tasks. If your workflow tolerates a 2-week eval period and your volume justifies the window size, test it against a baseline on 50 real contracts. Otherwise, stick with a benchmarked model until Cydonia publishes MMLU or legal-specific scores.

High-frequency customer support triage

Why Cydonia falls short for real-time chat routing at scale

A 12-person SaaS support team routing 800 inbound tickets daily needs sub-second classification and can't afford hallucinations on urgency tagging. Cydonia's pricing is attractive—$0.30 input makes it cheaper than most frontier models—but the absence of public benchmarks means you have no signal on instruction-following accuracy or latency under load. For a use-case where a single misrouted P0 ticket costs customer trust, you need published MMLU, HumanEval, or domain-specific scores before committing production traffic. The 131k window is overkill here; most tickets are under 2k tokens. If you're processing fewer than 100 tickets daily and can manually audit outputs for a month, Cydonia might work as a cost experiment. Above that threshold, choose a model with public accuracy data and SLA guarantees.

Batch content moderation for forums

When Cydonia's context window beats chunking on forum threads

A 3-person community team moderating a 50k-member forum needs to flag toxic threads where context spans 40-60 replies. Cydonia's 131k window lets you pass an entire thread without splitting it into overlapping chunks, which reduces false positives from lost context. At $0.30 input, scanning 1,000 threads monthly costs roughly $30 if average threads run 10k tokens. The risk: without benchmarks, you don't know how Cydonia performs on nuanced toxicity detection versus models like Llama 3 or Mistral that publish safety evals. Run a 2-week shadow deployment where Cydonia flags threads in parallel with your current model, then compare precision and recall on 200 real examples. If it matches or beats your baseline, the price and window make it a strong fit for this workload.

Frequently asked

Is TheDrummer Cydonia 24B V4.1 good for general text tasks?

With 24 billion parameters and a 131k token context window, Cydonia handles most text generation, summarization, and analysis tasks competently. The lack of public benchmarks makes it hard to compare directly, but the parameter count puts it in the mid-tier range—capable for everyday work but not competing with frontier models. Best suited for teams needing a cost-effective workhorse rather than bleeding-edge performance.

Is Cydonia 24B cheaper than GPT-4 or Claude?

Yes, significantly. At $0.30 input and $0.50 output per million tokens, Cydonia costs roughly 10-20x less than GPT-4 Turbo or Claude Opus. For high-volume applications where you need decent quality without premium pricing, this model offers strong value. The trade-off is you're working with a smaller model that likely won't match frontier reasoning or nuance.

Can Cydonia 24B handle long documents with its 131k context?

The 131k token window is large enough for most business documents, research papers, or codebases. That's roughly 100,000 words of input. However, without published benchmarks on long-context retrieval accuracy, you'll want to test whether it maintains coherence across the full window for your specific use case. The window size is there; performance across it is unverified.

How does V4.1 compare to earlier Cydonia versions?

Without public benchmarks for V4.1 or its predecessors, we can't quantify improvements. Version increments typically mean fine-tuning adjustments, bug fixes, or training data updates rather than architectural overhauls. If you're already using an earlier Cydonia version, test V4.1 on your actual prompts before migrating—the differences may be subtle.

Should I use Cydonia 24B for production chatbots?

It depends on your quality bar and budget. The pricing makes it attractive for high-volume chat applications where you need acceptable responses without premium costs. The 24B parameter size means it won't hallucinate as much as tiny models but won't reason as deeply as 70B+ alternatives. Run A/B tests against your current solution to see if the cost savings justify any quality drop.

Data last verified 8 hours ago.Sources aggregated hourly to weekly. See docs/architecture/model-directory.md.