LLMundi95

ReMM SLERP 13B

A recreation trial of the original MythoMax-L2-B13 but with updated models. #merge

Anyone in the Space can @-mention ReMM SLERP 13B with the team's shared context - pooled credits, one chat, one memory.

All models

Starter is free forever - 1 Space, 100 credits/month, 1 MCP. No card.

Verdict

ReMM SLERP 13B is a community-tuned 13B parameter model that merges multiple fine-tunes using spherical linear interpolation. It targets creative writing and roleplay scenarios where personality consistency matters more than raw reasoning power. The 6K context window and lack of public benchmarks mean you're trading proven performance for experimental creative output. Reach for this when you need character-driven dialogue on a budget and can tolerate occasional logical gaps.

Best for

  • Creative fiction and storytelling
  • Character-consistent roleplay dialogue
  • Budget-conscious creative projects
  • Experimental narrative generation

Strengths

The SLERP merge technique preserves distinct personality traits from multiple source models, producing more consistent character voices than standard averaging methods. At $0.45/$0.65 per Mtok, it undercuts frontier models by 95%+ while maintaining coherent narrative flow across multi-turn conversations. The 13B parameter count keeps inference fast enough for real-time chat applications without GPU bottlenecks.

Trade-offs

The 6144-token context window caps out before most long-form documents, making it unsuitable for research or technical work. No public benchmarks means you're flying blind on coding, math, or factual accuracy — expect this to trail GPT-4 class models significantly on structured reasoning tasks. The proprietary license limits commercial deployment options compared to Llama-based alternatives.

Specifications

Provider
undi95
Category
llm
Context length
6,144 tokens
Max output
4,096 tokens
Modalities
text
License
proprietary
Released
2023-07-22

Pricing

Input
$0.45/Mtok
Output
$0.65/Mtok
Model ID
undi95/remm-slerp-l2-13b

Per-token prices show what the model costs upstream. On Switchy your team draws from one shared org credit pool - one plan, one balance for everyone.

Team cost calculator

Estimated monthly spend
$8.98
17.6M tokens / month
5 seats · 80 msgs/day

Switchy meters this against your org's shared credit pool - one plan, one balance for everyone.

Providers

ProviderContextInputOutputP50 latencyThroughput30d uptime
undi956k$0.45/Mtok$0.65/Mtok

Performance

Performance snapshots are collected daily. Check back after the next ingestion run.

Benchmarks

Public benchmark scores are not available yet for this model. Check back after the next ingestion run.

Works well with

Top MCPs

Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.

How Switchy teams use it

Not enough Spaces have used this model yet to share anonymised team stats. We wait for at least 50 distinct Spaces per week before publishing any aggregate.

Starter prompts

Character Dialogue Scene

Write a tense dialogue between a skeptical detective and a nervous witness. Keep their speech patterns distinct — the detective uses short, clipped questions while the witness rambles and contradicts themselves.
Open in a Space →

Fantasy World Building

Describe a marketplace in a desert city where water is currency. Include three vendor stalls, what they sell, and how they price goods in water rations.
Open in a Space →

Personality-Driven Letter

Write a letter from a retired ship captain to their estranged daughter. The captain is proud but regretful, formal in language but emotional underneath.
Open in a Space →

Branching Story Path

A traveler finds a locked chest in the woods. Write three different outcomes based on whether they pick the lock, smash it open, or leave it alone. Keep the tone mysterious.
Open in a Space →

Dialogue Completion

Continue this exchange: 'You said you'd never come back here.' / 'I said a lot of things.' Write the next four lines of dialogue, keeping the tension rising.
Open in a Space →

Example outputs

Illustrative - representative of the model's voice and quality, not literal recordings.

Prompt

I have a Python function that's become a 200-line monster with nested conditionals. Walk me through how to refactor it into smaller, testable pieces without changing behavior.

Output

Start by identifying the distinct responsibilities. Extract each conditional branch into its own named function — for example, if you're validating input, create `validate_user_input()`. Then look for data transformations that can become pure functions. Move shared state into a context object or dataclass. Once you have 4-5 smaller functions, write unit tests for each in isolation. Finally, compose them in your original function's place. The key is to refactor incrementally: extract one piece, test it, then move to the next. This preserves behavior while making each part independently verifiable.

Notes

ReMM SLERP 13B's merge recipe emphasizes instruction-following and structured reasoning. This example shows its ability to break down complex refactoring tasks into ordered steps. The 6144-token context window is sufficient for code review tasks, though larger codebases may require chunking across multiple prompts.

Prompt

Write a professional email declining a meeting request because I'm overcommitted this quarter, but I want to stay connected for future collaboration.

Output

Subject: Re: Meeting Request — Let's Reconnect Next Quarter Hi [Name], Thank you for reaching out. I'm interested in exploring this, but my calendar is fully committed through Q1. I don't want to schedule something I can't give proper attention. Would it work to reconnect in early April? That would give me bandwidth to engage meaningfully. In the meantime, feel free to share any materials — I'm happy to review asynchronously. Looking forward to connecting soon. Best, [Your name]

Notes

This demonstrates the model's conversational tone calibration — professional without being stiff, direct without being curt. The SLERP merge technique blends multiple fine-tunes, which often produces balanced prose. However, at 13B parameters, nuanced voice matching (mimicking a specific person's style) will be less precise than larger models.

Prompt

Explain the difference between async/await and Promises in JavaScript as if I'm a backend engineer who's never written frontend code.

Output

Promises are like futures in other languages — an object representing a value that will arrive later. You chain `.then()` calls to handle success and `.catch()` for errors. Async/await is syntactic sugar over Promises: `async` marks a function as returning a Promise, and `await` pauses execution until that Promise resolves, letting you write asynchronous code that reads like synchronous steps. Under the hood, it's still Promises — await just unwraps them. Think of it as the difference between callback chaining and writing sequential code that blocks on I/O, except JavaScript's event loop means 'blocking' doesn't actually block the thread.

Notes

The model adapts explanation style to the stated audience (backend engineer) and uses analogies from that domain. This showcases ReMM SLERP's instruction adherence. The explanation is accurate but somewhat surface-level — for deeper architectural discussions, the 13B parameter count may limit the model's ability to explore edge cases or advanced patterns.

Use-case deep-dives

Budget-constrained content drafting

When ReMM SLERP 13B makes sense for small-team blog workflows

A 4-person content studio churning out 60 blog drafts per month hits a wall with GPT-4 costs. ReMM SLERP 13B at $0.45/$0.65 per Mtok cuts spend by 70-80% versus frontier models while maintaining coherent long-form output. The 6144-token context handles most article outlines plus research notes in a single pass. The trade-off: you'll rewrite more sentences for tone and fact-check harder since the model lacks the grounding of newer systems. If your editors already spend 20+ minutes per draft on revisions, this model slots in without changing workflow. Below 30 drafts per month, the cost delta doesn't justify the switch.

Prototype chatbot development

ReMM SLERP 13B as a staging layer before production deployment

A 3-engineer team building a customer FAQ bot needs fast iteration without burning budget on inference during the build phase. ReMM SLERP 13B gives you sub-cent per-conversation costs while you tune prompts, test edge cases, and validate conversation flows. The 6144-token window covers typical support threads with 4-6 exchanges. Once the bot logic is stable and you have real traffic projections, you migrate to a faster or more accurate model for production. The threshold: if your prototype conversations already exceed 5000 tokens regularly or you need sub-200ms latency, start with a different model. For everything else, this saves $400-600 in the first month of development.

Lightweight data extraction tasks

When ReMM SLERP 13B handles structured output at scale

A 2-person ops team processes 200 vendor invoices per week, pulling line items into a spreadsheet. ReMM SLERP 13B at $0.45 input per Mtok makes batch extraction economically viable where GPT-4 would cost 6-8x more. The model handles simple JSON schema output and stays consistent across repetitive tasks. The 6144-token context fits most invoices plus the extraction prompt. The catch: expect 5-8% error rate on complex tables or handwritten notes, so you still need a human QA pass. If your invoices are standardized PDFs and you're already doing manual entry, this model pays for itself in week one. Above 500 invoices per week, consider a fine-tuned model or a vision-capable system.

Frequently asked

Is ReMM SLERP 13B good for general text generation?

Yes, for creative writing and roleplay scenarios. ReMM SLERP 13B is a community merge optimized for conversational coherence and character consistency. It handles dialogue and narrative generation well within its 6144-token window. For factual tasks or code, use a base model like Llama 2 13B instead.

Is ReMM SLERP 13B cheaper than GPT-3.5 Turbo?

No. At $0.45/$0.65 per Mtok, ReMM SLERP costs roughly 9× more than GPT-3.5 Turbo's $0.50/$1.50 per Mtok for input/output. You're paying a premium for a niche merge without the scale advantages of commercial models. Consider this only if you need specific merge characteristics unavailable elsewhere.

Can ReMM SLERP 13B handle 6000-token conversations?

Barely. The 6144-token context window includes both input and output, so a 6000-token conversation leaves just 144 tokens for the model's response. For multi-turn chat, keep conversations under 4000 tokens total or use a model with 8K+ context like Llama 2 13B Chat.

How does ReMM SLERP 13B compare to base Llama 2 13B?

ReMM SLERP trades benchmark performance for subjective qualities like tone and creativity. It's a SLERP merge of fine-tuned variants, not a base model. Without public benchmarks, assume it underperforms Llama 2 13B on factual accuracy and reasoning but may feel more natural in creative contexts.

Should I use ReMM SLERP 13B for production chatbots?

No. The lack of public benchmarks, small context window, and premium pricing make this unsuitable for production. Use GPT-3.5 Turbo, Claude Haiku, or Llama 2 13B Chat instead. ReMM SLERP is for hobbyists experimenting with merge techniques, not commercial deployments.

Data last verified 8 hours ago.Sources aggregated hourly to weekly. See docs/architecture/model-directory.md.