LLMliquid

LiquidAI: LFM2-24B-A2B

LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment. Built as a 24B parameter Mixture-of-Experts model with only 2B active parameters per...

Anyone in the Space can @-mention LiquidAI: LFM2-24B-A2B with the team's shared context - pooled credits, one chat, one memory.

All models

Starter is free forever - 1 Space, 100 credits/month, 1 MCP. No card.

Verdict

LiquidAI's LFM2-24B-A2B is a 24-billion-parameter model built on their Liquid Foundation Model architecture, designed for efficient inference with a 32K context window. At $0.03/$0.12 per Mtok, it sits in the budget tier but lacks public benchmark data to validate performance claims. This is the model to reach for when you need cost-effective text generation and are willing to test internally before committing, but expect to benchmark it yourself against established alternatives like Gemini Flash or Llama 3.1 8B before production use.

Best for

Budget-conscious text generation tasks
Prototyping with 32K context needs
Teams exploring Liquid architecture benefits
Cost-sensitive chatbot backends

Strengths

The 24B parameter count delivers more reasoning capacity than 8B models while maintaining aggressive pricing that undercuts mid-tier options. The 32K context window handles moderately long documents without chunking. Liquid's architecture claims efficiency gains through their novel approach to sequence modeling, potentially offering faster inference than traditional transformer models at this scale. The A2B variant suggests architectural optimizations for specific workload patterns.

Trade-offs

No public benchmarks means you're flying blind on actual performance versus Gemini Flash, GPT-4o Mini, or Llama alternatives at similar price points. The 32K context is half what Gemini Flash offers and a fraction of Claude or GPT-4's capabilities. Proprietary license limits deployment flexibility. As a newer entrant from LiquidAI, ecosystem tooling and community knowledge are sparse compared to established models. You'll need to run your own evals before trusting it with production traffic.

Specifications

Provider: liquid
Category: llm
Context length: 32,768 tokens
Max output: —
Modalities: text
License: proprietary
Released: 2026-02-25

Pricing

Input: $0.03/Mtok
Output: $0.12/Mtok
Model ID: liquid/lfm-2-24b-a2b

Per-token prices show what the model costs upstream. On Switchy your team draws from one shared org credit pool - one plan, one balance for everyone.

Team cost calculator

Seats5 peopleMessages / seat / day80Avg turn size2 ktokOutput share30 %

Estimated monthly spend

$1.00

17.6M tokens / month
5 seats · 80 msgs/day

Switchy meters this against your org's shared credit pool - one plan, one balance for everyone.

Providers

Provider	Context	Input	Output	P50 latency	Throughput	30d uptime
liquid	33k	$0.03/Mtok	$0.12/Mtok	—	—	—

Performance

Performance snapshots are collected daily. Check back after the next ingestion run.

Benchmarks

Public benchmark scores are not available yet for this model. Check back after the next ingestion run.

Works well with

Top MCPs

Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.

How Switchy teams use it

Not enough Spaces have used this model yet to share anonymised team stats. We wait for at least 50 distinct Spaces per week before publishing any aggregate.

Starter prompts

Document Summarization Test

Summarize the following document in 3-4 bullet points, focusing on key decisions and action items. Preserve technical terminology and highlight any unresolved questions.

Open in a Space →

Code Review Assistant

Review this code snippet for potential bugs, performance issues, and style improvements. Explain your reasoning for each suggestion and rank them by severity.

Open in a Space →

Customer Support Draft

Draft a professional response to this customer inquiry. Acknowledge their concern, provide a clear solution or next steps, and maintain a helpful tone without over-apologizing.

Open in a Space →

Structured Data Extraction

Extract all dates, names, and monetary amounts from the following text. Return results as a JSON object with arrays for each category.

Open in a Space →

Long-Context Reasoning

Given these meeting transcripts from the past month, identify recurring themes, track how decisions evolved, and flag any contradictions between earlier and later discussions.

Open in a Space →

Example outputs

Illustrative - representative of the model's voice and quality, not literal recordings.

Prompt

Refactor this Python function to handle edge cases and improve readability: def calc(x,y): return x/y if y!=0 else None

Output

Here's a refactored version with better error handling: ```python def safe_divide(dividend: float, divisor: float) -> float | None: """Safely divide two numbers, returning None for invalid operations.""" if divisor == 0: return None return dividend / divisor ``` Key improvements: descriptive function and parameter names, type hints for clarity, docstring explaining behavior, and the same zero-division logic preserved. Consider raising a ValueError instead of returning None if you want callers to handle the error explicitly.

Notes

This example shows LFM2-24B-A2B's ability to produce clean, idiomatic code with modern Python conventions. The 32K context window supports reviewing larger codebases, though at $0.12/Mtok output, extended refactoring sessions cost more than some alternatives in the same parameter class.

Prompt

Summarize the key regulatory changes in this 8-page compliance memo and flag any action items for our legal team by end of quarter.

Output

The memo outlines three primary changes: (1) Updated data retention requirements extending storage from 5 to 7 years for financial records, effective Q1 2025. (2) New consent language mandated for EU customers under amended GDPR guidelines—templates must be revised by December 15. (3) Expanded breach notification timelines, now requiring disclosure within 48 hours instead of 72. Action items: Legal must draft updated consent forms (due Dec 15), IT must audit retention policies (due Jan 31), and Compliance must file acknowledgment with regulators (due Nov 30).

Notes

Demonstrates the model's structured summarization and task extraction from dense documents. The 32K token window accommodates multi-page memos without truncation. However, without public benchmark data, users should verify accuracy on domain-specific jargon before relying on output for compliance decisions.

Prompt

Write a professional email declining a vendor proposal while keeping the door open for future collaboration. They quoted $45K; our budget is $20K.

Output

Subject: Re: Proposal for Q1 Marketing Services Hi [Vendor Name], Thank you for the detailed proposal. Your team's approach to audience segmentation and the case studies you shared are impressive. Unfortunately, the $45K investment exceeds our current budget allocation for this initiative. We're working within a $20K range for Q1, which doesn't align with the scope you've outlined. That said, we'd welcome the opportunity to reconnect in Q2 when we revisit our marketing spend. If you have a scaled-down option that fits our budget, I'm happy to discuss that as well. Best regards

Notes

Shows the model's tone calibration for business communication—polite, specific about constraints, and forward-looking. The output balances professionalism with directness. At 24B parameters, this model competes with larger alternatives on nuanced writing tasks, though pricing is mid-range for the category.

Use-case deep-dives

Budget-conscious customer support routing

When LFM2-24B-A2B makes sense for high-volume ticket triage

A 12-person SaaS company processing 800 support tickets daily needs to classify urgency and route to the right specialist without burning through their AI budget. LFM2-24B-A2B at $0.03/$0.12 per Mtok sits in the sweet spot: cheap enough to run on every inbound message, capable enough to handle the classification logic that doesn't require frontier reasoning. The 32k context window covers full ticket threads plus knowledge base excerpts for routing decisions. If your accuracy threshold is above 95% or you need multi-step reasoning within tickets, you'll want benchmarks before committing—this model hasn't published public evals yet. For straightforward classification at scale where cost per call matters more than bleeding-edge performance, run a week-long pilot and measure false-positive rates against your current system.

Internal documentation Q&A

LFM2-24B-A2B as a cost-effective RAG backend for team wikis

A 40-person engineering org wants every developer to query their Confluence docs and runbooks without waiting on senior engineers. LFM2-24B-A2B handles the retrieval-augmented generation layer at a fraction of the cost of GPT-4 class models—critical when you're serving 200+ queries per day across the team. The 32k window fits most chunked documentation contexts, and the $0.03 input rate means embedding lookups and context injection stay cheap. The trade-off: you're betting on a model without public benchmark validation, so complex technical reasoning or edge-case accuracy might lag proven alternatives. If your docs are well-structured and queries are mostly lookup-heavy rather than synthesis-heavy, this model keeps your RAG pipeline economical. Test it on your hardest 50 questions first, then scale if precision holds.

Draft generation for marketing content

When to use LFM2-24B-A2B for first-pass blog outlines and social copy

A 5-person content agency produces 60 blog posts and 300 social assets monthly, and they need a model that drafts outlines and short-form copy fast without eating their margin. LFM2-24B-A2B's output pricing at $0.12/Mtok makes it viable for generating 500-800 token drafts that human editors refine—substantially cheaper than frontier models for work that doesn't require deep research or brand-voice nuance out of the gate. The 32k context lets you feed style guides and competitor examples inline. The risk: without published benchmarks, you can't predict how often drafts will need heavy rewrites versus light edits, which changes the true cost-per-finished-asset. Run a two-week trial on 20% of your workload, track editor time-to-publish, and calculate whether the cheaper token rate actually saves labor hours. If your editors spend under 10 minutes per draft, the math works.

Frequently asked

Is LFM2-24B-A2B good for general text tasks?

Yes, for most standard text work. At 24B parameters with a 32K context window, it handles document analysis, summarization, and conversational tasks well. Without public benchmarks we can't compare it directly to GPT-4 or Claude, but the parameter count suggests solid performance for everyday business use cases like email drafting, content editing, and customer support responses.

Is LFM2-24B-A2B cheaper than GPT-4o?

Significantly cheaper. At $0.03 input and $0.12 output per million tokens, you're paying roughly 10-20x less than GPT-4o depending on the task mix. For high-volume applications where you don't need frontier-model reasoning, this pricing makes it viable for batch processing, internal tools, and scenarios where cost per request matters more than peak capability.

Can LFM2-24B-A2B handle 32K tokens reliably?

The 32K context window is standard for mid-tier models, but real-world performance depends on your use case. For structured documents and straightforward retrieval it should work fine. For complex reasoning across the full context or needle-in-haystack tasks, expect degradation past 20K tokens—common for models in this class. Test with your actual data before committing to context-heavy workflows.

How does LFM2-24B-A2B compare to Llama 3.1 70B?

Llama 3.1 70B will outperform it on reasoning and coding tasks due to the 3x parameter advantage. LFM2-24B-A2B trades capability for speed and cost—inference is faster, hosting is cheaper, and for simpler tasks the quality gap disappears. Choose this if latency and price matter more than handling complex multi-step problems or advanced code generation.

Should I use LFM2-24B-A2B for production chatbots?

Yes, if your chatbot handles defined workflows rather than open-ended research. The 24B size means fast response times and the pricing supports high message volumes. For customer service, FAQ systems, or guided conversations it's a practical choice. For complex troubleshooting or technical support where reasoning depth matters, you'll want a larger model like Claude or GPT-4.