Qwen: Qwen3.6 27B
Qwen3.6 27B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba, released in April 2026. It features hybrid multimodal capabilities — accepting text, image, and video inputs...
Anyone in the Space can @-mention Qwen: Qwen3.6 27B with the team's shared context - pooled credits, one chat, one memory.
Starter is free forever - 1 Space, 100 credits/month, 1 MCP. No card.
Verdict
Best for
- Long-context document analysis under budget
- Multimodal tasks combining text and images
- Video content understanding and summarization
- Cost-sensitive production deployments
- Extended conversations with deep context retention
Strengths
The 262K context window puts Qwen3.6 27B in the same league as models twice its size, making it viable for full-document analysis and multi-turn conversations that would exhaust smaller windows. Multimodal support across text, image, and video gives you flexibility without switching models mid-workflow. At $0.29 per million input tokens, it's 3-5× cheaper than comparable multimodal alternatives, making it practical for high-volume production use cases where cost per request matters.
Trade-offs
The 27B parameter count means you'll hit reasoning limits faster than with 70B+ models — complex multi-step logic, advanced mathematics, and nuanced code generation will show the gap. Output pricing at $3.17/Mtok is higher than some text-only peers, so long-form generation tasks can still rack up costs. Without public benchmark data yet, you're flying somewhat blind on head-to-head performance against established models like GPT-4o or Claude 3.5 Sonnet in specific domains.
Specifications
- Provider
- qwen
- Category
- llm
- Context length
- 262,140 tokens
- Max output
- 262,140 tokens
- Modalities
- text, image, video
- License
- proprietary
- Released
- 2026-04-27
Pricing
- Input
- $0.29/Mtok
- Output
- $3.17/Mtok
- Model ID
qwen/qwen3.6-27b
Per-token prices show what the model costs upstream. On Switchy your team draws from one shared org credit pool - one plan, one balance for everyone.
Team cost calculator
5 seats · 80 msgs/day
Switchy meters this against your org's shared credit pool - one plan, one balance for everyone.
Providers
| Provider | Context | Input | Output | P50 latency | Throughput | 30d uptime |
|---|---|---|---|---|---|---|
| qwen | 262k | $0.29/Mtok | $3.17/Mtok | — | — | — |
Performance
Benchmarks
Works well with
Top MCPs
Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.
How Switchy teams use it
Starter prompts
Analyze Multi-Page Contract
Review this contract and identify all payment terms, termination clauses, and liability limitations. Flag any unusual provisions that deviate from standard commercial agreements.Open in a Space →
Summarize Video Content
Watch this video and provide a structured summary: main topics covered, key takeaways, and any action items or decisions mentioned. Include timestamps for important segments.Open in a Space →
Extract Data from Screenshots
Examine this screenshot and extract all visible data into a structured format. Include field labels, values, and any error messages or warnings displayed.Open in a Space →
Compare Product Images
Compare these two product images and list all visible differences: colors, dimensions, packaging, labels, and any quality issues. Be specific about what changed.Open in a Space →
Maintain Context Across Conversation
Based on everything we've discussed so far, synthesize the three most important decisions we've made and the open questions that still need resolution. Reference specific points from our conversation.Open in a Space →
Example outputs
Illustrative - representative of the model's voice and quality, not literal recordings.
Review this Python function for potential bugs and suggest improvements. Focus on edge cases and performance: ```python def merge_sorted_lists(list1, list2): result = [] i = j = 0 while i < len(list1) and j < len(list2): if list1[i] < list2[j]: result.append(list1[i]) i += 1 else: result.append(list2[j]) j += 1 return result ```
The function has a critical bug: it doesn't append remaining elements after one list is exhausted. Add `result.extend(list1[i:])` and `result.extend(list2[j:])` after the while loop. For performance, consider using `itertools.chain()` with `heapq.merge()` for larger lists—it's O(n) and memory-efficient. The current approach creates a new list with repeated appends; pre-allocating with `result = [None] * (len(list1) + len(list2))` and index assignment would reduce allocations. Edge case: function assumes both inputs are sorted and doesn't validate that assumption.
This example shows Qwen3.6 27B's code analysis depth—it catches the logic bug, suggests algorithmic alternatives, and addresses performance patterns. The 262K token context window means it could review entire codebases in one pass, though this example keeps scope narrow. Response is practical but assumes intermediate Python knowledge.
I'm designing a SaaS onboarding flow. Generate 5 microcopy variants for a button that lets users skip the tutorial. The app is a project management tool for remote teams. Tone should be encouraging but not pushy.
1. "I'll explore on my own" — Empowers user autonomy without implying the tutorial lacks value. 2. "Jump straight in" — Action-oriented, suggests confidence rather than avoidance. 3. "Skip for now" — Leaves door open for later, reduces commitment anxiety. 4. "Start with a blank canvas" — Frames skipping as a creative choice, fits PM context. 5. "I'm familiar with PM tools" — Acknowledges user expertise, respects their time. Recommendation: Test #2 and #4 in A/B—they reframe skipping as positive action rather than rejection. Avoid "Skip tutorial" (too blunt) or "Maybe later" (creates guilt).
Demonstrates multimodal reasoning—Qwen3.6 processes the product context (remote teams, PM tool) and tone constraints to generate contextually appropriate options. The strategic recommendation shows it's not just generating variants but evaluating them. At $3.20/Mtok output, this kind of creative iteration is cost-effective compared to text-heavy models.
Analyze this customer support transcript and identify: (1) the root cause of the issue, (2) whether the agent followed our escalation protocol, (3) sentiment trajectory. Transcript: [360-line conversation with timestamps, multiple handoffs, technical jargon, and emotional language spanning 48 minutes]
**Root cause:** Billing system charged customer twice due to webhook retry logic firing after initial timeout (lines 89-104). Not a duplicate order—same transaction ID processed twice. **Protocol adherence:** Partial. Agent correctly escalated to L2 at 12-minute mark (protocol: 10 min) but failed to offer proactive refund before customer requested it (line 201). Missed required empathy statement during second handoff. **Sentiment:** Started neutral (CSAT proxy: 6/10), dropped to frustrated at 18-min mark when asked to repeat information, recovered to cautiously positive after refund confirmation (est. 7/10). Key inflection at line 267 when agent acknowledged system fault without deflection.
This showcases the 262K token context window's practical value—ingesting a full support transcript without summarization preserves nuance that shorter-context models would lose. Qwen3.6 extracts structured insights from unstructured conversation while tracking sentiment shifts. Trade-off: no benchmark data means we can't compare its accuracy to specialized support-analysis models.
Use-case deep-dives
When Qwen3.6 27B handles image-heavy SKU tagging at scale
A 12-person e-commerce team processes 800 product photos daily, extracting attributes for search filters and auto-generating descriptions. Qwen3.6 27B's native image+text input means you skip the separate vision API call—feed the product shot and existing metadata in one request, get structured JSON back with color, material, style tags. At $0.32 input per million tokens, a 500-token image embedding plus 200-token prompt costs under a tenth of a cent per SKU. The 262k context window lets you batch 40-50 products in a single call for consistency across a collection. Output at $3.20/Mtok keeps 300-word descriptions economical when you're generating thousands weekly. If your catalog is under 200 items or accuracy demands exceed 95%, validate with a specialist vision model first. For mid-volume catalogs where speed and cost matter more than perfection, this is the call.
Why Qwen3.6 27B works for in-house counsel reviewing contracts
A 4-lawyer startup team reviews 15-20 SaaS vendor agreements monthly, each 30-80 pages. Qwen3.6 27B's 262k token context fits a full 60-page contract (roughly 90k tokens) plus your internal compliance checklist and 20 follow-up questions in a single session—no chunking, no retrieval layer, no context loss between clauses. You ask "Does Section 8 conflict with our data residency policy?" and get an answer grounded in both documents. At $0.32 input per Mtok, loading a 90k-token contract costs 3 cents; five rounds of Q&A add another 2 cents in output. The 27B parameter count won't match frontier models on nuanced legal reasoning, but for routine vendor paper where you're checking boxes, not litigating edge cases, it's fast and cheap enough to run on every agreement. If you're negotiating M&A or IP licensing, escalate to a 70B+ model.
When Qwen3.6 27B's video input cuts moderation latency for UGC platforms
A 20-person social app moderates 5,000 user-uploaded videos daily, flagging violence, hate symbols, and policy violations before publish. Qwen3.6 27B ingests video natively—you send 10-second clips as token sequences, ask "Does this contain prohibited content?", and get a yes/no plus reasoning in under 2 seconds. No separate transcription or frame-extraction pipeline. At $0.32 input per Mtok, a 10-second video (roughly 8k tokens) plus a 100-token prompt costs a fraction of a cent per clip. Output at $3.20/Mtok keeps the 150-token explanation affordable even at 5k daily reviews. The model won't catch every deepfake or context-dependent slur a human would, so route borderline cases (confidence under 80%) to human review. For high-volume, low-stakes moderation where you need sub-3-second decisions and can tolerate a 5-8% false-negative rate, this is the right trade-off.
Frequently asked
Is Qwen3.6 27B good for general text generation tasks?
Yes, Qwen3.6 27B handles most text generation well — summarization, drafting, Q&A, and light reasoning. At 27B parameters it sits between smaller fast models and heavyweight reasoning engines. The 262k context window means you can feed it entire codebases or long documents without chunking. It won't match 70B+ models on complex logic, but for everyday text work it's solid.
Is Qwen3.6 27B cheaper than GPT-4o or Claude Sonnet?
Much cheaper. At $0.32 input and $3.20 output per million tokens, Qwen3.6 costs roughly 10-15x less than GPT-4o or Claude Sonnet 3.5 for comparable workloads. If you're processing high volumes of text where top-tier reasoning isn't critical — customer support, content moderation, data extraction — the cost savings add up fast without sacrificing too much quality.
Can Qwen3.6 27B handle multimodal inputs like images and video?
Yes, it accepts text, image, and video inputs. You can feed it screenshots for UI analysis, diagrams for technical questions, or video frames for content understanding. The quality depends on your use case — it's not a specialist vision model, so don't expect GPT-4V-level image reasoning. For basic multimodal tasks like document parsing or video summarization, it works fine.
How does Qwen3.6 27B compare to the previous Qwen2.5 models?
Qwen3.6 extends the context window significantly — 262k tokens versus 128k in Qwen2.5 72B. The 27B parameter count makes it faster and cheaper to run than the 72B variant while maintaining competitive performance on most tasks. If you need the extra context for long documents or don't require maximum reasoning depth, Qwen3.6 27B is the better pick for cost-efficiency.
Should I use Qwen3.6 27B for production chatbots?
Yes, if cost and speed matter more than cutting-edge reasoning. The 262k context lets you maintain long conversation histories or inject large knowledge bases. Latency should be acceptable for chat — 27B models typically respond in 1-3 seconds depending on your infrastructure. For customer service, internal tools, or high-volume chat where you don't need GPT-4-class logic, it's a practical choice.