Z.ai: GLM 4.5 Air
GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications. Like GLM-4.5, it adopts the Mixture-of-Experts (MoE) architecture but with a more compact parameter...
Anyone in the Space can @-mention Z.ai: GLM 4.5 Air with the team's shared context - pooled credits, one chat, one memory.
Starter is free forever - 1 Space, 100 credits/month, 1 MCP. No card.
Verdict
Best for
- Budget-sensitive long-context tasks
- Chinese-language document analysis
- High-volume text processing pipelines
- Prototyping before scaling to premium models
Strengths
The 128K context window matches GPT-4 Turbo at a fraction of the cost, making it viable for ingesting full codebases or lengthy contracts in a single call. Pricing sits roughly 80% below Anthropic and OpenAI equivalents, which matters for teams running thousands of requests daily. The model originates from Zhipu AI's GLM series, which has shown competitive performance on Chinese-language benchmarks in prior releases, suggesting strength in multilingual scenarios where English-only models stumble.
Trade-offs
Public benchmark coverage is nearly nonexistent, so you're flying blind compared to models with extensive MMLU, HumanEval, and reasoning evals. Early adopters report weaker performance on complex multi-step reasoning and creative writing versus Claude Sonnet or GPT-4o. Output quality can drift on edge-case prompts, and the model lacks the safety tuning depth of Western labs. If your task demands high reliability or passes through compliance review, the lack of transparency becomes a blocking issue.
Specifications
- Provider
- z-ai
- Category
- llm
- Context length
- 131,072 tokens
- Max output
- 98,304 tokens
- Modalities
- text
- License
- proprietary
- Released
- 2025-07-25
Pricing
- Input
- $0.13/Mtok
- Output
- $0.85/Mtok
- Model ID
z-ai/glm-4.5-air
Per-token prices show what the model costs upstream. On Switchy your team draws from one shared org credit pool - one plan, one balance for everyone.
Team cost calculator
5 seats · 80 msgs/day
Switchy meters this against your org's shared credit pool - one plan, one balance for everyone.
Providers
| Provider | Context | Input | Output | P50 latency | Throughput | 30d uptime |
|---|---|---|---|---|---|---|
| z-ai | 131k | $0.13/Mtok | $0.85/Mtok | — | — | — |
Performance
Benchmarks
Works well with
Top MCPs
Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.
How Switchy teams use it
Starter prompts
Bulk Contract Extraction
Extract all payment terms, renewal clauses, and termination conditions from the attached lease. List each with page references and flag any ambiguous language.Open in a Space →
Chinese-English Code Comments
Translate all Chinese comments in this Python file to English. Preserve technical terms and keep the tone consistent with existing English comments.Open in a Space →
Multi-Document Summarization
Summarize the key findings and methodology from these five papers. Highlight where results conflict and note any shared limitations.Open in a Space →
Cost-Optimized Chatbot Backend
You are a support agent for an e-commerce platform. Answer the user's question about order status, refunds, or shipping. Be concise and friendly.Open in a Space →
Codebase Context Search
Given this full codebase, explain how the authentication flow works from login to token refresh. Include file names and function calls.Open in a Space →