othernone

Gemini

Comprehensive Gemini integration supporting Veo 3 video generation, Gemini Flash text generation (Nano Banana), chat completions, and multimodal AI capabilities via the Google Gemini API.

Verdict

Gemini brings Google's multimodal AI directly into your Spaces — text generation, image creation, video synthesis, and embeddings all from one @mention. Your team can draft content with Gemini Flash or Pro, generate images with 2.5 Flash, produce videos with Veo, or compute embeddings for semantic search without leaving the conversation. Token counting helps estimate costs before you commit. No API key setup required in Switchy, though you'll still need a valid Gemini API account to invoke the tools. Best for teams that prototype content, run semantic analysis, or need quick visual assets without context-switching to separate platforms.

Common use cases

  • Draft blog posts and social copy in chat
  • Generate hero images for landing pages
  • Produce explainer videos from text briefs
  • Compute embeddings for semantic document search
  • Estimate token costs before large batch jobs

Integration

Vendor
Gemini
Category
other
Auth
NONE
Tools
9
Composio slug
gemini

Tools

  • Count Tokens (Gemini)

    Counts the number of tokens in text using gemini tokenization. useful for estimating costs, checking input limits, and optimizing prompts before making api calls.

  • Download Video (Veo)

    Downloads a generated veo video to local storage. takes the video uri from a completed operation and saves it to the specified file path.

  • Embed Content (Gemini)

    Generates text embeddings using gemini embedding models. converts text into numerical vectors for semantic search, similarity comparison, clustering, and classification tasks.

  • Generate Content (Gemini)

    Generates text content from prompts using gemini models. supports various models like gemini flash and pro with configurable temperature, token limits, and safety settings for diverse text generation tasks.

  • Generate Image (Gemini 2.5 Flash)

    Generates images from text prompts using gemini 2.5 flash image preview model. supports creative image generation with customizable parameters like aspect ratio, safety settings, and optional file saving.

  • Generate Videos (Veo)

    Generates videos from text prompts using google's veo models. creates high-quality video content with customizable aspect ratios, duration, and style controls. returns operation id for tracking progress.

  • Get Videos Operation (Veo)

    Checks the status of a veo video generation operation. use the operation name from generatevideos to track progress and get the download url when complete.

  • List Models (Gemini API)

    Lists available gemini and veo models with their capabilities and limits. useful for discovering supported models and their features before making generation requests.

  • Wait For Video (Veo)

    Polls a veo video generation operation until completion or timeout. automatically checks status at intervals and returns the final video url when ready.

Setup

Setup guide

  1. 11. Open your Switchy workspace and navigate to Settings > Integrations. 2. Locate Gemini in the MCP directory and click Connect. 3. Because this MCP uses no auth flow in Switchy, the connection completes instantly — no OAuth redirect or key entry. 4. Ensure your team has a Gemini API account with billing enabled; the MCP relays requests to Google's endpoints using credentials you manage outside Switchy. 5. Return to any Space and type '@Gemini list models' to confirm the connection works — you should see a list of available Gemini and Veo models. 6. To generate text, try '@Gemini generate content with the prompt: Write a product tagline for a smart thermostat' and watch the response stream in. 7. For images or videos, use '@Gemini generate image' or '@Gemini generate videos' with a descriptive prompt; video operations return an ID you can check later with '@Gemini get videos operation'.

What teammates see: by default, memories from Gemini are scoped to the Space (PROJECT visibility) - you can mark any memory PRIVATE or share it ORG-wide.

Works well with

Top models

Compatibility data appears once enough Spaces have used this MCP together with a given model.

How Switchy teams use it

Not enough Spaces yet to publish anonymised usage stats (we require ≥ 50 Spaces per week).

Starter prompts

Draft Product Announcement

@Gemini generate content with the prompt: Write a 150-word product announcement email for our new project management dashboard, emphasizing speed and collaboration. Use a professional but friendly tone.
Open in a Space →

Create Social Media Visual

@Gemini generate image with the prompt: A minimalist illustration of a team collaborating around a digital whiteboard, pastel color palette, flat design style. Aspect ratio 1:1.
Open in a Space →

Generate Explainer Video

@Gemini generate videos with the prompt: A 10-second animation showing a smartphone receiving a notification, then a hand tapping to open a task list. Clean, modern aesthetic.
Open in a Space →

Embed Support Articles

@Gemini embed content with the text: 'How to reset your password: Navigate to Settings, click Security, then choose Reset Password and follow the email link.' Return the embedding vector.
Open in a Space →

Check Token Budget

@Gemini count tokens in this text: 'Please summarize the following 50-page research report on renewable energy trends, highlighting key statistics and policy recommendations.' How many tokens is that?
Open in a Space →

Example outputs

Illustrative - representative of the model's voice and quality, not literal recordings.

Prompt

@Gemini list all available models and tell me which one is best for generating technical documentation from code snippets

Output

Based on the available Gemini models, here's what I found: **Available Models:** - Gemini 2.0 Flash (text): 1M token context, optimized for speed - Gemini 1.5 Pro (text): 2M token context, best for complex reasoning - Gemini 2.5 Flash (multimodal): Supports text and image generation - Veo 2.0 (video): Video generation from text prompts **Recommendation:** For technical documentation from code, I'd suggest **Gemini 1.5 Pro**. Its 2M token context window handles large codebases well, and its reasoning capabilities excel at explaining complex technical concepts. The Flash models are faster but may sacrifice depth in technical explanations.

Notes

This example shows the MCP's model discovery capability, which is essential before making generation requests. Note that model availability and capabilities change as Google updates their API — the list returned reflects your API key's access level and regional availability at query time.

Prompt

@Gemini generate an image of a modern workspace with natural lighting, plants, and minimalist furniture in 16:9 aspect ratio and save it to workspace-concept.png

Output

Image generated successfully using Gemini 2.5 Flash image preview model. **Generation Details:** - Prompt: "modern workspace with natural lighting, plants, and minimalist furniture" - Aspect ratio: 16:9 (1920x1080) - Safety settings: Default (blocks harmful content) - File saved: workspace-concept.png (2.3 MB) **Result:** Created a photorealistic rendering showing a bright office space with floor-to-ceiling windows, potted fiddle-leaf fig, white desk with laptop, and Scandinavian-style chair. The composition emphasizes natural light and clean lines per your minimalist requirement.

Notes

Image generation requires no authentication but is subject to Google's safety filters and content policies. The MCP saves files locally to your specified path. Generation typically takes 8-15 seconds. If the prompt violates safety guidelines (violence, explicit content), the request will be blocked with a policy violation error.

Prompt

@Gemini count the tokens in our entire project README and estimate the cost if we use it as context in 50 Gemini Pro API calls this month

Output

Token analysis complete: **README.md Token Count:** - Total tokens: 3,847 tokens (using Gemini tokenization) - Character count: ~18,200 characters - Estimated words: ~3,100 words **Cost Estimate (50 API calls with Gemini 1.5 Pro):** - Input tokens per call: 3,847 - Total monthly input: 192,350 tokens - Estimated cost: $0.24 (at $0.00125 per 1K input tokens) - Output tokens not included (varies by response length) **Recommendation:** Your README fits comfortably within Gemini Pro's context window. Consider caching this content if you're making repeated calls with the same context to reduce costs by ~90%.

Notes

Token counting uses Gemini's actual tokenizer, so counts are accurate for cost estimation and context limit planning. This is particularly useful before making batch API calls or when optimizing prompts. The cost calculation assumes current Gemini API pricing, which Google may adjust. Output token costs depend on response length and aren't included in this estimate.

Use-case deep-dives

Content team batch video production

When Gemini MCP wins for social media video pipelines

A 3-person content team at a B2B SaaS company needs to produce 20 short explainer videos per month for LinkedIn and Twitter. They write the scripts in Notion, then use Switchy to batch-generate videos with Veo, track operation status, and download finished files to a shared Drive folder. The MCP's Generate Videos and Get Videos Operation tools handle the async workflow without manual polling. Count Tokens helps them stay under prompt limits when feeding in long product descriptions. This setup works until you hit Veo's rate limits (check the vendor docs for your tier). If you're shipping more than 50 videos a month, you'll need to stagger requests or upgrade your Gemini API plan. For teams under that threshold, this MCP turns a 2-day video production cycle into a 2-hour batch job.

Support team knowledge base search

Embed Content for semantic ticket routing at scale

A 6-person support team at a fintech startup uses Switchy to route incoming Zendesk tickets to the right specialist. They embed 400 help articles with the Embed Content tool, then compare incoming ticket text against that vector store to surface the 3 most relevant docs. The MCP's Generate Content tool drafts a reply using those articles as context. This works well when your knowledge base is stable (under 1,000 articles) and you're not re-embedding daily. If your docs change hourly or you're managing 10,000+ articles, the embedding cost and latency add up fast. For teams with a stable knowledge base and under 200 tickets a day, this MCP gives you semantic search without standing up a separate vector database. You'll pay per embed call, so budget accordingly.

Product team mockup generation sprint

When Generate Image beats Figma for early-stage concepts

A 4-person product team at a consumer app startup runs a 2-day design sprint to explore 15 different onboarding flow concepts. Instead of waiting on a designer, they use Switchy to generate mockups with the Generate Image tool, feeding in text prompts like 'mobile app onboarding screen, minimalist, blue accent, user profile form'. They iterate on aspect ratio and safety settings, then drop the best 5 into Figma for refinement. This MCP is fastest when you need throwaway visuals for internal discussion, not pixel-perfect production assets. If you're generating images for customer-facing marketing, you'll still need a designer to polish. For teams that need 10-20 concept images in a day to unblock a decision, this MCP cuts the bottleneck from 3 days to 3 hours.

Frequently asked

What does the Gemini MCP let me do in Switchy?

It connects Switchy to Google's Gemini and Veo models for text generation, image creation, video synthesis, and embeddings. You can generate content with Gemini Flash or Pro, create images with Gemini 2.5 Flash, produce videos with Veo, count tokens for cost estimation, and generate embeddings for semantic search. All nine tools run directly in your Switchy workspace without switching to Google's console.

Do I need a Google API key to use this MCP?

Yes. This MCP requires no OAuth flow, but you must provide a Gemini API key from Google AI Studio. The key authenticates all requests to Gemini and Veo models. Without it, none of the nine tools will function. You'll add the key once in Switchy's MCP settings, and it applies to all team members using this integration.

Can the Gemini MCP generate videos longer than 10 seconds?

Veo's duration limits depend on Google's current model constraints, typically capped at a few seconds for preview models. The MCP exposes whatever duration range Veo supports at the API level. If you need longer videos, you'll generate multiple clips and stitch them externally. Check the List Models tool output for the exact duration ceiling of each Veo model version.

Why use this MCP instead of calling Gemini's API directly?

The MCP wraps Gemini and Veo endpoints so your team can invoke them from Switchy's chat interface without writing code or managing API clients. You skip the boilerplate of token counting, operation polling for video generation, and model discovery. If you already have a custom pipeline or need sub-second latency, stick with direct API calls.

Does video generation count against my Switchy plan limits?

No. Switchy doesn't meter Gemini or Veo usage; Google bills you directly based on your API key's quota and pricing tier. Your Switchy plan limits apply only to Switchy-native features like message history and workspace seats. One team member should own the API key to centralise billing and avoid duplicate charges across multiple keys.

Data last verified 607 hours ago.Sources aggregated hourly to weekly. See docs/architecture/model-directory.md.