developer-toolsapi_key

RunPod

The Cloud Built for AI - GPU cloud computing platform for AI and machine learning workloads

Verdict

RunPod gives your team programmatic access to GPU and CPU infrastructure for ML workloads. @mention it in a Space to spin up pods, check cluster status, manage templates, or query available hardware specs without leaving the conversation. Engineers running training jobs or inference endpoints get the most value — they can provision resources, monitor costs, and troubleshoot deployments directly from chat. Setup requires a RunPod API key with appropriate permissions; the MCP can't manage billing or modify account-level settings.

Common use cases

  • Provision GPU pods for training runs from chat
  • Check cluster status during incident response
  • Compare GPU pricing before deploying models
  • Store API keys as RunPod secrets securely
  • Delete unused templates to clean up workspace

Integration

Vendor
RunPod
Category
developer-tools
Auth
API_KEY
Tools
13
Composio slug
runpod

Tools

  • Create RunPod Cluster

    Tool to create a new GPU cluster for multi-node distributed computing workloads on RunPod. Use when you need to deploy multiple pods with shared configuration for parallel processing, ML training, or HPC workloads.

  • Create Secret

    Tool to create a new secure secret in RunPod for credential management. Use when you need to store sensitive values like API keys, passwords, or tokens that will be accessible in pods and endpoints via environment variables (RUNPOD_SECRET_<

  • Delete Container Registry Authentication
    destructive

    Tool to delete container registry authentication from RunPod. Use when you need to remove stored registry credentials.

  • Delete Template
    destructive

    Tool to remove a RunPod template via GraphQL mutation. Use when you need to delete a template that is no longer needed. The template must not be in use by any pods or assigned to any serverless endpoints, otherwise the operation will fail.

  • Get authenticated user info

    Retrieve basic information about the authenticated user including ID, email, and security settings. Use this to get the current user's ID, email address, terms of service status, and MFA settings. Note: Access to financial fields (balance,

  • Get GPU Types

    Tool to retrieve available GPU types and their specifications, pricing, and availability from RunPod. Use when you need to find GPU options for deployment.

  • Get Pod Details

    Retrieve details of a specific RunPod pod by its unique pod ID. Returns pod configuration including GPU count, memory, cost, and status. Use when you need to check the current state or configuration of an existing pod.

  • List CPU Types

    Tool to retrieve available CPU types and their specifications from RunPod. Use when you need to view CPU options for provisioning pods or selecting hardware configurations.

  • Save Container Registry Authentication

    Tool to save container registry authentication credentials for accessing private Docker images in RunPod. Use when you need to store credentials for a private container registry.

  • Save Serverless Endpoint

    Tool to create or update a RunPod serverless endpoint with GPU configuration and scaling settings. Use when configuring new GPU-accelerated serverless endpoints or modifying existing endpoint parameters. Include 'id' parameter to update an

  • Save Template

    Tool to create a new RunPod template or update an existing one with container configuration. Use when you need to define reusable pod/serverless configurations with specific images, environment variables, and resource allocations. For serve

  • Update Registry Auth

    Tool to update existing container registry authentication credentials in RunPod. Use when you need to modify the username or password for an existing registry authentication.

  • Update User Settings

    Tool to update current user settings (e.g., SSH public key) in RunPod. Use when you need to configure SSH access to pods by setting the user's SSH public key.

Setup

Setup guide

  1. 11. In Switchy, open your workspace settings and navigate to the MCP Integrations page. 2. Click 'Add Integration' and select RunPod from the developer tools category. 3. Log into your RunPod account, go to Settings > API Keys, and generate a new key with read and write permissions for pods, templates, and secrets. 4. Paste the API key into Switchy's connection form and click 'Connect'. 5. Return to any Space and type '@RunPod get authenticated user info' to confirm the connection works. 6. If the MCP returns your user ID and email, you're ready to invoke other tools like creating pods or listing GPU types. 7. To use the MCP in a conversation, @mention RunPod followed by your request — for example '@RunPod list available GPU types' or '@RunPod create a pod with 1x A100'.

What teammates see: by default, memories from RunPod are scoped to the Space (PROJECT visibility) - you can mark any memory PRIVATE or share it ORG-wide.

Works well with

Top models

Compatibility data appears once enough Spaces have used this MCP together with a given model.

How Switchy teams use it

Not enough Spaces yet to publish anonymised usage stats (we require ≥ 50 Spaces per week).

Starter prompts

List Available GPUs

@RunPod show me all available GPU types with their current pricing and availability status
Open in a Space →

Create Training Pod

@RunPod create a pod with 1x A100 80GB GPU, 32GB RAM, and 200GB disk using the PyTorch 2.0 template
Open in a Space →

Check Pod Status

@RunPod get details for pod ID abc123xyz including current status, GPU utilization, and hourly cost
Open in a Space →

Store Secret Credential

@RunPod create a new secret named 'huggingface_token' with the value from my clipboard for use in training pods
Open in a Space →

Deploy Multi-Node Cluster

@RunPod create a cluster with 4 nodes, each with 1x A6000 GPU, for distributed training of a large language model
Open in a Space →

Example outputs

Illustrative - representative of the model's voice and quality, not literal recordings.

Prompt

@RunPod what GPU types are available right now with at least 24GB VRAM, and what's the hourly cost for each?

Output

Based on current RunPod availability, here are GPUs with 24GB+ VRAM: • RTX A5000 (24GB) - $0.34/hr community cloud, $0.76/hr secure cloud • RTX 4090 (24GB) - $0.69/hr community cloud • A40 (48GB) - $0.79/hr secure cloud • L40 (48GB) - $0.89/hr secure cloud Community cloud pricing is lower but availability fluctuates. Secure cloud offers guaranteed uptime. All prices are on-demand; spot pricing can be 50-70% cheaper when available. This snapshot reflects real-time inventory—options may change within minutes during high-demand periods.

Notes

This example shows the MCP's read capability for infrastructure planning. GPU availability and pricing are volatile; the response reflects a point-in-time query, not a static catalog. Requires API key with read access to GPU types endpoint. Use this to compare options before provisioning, but verify availability again at deployment time.

Prompt

@RunPod create a new pod with an RTX 4090, 50GB disk, using the pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime image. Name it 'training-experiment-01'.

Output

Pod 'training-experiment-01' created successfully. • Pod ID: abc123xyz789 • GPU: RTX 4090 (24GB VRAM) • Disk: 50GB • Container: pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime • Status: Starting (provisioning GPU resources) • Cost: ~$0.69/hr on-demand The pod is initializing and will be accessible via SSH and Jupyter in 2-3 minutes. You'll be billed from the moment it enters 'running' state. Remember to stop or terminate the pod when idle to avoid charges—RunPod bills by the second while running.

Notes

This demonstrates the MCP's write capability for provisioning compute. Creating a pod is a billable action that starts the usage clock immediately. The MCP doesn't auto-stop pods, so users must manage lifecycle to control costs. Requires API key with pod creation permissions. If the requested GPU type is unavailable, the operation fails—consider checking availability first with the GPU types tool.

Prompt

@RunPod check the status of all my running pods and calculate my total hourly burn rate across them.

Output

You have 3 active pods: 1. training-experiment-01 (RTX 4090) - Running 4h 23m - $0.69/hr 2. inference-server (A40) - Running 12h 8m - $0.79/hr 3. data-prep (CPU-only) - Running 1h 45m - $0.02/hr Total hourly rate: $1.50/hr Projected daily cost if left running: $36.00 Pod #2 has been running longest and accounts for 53% of your current spend. Consider whether 'data-prep' still needs to be active—it's been idle (0% GPU utilization) for the past hour based on runtime duration. Stopping idle pods immediately would reduce your rate to $1.48/hr.

Notes

This synthesis example pairs the MCP's pod listing capability with the AI's cost analysis reasoning. The MCP provides raw pod data (status, runtime, GPU type); the AI calculates burn rate and flags optimization opportunities. Useful for budget monitoring, but the MCP doesn't access real-time utilization metrics—the 'idle' inference here is illustrative. Requires read access to list all pods under your account.

Use-case deep-dives

ML training runs for 2-person research team

When RunPod MCP beats manual GPU provisioning for experiments

A two-person ML research team running daily training experiments needs to spin up H100s for 4-6 hours, then tear them down. The RunPod MCP is the right call here because the Create RunPod Cluster and Get GPU Types tools let you script the full lifecycle—check spot pricing, launch the cluster, monitor pod status, delete when done—without leaving your Switchy workspace. The API_KEY auth means one credential for both researchers. The trade-off: if you're running multi-week training jobs that need checkpoint recovery and complex orchestration, you'll hit the limits of the 13-tool scope and want Kubernetes or SageMaker instead. But for short-burst experiments where you're comparing architectures or tuning hyperparameters, this MCP turns GPU provisioning into a 30-second chat command instead of a 10-minute console workflow.

Customer demo environment provisioning

How RunPod MCP handles ephemeral inference endpoints for sales

A 5-person sales engineering team needs to spin up customer-specific inference demos—each prospect gets a dedicated endpoint running their model variant for a 2-week trial. The RunPod MCP wins here because the Create Secret tool lets you store per-customer API keys, and the pod lifecycle tools (Create, Get Pod Details, Delete Template) let your SE team provision and teardown environments from Switchy without touching the RunPod console. The 13-tool count is enough for this workflow: create the pod, check status, hand the endpoint to the prospect, delete when the trial ends. The boundary: if you're managing 50+ concurrent demos or need complex load balancing across regions, the MCP's tooling gets thin and you'll want Terraform or a custom orchestration layer. For teams running 5-15 demos at a time, this MCP turns a 20-minute provisioning task into a 2-minute Switchy command.

Batch video processing for content team

When RunPod MCP fits weekly render jobs at small scale

A 3-person video production team renders 10-20 client videos every Friday—each video needs 2-4 hours on an A6000 to apply effects and export at 4K. The RunPod MCP is a good fit because the Get GPU Types and Create RunPod Cluster tools let you check A6000 availability and spin up exactly the GPU count you need for that week's batch, then delete the cluster Saturday morning. The List CPU Types tool helps you right-size the instance if some videos are CPU-bound in the encode step. The constraint: if your render queue grows past 50 videos a week or you need real-time status updates across multiple concurrent jobs, the 13-tool MCP doesn't give you queue management or job orchestration—you'd want a dedicated render farm tool. For weekly batches under 25 videos, this MCP turns GPU rental into a Friday-morning Switchy ritual instead of a monthly AWS bill negotiation.

Frequently asked

What does the RunPod MCP let me do in Switchy?

It lets your team spin up GPU clusters, manage pods, and configure serverless endpoints without leaving the chat. You can check GPU availability, create templates, store secrets, and monitor pod status — basically the core RunPod operations you'd normally do in their dashboard. Useful when you're iterating on ML training runs or deploying inference endpoints and want to keep the workflow in one place.

Do I need a RunPod admin account to connect this MCP?

You need a RunPod API key with write permissions. RunPod doesn't have granular role-based access — any API key can create pods, delete templates, and manage secrets. If you're on a team plan, whoever connects the MCP can spend your RunPod credits, so connect it from an account you trust. The key goes into Switchy's credential store; team members don't see it.

Can the MCP start a training job on a specific GPU type?

Yes. The MCP can list available GPU types, check pricing and availability, then create a pod or cluster targeting that hardware. It can't upload your training code directly — you still reference a Docker image from a registry — but it handles the infrastructure provisioning. For multi-node jobs, use the cluster creation tool with your desired GPU count and shared config.

How is this different from using RunPod's web dashboard?

The dashboard is better for browsing GPU specs and monitoring long-running jobs visually. The MCP is faster when you're already in a Switchy chat and want to spin up a pod, check status, or delete a template without context-switching. It also lets you script repetitive tasks — like creating a cluster with the same config every Monday — inside a shared workspace where your team can see the history.

Who on the team should connect the RunPod MCP?

Whoever manages your RunPod billing or has the API key. Since the MCP can create expensive GPU clusters, connect it from an account with spending authority. Once connected, any Switchy team member can invoke the tools, so set workspace permissions accordingly. If you want tighter control, create a separate RunPod project with a budget cap and connect that key instead.

Data last verified 607 hours ago.Sources aggregated hourly to weekly. See docs/architecture/model-directory.md.