ElevenLabs
Create natural AI voices instantly in any language - perfect for video creators, developers, and businesses.
Verdict
Common use cases
- Generate podcast intro voiceovers from scripts
- Clone team member voices for training videos
- Convert blog posts into audio versions
- Prototype narration for video projects
- Build pronunciation dictionaries for brand terms
Integration
- Vendor
- ElevenLabs
- Category
- other
- Auth
- API_KEY
- Tools
- 50
- Composio slug
elevenlabs
Tools
- Add a pronunciation dictionary from file
Adds a new pronunciation dictionary from a lexicon file to improve speech synthesis accuracy.
- Add a voice
Adds a custom voice, requiring a `name` and a `files` list with at least one audio sample, to initiate cloning; returns `voice id` but voice is not immediately usable for synthesis.
- Add new project with attributes
Use to create a new elevenlabs project for text-to-speech synthesis (e.g., audiobooks); a project `name` is required by the api for creation, and content can be initialized using `from url` or `from document`.
- Add rules to the pronunciation dictionary
Adds one or more custom pronunciation rules (alias or phoneme) to an existing pronunciation dictionary.
- Add sharing voice
Adds an existing, shareable voice to a specified user's elevenlabs account library under a new custom name, requiring the user's public id and the voice id.
- Convert a project
Converts an existing elevenlabs studio project, including all its chapters and using its configured settings and voices, into speech.
- Convert chapter to audio
Converts the textual content of a chapter, identified by `chapter id` within a `project id`, into audio format.
- Create an AudioNative enabled project
Creates an elevenlabs audionative project, generating an embeddable audio player from a provided content file using text-to-speech, allowing customization of player appearance, audio settings, and conversion options.
- Create a previously generated voice
Finalizes the creation of a voice using its `generated voice id` from a previous generation step by assigning a name, description, and optional labels.
- Delete a dubbing projectdestructive
Permanently deletes a dubbing project by its id; this action is irreversible and the project cannot be recovered.
- Delete chapter from projectdestructive
Irreversibly deletes a specific, existing chapter from an existing project, typically to remove unwanted or obsolete content.
- Delete history itemdestructive
Permanently deletes a specific history item (including its audio file and metadata) using its `history item id`; this operation is irreversible and should be used with caution.
- Delete project by iddestructive
Use to irreversibly delete a specific project by its `project id`; the project must exist and be accessible, and this action cannot be undone.
- Delete voice by iddestructive
Permanently and irreversibly deletes a specific custom voice using its `voice id`; the voice must exist and the authenticated user must have permission to delete it.
- Delete voice sampledestructive
Permanently deletes a specific voice sample for a given voice id; this action is irreversible.
- Download history items
Downloads audio clips from history by id(s), returning a single file or a zip archive, with an optional output format (e.g., 'wav'); provides only audio content, no metadata.
- Dub a video or an audio file
Dub a video or audio file into a specified target language, requiring 'file' or 'source url', 'target lang', and 'csv file' if 'mode' is 'manual'.
- Dub a video or an audio file
Deprecated: use `dub a video or an audio file` instead; dubs a video/audio file, requiring 'file' or 'source url', 'target lang', and 'csv file' if 'mode' is 'manual'.
- Edit voice
Updates the name, audio files, description, or labels for an existing voice model specified by `voice id`.
- Edit voice settings
Edits key voice settings (e.g., stability, similarity enhancement, style exaggeration, speaker boost) for an existing voice, affecting all future audio generated with that voice id.
- Generate a random voice
Generates a unique, random elevenlabs text-to-speech voice based on input text and specified voice characteristics.
- Get audio from history item
Retrieves the audio content for a specific history item from elevenlabs, using a `history item id` that must correspond to a previously generated audio.
- Get chapter by ID
Fetches comprehensive details for a specific chapter within a given project, including its metadata (name, id), conversion status, progress, download availability, and content statistics.
- Get chapters by project id
Retrieves a list of all chapters, their details, and conversion status for a project, useful for managing content or tracking progress.
- Get chapter snapshots
Retrieves all saved version snapshots for a specific chapter within a given project, enabling review of its history or reversion to prior states.
- Get default voice settings
Retrieves the elevenlabs text-to-speech service's default voice settings (stability, similarity boost, style, speaker boost) that are applied when no voice-specific or request-specific settings are provided.
- Get dubbed audio for a language
Retrieves an existing dubbed audio file for a specific `dubbing id` and `language code`.
- Get dubbing project metadata
Retrieves metadata and status for a specific dubbing project by its id.
- Get dubbing transcript by language
Retrieves the textual transcript for a specified dubbing project and language, if one exists for that language in the project.
- Get generated items
Retrieves metadata for a list of generated audio items from history, supporting pagination and optional filtering by voice id.
- Get history item by id
Retrieves detailed information (excluding the audio file) for a specific audio generation history item from elevenlabs, using its unique id.
- Get models
Retrieves a detailed list of all available elevenlabs text-to-speech (tts) models and their capabilities.
- Get models
Deprecated: use the 'get models' action instead; formerly retrieved available elevenlabs text-to-speech (tts) models.
- Get project by ID
Use to retrieve all details for a specific project, including its chapters and their conversion statuses, by providing the project's unique id.
- Get projects
Fetches a list of all projects and their details associated with the user's elevenlabs account; this is a read-only operation.
- Get project snapshots
Retrieves all available snapshots (saved states or versions) for an existing project, enabling history tracking, version comparison, or accessing specific states for playback/processing, particularly in text-to-speech workflows.
- Get pronunciation dictionaries
Retrieves a paginated list of pronunciation dictionaries, used to customize how specific words or phrases are pronounced by the text-to-speech (tts) engine.
- Get pronunciation dictionary metadata
Retrieves metadata for a specific, existing pronunciation dictionary from elevenlabs using its id.
- Get pronunciation dictionary version
Downloads the pronunciation lexicon specification (pls) file for an existing version of a pronunciation dictionary from elevenlabs, used to customize tts pronunciation.
- Get sample audio
Retrieves the audio for a given `sample id` that must belong to the specified `voice id`.
- Get shared voices
Retrieves a paginated and filterable list of shared voices from the elevenlabs voice library.
- Get sso provider admin
Retrieves the sso provider configuration for a specified workspace, typically for review purposes, and will indicate if no configuration exists.
- Get user info
Retrieves detailed information about the authenticated elevenlabs user's account, including subscription, usage, api key, and status.
- Get user info
Deprecated: retrieves authenticated user's account details; use 'get user info' instead.
- Get user profile by handle
Retrieves the public profile information for an existing elevenlabs user based on their unique handle.
- Get user subscription info
Retrieves detailed subscription information for the currently authenticated elevenlabs user.
- Get voice
Retrieves comprehensive details for a specific, existing voice by its `voice id`, optionally including its settings.
- Get voices list
Retrieves a list of all available voices along with their detailed attributes and settings.
- Text to speech
Converts text to speech using a specified elevenlabs voice and model, returning a downloadable audio file.
- Text to speech stream
Converts text to a spoken audio stream, allowing latency optimization, specific output formats (some tier-dependent), and custom pronunciations; ensure the chosen model supports text-to-speech and text is preferably under 5000 characters.
Setup
Setup guide
- 11. In Switchy, navigate to Settings > Integrations and select ElevenLabs from the MCP directory. 2. You'll be prompted to enter your ElevenLabs API key — generate one from your ElevenLabs account dashboard under Profile > API Keys. 3. Paste the key into Switchy and click Connect; the integration tests the key by fetching your available voices. 4. Once connected, open any Space and type '@ElevenLabs' to confirm the MCP responds with available tools. 5. Test synthesis by asking '@ElevenLabs generate a 10-second voiceover saying hello in a neutral voice' — you'll receive an audio file link if the connection works. 6. Check your ElevenLabs account credits before heavy use; each synthesis consumes characters from your plan quota.
What teammates see: by default, memories from ElevenLabs are scoped to the Space (PROJECT visibility) - you can mark any memory PRIVATE or share it ORG-wide.
Works well with
Top models
Compatibility data appears once enough Spaces have used this MCP together with a given model.
How Switchy teams use it
Starter prompts
Generate Explainer Voiceover
@ElevenLabs generate a voiceover for this script using a professional male voice: [paste your script here]. Export as MP3.Open in a Space →
Clone Custom Voice
@ElevenLabs add a new voice named 'Founder Voice' using these three audio samples I'll upload. Let me know when cloning completes.Open in a Space →
Convert Article to Audio
@ElevenLabs create a new project called 'Blog Audio' and convert this article into speech with natural pacing: [paste article URL or text].Open in a Space →
Add Brand Pronunciations
@ElevenLabs create a pronunciation dictionary and add these rules: 'Switchy' sounds like 'switch-ee', 'MCP' expands to 'Model Context Protocol'.Open in a Space →
Test Voice Variations
@ElevenLabs synthesize this tagline in five different voices and send me all versions: 'AI that works the way your team does.'Open in a Space →
Example outputs
Illustrative - representative of the model's voice and quality, not literal recordings.
@ElevenLabs convert this 3-paragraph product announcement into speech using the 'Rachel' voice, then give me the audio file URL.
I've converted your product announcement to speech using the Rachel voice. The synthesis took approximately 8 seconds for the 247-word text. The generated audio file is available at: https://api.elevenlabs.io/v1/history/abc123xyz/audio (expires in 30 days). The audio runs 1 minute 42 seconds at the default speaking rate. You can download it directly or embed it in your announcement page.
This example shows basic text-to-speech conversion, the MCP's core capability. You'll need an API key with sufficient character quota — each synthesis deducts from your monthly allowance. The voice name must match one in your ElevenLabs library (default voices or custom clones). Audio URLs expire, so download files you need to keep.
@ElevenLabs create a new audiobook project called 'Q4 Sales Training' and initialize it with chapters from this Google Doc URL, then tell me the project ID.
I've created the audiobook project 'Q4 Sales Training' in your ElevenLabs account. The project ID is prj_abc123def456. I initialized it from the Google Doc you provided, which was parsed into 7 chapters based on heading structure. The project is ready for voice assignment and conversion, but no audio has been generated yet — you'll need to configure voices per chapter and run conversion separately.
This demonstrates the project-based workflow for longer content like audiobooks or training materials. The MCP can create and structure projects, but voice assignment and actual audio generation are separate steps. Be aware that converting a full project consumes character quota proportional to total word count across all chapters.
@ElevenLabs add a pronunciation dictionary rule so 'SQL' is always spoken as 'sequel' instead of letter-by-letter, then confirm it's active.
I've added a pronunciation rule to your dictionary (dict_xyz789) mapping 'SQL' to the phoneme sequence for 'sequel'. The rule is now active and will apply to all future speech synthesis jobs that reference this dictionary. To use it, you'll need to specify this dictionary ID when calling text-to-speech endpoints. Existing audio files won't be affected — only new conversions will pronounce 'SQL' as 'sequel'.
This showcases the pronunciation customization feature, useful for technical terms, brand names, or acronyms. The dictionary must be explicitly attached to synthesis requests to take effect — it's not applied globally by default. Rules are case-insensitive. This is particularly valuable for teams producing training content or documentation audio with domain-specific vocabulary.
Use-case deep-dives
When ElevenLabs makes sense for weekly content audio
A two-person newsletter team publishing 3,000-word essays twice a week wants to offer an audio version without hiring a voice actor. ElevenLabs is the right call here. The pronunciation dictionary tools let you lock in brand terms and names once, then the project conversion tools batch-process each issue into a hosted audio file. The voice cloning needs at least one good sample recording, but after that initial setup, the workflow is: paste text, run convert, embed player. This works until you're publishing daily or need real-time audio responses—at that frequency, the API call overhead and per-character pricing start to hurt. For twice-weekly long-form, it's a clean win.
ElevenLabs for self-published authors under 100k words
A solo author with a 60,000-word manuscript wants to publish an audiobook without studio costs. ElevenLabs handles this scenario well. The project tools let you structure chapters, apply consistent voice settings, and convert the full book in one pass. The pronunciation rules cover character names and invented terms. The voice library gives you options beyond your own cloned voice if you need multiple characters. The threshold is manuscript length and revision frequency: if your book is over 100k words or you're still editing heavily, the re-conversion costs add up fast. For a finished manuscript under that line, the tool set matches the job and the API key auth keeps it simple for one person to run.
When to skip ElevenLabs for tutorial voice work
A five-person SaaS team ships onboarding videos every sprint and wants to automate the voiceover. ElevenLabs is borderline here. The voice cloning and chapter conversion tools technically work, but the workflow doesn't fit the iteration speed. Each script tweak means a new API call, and the pronunciation dictionary doesn't help with product UI terms that change weekly. You'll spend more time managing voice projects than recording a human take in one pass. If your scripts are stable and you're producing ten videos a month, the math shifts—but for sprint-cadence tutorial content, a human voice or a lighter TTS tool is faster. Save ElevenLabs for long-form content that doesn't change.
Frequently asked
What does the ElevenLabs MCP do in Switchy?
It connects your team's AI workflows to ElevenLabs' text-to-speech API. You can generate audio from text, manage custom voices, create pronunciation dictionaries, and convert entire projects or chapters into speech. Useful if you're building voice-enabled features or producing audio content at scale without leaving your workspace.
Do I need an ElevenLabs API key to use this MCP?
Yes. You'll need to generate an API key from your ElevenLabs account dashboard and paste it into Switchy's connection settings. The key determines which voices, projects, and quotas your team can access. If you're on a free ElevenLabs plan, expect character limits to apply.
Can the MCP clone voices or only use pre-built ones?
It can clone voices. The 'Add a voice' tool lets you upload audio samples to create a custom voice, though ElevenLabs processes the clone asynchronously — it won't be usable for synthesis immediately. You can also add shared voices from other users' libraries if you have their public ID.
How is this different from just using ElevenLabs' web interface?
The MCP lets your AI agents generate speech programmatically as part of multi-step workflows. Instead of manually uploading text and downloading files, your team can trigger conversions, manage dictionaries, and batch-process chapters from inside Switchy. Faster if you're producing dozens of audio files weekly.
Who on the team should connect the ElevenLabs MCP?
Whoever owns your ElevenLabs subscription and can generate API keys. That person's quota and voice library will be shared across the team in Switchy. If you're on a paid ElevenLabs plan with multiple seats, coordinate with your account admin to avoid quota conflicts.