The ElevenLabs Text to Speech node converts written text into spoken audio using the ElevenLabs API. It allows you to select a specific voice and fine-tune various speech characteristics like stability, speed, and style to generate a customized audio output.

## Inputs

| Parameter | Description | Data Type | Required | Range |
| --- | --- | --- | --- | --- |
| `voice` | Voice to use for speech synthesis. Connect from Voice Selector or Instant Voice Clone. | CUSTOM | Yes | N/A |
| `text` | The text to convert to speech. | STRING | Yes | N/A |
| `stability` | Voice stability. Lower values give broader emotional range, higher values produce more consistent but potentially monotonous speech (default: 0.5). | FLOAT | No | 0.0 - 1.0 |
| `apply_text_normalization` | Text normalization mode. 'auto' lets the system decide, 'on' always applies normalization, 'off' skips it. | COMBO | No | `"auto"`<br>`"on"`<br>`"off"` |
| `model` | Model to use for text-to-speech. Selecting a model reveals its specific parameters. | DYNAMICCOMBO | No | `"eleven_multilingual_v2"`<br>`"eleven_v3"` |
| `language_code` | ISO-639-1 or ISO-639-3 language code (e.g., 'en', 'es', 'fra'). Leave empty for automatic detection (default: ""). | STRING | No | N/A |
| `seed` | Seed for reproducibility (determinism not guaranteed) (default: 1). | INT | No | 0 - 2147483647 |
| `output_format` | Audio output format. | COMBO | No | `"mp3_44100_192"`<br>`"opus_48000_192"` |

**Model-Specific Parameters:**
When the `model` parameter is set to `"eleven_multilingual_v2"`, the following additional parameters become available:

* `speed`: Speech speed. 1.0 is normal, <1.0 slower, >1.0 faster (default: 1.0, range: 0.7 - 1.3).
* `similarity_boost`: Similarity boost. Higher values make the voice more similar to the original (default: 0.75, range: 0.0 - 1.0).
* `use_speaker_boost`: Boost similarity to the original speaker voice (default: False).
* `style`: Style exaggeration. Higher values increase stylistic expression but may reduce stability (default: 0.0, range: 0.0 - 0.2).

When the `model` parameter is set to `"eleven_v3"`, the following additional parameters become available:

* `speed`: Speech speed. 1.0 is normal, <1.0 slower, >1.0 faster (default: 1.0, range: 0.7 - 1.3).
* `similarity_boost`: Similarity boost. Higher values make the voice more similar to the original (default: 0.75, range: 0.0 - 1.0).

## Outputs

| Output Name | Description | Data Type |
| --- | --- | --- |
| `audio` | The generated audio from the text-to-speech conversion. | AUDIO |

> This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! [Edit on GitHub](https://github.com/Comfy-Org/embedded-docs/blob/main/comfyui_embedded_docs/docs/ElevenLabsTextToSpeech/en.md)

---
**Source fingerprint (SHA-256):** `0cd570fbb152e07ba028e96df56abc08dde8941d043386fd076f42a1e1dc6016`