api:voice:tts
Table of Contents
API : Voice : Text to speech
Introduction
This request will generate audio based on a voice and text to be spoken by the text-to-speech engine.
Request
| URL | https://api.telecomx.dk/voice/tts | |||
|---|---|---|---|---|
| Method | GET or POST | |||
| Access level | MANAGER or OWNER if user belongs to the customer the audio shall belong to. RESELLER if customer belongs to the reseller. RESELLER_ADMIN or ADMIN. |
|||
| Property | Type | Description | Conditions | |
| Body/Query | engine | String | [optional] Which speech engine to use: ELEVEN_ML, ELEVEN_FLASH, ELEVEN_TURBO, ELEVEN_V3. Limits apply to usage. Defaults to ELEVEN_FLASH. | |
| mode | String | [optional] Generation mode: STREAM - the reply will be an audio stream. STORE - store the audio in documents, and return the metadata for accessing it (default) PBX_AUDIO - store the audio in PBX audio and return the metadata for it. | ||
| text | String | The text to speak, max. 3000 characters. | ||
| voice | String | The voice to speak with - use Voices API to retrieve a list of available voices. | ||
| language | String | [optional] The language of the text in ISO 639-1 format. Only applies to engines: ELEVEN_FLASH and ELEVEN_TURBO. If not set, it will be inferred from the voice. | engine=ELEVEN_FLASH / ELEVEN_TURBO | |
| format | String | [optional] Format of audio when streaming, see format list below. Defaults to MP3. | mode=STREAM | |
| speed | Number | [optional] Speaking speed - 70 (slow) → 120 (fast). Defaults to 100 (normal). | ||
| stability | Number | [optional] Randomness in generation, lower = more emotional, higher = more monotonous. 0 → 100, defaults to 50. | ||
| similarity | Number | [optional] How closely to the original should the voice be, 0 → 100, defaults to 75. | ||
| style | Number | [optional] This amplifies the style of the voice, 0 → 100, default to 0. | ||
| expire | Number | [optional] Number of seconds after which to auto-delete the stored audio file. | mode=STORE | |
| sensitive | Boolean | [optional] True if the audio is sensitive and shall not be listed in the customers list of files. Requires mode STORE and expire to be set. | mode=STORE | |
| customer | Id | [optional] Id of customer the audio belongs to, if the audio shall only be playable by the customers employees. Defaults to users customer. No customer can only be used by RESELLER_ADMIN or ADMIN users. | mode=STORE / PBX_AUDIO | |
| employee | Id | [optional] Id of employee the audio belongs to, if the audio shall only be playable by the employee. | mode=PBX_AUDIO | |
| name | String | [optional] Name of the audio, defaults to the start of the text. | mode=STORE / PBX_AUDIO | |
| Engine | Description | Languages | Stream startup |
|---|---|---|---|
| ELEVEN_ML | Eleven Labs multilingual V2 | 29 | ~1000ms |
| ELEVEN_TURBO | Eleven Labs turbo V2.5 | 32 | ~500ms |
| ELEVEN_FLASH | Eleven Labs flash V2.5 | 32 | ~350ms |
| ELEVEN_V3 | Eleven Labs V3 | 70+ | ~2000ms - 9500ms |
| Audio Format | Description |
|---|---|
| OGG_OPUS | Opus 48kHz samplerate, 32kbs bitrate in an OGG container |
| WEBM_OPUS | Opus 48kHz samplerate, 32kbs bitrate in an WEBM container |
| MP4_OPUS | Opus 48kHz samplerate, 32kbps bitrate in an MP4 container |
| MP4 | AAC 16kHz samplerate, 32kbs bitrate in an MP4 container |
| PCM | Raw 16Khz samplerate, 256kbs bitrate in a PCM Wave container |
| MP3 (default) | MP3 44.1kHz samplerate, 128kbps bitrate |
| ALAW | Alaw 8kHz samplerate, 64kbs bitrate |
| ULAW | Ulaw 8kHz samplerate, 64kbs bitrate |
Request body example
{
"engine": "ELEVEN_ML",
"text": "Welcome to our company. Press 1 for sales, press 2 for support, or press 3 for accounting",
"voice": "ygiXC2Oa1BiHksD3WkJZ",
"speed": 100,
"customer": "123457890123457890AAAA",
"name": "Default greeting message",
"employee": null
}
Response - when mode is STORE
| Property | Type | Description |
|---|---|---|
| _id | Id | Unique id of the stored audio file. |
| customer | Id | Id of customer the audio belongs to, if any. |
| name | String | Name/description of the audio. |
| length | Number | Length in seconds. |
| employee | Id | Optional id of employee the audio belongs to and can be played by. |
| expires | Date | If the audio expires, this is when. |
| url | String | URL for playing the stored audio file. May contain a token that can expire, and needs to replaced. See playback of stored audio files. |
Note that properties holding no value may be omitted from the response.
Example
{ "_id": "1234567890123457890ABCD", "customer": "12345678901234567890CCCC", "name": "Welcome greeting 08:00-12:00", "length": 15, "employee": null, "expires": "2025-01-01T00:00:00.000Z", "url": "https://audio.telecomx.dk/1234567890123457890ABCD.mp3?token=xjh2837fhv28edfcyhgb2uwdchbgwuvndc" }
Errors
| Error code | Message | Description |
|---|---|---|
| 404 | customer | Customer not found |
| 404 | employee | Employee not found |
| 422 | text | Text is missing |
| 422 | voice | Voice not found |
| 422 | engine | Speech engine not found |
| 403 | access_denied | Insufficient access level |
| 403 | quota_exceeded | Quota limit has been reached |
| 500 | internal_error | <Unspecified> |
api/voice/tts.txt · Last modified: 2025/09/15 13:29 by Per Møller