Document
The Text To Speech OpenAI (TTS) API allows you to convert files in various document formats into high-quality, natural-sounding speech. You can use this API to generate voiceovers for multimedia content, create narrations for e-books and documents, or turn subtitles into engaging audio experiences.
Text To Speech
POST https://api.ttsopenai.com/uapi/v1/document-to-speech
This endpoint allows you to convert document into speech. You can customize the voice, speed, and model used for the conversion.
Example Request
curl -X POST https://api.ttsopenai.com/uapi/v1/document-to-speech \
-H "Content-Type: multipart/form-data" \
-H "x-api-key: <your api key>" \
--form "model=tts-1" \
--form "voice_id=OA001" \
--form "speed=1" \
--form "file=@/path/to/your/document.pdf" \
--form "file_password=your_password"
Request Attributes
model
string
The model used for the conversion. You can choose between tts-1
and tts-1-hd
. The default value is tts-1
.
voice_id
string
The voice used for the conversion. You can find the list of voice IDs in the Voice Library. The default value is OA001
.
speed
float
The speed of the speech. The value should be between 1 and 4. The default value is 1.
file
string($binary)
The document file to be converted. Supported formats include .docx , .xlsx , .pptx , .pdf , .epub , .mobi , .txt , .html , .odt , .ods , .odp , .azw , .azw3. The maximum file size is 100 MB and max 500,000 rows of data.
file_password
string
The password for the document file, if it is password-protected.
Example Response
{
"success": true,
"result": {
"uuid": "4a7693ee-aa35-11ef-bfda-7eba07618aa0",
"voice_id": "OA001",
"speed": 1,
"model": "tts-1",
"tts_input": "5101447014.pdf",
"estimated_credit": 0,
"used_credit": 0,
"status": 1,
"status_percentage": 1,
"error_message": "",
"speaker_name": "Alloy",
"created_at": "2024-11-24T07:25:33",
"updated_at": "2024-11-24T07:25:33",
"file_size": 98842
}
}
Response Attributes
success
boolean
Indicates whether the request was successful.
result
object
The result of the document-to-speech conversion.
result.uuid
string
The unique identifier for the conversion.
result.voice_id
string
The voice used for the conversion.
result.speed
float
The speed of the speech.
result.model
string
The model used for the conversion.
result.tts_input
string
The document file that was converted into speech.
result.estimated_credit
integer
The estimated number of credits used for the conversion.
result.used_credit
integer
The actual number of credits used for the conversion.
result.status
integer
The status of the conversion. Possible values are:
1
: Converting2
: Completed3
: Error11
: Reworking12
: Joining Audio13
: Merging Audio14
: Downloading Audio
result.status_percentage
integer
The percentage of the conversion that has been completed.
result.error_message
string
The error message, if any.
result.speaker_name
string
The name of the speaker.
result.created_at
string
The date and time when the conversion was created.
result.updated_at
string
The date and time when the conversion was last updated.
result.file_size
integer
The size of the document file in bytes.