What does the Text to Speech tool do?
The Text to Speech tool renders plain-language scripts into an audio file using Dynamic Duniya’s media job pipeline. You type or paste into a large textarea capped at five thousand Unicode characters with a live counter. A native select lists bundled voices labeled English, Hindi, French, Spanish, German, Italian, Portuguese, and Russian, each mapped to a short language code sent as the voice option. Two sliders expose words-per-minute between eighty and two hundred twenty (default one hundred fifty) and pitch between ten and ninety (default fifty). Because the upload API expects multipart form data, the client attaches a tiny placeholder text/plain file alongside JSON options containing your trimmed text, voice, speed, and pitch—there is no separate audio upload on your side. After processing, the result page can play the returned asset through an HTML audio element resolved against the tool download helper and still offers the standard download and reset actions.
Voice quality expectations
An amber banner at the top states openly that the stack currently relies on espeak-ng for basic TTS and that more natural voices would need a cloud provider wired into the server. Expect compact, intelligible speech suited to accessibility prototypes or quick VO scratch tracks rather than polished marketing narration.
Privacy
Everything you submit in the textarea travels to Dynamic Duniya infrastructure for synthesis. Avoid passwords, API keys, private messages, regulated health or financial data, or copyrighted text you cannot lawfully process.