Neural Text to Speech and an AFB AI Survey
Sep. 13th, 2025 05:52 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
The American Foundation for the Blind is researching AI:
( details on how to participate )
In addition to the environmental and ethical violations which LLMs/AIs depend on, the endless hype and inaccurate performance make me shudder and growl. Yet I admit I’ve used neural text-to-speech voices for casual audio reading. The neural voices require an internet connection and they lose intelligibility at speed. They’re best as substitutes for human readers.
Blind computer users set their on-device system text-to-speech (TTS) at high speeds. Three hundred to five hundred words per minute are often cited. For screen reader applications, a robotic voice is a feature, enabling bits to flow from device to brain with minimal interpretation.
Neural voices produce much higher quality than system-level TTS. When fed appropriately coded input, they can laugh, whisper, and sound sarcastic as well as "analyze" an essay to produce a "podcast" dialog between two synthetic discussants. Some samples here: https://www.naturalreaders.com/online/
But I know well the expertise that skilled human narrators bring to their work—whether it’s commercial audiobook production, volunteer alternative-format creation, or podfic elves making magic. I don’t want a world where those jobs are outsourced to computers.
On the gripping hand, I remember when skilled Linotype operators--many Deaf--were obviated by computerized systems where reporters keyed their own copy. I used the bridge technology of phototypesetting, as well as pioneering desktop publishing. It's expected that admin workers now create flyers and graphs and charts.
Have you tried neural voices? Recognized them on YouTube or TikTok or your recent tech support call? Do you have thoughts for or against?