Let your computer speak to you
Modern deep-learning powered speech synthesis. In short: Text-to-Speech!
I recently injured my eye during a hike. Don’t worry, it will be fine soon — but it made me think about text-to-speech (TTS) software again. You probably all know the robotic voices that can read text, but are extremely hard to understand:
This article guides you through the process of utilizing an improved text-to-speech system. Here, you will find complete, functional code that can be executed locally, as well as access to scientific publications for a deeper understanding. Audio examples will illustrate the notable advancements made in recent years. Without further ado, let’s delve into the topic at hand!
The old Stuff: pyttsx3 and eSpeak
After installing the necessary requirements for pyttsx3, you can use the following Python snippet to generate the audio sample from above:
import sys
from pathlib import Path
import pyttsx3 as tts # pip install pyttsx3==2.90
# Get the data
with…