Let your computer speak to you

Modern deep-learning powered speech synthesis. In short: Text-to-Speech!

5 min readFeb 28, 2023

I recently injured my eye during a hike. Don’t worry, it will be fine soon — but it made me think about text-to-speech (TTS) software again. You probably all know the robotic voices that can read text, but are extremely hard to understand:

This article guides you through the process of utilizing an improved text-to-speech system. Here, you will find complete, functional code that can be executed locally, as well as access to scientific publications for a deeper understanding. Audio examples will illustrate the notable advancements made in recent years. Without further ado, let’s delve into the topic at hand!

The old Stuff: pyttsx3 and eSpeak

After installing the necessary requirements for pyttsx3, you can use the following Python snippet to generate the audio sample from above:

import sys
from pathlib import Path

import pyttsx3 as tts  # pip install pyttsx3==2.90

# Get the data
with…

Let your computer speak to you

Modern deep-learning powered speech synthesis. In short: Text-to-Speech!

The old Stuff: pyttsx3 and eSpeak

Written by Martin Thoma

No responses yet