Let your computer speak to you

Modern deep-learning powered speech synthesis. In short: Text-to-Speech!

Martin Thoma
5 min readFeb 28, 2023

I recently injured my eye during a hike. Don’t worry, it will be fine soon — but it made me think about text-to-speech (TTS) software again. You probably all know the robotic voices that can read text, but are extremely hard to understand:

This article guides you through the process of utilizing an improved text-to-speech system. Here, you will find complete, functional code that can be executed locally, as well as access to scientific publications for a deeper understanding. Audio examples will illustrate the notable advancements made in recent years. Without further ado, let’s delve into the topic at hand!

The old Stuff: pyttsx3 and eSpeak

After installing the necessary requirements for pyttsx3, you can use the following Python snippet to generate the audio sample from above:

import sys
from pathlib import Path

import pyttsx3 as tts # pip install pyttsx3==2.90

# Get the data
with…

--

--

Martin Thoma
Martin Thoma

Written by Martin Thoma

I’m a Software Engineer with over 10 years of Python experience (Backend/ML/AI). Support me via https://martinthoma.medium.com/membership