Text to Speech by using ttsvoice – Python

In this tutorial, we will convert the text written by a human into human-like speech.

Listening to the information is much more understandable than reading it. It involves the person more in the content rather than reading it. There are many APIs in Python for converting text in human language into human voice. This includes pyttsx3, gTTS, espeak, etc.

The process in which a text string is converted to voice, i.e., the text will speak the words in the English language, is called Text to Speech (TTS).

ttsvoice Library in Python

ttsvoice is a Python package used for Text to Speech conversion. It contains multiple packages like gTTS and pyttsx3, which convert human text into voice.

Installing the ttsvoice library in Python

The ttsvoice library can be directly installed into the command prompt or by making a Python  environment by running this command:

py –m pip install ttsvoice

Text to Speech Conversion using pyttsx3 API

Features of pyttsx3 API

  • It is a straightforward tool to use that converts text into speech.
  • It does not require any internet connection; it works offline.
  • It works with both Python 2 and Python 3.
  • It can convert text into speech in both female and male voices.

Installing pyttsx3 API in the system using a command prompt or any Python terminal

The pyttsx3 API can be directly installed into the command prompt or by making a Python  environment by running this command:

py –m pip install pyttsx3 
Text to Speech by using ttsvoice – Python

This library depends on win32; it may produce an error while running the program. Thus, to avoid this, install pypiwin32 in the Python environment by using the following:

py –m pip install pypinwin32

Functions in pyttsx3 Library

  1. pyttsx3.init(): This function takes an identifier to an instance of an engine that will occupy the specified driver. If another engine instance is already using the available driver, the other engine instance is returned else. A new engine is built.
  2. getProperty(): This function will take the present value of the engine property.
  3. setProperty(): It sets an engine property by queuing a command. The new property will affect all expressions or words queued in the engine.
  4. say(): This function helps in speaking the expression given by the user
  5. runAndWait(): This function will stop all the queued commands. It calls callbacks for engine notifications. It returns once the queue has been cleared of all orders waiting before the call.

Three TTS engines are supported by pyttsx3:

  • sapi5: it is used on Windows.
  • NSSpeechSynthesizer: it is used on Mac OS
  • espeak: Can be used on any platform

Let’s understand the use of pyttsx3 API using a simple example.

Code

import pyttsx3 

obj = pyttsx3.init()                                                     # object of Text-to-speech engine

txt = "Hello, I am converting the Text into Speech."      

print (txt)

obj.say(txt)                                                                # converting text to speech

obj.runAndWait()                                                    # play the speech

Output

Text to Speech by using ttsvoice – Python

As an output of this code, the written text will be heard in the human voice as:

“Hello, I am converting the Text into Speech.”

The pyttsx3 library is imported in this code, and an object obj is made, which initializes the pyttsx library. Then a text is written, and when obj.say() is called, it will speak the text, and obj.runAndWait() will run the command till all the commands are queued up.

Implementing different functions and properties in pyttsx3

1. Speaking rate

    The Speaking rate can be defined as the speed of speaking. We can check the details of the speaking rate using the getProperty function. It can be written as:

    rate= obj.getProperty("rate")

    print(rate)

    Output

    200

    We can change the rate by using the setProperty function. It can be used as:

    obj.setProperty("rate", 300)

    obj.say(txt)

    obj.runAndWait()

    Output

    It will speak the words faster than before. To slower the speed, we can change the value to 100.

    2. Voice Details

    The details of different voices available can be obtained by:

    voices = obj.getProperty("voices") 

    print(voices)

    Output

    [<pyttsx3.voice.Voice object at 0x000001D7BD8ED430>, <pyttsx3.voice.Voice object at 0x000001D7BD8ED3D0>]

    The given output gives voice objects for females and males.

    3. Converting Voices

    We can hear the text in both male and female voices using setProperty and the abovementioned voices. It can be generated by:

    The male and female voices are denoted by 0 and 1, respectively.

    For generating the text in a male voice, we can use the following:

    obj.setProperty('voice', voices[0].id)

    obj.say(txt)

    obj.runAndWait()

    We have set the voice id to 0 using the setProperty() function. For the male voice, it is 0.

    For generating the text in the female voice, we can use the following:

    obj.setProperty('voice', voices[1].id)

    obj.say(txt)

    obj.runAndWait()

    In this, we have set the voice id to 1 using the setProperty() function. For the female voice, it is 1.

    We can convert the voice by giving voice id to the setproperty() function.

    To get the voice ids, we will run a for loop, which will give the details of the voices present in our system:

    For voice in voices:

    # to get the info. about various voices in our PC

        print("Voice:")

        print("ID: %s" %voice.id)

        print("Name: %s" %voice.name)

        print("Age: %s" %voice.age)

        print("Gender: %s" %voice.gender)

        print("Languages Known: %s" %voice.languages)

    Output:

    Voice:

    ID: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Voices\Tokens\TTS_MS_EN-US_DAVID_11.0

    Name: Microsoft David Desktop - English (United States)

    Age: None

    Gender: None

    Languages Known: []

    Voice:

    ID: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Voices\Tokens\TTS_MS_EN-US_ZIRA_11.0

    Name: Microsoft Zira Desktop - English (United States)

    Age: None

    Gender: None

    Languages Known: []

    We have got two voices with different ids.

    For generating the text in a male voice, we can use the following:

    voice_id: “HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Voices\Tokens\TTS_MS_EN-US_DAVID_11.0”

    obj.setProperty('voice', voices_id)

    obj.say(txt)

    obj.runAndWait()

    For generating the text in the female voice, we can use the following:

    voice_id: “ HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Voices\Tokens\TTS_MS_EN-US_ZIRA_11.0”

    obj.setProperty('voice', voice_id)

    obj.say(txt)

    obj.runAndWait()

    Text to Speech conversion using gTTS API

    This package is used to convert human text into voice in different languages. The languages included are Hindi, English, Tamil, French, German, etc.

    Installing gTTS API in the system using a command prompt or any Python terminal

    The gTTS API can be directly installed into the command prompt or by making a Python  environment by running this command:

    py –m pip install gTTS

    Along with this, we need to install some packages like playsound and pyttsx3:

    py –m pip install playsound

    py –m pip install pyttsx3

    This package can be used on any platform. 

    Let’s understand the use of gTTS API using a simple example.

    Code

    from gtts import gTTS 

    from playsound import playsound 

    txt = "Hello, I am converting the Text into Speech"

    language = 'en'   

    obj = gTTS(text=txt, lang=language, slow=False) 

    obj.save("audio.mp3")   

    playsound("audio.mp3")

    Output

    Text to Speech by using ttsvoice – Python

    The audio file is saved as “audio.mp3.”

    Text to Speech by using ttsvoice – Python

    In this, gtts and playsound library is imported, an obj is made in which gTTS is called, and language is set as en, and then an mp3 file is saved using the obj.save() function. Then it is played using the playsound() function.

    We can change the language by giving different languages and can increase the speed by changing the slow= False to True.