How to Use CasaTunes Text-To-Speech (TTS)

Overview

CasaTunes Text-To-Speech requires CasaTunes v5.00.190326 or later.

You can use CasaTunes' TTS support to send custom text messages to your CasaTunes system, have CasaTunes automatically convert your message into an audio file, and play the audio file in any combination of rooms in your home.

The key to using CasaTunes TTS is you will need an application, typically a home automation program, to invoke a CasaTunes REST Api. While this may seem daunting, hopefully after you have read this Tech Note you will be equipped to add support in your home for this cool capability.

To invoke a CasaTunes TTS command, your application will need to have the capability to make HTTP calls, aka invoking REST methods. Since, it is just an HTTP call, you can test this functionality by using your browser to invoke these commands. So for example, on an iOS device you can enter the command in the Safari browser and save the link to your home screen, by pressing the "send" icon and selecting "Add to Home Screen". To invoke the command, just press the button (or link).

To invoke a CasaTunes TTS command on a desktop machine, you can use an application, like CURL.

Using CasaDev to test your message

CasaTunes provides the CasaDev tool to discover and experiment using the CasaTunes REST Api's. The CasaDev tool is automatically installed by default. To start the CasaDev tool and test your TTS commands:

Playing a message

To play a text message:

Playing a message using your Home Automation system

When clicking on "Try it out!" in the previous step, you will notice  the CasaDev tool displays the Request Url that it used to run your command. To program your Home Automation system, you can copy this value and paste it in to your Home Automation system as the command to invoke to play your message.

Adding delays before and after your message

Depending on the audio hardware you are using, you may need to delay playing your message until all the amplifiers have turned on. Similarly, if your text message is being clipped on the back end, you may want to delay turning off your amplifiers after playing back the message. To do this, simply enter a value for the preWait and postWait parameters, respectively.  The value is a whole integer (no decimal points) representing the number of seconds to wait.

Selecting a language and gender

You can playback your messages in any language supported by Google's TTS technology. A complete list of supported languages is available at: https://cloud.google.com/speech-to-text/docs/languages

To select a specific language, enter the associated language code as the value to the languageCode parameter

Similarly, to select a specific gender, enter one of the following values as the gender: MALE, FEMALE, NEUTRAL

Selecting a voice to use

Selecting a voice is optional, but selecting a voice based on Google's WaveNet technology will result in the best reproductions.

To list the available voices available:

When selecting a voice do not specify the gender, as this is included as part of the voice

Using SSML

For even more control over your messages, you can create your messages using Speech Synthesis Markup Language (SSML). To learn about creating custom SSML messages please refer to the following document: https://cloud.google.com/text-to-speech/docs/ssml

When using SSML, you will need to add the preWait and postWait delays in your SSML script, by inserting the <break time="3s"/> statements in your SSML. You can enter a variable number of seconds (3 in this example).

Note