What are SSML tags in Text To Speech

What are SSML tags in Text To Speech (TTS), and how do I use them?

Speech Synthesis Markup Language ( SSML ) tags allow you to more easily control the way someone reads your text by giving precise instructions on pronunciation, emphasis, timing, etc. A few good reasons for using SSML are:

  1. Making it easier for non-native speakers to read your copy
  2. Forcing specific word emphasis in English
  3. Controlling intonation and punctuation

You can use Text-to-Speech (TTS) to hear your copy and how it will actually sound. Here on Voiceley.com, we allow you to use a few SSML tags in order to control voice inflection and readability:

How to use SSML Tags In Creating Voiceover

To create content with SSML tags, you will need to sign up for an account on Voiceley.com and start to use our online Text-to-Speech (TTS) voice editor.

The rest of this guide is written as if you’re using our TTS editor in ‘Text’ mode, where you write the text that the TTS engine will read out loud after processing it.

Step 1: Create an account on Voiceley.com and log in. On your dashboard, you will see the TTS editor.

Step 2: You can select your language preference in the ‘Language’ drop-down menu followed by choosing a voice.

Step 3: Add some text into the editor and highlight it.

Step 4: Choose your SSML Usage. On top of the TTS editor, there is a pull-down menu that contains different SSML Tags. Select one of these styles for your text.

Step 5: If you’re happy, click on ‘Synthesize’ and save the audio file.

Examples of how to use SSML tags  on Voiceley.com

<break time=”1s”> – Pause for 1 second after each item

<prosody rate=”slow”> – Slow down speaking pace

<prosody rate=”fast”> – Speed up the speaking pace

<prosody volume=”soft”> – Reduce the volume to near-whisper

<prosody volume=”medium”> – Reduce the volume to a medium level

<prosody volume=”loud”> – Raise the volume to a very loud level

<emphasis> (adds emphasis to a word/phrase)

<say-as interpret-as=’ordinal’></say-as>> – Read as ordinal digits

<say-as interpret-as=’time’></say-as> – Read as time format

And many more…

Why should I care about SSML tags in Text-to-Speech?

SSML tags help you ensure your text is read in the way that fits best with what you want to say. If you are unsure about how you would like someone to read out your content, see if SSML tags will help you find the right style.

Let’s take this English sentence as an example.

Remember that <say-as interpret-as=’ordinal’> will read a number as its corresponding ordinal.

For example:

“It was the <say-as interpret-as=’ordinal’> 5 </say-as> time that she ran in the Boston Marathon.”

This would sound like “It was the “fifth” time she ran in the Boston Marathon,” whereas without this tag, it could be read as “t was the Five-time she ran in the Boston Marathon”.

Not all SSML tags are supported by every voice on Voiceley, but there are some that are. The voice will not work if you use incorrect markup.

You’ll see a list of SSML tags that may be used with your voice after you select a voice.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *