Best Practices for Using HAQM Polly Voices
You can use HAQM Polly voices in your skill, as described in the SSML Reference. Follow these guidelines to help ensure a good experience for your customers.
The following locales are supported for Alexa: en-US, en-GB, en-IN, en-AU, en-CA, de-DE, es-ES, hi-HI, it-IT, ja-JP, fr-FR.
- Considerations for use of HAQM Polly voices
- Use the voice tag
- Use the lang tag
- Technical specifications
- Best practices
- HAQM Polly voices currently available to Alexa
- Certification
- Related topics
Considerations for use of HAQM Polly voices
-
HAQM Polly voices are especially useful for multi-character story and gaming skills where your skill can use different voices for characters.
-
Your skill can particularly benefit from HAQM Polly voices if your skill's content is gender-specific, such as if you want to voice an Alexa response through your skill in a male voice.
-
Use HAQM Polly voices in any scenario in which multiple voices will improve the interactivity and customer experience within your skill.
-
Apply the same voice design principles as you do when constructing a typical Alexa response. Be brief, speak and write naturally, prompt with guidance for the user, use conversation markers, and so forth. See Alexa Design Guide.
-
Ensure that you test how your responses sound in the Alexa Simulator on the developer console, just as you would any other SSML audio tags.
Use the voice tag
Refer to Speech Synthesis Markup Language Reference With HAQM Polly Voices for documentation about how to add HAQM Polly voices to your skills.
-
The voice tag supports all SSML tags supported by Alexa Skills Kit, including
lang
,say-as
,break
, andprosody
, except that thespeechcons
tag is not supported withvoice
. -
Nest any other SSML tags that you use inside the
voice
tags, rather than the other way around. Note thatvoice
tags can be nested withvoice
tags as well.
In this case, the Kendra voice speaks English, as well as foreign language names in an imperfect pronunciation.
<speak>
<voice name="Kendra">
I am going to spell out Hello as <say-as interpret-as="spell-out">hello</say-as>. Now and then, I speak <lang xml:lang="de-DE">Deutsch</lang> and <lang xml:lang="fr-FR">français</lang> and <lang xml:lang="es-ES">español</lang>.
</voice>
</speak>
-
You can use the
voice
tag to use an HAQM Polly voice to construct your entire response, or as an accompaniment to an SSML audio file. -
Note that the voice tag values are case-sensitive, so use standard name casing, such as "Matthew".
-
Just as with standard SSML TTS, consider combining the
voice
tag with other SSML tags supported by Alexa to get special effects:
<speak>
<voice name="Matthew"><say-as interpret-as="digits">Can you call me at 8675309?</say-as></voice>
<voice name="Kendra">Okay, let's be mindful and take a deep breath. <break time="3s"/> Now don't we feel better? </voice>
</speak>
Use the lang tag
The lang
tag can be used on its own or nested in the voice tag to control how HAQM Polly voices speak. Use the lang
tag with a corresponding voice of the same language for the best results, as shown here. See lang tag.
<speak><voice name="Kendra">
I am going to spell out Hello as <say-as interpret-as="spell-out">hello</say-as>. Now and then, I speak <voice name="Hans"><lang xml:lang="de-DE">Deutsch</lang></voice> and <voice name="Celine"><lang xml:lang="fr-FR">français</lang></voice> and <voice name="Enrique"><lang xml:lang="es-ES">español</lang>.</voice>
</voice>
</speak>
Technical specifications
Refer to Speech Synthesis Markup Language Reference With HAQM Polly Voices for documentation about how to add HAQM Polly voices to your skills.
-
Alexa skill developers have a limit of 10,000 characters for a TTS (text to speech) response in their skill. With 10,000 characters, you can generate up to approximately 10 minutes of continuous audio stream with HAQM Polly and Alexa voices for use in the Alexa skills. However, responses should generally be brief for the best customer experience. See the one-breath test in the Alexa Design Guide.
-
Optionally, adjust for acoustic differences among different Alexa and HAQM Polly voices. Developers should keep in mind that Alexa and HAQM Polly voices may vary in the pitch, rate, timbre, and volume since they are different voices. Acoustic differences among different voices can be adjusted using different SSMLs tags developers should consider using them to provide a customer experience consistent with the use cases in their Alexa skill. For example,
- Pitch:
<speak>I can speak in a <prosody pitch="high">higher pitched voice</prosody>, or I can speak <prosody pitch="low">in a lower pitched voice</prosody></speak>
- Rate:
<speak>I can speak <prosody rate="x-slow">really slowly</prosody>, or I can speak <prosody rate="x-fast">really fast</prosody></speak>
- Volume:
<speak>I can also speak <prosody volume="x-loud">very loudly</prosody>, or I can speak <prosody volume="x-soft">very quietly</prosody>. </speak>
- Whisper:
<speak>I have a secret to tell you, I will whisper it to you.<amazon:effect name="whispered">'<prosody rate="x-slow"> <prosody volume="loud">I am not human.</prosody></prosody></amazon:effect>Can you believe it?</speak>
Best practices
- The initial introduction of your skill must use Alexa's default, in-country voice. This guideline helps ensure that your skill customers are clearly informed when they are interacting with a skill as opposed to Alexa directly.
- Remember that HAQM Polly and Alexa are separate, and not all HAQM Polly features are available within Alexa, particularly some HAQM Polly SSML features. Ensure that you only use supported features in your skill.
- Speechcons used in Alexa skills can only use Alexa's voice.
- Alexa skills that use HAQM Polly voices must adhere to all other content policies and the Alexa Skills Kit developer contract.
HAQM Polly voices currently available to Alexa
You can use any of the supported HAQM Polly voices in your Alexa responses, for part or all of the response. Be mindful of the customer experience if you combine voices from different locales in your skill responses.
Certification
To comply with Alexa skill policies, ensure that your skill doesn't expose the HAQM-assigned name of Polly voice(s) to users.
Related topics
Last updated: Apr 30, 2024