Customisation

SSML

Control the avatar's speech, animations, and more

Overview

SSML (speech synthesis markup language) is an XML-like language that can be used to configure your avatar on the fly, ranging from affecting speech, to controlling your avatar's animation and even changing their outfit.

Our SSML commands usually begin with a trl- prefix, and can be issued via chat messages sent by either the user or the avatar. For example you could adjust your LLM's prompt to send particular SSML messages in response to some natural language input, or you could hook up a button in your application to dispatch an SSML command using the sendMessageToAvatar() function in our SDK.

You can weave SSML tags with regular speech to affect the sentence that would be spoken, e.g. Let me think <trl-break duration="1.0" /> Ah yes, I remember now - <trl-config speechSpeed='20' /> you asked me to speak quicker!

When SSML tags are sent we automatically filter them where appropriate - e.g. in our client's chat component, or in setups where we have full control over the LLM=>TTS pipeline (to prevent the avatar trying to 'speak' the tag).

Under Construction We're still working on updating this section of the documentation

Avatar SSML

This section documents SSML tags that relate to your avatar's configuration.

Index

Below you can find all of the available avatar SSML commands, with compatibility indicators for each avatar type (IBA, 3D, VBA). Click on one of the command links to scroll to the relevant documentation section.

SSML Tag

IBA 3D VBA

<trl-anim />

<trl-break />

<trl-config />

<trl-state />

<trl-content />

<trl-morph />

<trl-speak />

<trl-play-background-audio />

<trl-stop-background-audio />

<trl-repeat />

<trl-start-repeat />

<trl-anim />

Allows you to combine animations to animate the avatar's head, face, mouth, etc.

Auxiliary and Core animations

Animations can be played in two different modes - aux and core. Animations played with the core type influence the entire body, only one core animation can play at a time. You may find it helpful to think of core animations as a 'base' animation, defining what the avatar's body is doing overall. Issuing two separate commands with the core animation type will result in the animations being queued, blending from the end of one to the start of the other.

Aux animations are additive and don't result in queueing, they are instead layered on top of the animations that are already playing. Core animations usually affect the entire body whereas aux animations are supposed to target a few specific bones to achieve some localised animation. This means you could achieve a setup where a core animation makes an avatar stand idle (lightly move/sway), but an aux animation that targets just the neck/head is layered on top to make them nod their head.

Animation IDs

Expand the collapsed section below to see a full list of Animation IDs.

Example usage

Head movements (small, large, shoulder translations)

<trl-anim type='aux' id='ANIM_NAME' duration='1.8' blendStart='0.8' blendEnd='1.0'/>

Core Body

<trl-anim type='core' id='coreNoddingSmall' duration='4' />

Arm & hand movements

<trl-anim type='aux' id='FingersSqueeze' blendStart='1' blendEnd='2'/>

Head & upper body movements

<trl-anim type='aux' id='noddingSmall' duration='3' blendStart='0.5' blendEnd='0.5'/>

Facial Expressions

<trl-anim type='aux' id='mouthOpen' duration='3.5' blendStart='0.5' blendEnd='0.5'/>

Viseme mouth shapes

<trl-anim type='aux' id='viseme' duration='4.0' blendStart='0.5' blendEnd='0.5' />

Parameter	Description
`type`	Animation type - 'aux' or 'core'
`id`	Animation name
`duration`	Animation duration in seconds
`blendStart`	Blend-in time in seconds
`blendEnd`	Blend-out time in seconds

<trl-break />

Introduces a delay in speech.

Example usage

Break duration example

<trl-break duration="1.0" />

Parameter	Description
`duration`	Speech duration delay in seconds

<trl-config />

Example Usage

Set the speed at which speech is spoken.

<trl-config speechSpeed='20' />

Toggle subtitles on/off

<trl-config subtitlesEnabled="true" />

Interrupt avatar talking with new avatar speech

<trl-config interruptAvatarTalkingWithNewAvatarSpeech="true" />

Ignore user speech while the avatar is talking

<trl-config ignoreUserSpeechWhileAvatarTalking="true" />Whilst the avatar is saying this sentence, the user should be ignored.<trl-config ignoreUserSpeechWhileAvatarTalking="false" />

Change the TTS voice

<trl-config TTSVoice='Joanna' />
<trl-config TTSService="polly" TTSVoice="MatthewNTTS"  />
<trl-config TTSService="azure" TTSVoice="ur-PK-UzmaNeural" AzureLanguage="ur-PK" />

Make the avatar look at a point and animate

<trl-config lookat='5,0,0' enable='true' speed='4' wait='2.5' returnSpeed='3.5' absolute='false' /><trl-anim type='aux' id='translateRotateLeftSmall' duration='2.5' />

Reload the avatar config

<trl-config reloadConfig="true" />

Assign key to app or character

<trl-config type="App" key="example-key" value="example-value" /><trl-config type="Character" key="example-key" value="example-value" />

Enable VNS animations

<trl-config enableVNSAnims="true" />

Enable auto animations

<trl-config enableAutoAnims="true" />

Reset session

<trl-config resetSession="true" />

Fullscreen mode

<trl-config FullscreenMode="true" />

<trl-state />

Example Usage

Enable listening state

<trl-state enableListening="true" />

<trl-content />

Example Usage

Queue command example

<trl-content queue="true" /> Could you repeat that?

Show Background Content

<trl-content ShowBGScreen="true" />

Background Videos

<trl-content BgScreenUrl="URL" /><trl-content BgScreenUrl="https://upload.wikimedia.org/wikipedia/commons/thumb/c/c1/Pacifica_Pier_CA.jpg/1920px-Pacifica_Pier_CA.jpg" />

Background Images

<trl-content BgScreenUrl="https://upload.wikimedia.org/wikipedia/commons/thumb/c/c1/Pacifica_Pier_CA.jpg/1920px-Pacifica_Pier_CA.jpg" />

Foreground Videos

<trl-content position="ScreenAngledSmallLeft" fade="false" speed="1.0" fadetime="0.2" showMovingScreen="true" /><trl-content screen="https://storage.googleapis.com/gtv-videos-bucket/sample/ForBiggerMeltdowns.mp4" />

Foreground Images

<trl-content position="ScreenAngledSmallLeft" fade="false" speed="1.0" fadetime="0.2" showMovingScreen="true" /><trl-content screen="https://jooinn.com/images/brighton-pier-3.jpg" />

Background Color

<trl-content r="0" g="0" b="0" a="0" />

Transitions

<trl-content position="DefaultCenter" fade="false" speed="1.0" fadetime="0.2" showMovingScreen="true" /><trl-content position="ScreenFlatLargeAvatarSmallLeft" fade="false" speed="1.0" fadetime="0.2" showMovingScreen="true" />

Avatar Location

<trl-content position="ANY" location="0.1,-0.1,-2" rotation="0,20,0" speed="1" />

Foreground YouTube

<trl-content screen="https://www.youtube.com/embed/CYXKi2XF5pM?loop=1&autoplay=1&playlist=CYXKi2XF5pM" />

Lighting

<trl-content lightingId="2" />

Lighting Tint

<trl-content lightingId="1" TintColor="1.0,1.0,1.0,1.0" />

Lighting With Blend

<trl-content lightingId="7" lightblending="true" transitionSpeed="10" />

Lighting With Fade

<trl-content lightingId="1" fade="true" fadeInTime="1" darkTime="2" fadeOutTime="1" />

Skylight Intensity

<trl-content lightingId="16" skylightIntensity="0.2" lightblending="true" blendingSpeed="1" />

Screen Intensity

<trl-content screenIntensity="0.4" />

Change Avatar

<trl-content character="amanda" sex="female" />

Subtitles

<trl-content subtitlesEnabled="true" />

<trl-morph />

Expand the collapsed section below to see a full list of morph targets (blend shapes).

Example Usage

Blend shape manipulation

<trl-morph id="tongueOut" min="0" max="1.0" speed="1" wait="2" />

<trl-start-routine />

Example Usage

This tag allows an avatar to be instructed to start a routine e.g, dance.

<trl-start-routine id='danceOne' fadeIn='3' weight='0.5' />

Parameter	Description
`id`	The name of the animation
`fadeIn`	The time for the animation to fade in in seconds
`loop`	Whether the animation shouold be looped or not (default: true)
`weight`	Animation weight (default 1.0)

<trl-stop-routine />

Example Usage

This tag will stop a currently playing routine animation.

<trl-stop-routine />

<trl-speak />

Example Usage

This tag makes the avatar speak some text

<trl-speak text='Hello. I am an avatar' />

<trl-play-background-audio />

Example Usage

This tag allows the addition of a background audio file (played on loop).

<trl-play-background-audio audio='https://trulience.com/docs/image/withoutasong.mp3' volume='0.5' />

<trl-stop-background-audio />

Example Usage

This tag stops any background audio file that is currently playing.

<trl-stop-background-audio />

<trl-repeat />

Repeat the previously spoken text.

Under Construction We're still working on updating this section of the documentation

<trl-start-repeat />

Restrict to repeat only part of the text.

Under Construction We're still working on updating this section of the documentation

Provider specific SSML tags

You can use TTS provider specific SSML tags in the input. This allows you to gain more control over the generated speech. To use TTS provider ssml tags, the input needs to be SSML based including the "speak" tag. Without "speak" tag the TTS provider ssml tags cannot be used correctly. For example, if you want to slow down a part of text you will need to provide:

AWS Polly

<speak>For dramatic purposes, you might wish to <prosody rate="30%">slow up the speaking rate of your text.</prosody></speak> Just sending following as input will not work as expected: For dramatic purposes, you might wish to <prosody rate="30%">slow up the speaking rate of your text.</prosody>

Microsoft Azure

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US"><voice name="en-US-JennyNeural"><prosody rate="+30.00%">Enjoy using text to speech.</prosody></voice></speak> Just sending following as input will not work as expected: <prosody rate="+30.00%">Enjoy using text to speech.</prosody>

Speak tag

If the speak tag is not present, we assume the input does not use any service specific SSML tags and wrap the input in our own tags. So if you use any service specific SSML tags without using speak tag, such input would be regarded as incorrect and the result might be an error or not as expected. For example, following input is incorrect and will not give expected output. <prosody pitch="high">This will make my voice high pitched</prosody> trl-break tag cannot be used between speak start and end tag. For example, following input is incorrect and would not produce an output. <speak>For dramatic purposes, you might wish to <trl-break duration="1.0" /> <prosody rate="30%">slow up the speaking rate of your text.</prosody></speak>