Get Started
Interactive Avatar Creator Dashboard
A video guide to creating and configuring interactive avatarsIntroduction
Watch this video for a guided tour of the Avatar Creator Dashboard:
Appearance Tab
Customise your avatar’s appearance.
Explore our library of avatars or get in touch if you’re looking to create your own unique face.
Avatar Types
Inside of the appearance tab you can pick between four different types of avatar:
- IBA (Image Based Avatars): Realistic avatars whose appearance is derived from an image (contact us if you’re looking to create your own avatar using an image).
- 3D: Three dimensional avatars powered WebGL.
- Upper body 3D: Avatars that display just the upper half of their body (head, shoulders).
- Full body 3D: Can be placed in a 3D scene and manipulated to perform actions through animations.
- VBA (Video Based Avatars): Ultra realistic avatars at an affordable price — coming soon.
Environment Tab
Customise the avatar’s environment and lighting.
Background
Choose from the following background options:
- Static image background
- Looping video background
- Solid colour background
- Fully transparent background
Its possible to upload your own static image / looping video backgrounds.
For Upper Body 3D avatars it is also possible to select a lighting preset.
3D Environment
3D Environments are only available when your avatar’s appearance is configured to use the ‘Full Body 3D’ avatar type.
In 3D environments the lighting is defined by the scene file (.glb format), it is possible to upload your own custom scene designed in 3D modelling applications like Blender.
Brain Tab
Configure the fundamental building blocks of your avatar:
- Service provider / framework: responsible for generating text responses based on user input. For most use-cases the LLM (Large Language Model) option is ideal.
- STT (Speech to Text): responsible for converting the user’s speech into text, providers vary in cost and available languages.
- TTS (Text to Speech): used to convert the framework’s text response into a spoken response. Providers vary in cost and available voices, some providers like ElevenLabs allow you to create your own voice using an audio clip.
These settings influence the content of your avatar’s responses, how they understand the user’s speech (e.g. accent/dialect) and how the avatar’s speech sounds (their voice).
The brain tab also lets you assign uploaded knowledge to your avatar or configure it to perform programmatic actions with function calling.
Avatar Modes
Inside of the Foundations sub-tab you can choose from the following modes:
- Simple - This mode lets you get set up quickly by not requiring any API keys for your LLM/STT/TTS and instead uses a more restricted set of providers that we pay for.
- Composite - The fully unlocked version of simple mode, offers the most flexibility over LLM/STT/TTS with your own API keys.
- Speech to speech - Use specialised models that operate directly on speech instead of text to reduce latency and improve how expressive the AI voice is. This is a complete end to end solution that eliminates the need for separate STT/TTS configuration.
- 3rd Party AI - Integrations with third party providers such as Agora or ElevenLabs. Note that not all of our features are supported in 3rd party integrations (e.g. knowledge base, function calling).
Avatar Engine
The ‘Service Provider or framework’ dropdown allows you to choose between:
- Large Language Model (LLM): Natural language responses generated by AI / transformer neural networks in response to user input. This option allows you to set a prompt and integrates directly with our other features like knowledge and function calling. This is the most commonly selected option.
- Rest Endpoint: use your own code to provide responses to user messages, check out the REST API docs for more information.
- Echo back test: a testing mode that repeats what the user says.
- Google Dialogflow CX / ES Service: Construct custom dialog flows with Google’s platform.
Advanced Tab
Settings for more advanced use cases, like configuring authentication, wake messages or interruptions.
Advanced Settings
Encryption - assign an encryption key (default, or your own custom key) to require JWT token authentication when loading your avatar. Check out the authentication docs to find out more.
Wake up message - a message that get sent to the avatar upon connecting but isn’t shown in chat. This message is sent to the avatar as if it were spoken by the user. Useful for making it appear as though the avatar initiated the conversation, a typical value would be ‘hello’ or ‘hi’.
Enable Realtime Message Streaming - Controls whether message content is streamed in real time with partial updates, allowing live transcription and dynamic message corrections during speech.
Daily Limit Per Session - This is useful if you want to limit the duration of active sessions with your avatar over a 24-hour period, after which time the avatar switches to echo-back mode.
Daily Limit Per IP - Useful for limiting the duration of active sessions originating from the same IP address over a 24-hour period, after which time the avatar switches to echo-back mode.
Demo duration - This will disconnect the end-user after the specified time without switching to echo-back mode.
Interruptions
Allow Interruptions - Whether to allow the end-user to interrupt the avatar. The default value is True.
Interrupt Speech Duration - Minimum duration of speech to consider for interruption. The default value is 0.5s.
Interrupt Min Words - Minimum number of words to consider for interruption. Defaults to 0 as this may increase the latency depending on the STT provider.
Voice Activity Detection (VAD)
Min Endpointing Delay (seconds) - Sets how long a pause needs to be before the system begins to consider whether you’ve finished speaking. This helps avoid cutting you off if you pause briefly. The default value is 0.5 seconds.
Max Endpointing Delay (seconds) - Decides how long the system will wait if it thinks you might continue speaking. This is useful in scenarios where the system is trying to predict whether you’ve finished.
Min Speech Duration (seconds) - Determines how long a sound must be to start being treated as speech. This prevents accidental triggers by short noises. The default value is 0.05 seconds.
Min Silence Duration (seconds) - Sets how long of a silence must be before determining you’ve stopped speaking. This avoids breaking speech into multiple parts if you pause mid-sentence. The default value is 0.55 seconds.
Prefix Padding Duration (seconds) - Ensures that a little bit of audio is added at the beginning of your speech to make sure nothing important is cut off. The default value is 0.5 seconds.
Max Buffered Speech - Controls the maximum amount of speech stored before it’s processed. The default value is 60 seconds.
Activation Threshold - Adjusts how sensitive the system is to picking up speech. Lower values make it catch even quiet speech, while higher values mean it listens only for clearer, louder speech. The default value is 0.5. Ranges from 0.0 to 1.0. If you find that your avatar is interrupting itself with its own speech over speaker, you may wish to increase this value.
Code Tab
Ready to use embed code and avatar/client JSON for more intricate parameter tuning.
Embed
The embed tab shows you a ready to paste HTML iframe snippet that contains your avatar id and a JWT token (if default encryption is enabled). For more flexible implementations with custom code, take a look at our SDK and React docs.
Avatar / Environment JSON
Used by enterprise customers to control the bespoke functionality of their avatar. Contact us to find out more.
Client JSON
You can specify the following JSON keys/values in the client json tab to configure the appearance of your avatar’s client.
Key | Example Value |
---|---|
dialPageBackground | “#FFFFFF” |
dialButtonTextColor | “#000000” |
dialButtonBackground | “#CCCCCC” |
chatScreenBGColor | “#F0F0F0” |
userChatBubbleBGColor | “#DCF8C6” |
avatarChatBubbleBGColor | “#FFFFFF” |
userChatBubbleBorderColor | “#34B7F1” |
avatarChatBubbleBorderColor | “#ECECEC” |
userChatBubbleTextColor | “#000000” |
avatarChatBubbleTextColor | “#000000” |
inputBoxBGColor | “#FFFFFF” |
inputBoxBorderColor | “#CCCCCC” |
inputBoxTextColor | “#000000” |
sendButtonBGColor | “#25D366” |
sendButtonArrowColor | “#FFFFFF” |
sendButtonBorderColor | “#34B7F1” |
borderColorBetweenInputAndScreen | “#EEEEEE” |
overlayButtonColor | “#FFFFFF” |
loadingBarColor | “#0A74DA” |
avatarLaunch | “AutoLoad” |
noCacheConfig | false |
token | null |
hideToast | false |
controlButtonPosition | “right” |
controlButtonLayout | “row” |
dialButtonText | “LET’S CHAT” |
msgOnConnect | null |
hideMicButton | false |
hideSpeakerButton | false |
hideHangUpButton | false |
hideLetsChatBtn | false |
showSubtitlesButton | true |
hideChatInput | false |
hideChatHistory | false |
fullscreen | false |