External Platforms

Integration Guide

Learn how to integrate Trulience avatars with any external voice AI platform or custom audio pipeline

Introduction

This guide explains how to integrate Trulience avatars with external voice platforms or custom audio pipelines. While Trulience provides a complete conversational AI solution through the Trulience Platform, you may want to use an external platform that handles speech-to-text, conversation logic, and text-to-speech while Trulience provides the visual avatar experience.

When to Use External Platform Integration

Consider this approach if you:

  • Already use a voice AI platform (VAPI, ElevenLabs, etc.) and want to add avatar visuals
  • Need specialized conversation capabilities not available in Trulience Platform
  • Want to build a custom audio pipeline with full control
  • Are integrating Trulience into an existing voice application

Platform-Specific Guides

Before diving into the technical details, check if we have a guide for your specific platform:

If your platform isn’t listed, this guide will teach you the general patterns needed to build your own integration.

Working examples: We maintain complete integration examples in our integration examples repository.

Core Concept: Audio Routing

At the heart of every external platform integration is audio routing - connecting audio from your voice platform to Trulience for lip-sync visualization.

The setMediaStream Method

Trulience avatars lip-sync to audio through the setMediaStream() method:

// After initializing Trulience
const audioStream = /* your MediaStream */;
trulience.setMediaStream(audioStream);

Your integration needs to:

  1. Obtain audio from your voice platform
  2. Convert it to a MediaStream object
  3. Pass it to setMediaStream()

Dashboard Configuration

Before starting your integration, configure your avatar for external platform use:

  1. Open your avatar in the Trulience dashboard
  2. Navigate to the BRAIN tab
  3. Select ‘3rd Party AI’ mode
  4. Set ‘Service provider or framework’ to ‘External Voice Platforms’

This configuration disables Trulience’s built-in STT, LLM, and TTS, allowing your external platform to handle these components.

Integration Patterns

Different voice platforms deliver audio in different ways. Here are the common patterns:

Pattern 1: WebRTC Platforms

Platforms like VAPI, OpenAI Realtime API, and others using WebRTC provide audio through RTCPeerConnection.

How it works:

  • Platform establishes WebRTC connection
  • Audio arrives via remote media tracks
  • You capture the track and create a MediaStream

Example:

// Listen for remote audio track
pc.ontrack = (event) => {
  const stream = event.streams[0];

  // Route to Trulience
  trulienceRef.current.setMediaStream(stream);
  trulienceRef.current.getTrulienceObject().setSpeakerEnabled(true);

  // Mute platform's audio element (prevent double audio)
  audioElement.muted = true;
};

Key considerations:

  • Wait for the track or track-started event before routing
  • Platform may auto-create audio elements - mute them to prevent double audio
  • Ensure the track kind is 'audio' and from the remote participant

Pattern 2: Web Audio API Platforms

Platforms like ElevenLabs WebSocket mode use the Web Audio API for audio playback.

How it works:

  • Platform processes audio through Web Audio API nodes
  • Audio typically flows through a GainNode before speakers
  • You intercept the gain node and route to a MediaStreamDestination

Example:

// Access the platform's Web Audio components
const audioContext = platform.audioContext;
const gainNode = platform.gainNode;

// Create MediaStream destination
const destination = audioContext.createMediaStreamAudioDestinationNode();

// Reroute audio
gainNode.disconnect(); // Disconnect from speakers
gainNode.connect(destination); // Connect to MediaStream

// Route to Trulience
trulienceRef.current.setMediaStream(destination.stream);
trulienceRef.current.getTrulienceObject().setSpeakerEnabled(true);

Key considerations:

  • Find where the platform connects to audioContext.destination (speakers)
  • Insert your MediaStreamAudioDestinationNode before that connection
  • Disconnect from speakers to prevent double audio

Pattern 3: Custom Audio Sources

If you’re generating audio yourself (custom TTS, audio files, etc.), you need to create a MediaStream.

Creating MediaStream from Audio Element:

const audioElement = document.createElement('audio');
audioElement.src = 'your-audio-source.mp3';

const audioStream = audioElement.captureStream();
trulience.setMediaStream(audioStream);

Creating MediaStream from Web Audio API:

const audioContext = new AudioContext();
const source = audioContext.createBufferSource();
// ... configure your source ...

const destination = audioContext.createMediaStreamDestination();
source.connect(destination);

trulience.setMediaStream(destination.stream);

Preventing Double Audio

A common issue in integrations is hearing both the platform’s audio and the avatar’s audio simultaneously.

Why This Happens

Voice platforms typically play audio through speakers by default. When you route that same audio to Trulience, both play unless you prevent it.

Solutions

For WebRTC platforms:

Simple case - platform creates one audio element:

// Mute the platform's auto-created audio element
const platformAudio = document.querySelector('audio');
if (platformAudio) platformAudio.muted = true;

Advanced case - platform creates elements with specific attributes:

// VAPI creates elements tagged with participant IDs
setTimeout(() => {
  const vapiAudio = document.querySelector(
    `audio[data-participant-id="${participantId}"]`
  );
  if (vapiAudio) vapiAudio.muted = true;
}, 100); // Delay needed as element is created asynchronously

Multiple audio elements:

// Mute all audio elements created by the platform
const allAudioElements = document.querySelectorAll('audio');
allAudioElements.forEach(el => el.muted = true);

For Web Audio API platforms:

// Disconnect from speakers before routing to Trulience
gainNode.disconnect(audioContext.destination);
gainNode.connect(mediaStreamDestination);

Always enable Trulience’s speaker:

trulienceRef.current.getTrulienceObject().setSpeakerEnabled(true);

Timing and Connection States

Critical Timing Requirements

Audio routing must happen after both:

  1. The external platform connection is established (audio is available)
  2. Trulience WebSocket is connected

Timing Strategies

Different platforms make audio available at different times. Here are the common patterns:

Strategy 1: Event-Based (Preferred)

Use when the platform provides connection events:

// OpenAI Realtime API - synchronous WebRTC track
pc.ontrack = (event) => {
  const stream = event.streams[0];
  trulience.setMediaStream(stream);
  trulience.getTrulienceObject().setSpeakerEnabled(true);
};

Strategy 2: Polling for Async Objects

Use when platform objects are created asynchronously:

// VAPI - getDailyCallObject() returns null initially
const waitForDaily = new Promise((resolve) => {
  const checkDaily = () => {
    const dailyCall = vapi.getDailyCallObject();
    if (dailyCall) {
      resolve(dailyCall);
    } else {
      setTimeout(checkDaily, 100); // Poll every 100ms
    }
  };
  checkDaily();
});

const dailyCall = await waitForDaily;
// Now safe to use dailyCall

Strategy 3: Polling for Internal Properties

Use when internal properties populate over time:

// ElevenLabs - output.mediaStream is set after initialization
for (let i = 0; i < 10; i++) {
  if (conversation.output?.mediaStream) {
    const stream = conversation.output.mediaStream;
    trulience.setMediaStream(stream);
    break;
  }
  await new Promise(resolve => setTimeout(resolve, 200));
}

Note: Polling is a workaround for platforms that don’t expose proper lifecycle events. Prefer event-based approaches when available.

Converting WebSocket Audio to MediaStream

If your platform sends audio over WebSocket, you’ll need to convert it to a MediaStream.

General Process

  1. Receive audio chunks via WebSocket
  2. Decode the audio (may be compressed - μ-law, Opus, etc.)
  3. Feed into Web Audio API using an Audio Worklet or ScriptProcessor
  4. Create MediaStream from the audio graph

Reference Implementation

The ElevenLabs SDK provides an Audio Worklet Processor that demonstrates this pattern. Key steps:

  1. Create an AudioWorklet to process chunks
  2. Connect the worklet to a GainNode
  3. Create a MediaStreamAudioDestinationNode
  4. Connect the gain to the destination
// Simplified example
class AudioChunkProcessor extends AudioWorkletProcessor {
  process(inputs, outputs, parameters) {
    // Process incoming audio chunks
    // Write to outputs
    return true;
  }
}

// In main thread
const context = new AudioContext();
await context.audioWorklet.addModule('processor.js');

const workletNode = new AudioWorkletNode(context, 'audio-chunk-processor');
const gainNode = context.createGain();
const destination = context.createMediaStreamDestination();

workletNode.connect(gainNode);
gainNode.connect(destination);

// Feed chunks to worklet via MessagePort
workletNode.port.postMessage({ chunk: audioData });

// Route to Trulience
trulience.setMediaStream(destination.stream);

Debugging Your Integration

Common Issues and Solutions

ProblemPossible CauseSolution
Avatar mouth not movingStream not attachedCheck that setMediaStream() was called after both platform connected AND Trulience websocket connected
Stream has no audio tracksVerify stream.getAudioTracks().length > 0
Avatar not configured correctlyEnsure avatar is set to ‘External Voice Platforms’ in dashboard
Speaker disabledCall getTrulienceObject().setSpeakerEnabled(true)
Double audio (hearing platform + avatar)Platform audio not mutedMute platform’s audio elements: audioElement.muted = true
Web Audio node connected to speakersDisconnect from audioContext.destination
Speaker muted instead of audio elementEnsure you mute the <audio> element, not the MediaStream
No audio at allSpeaker disabledCall getTrulienceObject().setSpeakerEnabled(true)
Audio context suspendedResume context: audioContext.resume() (requires user gesture)
MediaStream has no tracksCheck stream.getAudioTracks()[0]?.enabled === true
Timing issueEnsure stream is attached after Trulience websocket connects
Platform object not availableAccessed too earlyUse polling pattern to wait for async initialization
Incorrect property pathInspect the SDK object in console to find correct path

Debugging Tools

Use these snippets to diagnose integration issues:

// Check MediaStream validity
console.log('Audio tracks:', stream.getAudioTracks());
console.log('Track enabled:', stream.getAudioTracks()[0]?.enabled);
console.log('Track muted:', stream.getAudioTracks()[0]?.muted);
console.log('Track readyState:', stream.getAudioTracks()[0]?.readyState);

// Monitor Trulience state
const trulienceObj = trulienceRef.current.getTrulienceObject();
console.log('Speaker enabled:', trulienceObj.getSpeakerStatus());

// Check Web Audio API state
console.log('AudioContext state:', audioContext.state);

// Find all audio elements on page
const audioElements = document.querySelectorAll('audio');
console.log('Audio elements:', audioElements);
audioElements.forEach((el, i) => {
  console.log(`Audio ${i}:`, {
    muted: el.muted,
    paused: el.paused,
    src: el.src,
    srcObject: el.srcObject
  });
});

Example Integrations

Our integration examples repository contains complete working implementations:

  • OpenAI Realtime API - Simple WebRTC voice-to-voice integration
  • VAPI - WebRTC platform with Daily.co integration
  • ElevenLabs - Custom WebSocket and WebRTC integration

Each example includes full source code, configuration, and setup instructions.

Next Steps