External Platforms

ElevenLabs Conversational AI

How to connect your avatar to ElevenLab's Conversational AI service

Overview

ElevenLabs offers Conversational AI agents that handle speech recognition, conversation logic, and voice synthesis. Trulience provides two ways to integrate with ElevenLabs agents:

  1. Built-in Integration - Quick setup through the Trulience dashboard (recommended for most users)
  2. Custom SDK Integration - Advanced integration using the ElevenLabs SDK for full control (supports both WebSocket and WebRTC modes)

Working example: See our ElevenLabs integration example for complete working code.

Built-in Integration

The simplest way to add Trulience avatars to your ElevenLabs agents.

Dashboard Configuration

  1. Open your avatar’s settings in the Trulience dashboard
  2. Navigate to the BRAIN tab
  3. Select ‘3rd Party AI - Conversational AI Frameworks’
  4. Choose ‘ElevenLabs Conversational AI’ from the service provider dropdown
  5. Enter your Agent ID from the ElevenLabs dashboard
  6. (Optional) If encryption is enabled for your ElevenLabs agent, paste the API key

Video Tutorial

This video demonstrates the complete setup process:


Custom WebSocket Integration (Advanced)

For developers who want fine-grained control or need to customize the integration beyond what the dashboard offers, you can use the ElevenLabs SDK directly with Trulience.

Why Use Custom Integration?

  • Full control over the conversation lifecycle
  • Ability to intercept and modify audio streams
  • Custom UI/UX beyond the standard Trulience client
  • Integration with your own application state management

Prerequisites

  • A Trulience avatar configured for External Voice Platforms
  • An ElevenLabs API key
  • Node.js and npm installed

Installation

npm install @trulience/react-sdk @elevenlabs/client@^0.7.0

Dashboard Configuration

  1. Open your avatar’s settings in the Trulience dashboard
  2. Navigate to the BRAIN tab
  3. Select ‘3rd Party AI’ mode
  4. Set ‘Service provider or framework’ to ‘External Voice Platforms’

This configuration disables Trulience’s built-in conversation handling, allowing the ElevenLabs SDK to manage it.

Integration Pattern (WebSocket Mode)

ElevenLabs uses the Web Audio API for audio playback. We need to monkey-patch the SDK initialization to intercept audio routing before it connects to speakers:

import { Conversation } from '@elevenlabs/client';

// Helper function to route Web Audio to MediaStream
const setupWebSocketAudio = (conversation) => {
  const outputInstance = conversation.output;
  const context = outputInstance.context;
  const mediaStreamDestination = context.createMediaStreamDestination();

  // Disconnect from speakers and route to MediaStream
  outputInstance.gain.disconnect();
  outputInstance.gain.connect(mediaStreamDestination);
  outputInstance.mediaStream = mediaStreamDestination.stream;

  return outputInstance.mediaStream;
};

// Monkey-patch the SDK BEFORE starting conversation
const originalStartSession = Conversation.startSession;
Conversation.startSession = async (options) => {
  const conversation = await originalStartSession.call(Conversation, options);
  setupWebSocketAudio(conversation);
  return conversation;
};

// Now start the conversation (will use patched version)
const conversation = await Conversation.startSession({
  agentId: 'your-agent-id',
  connectionType: 'websocket'
});

// Wait for stream to be available
for (let i = 0; i < 10; i++) {
  const outputInstance = conversation.output;
  if (outputInstance?.mediaStream) {
    // Route to Trulience avatar
    trulienceRef.current.setMediaStream(outputInstance.mediaStream);
    trulienceRef.current.getTrulienceObject().setSpeakerEnabled(true);
    break;
  }
  await new Promise(resolve => setTimeout(resolve, 200));
}

Why monkey-patch? We need to intercept the audio routing during SDK initialization, before the gain node connects to speakers. This ensures no audio plays through the platform’s default output.

Complete React Example

import React, { useRef, useState } from 'react';
import { TrulienceAvatar } from '@trulience/react-sdk';
import { Conversation } from '@elevenlabs/client';

function ElevenLabsTrulienceIntegration() {
  const trulienceRef = useRef(null);
  const [conversation, setConversation] = useState(null);
  const [connected, setConnected] = useState(false);
  const [remoteStream, setRemoteStream] = useState(null);

  // Helper function to set up WebSocket audio routing
  const setupWebSocketAudio = (conversation) => {
    const outputInstance = conversation.output;
    const context = outputInstance.context;
    const mediaStreamDestination = context.createMediaStreamDestination();

    outputInstance.gain.disconnect();
    outputInstance.gain.connect(mediaStreamDestination);
    outputInstance.mediaStream = mediaStreamDestination.stream;

    return outputInstance.mediaStream;
  };

  const startConversation = async () => {
    try {
      // Monkey-patch the SDK before starting
      const originalStartSession = Conversation.startSession;
      Conversation.startSession = async (options) => {
        const conversation = await originalStartSession.call(Conversation, options);
        setupWebSocketAudio(conversation);
        return conversation;
      };

      // Start conversation with WebSocket mode
      const conversationInstance = await Conversation.startSession({
        agentId: process.env.NEXT_PUBLIC_ELEVENLABS_AGENT_ID,
        connectionType: 'websocket',
        onConnect: () => {
          console.log('ElevenLabs conversation connected');
          setConnected(true);
        },
        onDisconnect: () => {
          console.log('ElevenLabs conversation disconnected');
          setConnected(false);
        },
        onError: (error) => {
          console.error('ElevenLabs error:', error);
        }
      });

      setConversation(conversationInstance);

      // Wait for MediaStream to be available
      for (let i = 0; i < 10; i++) {
        const outputInstance = conversationInstance.output;
        if (outputInstance?.mediaStream) {
          const mediaStream = outputInstance.mediaStream;
          setRemoteStream(mediaStream);

          // Route to Trulience avatar
          trulienceRef.current.setMediaStream(mediaStream);
          const trulienceObj = trulienceRef.current.getTrulienceObject();
          if (trulienceObj) {
            trulienceObj.setSpeakerEnabled(true);
            console.log('Speaker enabled on Trulience');
          }
          break;
        }
        await new Promise(resolve => setTimeout(resolve, 200));
      }
    } catch (error) {
      console.error('Failed to start ElevenLabs conversation:', error);
      setConnected(false);
    }
  };

  const endConversation = async () => {
    if (conversation) {
      await conversation.endSession();
      setConversation(null);
      setConnected(false);
      setRemoteStream(null);
    }
  };

  return (
    <div className="relative min-h-screen">
      <div className="absolute inset-0">
        <TrulienceAvatar
          ref={trulienceRef}
          url={process.env.NEXT_PUBLIC_TRULIENCE_SDK_URL}
          avatarId="your-avatar-id"
          token="your-trulience-token"
          width="100%"
          height="100%"
        />
      </div>

      <button
        onClick={connected ? endConversation : startConversation}
        className={`absolute bottom-6 left-1/2 -translate-x-1/2 px-6 py-3 rounded-lg text-white font-semibold ${
          connected ? 'bg-red-600 hover:bg-red-700' : 'bg-blue-600 hover:bg-blue-700'
        }`}
      >
        {connected ? 'Disconnect' : 'Connect'}
      </button>
    </div>
  );
}

export default ElevenLabsTrulienceIntegration;

Understanding the Audio Pipeline

ElevenLabs sends μ-law compressed 8-bit audio chunks over WebSocket. Their SDK handles:

  1. Audio Decoding: An Audio Worklet Processor decodes incoming chunks into 16-bit PCM audio
  2. Stream Processing: The Output class manages the audio worklet and connects it to a gain node
  3. MediaStream Creation: We create a MediaStreamAudioDestinationNode to convert the Web Audio API output into a MediaStream

Integration Pattern (WebRTC Mode)

ElevenLabs also supports WebRTC mode using LiveKit internally. This requires extracting audio from LiveKit’s remote participants:

// Helper function to set up WebRTC audio routing
const setupWebRTCAudio = async (conversation) => {
  const outputInstance = conversation.output;

  // Mute ElevenLabs audio element
  if (outputInstance.audioElement) {
    outputInstance.audioElement.muted = true;
    outputInstance.audioElement.volume = 0;
  }

  // Extract MediaStream from LiveKit
  const connection = conversation.connection;

  for (let i = 0; i < 50; i++) {
    const room = connection?.room;
    if (room?.remoteParticipants) {
      const participants = Array.from(room.remoteParticipants.values());

      for (const participant of participants) {
        const audioTracks = participant.audioTrackPublications;
        if (audioTracks?.size > 0) {
          const trackPublication = Array.from(audioTracks.values())[0];
          const liveKitTrack = trackPublication.track;

          if (liveKitTrack?.mediaStream) {
            // Mute original playback elements
            liveKitTrack.attachedElements?.forEach((element) => {
              if (element?.muted !== undefined) {
                element.muted = true;
                element.volume = 0;
              }
            });

            return liveKitTrack.mediaStream;
          }
        }
      }
    }
    await new Promise((resolve) => setTimeout(resolve, 100));
  }
  return null;
};

// Start conversation with WebRTC mode
const conversation = await Conversation.startSession({
  agentId: 'your-agent-id',
  connectionType: 'webrtc'
});

// Get MediaStream from LiveKit
const mediaStream = await setupWebRTCAudio(conversation);

if (mediaStream) {
  trulienceRef.current.setMediaStream(mediaStream);
  trulienceRef.current.getTrulienceObject().setSpeakerEnabled(true);
}

WebSocket vs WebRTC:

  • WebSocket - Simpler integration, lower latency, recommended for most use cases
  • WebRTC - Uses LiveKit internally, may offer better network stability in some scenarios

Key Integration Points

Preventing Double Audio (WebSocket)

  • Disconnect the gain node from context.destination (speakers)
  • Connect only to the MediaStreamAudioDestinationNode
  • Route the resulting MediaStream to Trulience

Preventing Double Audio (WebRTC)

  • Mute the ElevenLabs audioElement
  • Mute LiveKit’s attachedElements array
  • Route the LiveKit MediaStream to Trulience

Timing

  • WebSocket: Set up monkey-patch before calling startSession()
  • WebRTC: Poll for LiveKit remote participants after connection
  • Ensure Trulience WebSocket is connected before calling setMediaStream()

Troubleshooting

Issue: No audio from avatar

  • Verify output.gain.disconnect() was called before connecting to destination
  • Check that setSpeakerEnabled(true) is set on the Trulience object
  • Confirm the MediaStream has active audio tracks

Issue: Hearing double audio

  • Ensure the gain node is disconnected from context.destination
  • Check that no other audio elements are playing the ElevenLabs output

Issue: Choppy or distorted audio

  • ElevenLabs audio quality depends on your API plan
  • Check network stability (WebSocket connection quality)
  • Verify browser audio context is not throttled (happens in background tabs)

Choosing the Right Approach

FeatureBuilt-in IntegrationCustom WebSocket Integration
Setup ComplexitySimple (dashboard only)Advanced (code required)
CustomizationLimitedFull control
Audio ControlAutomaticManual routing required
Use CaseQuick deployment, standard UICustom applications, advanced features

Example Code Repository

See our ElevenLabs integration example for complete working code demonstrating both WebSocket and WebRTC modes.

Next Steps