voice devtools

Developer tools to debug and build realtime voice agents. Supports multiple models.

28
3
TypeScript

Voice DevTools

This UI provides a debug console for real-time AI voice interactions. It works with multiple realtime models (View supported models). Features include:

  1. Cost Tracking: Know how much you’ve spent per voice interaction
  2. Model Support: Supports open-source (MiniCPM-o) and closed-source S2S models like OpenAI’s GPT-4o Realtime (adding more soon!)
  3. Voice and Chat UI
  4. Session history and recording

Inspired by openai-realtime-console and openai-realtime-agents.

Quick Start

  1. Get your API keys:

  2. Set up environment:

    cp .env.example .env
    # Add your API keys to .env:
    # OPENAI_API_KEY="<your-openai-key>"
    # OUTSPEED_API_KEY="<your-outspeed-key>"
    
  3. Install and run:

    npm install
    npm run dev
    

Visit http://localhost:3000 to access the console.

Usage

To modify agent prompt and tools, modify agent-config.js.

To modify the model parameters like (voice, version, etc.), edit model-config.js

Agents

There are three voice agent examples already present. You can choose them in the Session Config UI on the right before starting a session.

  1. Dental Agent: Answers callers’ questions about working hours of a dental clinic
  2. Message Agent: Takes callers’ messages for a person
  3. Recruiter Agent: Talks to a candidate and asks questions about their background and availability

You can see the prompts in ./src/agent-config.ts file.

Supported Models

  • MiniCPM-o (hosted by Outspeed)
  • OpenAI Realtime models
  • Moshi (Coming Soon)
  • Gemini Multimodal Live (Coming Soon)

Deployment

You can see the deployment your agent to Cloudflare by following the steps at demo.outspeed.com/deploy, or you could run this locally and visit the deploy route.

License

MIT