Open-source evaluation platform for Voice AI agents. Import real conversations, score calls with LLMs, replay audio and transcripts, and track quality over time. Built for ElevenLabs, Vapi, and production voice agents.

elevenlabs evaluation evaluation-framework vapi voice voice-ai
6 Open Issues Need Help Last updated: Jul 3, 2026

Open Issues Need Help

View All on GitHub
good first issue Frontend

Open-source evaluation platform for Voice AI agents. Import real conversations, score calls with LLMs, replay audio and transcripts, and track quality over time. Built for ElevenLabs, Vapi, and production voice agents.

Python
#elevenlabs#evaluation#evaluation-framework#vapi#voice#voice-ai
good first issue

Open-source evaluation platform for Voice AI agents. Import real conversations, score calls with LLMs, replay audio and transcripts, and track quality over time. Built for ElevenLabs, Vapi, and production voice agents.

Python
#elevenlabs#evaluation#evaluation-framework#vapi#voice#voice-ai

Open-source evaluation platform for Voice AI agents. Import real conversations, score calls with LLMs, replay audio and transcripts, and track quality over time. Built for ElevenLabs, Vapi, and production voice agents.

Python
#elevenlabs#evaluation#evaluation-framework#vapi#voice#voice-ai
good first issue Frontend

Open-source evaluation platform for Voice AI agents. Import real conversations, score calls with LLMs, replay audio and transcripts, and track quality over time. Built for ElevenLabs, Vapi, and production voice agents.

Python
#elevenlabs#evaluation#evaluation-framework#vapi#voice#voice-ai
enhancement help wanted

Open-source evaluation platform for Voice AI agents. Import real conversations, score calls with LLMs, replay audio and transcripts, and track quality over time. Built for ElevenLabs, Vapi, and production voice agents.

Python
#elevenlabs#evaluation#evaluation-framework#vapi#voice#voice-ai
enhancement good first issue

Open-source evaluation platform for Voice AI agents. Import real conversations, score calls with LLMs, replay audio and transcripts, and track quality over time. Built for ElevenLabs, Vapi, and production voice agents.

Python
#elevenlabs#evaluation#evaluation-framework#vapi#voice#voice-ai