Voice AI models are having a moment currently, and the ability to simply talk like with a friend has opened up a higher bandwidth way of communicating with software, especially with the rise of prompt-based and conversational workflows, where it fits perfectly.
But if you have to add a voice agent to your website, you have to build your own stack dealing with the async hell of parallel websockets + streaming model calls, voice activity detection, background noise suppression, speaker diarization, TTS, STT, integrating with language models, streaming function calls AND somehow keep the latency to ~400-700 ms.
Similar services have made notable advances, but their stack still remains heavily optimized for telephony.
Web app-specific cases, such as handling UI state updates asynchronously, waiting for user action, controlling responses based on function calls, interruption handling (user takes action midway when AI is talking) - while maintaining a fluid & human-like conversational flow remains a challenge
Using Ponder’s React SDK - you can add a voice agent to the bottom right corner of the screen. All you have to do is wrap your _app.js component in PonderProvider.
Just with this, you will have the Ponder widget render on your website that users can talk to.
After that, you dynamically control the context and actions the agent can take on each page by using the setActions and setInstructions hooks, anywhere in your app.
Actions can have JavaScript functions that are already part of your code base. Simply pass them to setActions along with a description for the agent. (Checkout docs for more details.)
You can configure the agent on Ponder’s dashboard, choose a Voice, LLM, and the system prompt. If you want to attach external docs to the context, simply add curly braces variables in the system prompt, and for each variable, pick a data source (currently supports Confluence and Google Docs).
Ponder supports both Voice and Text modes. The messages get populated in the Ponder widget as the conversation unfolds:
Companies that offer prompt-based conversational apps - E.g., creativity tools, text-to-anything, agentic workflows—where the user needs to craft a prompt. Making it conversational lets the user brainstorm the prompt with the agent: “I want it to look more like a…“
Companies that have a data entry part to their apps - For example, we are working with an inspections software provider to enable construction inspection officials to enter large amounts of data into the software simply by speaking—making the process much faster. For instance: “The inspection ID is 1234 and the beam is 30 by 30 by 52 inches.“
Companies that are struggling with onboarding - Especially for applications whose users are not the most tech-savvy and, despite your best efforts at building an intuitive UI, struggle to operate it. With Ponder, users can simply talk with your application.
In-App assistance, customer support, scheduling, and other usual suspects.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.