Veritone Voice is a cloud-based voice synthesis platform designed for organizations that need high-quality, scalable AI-generated speech. It allows businesses to produce audio from text or speech, cre...
MetaVoice is a conversational voice AI platform designed to make digital speech interactions feel natural and human-like. Its duplex speech-to-speech technology allows fluid back-and-forth conversatio...
The voices carry emotion in a way that doesn’t sound over‑acted, and being able to tweak pitch, spee...
Voicemod AI is a real-time voice modification platform that allows users to change, enhance, and customize their voice for streaming, gaming, and online communication. Leveraging artificial intelligen...
BeFreed delivers personalized audio from top knowledge sources based on user intent. It transforms books, podcasts, research papers, and lectures into narrated content that fits chosen styles and dept...
Typeless turns spoken words into polished text for messages, emails, and documents in real time. It processes natural speech by removing filler words like 'um' and 'uh,' eliminating unnecessary repeti...
AssemblyAI provides speech-to-text and speech understanding models for transcribing and extracting insights from voice data. The platform handles real-time streaming transcription via WebSocket API wi...
Soniox transcribes speech into text and delivers translations in real time across 60+ languages. It streams results as speech occurs, without waiting for sentence boundaries or pauses. Core capabiliti...
Kits AI is a browser-based platform for AI voice generation, vocal isolation, and music production. It enables users to clone voices by uploading audio datasets, blend voices from a library of 75+ roy...
WellSaid provides a studio for generating voiceovers with word-level editing, real-time pitch and pace adjustments, and audio up to 96 kHz for natural speech. It focuses on workflows involving precise...
ElevenLabs provides an AI voice generator and voice agents platform that creates lifelike speech from text. It supports controllable, expressive speech layered with emotion, audio events, and immersiv...