What is Neural Voice?
A synthetic voice generated by deep neural networks (as opposed to older concatenative TTS). Neural voices sound significantly more natural, with proper intonation, breathing, and emotional range. ElevenLabs and OpenAI are leading providers.
Related Terms
Talking Head Video
A video format featuring a person (or AI-generated character) speaking directly to the camera. Commonly used in education, marketing, and social media content. Puppetry turns any photo into a talking head video using AI lip-sync technology.
Lip Sync / Lip Syncing
The process of matching mouth movements to audio speech. In AI video, lip sync algorithms analyze audio waveforms and generate realistic mouth shapes frame-by-frame. Puppetry uses LivePortrait + Wav2Lip for production-quality lip sync across 29 languages.
AI Avatar / AI Presenter
A digital character animated by artificial intelligence. Unlike deepfakes (which impersonate real people), AI avatars are created from photos with the owner's consent for legitimate use cases like education, marketing, and accessibility.
Text-to-Speech (TTS)
Technology that converts written text into spoken audio. Modern TTS systems like ElevenLabs and Kokoro produce natural-sounding voices with emotion, pacing, and accent control. Puppetry offers 500+ AI voices across 29 languages.