Meta open-sources Hermes: a speech-to-animation model for lively NPCs
Gaming · 5 min read
Hermes maps prosody, phonemes, and emotional intent to animation parameters like visemes, eyebrow movement, and body posture, producing a time-aligned animation stream compatible with popular engines. Meta released runtime bindings for Unity and Unreal to simplify integration.
The project includes sample pipelines for lip-sync, gaze behavior, and interruptible conversational states, enabling designers to author branching dialogue with consistent character animation. Meta emphasized the model's flexibility, allowing studios to fine-tune for stylized or realistic animation styles.
While open-sourcing lowers the bar to entry, Meta warns developers to test for culturally sensitive gestures and to curate training data to avoid inappropriate animations. The release also comes with a license that restricts misuse and provides community guidelines for responsible use.