SyncNet based on Meta's Perception Encoder Audio-Visual (PE-AV)
computer-vision deep-learning pytorch video-processing representation-learning lip-sync speech-processing lipsync time-synchronization multimodal-learning audio-visual lip deepfake-detection active-speaker-detection time-delay-estimation syncnet lip-sync-detection audio-visual-sync
-
Updated
Jan 12, 2026 - Python