A Survey of Audio Synthesis and Lip-syncing for Synthetic Video Generation

Kadam, Anup and Rane, Sagar and Mishra, Arpit Kumar and Sahu, Shailesh Kumar and Singh, Shubham and Pathak, Shivam Kumar (2021) A Survey of Audio Synthesis and Lip-syncing for Synthetic Video Generation. EAI Endorsed Transactions on Creative Technologies. e2. ISSN 2409-9708

[thumbnail of eai.14-4-2021.169187.pdf]
Available under License Creative Commons Attribution No Derivatives.

Download (1MB) | Preview


The fields like Media, Education and Corporations etc have started focusing on content creation. This has led to the huge demand for synthetic media generation using less data. To synthesize a high-grade artificial video, the lip must be synchronized with the audio. Here we have compared the various methods for voice-cloning and lip synchronization. Voice cloning procedure include state of the art methods like wavenet and other text-to-speech approaches. Lip synchronization methods describe constrained and unconstrained methods. Various recent research like LipGan, Wav2Lip are discussed. The methods are compared and the best method is suggested. Apart from studying and comparing the various methods, their drawbacks, future scopes, and application are also there. Different social and ethical issues are also discussed.

Item Type: Article
Uncontrolled Keywords: Video Synthesis, Voice Cloning, Lip Synchronization, Video Generation Application
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
QA75 Electronic computers. Computer science
T Technology > T Technology (General)
Depositing User: EAI Editor IV
Date Deposited: 15 Jul 2021 11:37
Last Modified: 15 Jul 2021 11:37
URI: https://eprints.eudl.eu/id/eprint/4742

Actions (login required)

View Item
View Item