Multimodel deepfake detection
-
Updated
Jul 21, 2025 - Jupyter Notebook
Multimodel deepfake detection
In this code, we have used common and well-known datasets such as the Toronto dataset available on Kaggle to create a sentiment analysis model from human voice. This model is designed based on the Bert model and is called Hubert.
Tone.me helps users improve their pronunciation in Mandarin
Multimodal Model which take text audio and video to predict the turn taking. That is, to predict whether the speaker in a discussion will change.
Generates section wise topics and transcription for lecture videos and helps to control the lecture video playback based on generated topic-wise timestamps.
PyTorch implementation of MixCap: A Multimodal Video Captioning model fusing BLIP-2 (Visual) & Wav2Vec2 (Audio). Features a novel Dual-Target MixUp strategy for low-resource training.
Low-resource multimodal hate speech detection leveraging acoustic and textual representations for robust moderation in Telugu.
Add a description, image, and links to the wave2vec2 topic page so that developers can more easily learn about it.
To associate your repository with the wave2vec2 topic, visit your repo's landing page and select "manage topics."