1International Digital Economy Academy (IDEA), Shenzhen, China, 2Vistring Inc., Hangzhou, China
—ICCV 2023—
MODA is a unified system for multi-person, diverse, and high-fidelity talking portrait generation system.
2023/08/31Training codes have been released.2023/08/31Pretrained models have been released.2023/08/13Inference codes have been released.2023/08/13Data preprocessing scripts have been released.
After cloning the repository please install the environment by running the install.sh script. It will prepare the MODA for usage.
git clone https://github.com/DreamtaleCore/MODA.git
cd MODA
bash ./install.shQuick run
python inference.pyThen a few minutes later ☕, the results will be generated at results/.
Parameters:
usage: Inference entrance for MODA. [-h] [--audio_fp_or_dir AUDIO_FP_OR_DIR] [--person_config PERSON_CONFIG]
[--output_dir OUTPUT_DIR] [--n_sample N_SAMPLE]
optional arguments:
-h, --help show this help message and exit
--audio_fp_or_dir AUDIO_FP_OR_DIR
--person_config PERSON_CONFIG
--output_dir OUTPUT_DIR
--n_sample N_SAMPLEcd data_prepare
python process.py -i your/video/dir -o your/output/dir
More informations please refer to here.
python train.py --config configs/train/moda.yamlpython train.py --config configs/train/faco.yamlpython train_renderer.py --config configs/train/renderer/Cathy.yamlln -s your_absolute_dir/TrainMODAVel/Audio2FeatureVertices/best_MODA.pkl assets/ckpts/MODA.pklln -s your_absolute_dir/TrainFaCoModel/Audio2FeatureVertices/best_FaCo_G.pkl assets/ckpts/FaCo.pklln -s your_absolute_dir/Render/TrainRenderCathy/Render/best_Render_G.pkl assets/ckpts/renderer/Cathy.pthThen update the ckpt filepath in your config files.
- Release the inference code
- Data preprocessing scripts
- Prepare the pretriained-weights
- Release the training code
- Prepare the huggingface🤗 demo
- Releaes the processed HDTF data
If you find our work useful in your research, please consider citing:
@inproceedings{liu2023MODA,
title={MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions},
author={Liu, Yunfei and Lin, Lijian and Fei, Yu and Changyin, Zhou, and Yu, Li},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
year={2023}
}Our code is based on LiveSpeechPortrait and FaceFormer.