Skip to content

Pixel-Talk/MODA

Repository files navigation

MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions

    Open In Colab   Hugging Face Spaces  


ICCV 2023

moda


🕊️ Description

MODA is a unified system for multi-person, diverse, and high-fidelity talking portrait generation system.

🎊 News

  • 2023/08/31 Training codes have been released.
  • 2023/08/31 Pretrained models have been released.
  • 2023/08/13 Inference codes have been released.
  • 2023/08/13 Data preprocessing scripts have been released.

🛠️ Installation

After cloning the repository please install the environment by running the install.sh script. It will prepare the MODA for usage.

git clone https://github.com/DreamtaleCore/MODA.git
cd MODA
bash ./install.sh

🚀 Usage

Quick run

python inference.py

Then a few minutes later ☕, the results will be generated at results/.

Parameters:

usage: Inference entrance for MODA. [-h] [--audio_fp_or_dir AUDIO_FP_OR_DIR] [--person_config PERSON_CONFIG]
                                    [--output_dir OUTPUT_DIR] [--n_sample N_SAMPLE]

optional arguments:
  -h, --help            show this help message and exit
  --audio_fp_or_dir AUDIO_FP_OR_DIR
  --person_config PERSON_CONFIG
  --output_dir OUTPUT_DIR
  --n_sample N_SAMPLE

🍏 Dataset preparation

cd data_prepare

python process.py -i your/video/dir -o your/output/dir

More informations please refer to here.

🏃 Train

Train the MODA model and FaCo model

python train.py --config configs/train/moda.yaml
python train.py --config configs/train/faco.yaml

Train the renderer for new avatar

python train_renderer.py --config configs/train/renderer/Cathy.yaml

Link your models

ln -s your_absolute_dir/TrainMODAVel/Audio2FeatureVertices/best_MODA.pkl assets/ckpts/MODA.pkl
ln -s your_absolute_dir/TrainFaCoModel/Audio2FeatureVertices/best_FaCo_G.pkl assets/ckpts/FaCo.pkl
ln -s your_absolute_dir/Render/TrainRenderCathy/Render/best_Render_G.pkl assets/ckpts/renderer/Cathy.pth

Then update the ckpt filepath in your config files.

🚧 TODO

  • Release the inference code
  • Data preprocessing scripts
  • Prepare the pretriained-weights
  • Release the training code
  • Prepare the huggingface🤗 demo
  • Releaes the processed HDTF data

🛎 Citation

If you find our work useful in your research, please consider citing:

@inproceedings{liu2023MODA,
  title={MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions},
  author={Liu, Yunfei and Lin, Lijian and Fei, Yu and Changyin, Zhou, and Yu, Li},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2023}
}

🥂 Acknowledgement

Our code is based on LiveSpeechPortrait and FaceFormer.

About

Official Repository for ICCV-2023 MODA

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published