Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

Overview

πŸ“– Depth-Aware Generative Adversarial Network for Talking Head Video Generation (CVPR 2022)

πŸ”₯ If DaGAN is helpful in your photos/projects, please help to ⭐ it or recommend it to your friends. Thanks πŸ”₯

[Paper]   [Project Page]   [Demo]

Fa-Ting Hong, Longhao Zhang, Li Shen, Dan Xu
The Hong Kong University of Science and Technology

Cartoon Sample

cartoon.mp4

Human Sample

celeb.mp4

Voxceleb1 Dataset

🚩 Updates

  • πŸ”₯ πŸ”₯ βœ… Add SPADE model, which produces more natural results.

πŸ”§ Dependencies and Installation

Installation

We now provide a clean version of DaGAN, which does not require customized CUDA extensions.

  1. Clone repo

    git clone https://github.com/harlanhong/CVPR2022-DaGAN.git
    cd CVPR2022-DaGAN
  2. Install dependent packages

    pip install -r requirements.txt
    
    ## Install the Face Alignment lib
    cd face-alignment
    pip install -r requirements.txt
    python setup.py install

⚑ Quick Inference

We take the paper version for an example. More models can be found here.

YAML configs

See config/vox-adv-256.yaml to get description of each parameter.

Pre-trained checkpoint

The pre-trained checkpoint of face depth network and our DaGAN checkpoints can be found under following link: OneDrive.

Inference! To run a demo, download checkpoint and run the following command:

CUDA_VISIBLE_DEVICES=0 python demo.py  --config config/vox-adv-256.yaml --driving_video path/to/driving --source_image path/to/source --checkpoint path/to/checkpoint --relative --adapt_scale --kp_num 15 --generator DepthAwareGenerator 

The result will be stored in result.mp4. The driving videos and source images should be cropped before it can be used in our method. To obtain some semi-automatic crop suggestions you can use python crop-video.py --inp some_youtube_video.mp4. It will generate commands for crops using ffmpeg.

πŸ’» Training

Datasets

  1. VoxCeleb. Please follow the instruction from https://github.com/AliaksandrSiarohin/video-preprocessing.

Train on VoxCeleb

To train a model on specific dataset run:

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --master_addr="0.0.0.0" --master_port=12348 run.py --config config/vox-adv-256.yaml --name DaGAN --rgbd --batchsize 12 --kp_num 15 --generator DepthAwareGenerator

The code will create a folder in the log directory (each run will create a new name-specific directory). Checkpoints will be saved to this folder. To check the loss values during training see log.txt. By default the batch size is tunned to run on 8 GeForce RTX 3090 gpu (You can obtain the best performance after about 150 epochs). You can change the batch size in the train_params in .yaml file.

Also, you can watch the training loss by running the following command:

tensorboard --logdir log/DaGAN/log

When you kill your process for some reasons in the middle of training, a zombie process may occur, you can kill it using our provided tool:

python kill_port.py PORT

Training on your own dataset

  1. Resize all the videos to the same size e.g 256x256, the videos can be in '.gif', '.mp4' or folder with images. We recommend the later, for each video make a separate folder with all the frames in '.png' format. This format is loss-less, and it has better i/o performance.

  2. Create a folder data/dataset_name with 2 subfolders train and test, put training videos in the train and testing in the test.

  3. Create a config config/dataset_name.yaml, in dataset_params specify the root dir the root_dir: data/dataset_name. Also adjust the number of epoch in train_params.

πŸ“œ Acknowledgement

Our DaGAN implementation is inspired by FOMM. We appreciate the authors of FOMM for making their codes available to public.

πŸ“œ BibTeX

@inproceedings{hong2022depth,
            title={Depth-Aware Generative Adversarial Network for Talking Head Video Generation},
            author={Hong, Fa-Ting and Zhang, Longhao and Shen, Li and Xu, Dan},
            journal={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
            year={2022}
          }

πŸ“§ Contact

If you have any question, please email [email protected].

Comments
  • add web demo/model to Huggingface

    add web demo/model to Huggingface

    Hi, would you be interested in adding DaGAN to Hugging Face? The Hub offers free hosting, and it would make your work more accessible and visible to the rest of the ML community. Models/datasets/spaces(web demos) can be added to a user account or organization similar to github.

    Example from other organizations: Keras: https://huggingface.co/keras-io Microsoft: https://huggingface.co/microsoft Facebook: https://huggingface.co/facebook

    Example spaces with repos: github: https://github.com/salesforce/BLIP Spaces: https://huggingface.co/spaces/salesforce/BLIP

    github: https://github.com/facebookresearch/omnivore Spaces: https://huggingface.co/spaces/akhaliq/omnivore

    and here are guides for adding spaces/models/datasets to your org

    How to add a Space: https://huggingface.co/blog/gradio-spaces how to add models: https://huggingface.co/docs/hub/adding-a-model uploading a dataset: https://huggingface.co/docs/datasets/upload_dataset.html

    Please let us know if you would be interested and if you have any questions, we can also help with the technical implementation.

    opened by AK391 18
  • Error in running a demo version!

    Error in running a demo version!

    Hello! Thanks for sharing openly amazing work! My research is also related to generating talking faces. I face error when tried to run: CUDA_VISIBLE_DEVICES=0 python demo.py --config config/vox-adv-256.yaml --driving_video data/2.mp4 --source_image data/2.jpg --checkpoint depth/models/weights_19/encoder.pth --relative --adapt_scale --kp_num 15 --generator DepthAwareGenerator image Can you please correct me where I made mistakes while running the demo one?

    opened by muxiddin19 3
  • Fix some codes about py-feat library

    Fix some codes about py-feat library

    Hi @harlanhong !

    First, I'm very pleased to see your works, DaGAN. Thanks for your effort. The reason why I issue this post is I just want to fix your code a little bit. In your utils.py, there are some codes using py-feat library and this a causes of problem. I don't know which version of py-feat you use, but no matter what you should change some codes like this way due to latest version using this way:

    p1 = out1.facepose().values # AS-IS
    p1 = out1.facepose.values # TO-BE
    

    because latest version of py-feat uses facepose as property like this:

    @property
        def facepose(self):
            """Returns the facepose data using the columns set in fex.facepose_columns
    
            Returns:
                DataFrame: facepose data
            """
            return self[self.facepose_columns]
    

    Could you fix this problems for anybody who will use this codes?

    opened by samsara-ku 3
  • Size of input

    Size of input

    Hello Thanks for your great work! I have a question, does your model support input resolution higher, than 256px? 512px for example I see that in code input video and image are resized to 256px, so causes the loss of visual quality Is there a way to use 512x512 img/vid without losing quality?

    opened by NikitaKononov 3
  • The generated face remains the same pose

    The generated face remains the same pose

    Thanks for your good work; however when i tried run the demo, the generated video tends to remains the same pose as the source image; while in the paper (Figure 2) the generated results have driving frame's pose(this is also the case for the results from README), so why is this the case?

    https://user-images.githubusercontent.com/29053705/165462856-da97c242-b091-4609-b122-414c4216f492.mp4

    opened by hallwaypzh 3
  • Error while training on VoxCeleb

    Error while training on VoxCeleb

    Hi, I am trying to train DaGAN on VoxCeleb. The following error is occurring.

      File "run.py", line 144, in <module>
        train(config, generator, discriminator, kp_detector, opt.checkpoint, log_dir, dataset, opt.local_rank,device,opt,writer)
      File "/home/madhav3101/gan_codes/CVPR2022-DaGAN/train.py", line 66, in train
        losses_generator, generated = generator_full(x)
      File "/home/madhav3101/env_tf/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/madhav3101/gan_codes/CVPR2022-DaGAN/modules/model.py", line 189, in forward
        kp_driving = self.kp_extractor(driving)
      File "/home/madhav3101/env_tf/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/madhav3101/env_tf/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 886, in forward
        output = self.module(*inputs[0], **kwargs[0])
      File "/home/madhav3101/env_tf/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/madhav3101/gan_codes/CVPR2022-DaGAN/modules/keypoint_detector.py", line 51, in forward
        feature_map = self.predictor(x) #x bz,4,64,64
      File "/home/madhav3101/env_tf/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/madhav3101/gan_codes/CVPR2022-DaGAN/modules/util.py", line 252, in forward
        return self.decoder(self.encoder(x))
      File "/home/madhav3101/env_tf/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/madhav3101/gan_codes/CVPR2022-DaGAN/modules/util.py", line 178, in forward
        out = up_block(out)
      File "/home/madhav3101/env_tf/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/madhav3101/gan_codes/CVPR2022-DaGAN/modules/util.py", line 92, in forward
        out = self.norm(out)
      File "/home/madhav3101/env_tf/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/madhav3101/env_tf/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 745, in forward
        self.eps,
      File "/home/madhav3101/env_tf/lib/python3.7/site-packages/torch/nn/functional.py", line 2283, in batch_norm
        input, weight, bias, running_mean, running_var, training, momentum, eps, torch.backends.cudnn.enabled
     (function _print_stack)
    ^M  0%|          | 0/3965 [00:26<?, ?it/s]
    ^M  0%|          | 0/150 [00:26<?, ?it/s]
    
    Traceback (most recent call last):
      File "run.py", line 144, in <module>
        train(config, generator, discriminator, kp_detector, opt.checkpoint, log_dir, dataset, opt.local_rank,device,opt,writer)
      File "/home/madhav3101/gan_codes/CVPR2022-DaGAN/train.py", line 70, in train
        loss.backward()
      File "/home/madhav3101/env_tf/lib/python3.7/site-packages/torch/_tensor.py", line 307, in backward
        torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
      File "/home/madhav3101/env_tf/lib/python3.7/site-packages/torch/autograd/__init__.py", line 156, in backward
        allow_unreachable=True, accumulate_grad=True)  # allow_unreachable flag
    RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [32]] is at version 4; expected version 3 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
    /home/madhav3101/env_tf/lib/python3.7/site-packages/torch/distributed/launch.py:186: FutureWarning: The module torch.distributed.launch is deprecated
    and will be removed in future. Use torchrun.
    Note that --use_env is set by default in torchrun.
    If your script expects `--local_rank` argument to be set, please
    change it to read from `os.environ['LOCAL_RANK']` instead. See
    https://pytorch.org/docs/stable/distributed.html#launch-utility for
    further instructions
    
      FutureWarning,
    ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 13113) of binary: /home/madhav3101/env_tf/bin/python
    Traceback (most recent call last):
      File "/home/madhav3101/miniconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main
        "__main__", mod_spec)
      File "/home/madhav3101/miniconda3/lib/python3.7/runpy.py", line 85, in _run_code
        exec(code, run_globals)
      File "/home/madhav3101/env_tf/lib/python3.7/site-packages/torch/distributed/launch.py", line 193, in <module>
        main()
      File "/home/madhav3101/env_tf/lib/python3.7/site-packages/torch/distributed/launch.py", line 189, in main
        launch(args)
      File "/home/madhav3101/env_tf/lib/python3.7/site-packages/torch/distributed/launch.py", line 174, in launch
        run(args)
      File "/home/madhav3101/env_tf/lib/python3.7/site-packages/torch/distributed/run.py", line 713, in run
        )(*cmd_args)
      File "/home/madhav3101/env_tf/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
        return launch_agent(self._config, self._entrypoint, list(args))
      File "/home/madhav3101/env_tf/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 261, in launch_agent
        failures=result.failures,
    torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
    ============================================================
    run.py FAILED
    ------------------------------------------------------------
    Failures:
      <NO_OTHER_FAILURES>
    ------------------------------------------------------------
    Root Cause (first observed failure):
    [0]:
      time      : 2022-04-25_17:30:13
      host      : gnode90.local
      rank      : 0 (local_rank: 0)
      exitcode  : 1 (pid: 13113)
      error_file: <N/A>
      traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
    ============================================================
    
    
    opened by mdv3101 3
  • About measurement question

    About measurement question

    Hi, @harlanhong . First, I appreciate your nice work in this fields.

    I'm just asking you how to measure the details of your metric result.

    Did you write simple codes or just import library functions to measure those results in tables?

    And if you wrote the codes, could you share that? If not, what library did you import to measure those results?

    Thank you.

    image

    opened by samsara-ku 2
  • kp_num

    kp_num

    It seems the param 'kp_num' is not allowed to change. When I set it to 20, error occurs:

    Traceback (most recent call last): File "demo.py", line 191, in generator, kp_detector = load_checkpoints(config_path=opt.config, checkpoint_path=opt.checkpoint, cpu=opt.cpu) File "demo.py", line 46, in load_checkpoints generator.load_state_dict(ckp_generator) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1490, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for SPADEDepthAwareGenerator: size mismatch for dense_motion_network.hourglass.encoder.down_blocks.0.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 84, 3, 3]). size mismatch for dense_motion_network.mask.weight: copying a param with shape torch.Size([16, 128, 7, 7]) from checkpoint, the shape in current model is torch.Size([21, 148, 7, 7]). size mismatch for dense_motion_network.mask.bias: copying a param with shape torch.Size([16]) from checkpoint, the shape in current model is torch.Size([21]). size mismatch for dense_motion_network.occlusion.weight: copying a param with shape torch.Size([1, 128, 7, 7]) from checkpoint, the shape in current model is torch.Size([1, 148, 7, 7]).

    opened by MingZJU 2
  • Error as training on my own dataset, did anyone have this problem before?

    Error as training on my own dataset, did anyone have this problem before?

    [W python_anomaly_mode.cpp:104] Warning: Error detected in CudnnBatchNormBackward. Traceback of forward call that caused the error: File "run.py", line 144, in train(config, generator, discriminator, kp_detector, opt.checkpoint, log_dir, dataset, opt.local_rank,device,opt,writer) File "/mnt/users/CVPR2022-DaGAN-master/train.py", line 66, in train losses_generator, generated = generator_full(x)

    Meanwhile there's another problem as well: Traceback (most recent call last): File "run.py", line 144, in train(config, generator, discriminator, kp_detector, opt.checkpoint, log_dir, dataset, opt.local_rank,device,opt,writer) File "/mnt/users/CVPR2022-DaGAN-master/train.py", line 74, in train loss.backward() File "/home/anaconda3/envs/DaGAN/lib/python3.7/site-packages/torch/tensor.py", line 221, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [32]] is at version 5; expected version 4 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

    It seems an inplace problem happen, but I couldn't find anywhere with an inplace code.

    opened by twilight0718 2
  • Suggestion: Add automatic face cropping in demo.py

    Suggestion: Add automatic face cropping in demo.py

    Output result significally related to input image. There few samples:

    1. Photo as is
    2. Photo with manual crop
    3. Photo converted to video and cropped with crop-video.py

    Please, crop input image inside demo.py automatically

    https://user-images.githubusercontent.com/84853762/169252080-db016d04-2f9c-4bb3-9d84-d5de0450d2a6.mp4

    https://user-images.githubusercontent.com/84853762/169252087-5a8c9a5a-0eeb-436f-874f-0745683e64b3.mp4

    https://user-images.githubusercontent.com/84853762/169252089-a8b37f66-897b-4092-a109-86295fecbf15.mp4

    opened by Vadim2S 2
  • Depth map from paper not reproducable

    Depth map from paper not reproducable

    Hi

    Firstly, thank you for this awesome work. However, I tried to reproduce the depth map from the paper using the "demo.py" script and the result is quite different from the one seen in Fig. 9 of the paper.

    Result from the paper: depthmapDaGAN paper

    Result running the script: depthmapDaGAN_myRun

    Corresponding depth map as pointcloud: depthmapDaGAN_myRunPCD

    The Depth map looks way more smooth and facial details like the nose or mouth are completely lost.

    opened by mrokuss 2
  • How to preprocess the image data?

    How to preprocess the image data?

    If I have an face image as the driving image. How to properly crop it? Could you provide the script? I tested the crop-video.py but it could not work for a single image.

    opened by ChawDoe 0
  • DaGAN VoxCeleb

    DaGAN VoxCeleb

    Hello, I see, that you've released depth face model trained on voxceleb2 Does it show better results, than your previous depth checkpoints? Can I use it with SPADE or standart DaGAN checkpoints? Can you please tell us, when do you plan to release DaGAN checkpoint corresponding to voxceleb2 depth model? Thanks a lot for you great work.

    opened by NikitaKononov 1
  • testing error

    testing error

    when i run this command CUDA_VISIBLE_DEVICES=0 python demo.py --config config/vox-adv-256.yaml --driving_video ./example_video.mp4 --source_image ./example_image.png --checkpoint ./checkpoints/SPADE_DaGAN_vox_adv_256.pth.tar --relative --adapt_scale --kp_num 15 --generator SPADEDepthAwareGenerator --result_video results/example_out.mp4 --find_best_frame

    I got the following error: Traceback (most recent call last): File "demo.py", line 169, in depth_encoder.load_state_dict(filtered_dict_enc) File "/home/miniconda3/envs/dagan/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1407, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for ResnetEncoder: size mismatch for encoder.layer1.0.conv1.weight: copying a param with shape torch.Size([64, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for encoder.layer1.1.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for encoder.layer2.0.conv1.weight: copying a param with shape torch.Size([128, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 64, 3, 3]). size mismatch for encoder.layer2.0.downsample.0.weight: copying a param with shape torch.Size([512, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 64, 1, 1]). size mismatch for encoder.layer2.0.downsample.1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for encoder.layer2.0.downsample.1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for encoder.layer2.0.downsample.1.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for encoder.layer2.0.downsample.1.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for encoder.layer2.1.conv1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for encoder.layer3.0.conv1.weight: copying a param with shape torch.Size([256, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for encoder.layer3.0.downsample.0.weight: copying a param with shape torch.Size([1024, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 128, 1, 1]). size mismatch for encoder.layer3.0.downsample.1.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for encoder.layer3.0.downsample.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for encoder.layer3.0.downsample.1.running_mean: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for encoder.layer3.0.downsample.1.running_var: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for encoder.layer3.1.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]). size mismatch for encoder.layer4.0.conv1.weight: copying a param with shape torch.Size([512, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 256, 3, 3]). size mismatch for encoder.layer4.0.downsample.0.weight: copying a param with shape torch.Size([2048, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 256, 1, 1]). size mismatch for encoder.layer4.0.downsample.1.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.layer4.0.downsample.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.layer4.0.downsample.1.running_mean: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.layer4.0.downsample.1.running_var: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.layer4.1.conv1.weight: copying a param with shape torch.Size([512, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). size mismatch for encoder.fc.weight: copying a param with shape torch.Size([1000, 2048]) from checkpoint, the shape in current model is torch.Size([1000, 512]).

    opened by Ha0Tang 3
  • pip install -r requirements.txt

    pip install -r requirements.txt

    Processing /data/fhongac/workspace/src/CVPR22_DaGAN/torch-1.9.0+cu111-cp37-cp37m-linux_x86_64.whl ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: '/data/fhongac/workspace/src/CVPR22_DaGAN/torch-1.9.0+cu111-cp37-cp37m-linux_x86_64.whl'

    opened by Ha0Tang 1
Demo Programs for the "Talking Head(?) Anime from a Single Image 3: Now the Body Too" Project

Demo Code for "Talking Head(?) Anime from A Single Image 3: Now the Body Too" This repository contains demo programs for the Talking Head(?) Anime fro

Pramook Khungurn 221 Sep 16, 2022
[ECCV2022] The implementation for "Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis".

DFRF The pytorch implementation for our ECCV2022 paper "Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis". Requirements Pyt

Shuai Shen 87 Sep 17, 2022
Official implementation of "Towards 3D Scene Reconstruction from Locally Scale-Aligned Monocular Video Depth (Boosting Monocular Depth Estimation with Sparse Guided Points)"

BoostingDepth This repository contains the source code of our paper: Guangkai Xu, Wei Yin, Hao Chen, Kai Cheng, Feng Zhao, Chunhua Shen, Towards 3D Sc

Guangkai Xu 29 Sep 20, 2022
Code for paper 'EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model'

EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model [SIGGRAPH 2022 Conference] Xinya Ji, Hang Zhou, Kaisiyuan Wang, Qiany

null 23 Sep 17, 2022
Implementation of "compositional attention" from MILA, a multi-head attention variant that is reframed as a two-step attention process with disentangled search and retrieval head aggregation, in Pytorch

Compositional Attention - Pytorch Implementation of Compositional Attention from MILA. They reframe the "heads" of multi-head attention as "searches",

Phil Wang 42 Jul 26, 2022
A website that lets you view the head-to-head record of 2 FRC teams. Built with Flask.

FRC Head-to-Head Record This website will allow you to view the head-to-head record of 2 FRC teams based on the data available on The Blue Alliance. H

null 22 Aug 31, 2022
YOLOv7 training. Generates a head-only dataset in YOLO format. The labels included in the CrowdHuman dataset are Head and FullBody, but ignore FullBody.

crowdhuman_hollywoodhead_yolo_convert YOLOv7 training. Generates a head-only dataset in YOLO format. The labels included in the CrowdHuman dataset are

Katsuya Hyodo 6 Aug 8, 2022
Pytorch Implementation of Automatic Relation-aware Graph Network Proliferation (CVPR2022 Oral)

Automatic Relation-aware Graph Network Proliferation Contributions We devise a RELATION-AWARE GRAPH SEARCH SPACE that comprises both node and relation

Shaofei Cai 13 Aug 23, 2022
Official implementation of CVPR2022 paper "Capturing Humans in Motion: Temporal-Attentive 3D Human Pose and Shape Estimation from Monocular Video"

Capturing Humans in Motion: Temporal-Attentive 3D Human Pose and Shape Estimation from Monocular Video [CVPR 2022] Our Motion Pose and Shape Network (

lilijinjin 29 Sep 14, 2022
The official repository for [CVPR2022] MOVER: Human-Aware Object Placement for Visual Environment Reconstruction.

Human-Aware Object Placement for Visual Environment Reconstruction. (CVPR2022) [Project Page] [Paper] [MPI Project Page] [Youtube Video] 3D Scene and

Hongwei Yi 69 Sep 26, 2022
Official code for "Towards An End-to-End Framework for Flow-Guided Video Inpainting" (CVPR2022)

E2FGVI (CVPR 2022) This repository contains the official implementation of the following paper: Towards An End-to-End Framework for Flow-Guided Video

Media Computing Group @ Nankai University 464 Sep 17, 2022
Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022) Paper | Demo Requirements Python >= 3.6 , Pytorch >

Netease Fuxi Virtual Human 57 Sep 21, 2022
cow pattern generation with a deep convolutional generative adversarial network. (really low quality)

cow-gen! details this project uses a deep convolutional generative adversarial network to generate cow patterns. unfortunately, i can't find a large d

Ryan Chou 4 Sep 12, 2022
The official code for BSTRO in paper: Capturing and Inferring Dense Full-Body Human-Scene Contact, CVPR2022

BSTRO: Body-Scene contact TRansfOrmer This is the code repository for Capturing and Inferring Dense Full-BodyHuman-Scene Contact. Body-Scene contact T

Paul Huang 58 Sep 19, 2022
Efficient Geometry-aware 3D Generative Adversarial Networks (EG3D)

Efficient Geometry-aware 3D Generative Adversarial Networks (EG3D) Official PyTorch implementation of the CVPR 2022 paper Efficient Geometry-aware 3D

Derek Jones 10 Aug 25, 2022
Depth-Aware Amodal Instance Segmentation Network

DAIS Depth-Aware Amodal Instance Segmentation Network or DAISnet, utilizes depth information to predict amodal instance segmentation using occlusion r

Jingcheng Yang 1 Sep 6, 2022
Official Github repository for the CVPR 2022 paper "GIRAFFE HD: A High-Resolution 3D-aware Generative Model"

GIRAFFE HD: A High-Resolution 3D-aware Generative Model Project Page Paper Usage Create and activate conda environment 'giraffehd': conda env create -

Yang Xue 51 Sep 17, 2022
This is an official implementation of the CVPR2022 paper "Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots".

Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots Blind2Unblind Citing Blind2Unblind @inproceedings{wang2022blind2unblind, tit

demonsjin 37 Sep 5, 2022