The official code for BSTRO in paper: Capturing and Inferring Dense Full-Body Human-Scene Contact, CVPR2022

Related tags

Admin Panels bstro
Overview

BSTRO: Body-Scene contact TRansfOrmer

This is the code repository for Capturing and Inferring Dense Full-BodyHuman-Scene Contact.

Body-Scene contact TRansfOrmer (BSTRO) is a transformer-based method that detects human-scene contact directly from pixels. In this repository, we provide the inference code of BSTRO.

TODO items

  • Release training and validation code (coming soon)

Installation

Check INSTALL.md for installation instructions.

Pre-trained models and other required files

Please download our pre-trained weights from the website and follow DOWNLOAD.md to prepare other relevant files that are important to run our code.

Quick demo

We provide demo codes to run end-to-end inference on the test images.

Check DEMO.md for details.

Citations

If you find our work useful in your research, please consider citing:

@inproceedings{huang2022rich,
    title = {Capturing and Inferring Dense Full-Body Human-Scene Contact},
    author = {Huang, Chun-Hao P. and Yi, Hongwei and H{\"o}schle, Markus and Safroshkin, Matvey and Alexiadis, Tsvetelina and Polikovsky, Senya and Scharstein, Daniel and Black, Michael J.},
    booktitle = {IEEE/CVF Conf.~on Computer Vision and Pattern Recognition (CVPR) },
    pages = {13274-13285},
    month = jun,
    year = {2022},
    month_numeric = {6}
}

License

Our research code is released under the MPI license. See LICENSE for details.

METRO has MIT license. See LICENSE for details.

We use huggingface/transformers submodule. Please see NOTICE for details.

Acknowledgments

Our implementation and experiments are built on top of open-source GitHub repositories. We thank all the authors who made their code public, which tremendously accelerates our project progress. If you find these works helpful, please consider citing them as well.

microsoft/MeshTransformer

huggingface/transformers

HRNet/HRNet-Image-Classification

nkolot/GraphCMR

Contact

For questions, please contact [email protected]

For commercial licensing (and all related questions for business applications), please contact [email protected].

You might also like...

PyMAF-X: Towards Well-aligned Full-body Model Regression from Monocular Images

PyMAF-X: Towards Well-aligned Full-body Model Regression from Monocular Images

PyMAF-X: Towards Well-aligned Full-body Model Regression from Monocular Images Hongwen Zhang · Yating Tian · Yuxiang Zhang · Mengcheng Li · Liang An ·

Sep 26, 2022

[CVPR 2022] Aesthetic Text Logo Synthesis via Content-aware Layout Inferring

[CVPR 2022] Aesthetic Text Logo Synthesis via Content-aware Layout Inferring

TextLogoLayout This is the official Pytorch implementation of the paper: Aesthetic Text Logo Synthesis via Content-aware Layout Inferring. CVPR 2022.

Sep 25, 2022

This is an official implementation of the CVPR2022 paper "Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots".

Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots Blind2Unblind Citing Blind2Unblind @inproceedings{wang2022blind2unblind, tit

Sep 5, 2022

The official implementation of Tink -- one of the core contributions in the CVPR2022 paper: OakInk.

The official implementation of Tink -- one of the core contributions in the CVPR2022 paper: OakInk.

Tink CVPR, 2022 Lixin Yang* · Kailin Li* · Xinyu Zhan* · Fei Wu · Anran Xu . Liu Liu · Cewu Lu \star = equal contribution This repo contains the offic

Aug 30, 2022

Code release of paper Compositional Human-Scene Interaction Synthesis with Semantic Control

Code release of paper Compositional Human-Scene Interaction Synthesis with Semantic Control

Compositional Human-Scene Interaction Synthesis with Semantic Control (COINS) This repository contains the implementation of our paper Compositional H

Sep 22, 2022

Face and Pose detector that emits MQTT events when a face or human body is detected and not detected.

Face and Pose detector that emits MQTT events when a face or human body is detected and not detected.

Face Detect MQTT Face or Pose detector that emits MQTT events when a face or human body is detected and not detected. I built this as an alternative t

May 17, 2022

Official code for "Towards An End-to-End Framework for Flow-Guided Video Inpainting" (CVPR2022)

Official code for

E2FGVI (CVPR 2022) This repository contains the official implementation of the following paper: Towards An End-to-End Framework for Flow-Guided Video

Sep 29, 2022

Live Stream Temporally Embedded 3D Human Body Pose and Shape Estimation (2022)

Live Stream Temporally Embedded 3D Human Body Pose and Shape Estimation (2022)

Live Stream Temporally Embedded 3D Human Body Pose and Shape Estimation Contact: Zhouping Wang Sarah Ostadabbas Introduction This repository is the of

Sep 19, 2022
Comments
  • DEMO Error : can't run ref_vertices.expand(batch_size, -1, -1)

    DEMO Error : can't run ref_vertices.expand(batch_size, -1, -1)

    Hi , I really liked your paper and wanted to try out the demo. I run the exact line from DEMO.md and got the following error:

    Traceback (most recent call last):
      File "/home/oscar/Workspace/bstro/bstro/./metro/tools/demo_bstro.py", line 302, in <module>
        main(args)
      File "/home/oscar/Workspace/bstro/bstro/./metro/tools/demo_bstro.py", line 296, in main
        run_inference(args, _bstro_network, smpl, mesh_sampler)
      File "/home/oscar/Workspace/bstro/bstro/./metro/tools/demo_bstro.py", line 88, in run_inference
        _, _, pred_contact = BSTRO_model(images, smpl, mesh_sampler)
      File "/home/oscar/anaconda3/envs/bstro2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/oscar/Workspace/bstro/bstro/metro/modeling/bert/modeling_bstro.py", line 203, in forward
        ref_vertices = ref_vertices.expand(batch_size, -1, -1)
    RuntimeError: The expanded size of the tensor (1) must match the existing size (30) at non-singleton dimension 0.  Target sizes: [1, -1, -1].  Tensor sizes: [30, 431, 3]
    

    My setup: Python 3.10.5 Pytorch 1.11.0 torchvision 0.12.0 cuda 11.3.1

    opened by oscarfossey 10
  • dataset preparation

    dataset preparation

    Hi, authors. thanks for great work!

    Could this repository give more detailed instruction for preparing the proposed dataset RICH and training scripts, so that I can try to train the model?

    opened by YongtaoGe 2
  • ImportError: cannot import name 'METRO' from 'metro.modeling.bert.modeling_bstro'

    ImportError: cannot import name 'METRO' from 'metro.modeling.bert.modeling_bstro'

    hi, I am very interesting in your excellenct work! but i found there is a bug in the code when i run the demo code. https://github.com/paulchhuang/bstro/blob/main/docs/DEMO.md#human-scene-contact-detection. ImportError: cannot import name 'METRO' from 'metro.modeling.bert.modeling_bstro'

    opened by JiahongWu1995 2
Owner
Paul Huang
Paul Huang
HumanDetection is a Computer Vision project detecting Human Faces, Eyes, Upper body and, Full Body. Using OpenCV and Python

HumanDetection HumanDetection is a Computer Vision project detecting Human Faces, Eyes, Upper body and, Full Body. Using OpenCV and Python Installatio

Elmin Didic 3 Jul 25, 2022
The official repository for [CVPR2022] MOVER: Human-Aware Object Placement for Visual Environment Reconstruction.

Human-Aware Object Placement for Visual Environment Reconstruction. (CVPR2022) [Project Page] [Paper] [MPI Project Page] [Youtube Video] 3D Scene and

Hongwei Yi 69 Sep 26, 2022
Pytorch implementation of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors

Make-A-Scene - PyTorch Pytorch implementation (inofficial) of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors (https://arxiv.org/

Casual GAN Papers 238 Sep 22, 2022
Code of ECCV2022 paper "Inverted Pyramid Multi-task Transformer for Dense Scene Understanding"

?? ECCV2022 InvPT: Inverted Pyramid Multi-task Transformer for Dense Scene Understanding ?? Introduction This repository implements our ECCV2022 paper

Hanrong Ye 36 Sep 20, 2022
A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022) https://arxiv.org/abs/2203.09388 Jianqi Ma, Zheto

MA Jianqi, shiki 89 Sep 24, 2022
Official PyTorch implementation of "Accurate 3D Hand Pose Estimation for Whole-Body 3D Human Mesh Estimation", CVPRW 2022

Accurate 3D Hand Pose Estimation for Whole-Body 3D Human Mesh Estimation (Hand4Whole codes) High-resolution video link: here Introduction This repo is

Gyeongsik Moon 105 Sep 30, 2022
Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

?? Depth-Aware Generative Adversarial Network for Talking Head Video Generation (CVPR 2022) ?? If DaGAN is helpful in your photos/projects, please hel

Fa-Ting Hong 377 Sep 28, 2022
[ECCV2022] Dense Siamese Network for Dense Unsupervised Learning

Dense Siamese Network for Dense Unsupervised Learning Introduction This is an official release of the paper Dense Siamese Network for Dense Unsupervis

Wenwei Zhang 21 Sep 19, 2022
Machine Learning model for Inferring and segmenting objects from an unseen video

Object-Segmentation-Inference Machine Learning model for Inferring and segmenting objects from an unseen video Using 1-2 text queries (i.e. "Water Bot

Wonho 2 Mar 6, 2022
The official repo for the paper "Rethinking Portrait Matting with Privacy Preserving". For further questions, please contact Sihan Ma at [email protected] or Jizhizi Li at [email protected]

Rethinking Portrait Matting with Privacy Preserving This is the official repository of the paper Rethinking Portrait Matting with Privacy Preserving.

null 96 Sep 29, 2022