The official repo for the paper "An Empirical Study of Remote Sensing Pretraining"

Overview

An Empirical Study of Remote Sensing Pretraining

Di Wang, Jing Zhang, Bo Du, Gui-Song Xia and Dacheng Tao

Updates | Introduction | Usage | Results & Models | Statement |

Current applications

Scene Recognition: Please see Remote Sensing Pretraining for Scene Recognition;

Sementic Segmentation: Please see Remote Sensing Pretraining for Semantic Segmentation;

Object Detection: Please see Remote Sensing Pretraining for Object Detection;

Change Detection: Please see Remote Sensing Pretraining for Change Detection;

ViTAE: Please see ViTAE-Transformer;

Matting: Please see ViTAE-Transformer for matting;

Updates

011/04/2022

The baiduyun links of pretrained models are provided.

07/04/2022

The paper is post on arxiv!

06/04/2022

The pretrained models for ResNet-50, Swin-T and ViTAEv2-S are released. The code for pretraining and downstream tasks are also provided for reference.

Introduction

This repository contains codes, models and test results for the paper "An Empirical Study of Remote Sensing Pretraining".

The aerial images are usually obtained by a camera in a birdview perspective lying on the planes or satellites, perceiving a large scope of land uses and land covers, whose scene is usually difficult to be interpreted since the interference of the scene-irrelevant regions and the complicated spatial distribution of land objects. Although deep learning has largely reshaped remote sensing research for aerial image understanding and made a great success. However, most of existing deep models are initialized with ImageNet pretrained weights, where the natural images inevitably presents a large domain gap relative to the aerial images, probably limiting the finetuning performance on downstream aerial scene tasks. This issue motivates us to conduct an empirical study of remote sensing pretraining. To this end, we train different networks from scratch with the help of the largest remote sensing scene recognition dataset up to now-MillionAID, to obtain the remote sensing pretrained backbones, including both convolutional neural networks (CNN) and vision transformers such as Swin and ViTAE, which have shown promising performance on computer vision tasks. Then, we investigate the impact of ImageNet pretraining (IMP) and RSP on a series of downstream tasks including scene recognition, semantic segmentation, object detection, and change detection using the CNN and vision transformers backbones.

Fig. - (a) and (b) are the natural image and aerial image belonging to the "park" category. (c) and (d) are two aerial images from the "school" category. Despite the distinct view difference of (a) and (b), (b) contains the playground that is unusual in the park scenes but usually exists in the school scenes like (d). On the other hand, (c) and (d) show different colors as well as significantly different spatial distributions of land objects like playground and swimming pool.

Results and Models

MillionAID

Backbone Input size [email protected] [email protected] Param(M) Pretrained model
RSP-ResNet-50-E300 224 × 224 98.99 99.82 23.6 google & baidu
RSP-Swin-T-E300 224 × 224 98.59 99.88 27.6 google & baidu
RSP-ViTAEv2-S-E100 224 × 224 98.97 99.88 18.8 google & baidu

Usage

Please refer to Readme.md for installation, dataset preparation, training and inference.

Citation

If this repo is useful for your research, please consider citation

@article{wang2022rsp,
  title={An Empirical Study of Remote Sensing Pretraining},
  author={Wang, Di and Zhang, Jing and Du, Bo and Xia, Gui-Song and Tao, Dacheng},
  journal={arXiv preprint arXiv:2204.02825},
  year={2022}
}

Statement

This project is for research purpose only. For any other questions please contact di.wang at gmail.com .

Comments
  • change detection

    change detection

    我在运行python eval.py
    --backbone 'swin' --dataset 'levir' --mode 'rsp_300'
    --path [model path] 实例时报错如下:

    Attempted to read a PyTorch file with version 3, but the maximum supported version for reading is 2

    请问你们用的pytorch那个版本,change detection 我安装按照 https://github.com/likyoo/Siam-NestedUNet/blob/master/README.md Requirements Python 3.6

    Pytorch 1.4

    torchvision 0.5.0

    other packages needed

    pip install opencv-python tqdm tensorboardX sklearn

    帮忙分析下,谢谢了。

    opened by zgsxwsdxg 18
  • questions about exp. of semantic seg.

    questions about exp. of semantic seg.

    Hi, thanks for your great work and codebase.

    The batch size is 8 in the paper, and 4 in the config of Swin-T-IMP+UperNet. And I did not find any description of num_gpu for the semantic seg. subsection. In the README.md of semantic seg., the command:

    python -m torch.distributed.launch --nproc_per_node=1 --master_port=40001 tools/train.py \
        configs/upernet/upernet_our_r50_512x512_80k_potsdam_epoch300.py \
        --launcher 'pytorch'
    

    which seems to set num_gpus_per_node as 1? or your command is for 2 single GPU nodes and batch_size 4 for each (2x4)?

    opened by Li-Qingyun 15
  • Couldn't reproduce the semantic segmentation experiment

    Couldn't reproduce the semantic segmentation experiment

    Tried to by simply eval the model, as the notebook 'Semantic segmentatin/demo/inference_demo.ipynb' do. But instead of using restnet, use the uploaded model. By providing the config and the model of RSP-ViTAEv2-S-E100, at the notebook, init_segmentor doesn't work, due to configs related with data outside the repo.

    I would be pretty happy if you provide some notebook to reproduce that. :)

    Best regards

    opened by vic-torr 11
  • Reproduce the SeCo DOTA result.

    Reproduce the SeCo DOTA result.

    Hi~I'm recently working on some comparison experiments. When I fine-tune the official SeCo pre-trained model (SeCo-1M) on DOTA objection detection tasks. The test set mAP result was much lower than the paper's (TABLE VIII 70.07).

    I strictly followed the experimental setup in the paper, but instead of OBBDetection I used mmrotate, and the difference is, I think, not that big.

    Do you have any suggestions for reproduction? Thanks~

    The mmrotate config which I use is given blow:

    angle_version = 'le90'
    dataset_type = 'DOTADataset'
    data_root = '../DOTA'
    img_norm_cfg = dict(
        mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
    train_pipeline = [
        dict(type='LoadImageFromFile'),
        dict(type='LoadAnnotations', with_bbox=True),
        dict(type='RResize', img_scale=(1024, 1024)),
        dict(
            type='RRandomFlip',
            flip_ratio=[0.25, 0.25, 0.25],
            direction=['horizontal', 'vertical', 'diagonal'],
            version='le90'),
        dict(
            type='Normalize',
            mean=[123.675, 116.28, 103.53],
            std=[58.395, 57.12, 57.375],
            to_rgb=True),
        dict(type='Pad', size_divisor=32),
        dict(type='DefaultFormatBundle'),
        dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
    ]
    test_pipeline = [
        dict(type='LoadImageFromFile'),
        dict(
            type='MultiScaleFlipAug',
            img_scale=(1024, 1024),
            flip=False,
            transforms=[
                dict(type='RResize'),
                dict(
                    type='Normalize',
                    mean=[123.675, 116.28, 103.53],
                    std=[58.395, 57.12, 57.375],
                    to_rgb=True),
                dict(type='Pad', size_divisor=32),
                dict(type='DefaultFormatBundle'),
                dict(type='Collect', keys=['img'])
            ])
    ]
    data = dict(
        samples_per_gpu=2,
        workers_per_gpu=2,
        train=dict(
            type='DOTADataset',
            ann_file=data_root + "/trainVal/annfiles",
            img_prefix=data_root + "/trainVal/images",
            pipeline=[
                dict(type='LoadImageFromFile'),
                dict(type='LoadAnnotations', with_bbox=True),
                dict(type='RResize', img_scale=(1024, 1024)),
                dict(
                    type='RRandomFlip',
                    flip_ratio=[0.25, 0.25, 0.25],
                    direction=['horizontal', 'vertical', 'diagonal'],
                    version='le90'),
                dict(
                    type='Normalize',
                    mean=[123.675, 116.28, 103.53],
                    std=[58.395, 57.12, 57.375],
                    to_rgb=True),
                dict(type='Pad', size_divisor=32),
                dict(type='DefaultFormatBundle'),
                dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
            ]),
        val=dict(
            type='DOTADataset',
            ann_file=data_root + "/trainVal/annfiles",
            img_prefix=data_root + "/trainVal/images",
            pipeline=[
                dict(type='LoadImageFromFile'),
                dict(
                    type='MultiScaleFlipAug',
                    img_scale=(1024, 1024),
                    flip=False,
                    transforms=[
                        dict(type='RResize'),
                        dict(
                            type='Normalize',
                            mean=[123.675, 116.28, 103.53],
                            std=[58.395, 57.12, 57.375],
                            to_rgb=True),
                        dict(type='Pad', size_divisor=32),
                        dict(type='DefaultFormatBundle'),
                        dict(type='Collect', keys=['img'])
                    ])
            ]),
        test=dict(
            type='DOTADataset',
            ann_file=data_root + "/test/annfiles",
            img_prefix=data_root + "/tests/images",
            pipeline=[
                dict(type='LoadImageFromFile'),
                dict(
                    type='MultiScaleFlipAug',
                    img_scale=(1024, 1024),
                    flip=False,
                    transforms=[
                        dict(type='RResize'),
                        dict(
                            type='Normalize',
                            mean=[123.675, 116.28, 103.53],
                            std=[58.395, 57.12, 57.375],
                            to_rgb=True),
                        dict(type='Pad', size_divisor=32),
                        dict(type='DefaultFormatBundle'),
                        dict(type='Collect', keys=['img'])
                    ])
            ]))
    evaluation = dict(interval=1, metric='mAP')
    optimizer = dict(type='SGD', lr=0.005, momentum=0.9, weight_decay=0.0001)
    optimizer_config = dict(
        type='Fp16OptimizerHook',
        distributed=False,
        grad_clip=dict(max_norm=35.0, norm_type=2))
    lr_config = dict(
        policy='step',
        warmup='linear',
        warmup_iters=500,
        warmup_ratio=0.3333333333333333,
        step=[8, 11])
    runner = dict(type='EpochBasedRunner', max_epochs=12)
    checkpoint_config = dict(interval=1)
    log_config = dict(
        interval=50,
        hooks=[dict(type='TextLoggerHook'),
               dict(type='TensorboardLoggerHook')])
    dist_params = dict(backend='nccl')
    log_level = 'INFO'
    load_from = None
    resume_from = None
    workflow = [('train', 1)]
    opencv_num_threads = 0
    mp_start_method = 'fork'
    model = dict(
        type='OrientedRCNN',
        backbone=dict(
            type='ResNet',
            depth=50,
            num_stages=4,
            out_indices=(0, 1, 2, 3),
            frozen_stages=1,
            norm_cfg=dict(type='BN', requires_grad=True),
            norm_eval=True,
            style='pytorch',
            init_cfg=dict(
                type='Pretrained',
                checkpoint='../pretrain_checkpoint/SeCo1m.pth')),
        neck=dict(
            type='FPN',
            in_channels=[256, 512, 1024, 2048],
            out_channels=256,
            num_outs=5),
        rpn_head=dict(
            type='OrientedRPNHead',
            in_channels=256,
            feat_channels=256,
            version='le90',
            anchor_generator=dict(
                type='AnchorGenerator',
                scales=[8],
                ratios=[0.5, 1.0, 2.0],
                strides=[4, 8, 16, 32, 64]),
            bbox_coder=dict(
                type='MidpointOffsetCoder',
                angle_range='le90',
                target_means=[0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
                target_stds=[1.0, 1.0, 1.0, 1.0, 0.5, 0.5]),
            loss_cls=dict(
                type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
            loss_bbox=dict(
                type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)),
        roi_head=dict(
            type='OrientedStandardRoIHead',
            bbox_roi_extractor=dict(
                type='RotatedSingleRoIExtractor',
                roi_layer=dict(
                    type='RoIAlignRotated',
                    out_size=7,
                    sample_num=2,
                    clockwise=True),
                out_channels=256,
                featmap_strides=[4, 8, 16, 32]),
            bbox_head=dict(
                type='RotatedShared2FCBBoxHead',
                in_channels=256,
                fc_out_channels=1024,
                roi_feat_size=7,
                num_classes=15,
                bbox_coder=dict(
                    type='DeltaXYWHAOBBoxCoder',
                    angle_range='le90',
                    norm_factor=None,
                    edge_swap=True,
                    proj_xy=True,
                    target_means=(0.0, 0.0, 0.0, 0.0, 0.0),
                    target_stds=(0.1, 0.1, 0.2, 0.2, 0.1)),
                reg_class_agnostic=True,
                loss_cls=dict(
                    type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
                loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))),
        train_cfg=dict(
            rpn=dict(
                assigner=dict(
                    type='MaxIoUAssigner',
                    pos_iou_thr=0.7,
                    neg_iou_thr=0.3,
                    min_pos_iou=0.3,
                    match_low_quality=True,
                    ignore_iof_thr=-1),
                sampler=dict(
                    type='RandomSampler',
                    num=256,
                    pos_fraction=0.5,
                    neg_pos_ub=-1,
                    add_gt_as_proposals=False),
                allowed_border=0,
                pos_weight=-1,
                debug=False),
            rpn_proposal=dict(
                nms_pre=2000,
                max_per_img=2000,
                nms=dict(type='nms', iou_threshold=0.8),
                min_bbox_size=0),
            rcnn=dict(
                assigner=dict(
                    type='MaxIoUAssigner',
                    pos_iou_thr=0.5,
                    neg_iou_thr=0.5,
                    min_pos_iou=0.5,
                    match_low_quality=False,
                    iou_calculator=dict(type='RBboxOverlaps2D'),
                    ignore_iof_thr=-1),
                sampler=dict(
                    type='RRandomSampler',
                    num=512,
                    pos_fraction=0.25,
                    neg_pos_ub=-1,
                    add_gt_as_proposals=True),
                pos_weight=-1,
                debug=False)),
        test_cfg=dict(
            rpn=dict(
                nms_pre=2000,
                max_per_img=2000,
                nms=dict(type='nms', iou_threshold=0.8),
                min_bbox_size=0),
            rcnn=dict(
                nms_pre=2000,
                min_bbox_size=0,
                score_thr=0.05,
                nms=dict(iou_thr=0.1),
                max_per_img=2000)))
    work_dir = './seco_result'
    auto_resume = False
    gpu_ids = [0]
    
    
    opened by pUmpKin-Co 6
  • use one image  to test issues

    use one image to test issues

    想问一下,我在利用您给的ViTAEv2模型做语义分割测试时,发现所给的图片无法通过测试, 问题出在ReductionCell.py的assert N==HW,但我在利用另外两个权重对该图片进行测试时均可正常产生结果。 1.所以该ViTAEv2模型是否只支持图片大小为2的倍数的情况呢? 2.当我使用1024512的png图片进行预测时,依旧会出问题,在ViTAE_Window_NoShiftt/base_model.py的outs.append(x.view(b,wh,wh,-1).permute(0,3,1,2))处也会出现size不一致的问题

    我是用的权重如下图 image

    如果您有空回答,十分感谢

    opened by mstroke-bird 4
  • About the IMP weights on change detection

    About the IMP weights on change detection

    Hello, @DotWang. Your work is great. The results in your paper show that the bit with IMP-ViTAEv2-S weights performs best. So I wonder whether the pretrained weights from IMP on change detection will be released. Thank you very much.

    opened by lauraset 4
  • Where is the train_labels_{}_{}.txt for scene recognition?

    Where is the train_labels_{}_{}.txt for scene recognition?

    I am running the code of scene recognition and I have downloaded the AID, UCM, and NWPU datasets from their official webpages.

    But there are no train_labels_{}{}.txt in these datasets. Where are train_labels{}_{}.txt used in your code?

    image

    opened by liulingbo918 2
  • About Labels of Million-AID Dataset

    About Labels of Million-AID Dataset

    The original split of MillionAID is used for recognition. Our study is about pretraining, so we resplit the training and testing sets. The obtained training set is relatively large for transferring the pretrained weights to downstream tasks. All RSP pretrained weights are available at https://github.com/ViTAE-Transformer/ViTAE-Transformer-Remote-Sensing/blob/main/README.md

    Thank you very much. Your work is very inspiring to us. We are working to do some research in the field of remote sensing image pre-training and would like to use your work as a baseline. However, we found that the open source MillionAID data has less annotated data than in your paper, so we propose this issue.

    Originally posted by @pUmpKin-Co in https://github.com/ViTAE-Transformer/ViTAE-Transformer-Remote-Sensing/issues/3#issuecomment-1123531593

    Hello, I wasn't able to understand what the conclusion is.

    we resplit the training and testing sets.

    I think the annotated data is needed to use all images and split the images again. Is the labeled data provided for test data? I mean, in my understanding, we need both "train_label.txt" and "valid_label.txt" to use million-AID, but I don't know where I can download them. I appreciate your help.

    opened by thanyu-hub 2
  • ModuleNotFoundError: No module named 'mmdet.version'

    ModuleNotFoundError: No module named 'mmdet.version'

    Traceback (most recent call last): File "/home/dgx/workspace/cui/ViTAE/tools/train.py", line 13, in from mmdet.apis import set_random_seed, train_detector File "/home/dgx/workspace/cui/ViTAE/mmdet/init.py", line 1, in from .version import version, short_version ModuleNotFoundError: No module named 'mmdet.version'

    in the mmdet/init.py, I found the code to be written like this

    from .version import version, short_version

    all = ['version', 'short_version']

    but the .version is not the python file, in the .version file, It is only one line of code 2.2.0

    opened by Ariyl 2
  • About download the pretained model with change detection.

    About download the pretained model with change detection.

    hello , I can't find the link of pretrained model ,like :RS_CLS_finetune/output/resnet_50_224/epoch120/millionAID_224_None/0.0005_0.05_192/resnet/100/ckpt.pth Swin-Transformer-main/output/swin_tiny_patch4_window7_224/epoch120/swin_tiny_patch4_window7_224/default/ckpt.pth ...

    opened by runauto 0
  • What are the differences between 'Your_ResNet' and MMCV's ResNet vb/vc/vd

    What are the differences between 'Your_ResNet' and MMCV's ResNet vb/vc/vd

    I find your config file the backbone network is 'our_resnet'. I see the code of your resnet and want to know what are the differences between general resnet and yours? Could I load the checkpoint file in mmcv's resnet vb/vc/vd directly?

    opened by sherwincn 4
  • DIOR-R Benchmark question.

    DIOR-R Benchmark question.

    In your paper, the performance of your model in dior-r dataset is showed in table. However, there is no information on whether this is single-scale or multi-scale. I want to know the performance of your model in dior-r dataset is in single-scale setting or multi-scale setting. Thank you.

    image
    opened by chagmgang 1
  • KeyError:

    KeyError: "EncoderDecoder: 'ViTAE_Window_NoShift_basic is not in the models registry'"

    作者您好,我在尝试复现您在论文中,在Potsdam数据集上的实验时,运行以下代码 python -m torch.distributed.launch --nproc_per_node=1 --master_port=40001 tools/train.py configs/vitae_win/upernet_vitae_win_imp_window7_512x512_80k_potsdam.py --launcher 'pytorch', 但是出现了KeyError: "EncoderDecoder: 'ViTAE_Window_NoShift_basic is not in the models registry'"的问题,请问这是为什么呢?有什么办法解决吗? 我还想请教一下如何正确复现您的实验呢?

    opened by Akinpzx 7
  • CVE-2007-4559 Patch

    CVE-2007-4559 Patch

    Patching CVE-2007-4559

    Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

    If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

    opened by TrellixVulnTeam 0
  • About Reproducing Training of Remote Sensing Semantic Segmentation Models

    About Reproducing Training of Remote Sensing Semantic Segmentation Models

    Hello, I am interested in the remote sensing semantic segmentation of this project. I have downloaded all the relevant Potsdam datasets and configured the program running environment, but the reproduction steps of the entire training are still unclear. The downloaded public datasets Whether preprocessing is required, images and label images need to be cut into small pieces. If you don't do distributed training, you can simply remove the distributed settings when training on one card. Do you have a more detailed description of the training steps? The description of the training parameter configuration config file allows us to reproduce the model training. Thank you.

    opened by smallsnailrunning 9
  • About MillionAID dataset

    About MillionAID dataset

    I downloaded the MillionAID dataset from the official homepage: MillionAID. The training set was found to have only 10K images. The test set is not labeled. May I know how the pre-training data in the paper was obtained?

    opened by pUmpKin-Co 2
Owner
null
This Repo is created for Hacktober Fest 2022. All you have to do is follow the steps in the Readme of this repo, perform the tasks and create the pull request. Leave the rest upon me.

Hack-this-October-2k22- This Repo is created for Hacktober Fest 2021. All you have to do is follow the steps in the Readme of this repo, perform the t

Rajdeep Majumder 8 Oct 18, 2022
The official repo for the paper "Rethinking Portrait Matting with Privacy Preserving". For further questions, please contact Sihan Ma at [email protected] or Jizhizi Li at [email protected]

Rethinking Portrait Matting with Privacy Preserving This is the official repository of the paper Rethinking Portrait Matting with Privacy Preserving.

null 182 Dec 26, 2022
This repo is the official megengine implementation of the ECCV2022 paper: Efficient One Pass Self-distillation with Zipf's Label Smoothing.

This repo is the official megengine implementation of the ECCV2022 paper: Efficient One Pass Self-distillation with Zipf's Label Smoothing. The pytorc

MEGVII Research 21 Dec 5, 2022
This repo contains the official implementation of ECCV 2022 paper "What to Hide from Your Students: Attention-Guided Masked Image Modeling"

What to Hide from Your Students: Attention-Guided Masked Image Modeling PyTorch implementation and pretrained models for AttMask. [arXiv] Pretrained m

Ioannis Kakogeorgiou 27 Dec 14, 2022
[ECCV 2022] The official repo for the paper "Poseur: Direct Human Pose Regression with Transformers".

Poseur: Direct Human Pose Regression with Transformers Poseur: Direct Human Pose Regression with Transformers, Weian Mao*, Yongtao Ge*, Chunhua Shen,

Advanced Intelligent Machines (AIM) 120 Dec 26, 2022
The official repo for ECCV'22 paper: Pose for Everything: Towards Category-Agnostic Pose Estimation

Pose-for-Everything (ECCV'2022 Oral) Introduction Official code repository for the paper: Pose for Everything: Towards Category-Agnostic Pose Estimati

Lumin 115 Dec 29, 2022
Official repo for CVPR 2022 (Oral) paper: Revisiting the "Video" in Video-Language Understanding. Contains code for the Atemporal Probe (ATP).

Revisiting the "Video" in Video-Language Understanding Welcome to the official repo for our paper: Revisiting the "Video" in Video-Language Understand

Stanford Vision and Learning Lab 15 Dec 6, 2022
The official Github repo for the python package: scratchon

scratchon A Python and Scratch Project Connector! Installation pip install scratchon Get started How to set up a client connection: import scratchon

Nice One 0 Sep 3, 2022
This repo presents you the official code of "VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention"

VISTA VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention Shengheng Deng, Zhihao Liang, Lin Sun and Kui Jia* (*) Corresponding a

null 103 Dec 19, 2022
The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.

OC-SORT This is the github repo for Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking [arxiv]. Observation-Centric SORT (OC-S

Jinkun Cao 325 Jan 5, 2023
Official repo for FEAR: Fast, Efficient, Accurate and Robust Visual Tracker (ECCV 2022)

FEAR: Fast, Efficient, Accurate and Robust Visual Tracker This is an official repository for the paper FEAR: Fast, Efficient, Accurate and Robust Visu

Piñata Farms 113 Dec 26, 2022
This repo is official PyTorch implementation of 3D Clothed Human Reconstruction in the Wild (ECCV 2022).

3D Clothed Human Reconstruction in the Wild (ClothWild codes) 3D Clothed Human Reconstruction in the Wild, Gyeongsik Moon, Hyeongjin Nam, Takaaki Shir

Hyeongjin Nam 126 Dec 15, 2022
This repo equips the official CLIFF [ECCV 2022 Oral] with better detector, better tracker. Support multi-person, motion interpolation and smooth.

CLIFF [ECCV 2022 Oral] Introduction This repo is highly built on the official CLIFF and contains an inference demo, and further adds accurate detector

Haofan Wang 39 Dec 21, 2022
Official repo for "Solving Inverse Problems in Medical Imaging with Score-Based Generative Models"

Solving Inverse Problems in Medical Imaging with Score-Based Generative Models This repo contains the JAX code for experiments in the paper Solving In

Yang Song 106 Dec 29, 2022
Official repo for 'Nope' group project (yelp clone).

?? API Documentation Table of Contents FEATURE 0: USER AUTHORIZATION All endpoints that require authentication All endpoints that require proper autho

null 7 Nov 6, 2022
This repo implements our paper, "Efficient Neural Neighborhood Search for Pickup and Delivery Problems", which has been accepted as short oral at IJCAI 2022.

PDP-N2S N2S is a learning based improvement framework for solving the pickup and delivery problems, a.k.a. PDPs (e.g., PDTSP and PDTSP-LIFO). It explo

Yi-Ning Ma (马一宁) 20 Dec 2, 2022
Repo for our ECCV 2022 paper on "Paint2Pix: Interactive Painting based Progressive Image Synthesis and Editing"

Paint2Pix: Interactive Painting based Progressive Image Synthesis and Editing (ECCV 2022) Controllable image synthesis with user scribbles is a topic

Jaskirat Singh 99 Dec 14, 2022
Code repo for KDD'22 paper : 'RES: A Robust Framework for Guiding Visual Explanation'

RES: A Robust Framework for Guiding Visual Explanation The Pytorch implementation of RES framework for KDD'22 paper: RES: A Robust Framework for Guidi

null 26 Nov 7, 2022
Hardware/Software repo for DIY eye tracking on the Valve Index

IndexEyeTrackVR Hardware/Software repo for DIY eye tracking on the Valve Index (and potentially other) VR headset. This is intended to emulate the fun

Razgriz 19 Dec 23, 2022