[ECCV2022] Motion Sensitive Contrastive Learning for Self-supervised Video Representation

Overview

MSCL

Official code for Motion Sensitive Contrastive Learning for Self-supervised Video Representation (ECCV2022).

Introduction

Contrastive learning has shown great potential in video representation learning. However, existing approaches fail to fully exploit short-term motion dynamics, which are crucial to various down-stream video understanding tasks. In this paper, we propose Motion Sensitive Contrastive Learning (MSCL) that injects the motion information captured by optical flows into RGB frames to strengthen feature learning. To achieve this, in addition to clip-level global contrastive learning, we develop Local Motion Contrastive Learning (LMCL) with frame-level contrastive objectives across the two modalities. Moreover, we introduce Flow Rotation Augmentation (FRA) to generate extra motion-shuffled negative samples and Motion Differential Sampling (MDS) to accurately screen training samples. Extensive experiments on standard benchmarks validate the effectiveness of the proposed method. With the commonly-used 3D ResNet-18 as the backbone, we achieve the top-1 accuracies of 91.5% on UCF101 and 50.3% on Something-Something v2 for video classification, and a 65.6% Top-1 Recall on UCF101 for video retrieval, notably improving state-of-the-art methods.

mscl

Getting Started

This repo is developed from MMAction2 codebase, please follow the install instruction of MMAction2 to setup the environment.

Please refer to the document of mmaction2 now.

Training

bash ./tools/dist_train.sh configs/recognition/moco/mscl_r18_cosm_lr2e-2.py 4 --validate --seed 0 --deterministic

Downstream Classification Fine-tuning

bash ./tools/dist_train.sh configs/recognition/ssl_test/test_ssv2_r18.py 1 --validate --seed 0 --deterministic

Downstream Retrieval

Only one gpu is supported for retrieval task.

bash ./tools/test_retrival.sh configs/recognition/ssl_test/test_ssv2_r18.py {your checkpoint path}

Acknowledgement

This repo is based on mmaction2.

You might also like...

Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)

CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding (CVPR'22) Paper Link | Project Page Abstract : Manual an

Nov 28, 2022

[ACM MM 2022] Modality-aware Contrastive Instance Learning with Self-Distillation for Weakly-Supervised Audio-Visual Violence Detection

MACIL_SD [ACM MM 2022] Modality-aware Contrastive Instance Learning with Self-Distillation for Weakly-Supervised Audio-Visual Violence Detection Jiash

Nov 9, 2022

The source code of an essay "Siamese Network Based Multi-Scale Self-Supervised Heterogeneous Graph Representation Learning".

SNMH Introduction This repo is for source code of an essay "Siamese Network Based Multi-Scale Self-Supervised Heterogeneous Graph Representation Learn

Nov 13, 2022

The official source code for "Relational Self-Supervised Representation Learning on Graphs"

The official source code for

Relational Self-Supervised Representation Learning on Graphs The official source code for Relational Self-Supervised Representation Learning on Graphs

Sep 15, 2022

[ECCV2022] Learning Quality-aware Dynamic Memory for Video Object Segmentation

[ECCV2022] Learning Quality-aware Dynamic Memory for Video Object Segmentation

Learning Quality-aware Dynamic Memory for Video Object Segmentation ECCV 2022 Abstract Previous memory-based methods mainly focus on better matching b

Nov 18, 2022

Adversarial Erasing Framework via Triplet with Gated Pyramid Pooling Layer for Weakly Supervised Semantic Segmentation, ECCV2022

Adversarial Erasing Framework via Triplet with Gated Pyramid Pooling Layer for Weakly Supervised Semantic Segmentation, ECCV2022

Adversarial Erasing Framework via Triplet with Gated Pyramid Pooling Layer for Weakly Supervised Semantic Segmentation This repository contains the of

Nov 16, 2022

Codes for ECCV2022 paper "What matters in supervised 3D scene flow"

Codes for ECCV2022 paper

What Matters for 3D Scene Flow Network (ECCV2022) This is the official implementation of our ECCV 2022 paper: "What Matters for 3D Scene Flow Network"

Nov 14, 2022

This repo is the official megengine implementation of the ECCV2022 paper: Efficient One Pass Self-distillation with Zipf's Label Smoothing.

This repo is the official megengine implementation of the ECCV2022 paper: Efficient One Pass Self-distillation with Zipf's Label Smoothing.

This repo is the official megengine implementation of the ECCV2022 paper: Efficient One Pass Self-distillation with Zipf's Label Smoothing. The pytorc

Nov 10, 2022

Official implementation of "Self-slimmed Vision Transformer" (ECCV2022)

Official implementation of

Self-slimmed Vision Transformer (ECCV2022) This repo is the official implementation of "Self-slimmed Vision Transformer". Updates 07/20/2022 [Initial

Oct 28, 2022
Owner
MEGVII Research
Power Human with AI. 持续创新拓展认知边界 非凡科技成就产品价值
MEGVII Research
[ECCV2022] Official Pytorch Implementation of Object Discovery via Contrastive Learning for Weakly Supervised Object Detection

Object Discovery via Contrastive Learning for Weakly Supervised Object Detection Jinhwan Seo, Wonho Bae, Danica J. Sutherland, Junhyug Noh, and Daijin

Jinhwan Seo 15 Nov 24, 2022
(ECCV2022) This is the official PyTorch implementation of ECCV2022 paper: Towards Efficient and Scale-Robust Ultra-High-Definition Image Demoireing

Towards Efficient and Scale-Robust Ultra-High-Definition Image Demoiréing Project Page | Dataset | Paper Towards Efficient and Scale-Robust Ultra-High

CVMI Lab 87 Nov 25, 2022
Code for ECCV2022 paper 'KD-MVS: Knowledge Distillation Based Self-supervised Learning for Multi-view Stereo'

KD-MVS: Knowledge Distillation Based Self-supervised Learning for Multi-view Stereo Paper | Project Page | Data | Checkpoints Installation Clone this

MEGVII Research 22 Oct 25, 2022
[ECCV2022] RA-Depth: Resolution Adaptive Self-Supervised Monocular Depth Estimation

RA-Depth This repo is for RA-Depth: Resolution Adaptive Self-Supervised Monocular Depth Estimation (arxiv), ECCV2022 If you think it is a useful work,

null 29 Nov 22, 2022
This is the official PyTorch implementation of our paper: Treating Motion as Option to Reduce Motion Dependency in Unsupervised Video Object Segmentation, arXiv'22

TMO This is the official PyTorch implementation of our paper: Treating Motion as Option to Reduce Motion Dependency in Unsupervised Video Object Segme

Suhwan Cho 17 Nov 21, 2022
[ECCV2022] Learning to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining

Learning to Drive by Watching YouTube videos: Action-Conditioned Contrastive Policy Pretraining (ECCV22) Webpage | Code | Paper Installation Our codeb

MetaDriverse for Autonomy Research 33 Nov 23, 2022
[ECCV2022] Learning Ego 3D Representation as Ray Tracing

Learning Ego 3D Representation as Ray Tracing Website | Paper Learning Ego 3D Representation as Ray Tracing, Jiachen Lu, Zheyuan Zhou, Xiatian Zhu, Ha

Fudan Zhang Vision Group 83 Nov 21, 2022
Codes for ECCV2022 paper - contrastive deep supervision

Contrastive Deep Supervision This is the code for contrastive deep supervision and distilled contrastive deep supervision. Install. Install the based

null 42 Nov 29, 2022
TATS: A Long Video Generation Framework with Time-Agnostic VQGAN and Time-Sensitive Transformer

Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer Project Website | Video | Paper tl;dr We propose TATS, a long video gene

null 93 Nov 21, 2022
NSF4SL is a negative-sample-free model for prediction of synthetic lethality (SL) based on a self-supervised contrastive learning framework.

NSF4SL: Negative-Sample-Free Contrastive Learning for Ranking Synthetic Lethal Partner Genes in Human Cancers This is the code for our paper ``NSF4SL:

JieZheng 1 Apr 16, 2022