"Video Moment Retrieval from Text Queries via Single Frame Annotation" in SIGIR 2022.

Related tags

Admin Panels ViGA
Overview

ViGA: Video moment retrieval via Glance Annotation

This is the official repository of the paper "Video Moment Retrieval from Text Queries via Single Frame Annotation" published in SIGIR 2022.

https://arxiv.org/abs/2204.09409

Dependencies

This project has been tested on the following conda environment.

$ conda create --name viga python=3.7
$ source activate viga
(viga)$ conda install pytorch=1.10.0 cudatoolkit=11.3.1
(viga)$ pip install numpy scipy pyyaml tqdm 

Data preparation

This repository contains our glance annotations already. To replicate our work, one should prepare extra data and finally get the following structure.

ckpt/                                 our pre-trained model, available at https://drive.google.com/file/d/1S4e8XmIpiVFJKSSJ4Tig4qN0yaCwiVLs/view?usp=sharing
data/
+-- activitynetcaptions/
|   +-- c3d/                    
|   +-- annotations/
|   |   +-- glance/
|   |   |   +-- train.json                
|   |   |   +-- val_1.json                
|   |   |   +-- val_2.json   
|   |   +-- train.json                downloaded
|   |   +-- val_1.json                downloaded
|   |   +-- val_2.json                downloaded
+-- charadessta/
|   +-- i3d/                     
|   +-- c3d/ 
|   +-- vgg/
|   +-- annotations/
|   |   +-- glance/
|   |   |   +-- charades_sta_train.txt
|   |   |   +-- charades_sta_test.txt
|   |   +-- charades_sta_train.txt    downloaded
|   |   +-- charades_sta_test.txt     downloaded
|   |   +-- Charades_v1_train.csv     downloaded
|   |   +-- Charades_v1_test.csv      downloaded
+-- tacos/
|   +-- c3d/ 
|   +-- annotations/
|   |   +-- glance/
|   |   |   +-- train.json                
|   |   |   +-- test.json                 
|   |   |   +-- val.json
|   |   +-- train.json                downloaded
|   |   +-- test.json                 downloaded
|   |   +-- val.json                  downloaded
glove.840B.300d.txt                   downloaded from https://nlp.stanford.edu/data/glove.840B.300d.zip

1. ActivityNet Captions

c3d feature

Downloaded from http://activity-net.org/challenges/2016/download.html. We extracted the features from sub_activitynet_v1-3.c3d.hdf5 as individual files.

Folder contains 19994 vid.npys, each of shape (T, 500).

annotation

Downloaded from https://cs.stanford.edu/people/ranjaykrishna/densevid/

2. Charades-STA

c3d feature

We extracted the C3D features of Charades-STA by ourselves. We decided not to make it available for downloading due to our limited resource of online storage, and the fact that this process can be easily replicated. To specify, we directly adopted the C3D model weights pre-trained on Sports1M. The extracted feature of each clip in a video was the fc6 layer, and the clips were sampled via sliding window of step size of 8, window size of 16. Our codes for this extraction were based on this repo. https://github.com/DavideA/c3d-pytorch

Folder contains 9848 vid.npys, each of shape (T, 4096).

i3d feature

Downloaded from https://github.com/JonghwanMun/LGI4temporalgrounding. This is the features extracted from I3D (finetuned on Charades). We processed them by trimming off unnecessary dimensions.

Folder contains 9848 vid.npys, each of shape (T, 1024).

vgg feature

Downloaded from https://github.com/microsoft/2D-TAN. We processed the data by converting the downloaded version vgg_rgb_features.hdf5 into numpy arrays.

Folder contains 6672 vid.npys, each of shape (T, 4096).

annotation

Downloaded from https://github.com/jiyanggao/TALL

3. TACoS

c3dfeature

Downloaded from https://github.com/microsoft/2D-TAN. We extracted the features from tall_c3d_features.hdf5 as individual files.

Folder contains 127 vid.npys, each of shape (T, 4096).

annotation

Downloaded from https://github.com/microsoft/2D-TAN

Run

Our models were trained using the following commands.

(viga)$ CUDA_VISIBLE_DEVICES=0 python -m src.experiment.train --task activitynetcaptions
(viga)$ CUDA_VISIBLE_DEVICES=0 python -m src.experiment.train --task charadessta
(viga)$ CUDA_VISIBLE_DEVICES=0 python -m src.experiment.train --task tacos

Our trained models were evaluated using the following commands.

(viga)$ CUDA_VISIBLE_DEVICES=0 python -m src.experiment.eval --exp ckpt/activitynetcaptions
(viga)$ CUDA_VISIBLE_DEVICES=0 python -m src.experiment.eval --exp ckpt/charadessta_c3d
(viga)$ CUDA_VISIBLE_DEVICES=0 python -m src.experiment.eval --exp ckpt/charadessta_i3d
(viga)$ CUDA_VISIBLE_DEVICES=0 python -m src.experiment.eval --exp ckpt/charadessta_vgg
(viga)$ CUDA_VISIBLE_DEVICES=0 python -m src.experiment.eval --exp ckpt/tacos
You might also like...

[ECCV 2022] Official pytorch implementation of "mc-BEiT: Multi-choice Discretization for Image BERT Pre-training" in European Conference on Computer Vision (ECCV) 2022.

[ECCV 2022] Official pytorch implementation of

mc-BEiT: Multi-choice Discretization for Image BERT Pre-training Official pytorch implementation of "mc-BEiT: Multi-choice Discretization for Image BE

Sep 14, 2022

Source code for paper "A Two-Stage Graph-Based Method for Chinese AMR Parsing with Explicit Word Alignment" @ CAMRP-2022 & CCL-2022

Source code for paper

两阶段中文AMR解析方法 中文 | English 论文 "A Two-Stage Graph-Based Method for Chinese AMR Parsing with Explicit Word Alignment" @ CAMRP-2022 & CCL-2022 的模型及训练代码。 我

Sep 20, 2022

Exploit for CVE-2022-27226

ez-iRZ Exploit for CVE-2022-27226 Cross Site Request Forgery to Remote Code Execution in iRZ Mobile Routers Credits --Vulnerability Discovery-- John

Jun 23, 2022

CVE-2022-22947_POC_EXP

CVE-2022-22947_POC_EXP

CVE-2022-22947 poc: pocsuite -r CVE-2022-22947_POC_EXP.py -u url --verify exp: pocsuite -r CVE-2022-22947_POC_EXP.py -u url --attack --command "[comma

Aug 18, 2022

Apache APISIX Remote Code Execution (CVE-2022-24112) proof of concept exploit

Apache APISIX Remote Code Execution (CVE-2022-24112) Exploit Summary An attacker can abuse the batch-requests plugin to send requests to bypass the IP

Sep 24, 2022

A loan eligibility calculator aiming to reduce algorithmic biases. Made for Hack the Globe 2022 by Team 32.

Loan-calculator A loan eligibility calculator aiming to reduce algorithmic biases. Made for Hack the Globe 2022 by Team 32. This code acts as a framew

Mar 20, 2022

CVE-2022-24990 TerraMaster TOS unauthenticated RCE via PHP Object Instantiation

CVE-2022-24990 TerraMaster TOS unauthenticated RCE via PHP Object Instantiation

CVE-2022-24990 CVE-2022-24990 TerraMaster TOS unauthenticated RCE via PHP Object Instantiation Usage Vulnerability Detection. python CVE-2022-24990.py

Jul 22, 2022

Code for the winning solution in the SE&R 2022 Challenge - SER track.

SE&R 2022 Challenge - SER Track Introduction Dependencies Datasets Training Pre-Trained Model Contact Introduction Automatic Speech Recognition for sp

Aug 28, 2022

Implementation of our paper "Super-resolution with adversarial loss on the feature maps of the generated high-resolution image" (IET Electronics Letters 2022)

Implementation of our paper

Adversarial Feature Maps Super Resolution This repository is the Implementation of our paper: Imanuel, I. and Lee, S. (2022), Super-resolution with ad

Mar 26, 2022
Comments
  • errors occurs when it comes to run the Charades-STA dataset.

    errors occurs when it comes to run the Charades-STA dataset.

    Hi, thanks for your wonderful work , however, I have met the following error when I run the Charades-STA dataset (everything goes well when I run the other two datasets, tacos and activity net). Would you mind helping figure it out? Thanks.

    errors log

    Training epoch 1 with lr 0.0001: 0%| | 0/24 [00:00<?, ?it/s] Training epoch 1 with lr 0.0001: 0%| | 0/24 [00:01<?, ?it/s] Traceback (most recent call last): File "/data1/zhekun/anaconda3/envs/viga/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/data1/zhekun/anaconda3/envs/viga/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/data1/chentongbao/VCMR/ViGA/src/experiment/train.py", line 90, in train(config) File "/data1/chentongbao/VCMR/ViGA/src/experiment/train.py", line 56, in train desc="Training epoch {} with lr {}".format(epoch, model.optimizer.param_groups[0]["lr"]) File "/data1/zhekun/anaconda3/envs/viga/lib/python3.7/site-packages/tqdm/std.py", line 1195, in iter for obj in iterable: File "/data1/zhekun/anaconda3/envs/viga/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 521, in next data = self._next_data() File "/data1/zhekun/anaconda3/envs/viga/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data return self._process_data(data) File "/data1/zhekun/anaconda3/envs/viga/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data data.reraise() File "/data1/zhekun/anaconda3/envs/viga/lib/python3.7/site-packages/torch/_utils.py", line 434, in reraise raise exception RuntimeError: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/data1/zhekun/anaconda3/envs/viga/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop data = fetcher.fetch(index) File "/data1/zhekun/anaconda3/envs/viga/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch return self.collate_fn(data) File "/data1/chentongbao/VCMR/ViGA/src/dataset/dataset.py", line 116, in collate_fn batch[k] = torch.stack(batch[k], dim=0) RuntimeError: stack expects each tensor to be equal size, but got [87, 1, 42, 1024] at entry 0 and [72, 1, 57, 1024] at entry 1

    opened by tongbaochen 3
[SIGIR 2022] A Review-aware Graph Contrastive Learning Framework for Recommendation

Review Graph This code is under tidying up ... requirements dgl == 0.7.2 pytorch == 1.10.2 tqdm Data prepration Run word2vector.py for word embedding

null 7 Sep 1, 2022
Codebase for SIGIR 2022 paper: Coarse-to-Fine Sparse Sequential Recommendation

CAFE This repository contains the code of model CAFE. Our SIGIR 2022 paper Coarse-to-Fine Sparse Sequential Recommendation. Overview Sequential recomm

null 8 Aug 13, 2022
Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion"

MKGFormer Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion" Model Architecture Illu

ZJUNLP 45 Sep 14, 2022
Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation'

OD-Rec Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation' Paper, saved teacher models and Andro

Xin Xia 8 Sep 13, 2022
"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (CVPRW 2022) Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Z

Yuanhao Cai 251 Sep 23, 2022
DIS18a/b Projektarbeit I - Linked Open Data und Knowledge Graphs (WiSe 2021/2022 und SoSe 2022) Group 1

DIS18a/b Projektarbeit I - Linked Open Data und Knowledge Graphs (WiSe 2021/2022 und SoSe 2022) Group 1 / Datenkrake In this repository you will find

null 2 May 4, 2022
Python script to exploit CVE-2022-22954 and then exploit CVE-2022-22960

CVE-2022-22954 PoC VMware Workspace ONE Access and Identity Manager RCE via SSTI. CVE-2022-22954 - PoC SSTI Usage: CVE-2022-22954.py [-h] -m SET_MODE

Chocapik 22 Aug 15, 2022
This is the official PyTorch implementation of TBSR. Our team received 2nd place (real data track) and 3rd place (synthetic track) in NTIRE 2022 Burst Super-Resolution Challenge (CVPRW 2022).

Transformer for Burst Image Super-Resolution (TBSR) This is the official PyTorch implementation of TBSR. Our team received 2nd place (real data track)

Zhilu Zhang 11 Jul 26, 2022
ITU Merit List for 2022 - Calculating Marks of Students who got admission in BSCS 2022. Required Admission Marks & Calculation Formula.

ITU BSCS Admission 2022 So, here is a bit of scraping and data analysis to get you started. I'm sure you do not care about the source code, so simply

Muneeb Ahmad 2 Aug 20, 2022
Code for CVPR 2022 CLEAR Challenge "This repository is the CLEAR Challenge 1st place methods for CVPR 2022 Workshop on Visual Perception and Learning in an Open World"

CLEAR | Starter Kit This repository is the CLEAR Challenge 1st place methods for CVPR 2022 Workshop on Visual Perception and Learning in an Open World

Tencent YouTu Research 5 Sep 9, 2022