Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations.

Overview

VL-CheckList

Updates

  • 07/04/2022: VL-CheckList paper on arxiv https://arxiv.org/abs/2207.00221
  • 07/12/2022: Updated object, relation, attribute splits/dataset
  • 08/01/2022: Release the initial code and example models

Introduction

This repository is the official project page for (VL-CheckList). VL-CheckList is an explainable framework that comprehensively evaluates VLP models and facilitates deeper understanding.The current method to evaluate a VLP model is solely by comparing its fine-tuned downstream tasks performance, which has a number of limitations, such as poor interpretability, incomparable results and bias in data.

The core principle of VL-CheckList are: (1) evaluate a VLP model's fundamental capabilities instead of performance on applications (2) disentangle capabilities into relatively independent variables that are easier to analyze.

VL-CheckList evaluates VLP models from three aspects: Object, Attribute and Relationship. We provide the performance quantitative table and the radar chart based on the three aspects.

How to Install VL-CheckList

You can install vl_checklist in your project and import vl_checklist and evaluate your models:

pip install vl_checklist

vilt_test.py is an example code to show how to import vl_checklist in your project.

You need to copy data/ and corpus/ folders to the root of your project and prepare image datasets Link.

You can also clone this project add your model as follows.

git clone https://github.com/om-ai-lab/VL-CheckList.git

Detailed Guidelines How to Evaluate your Model

We include several representative example VLP models in the example_models/ folder.

1. Define a config file e.g. in configs/sample.yaml

MAX_NUM: 2000
MODEL_NAME: "ViLT"
BATCH_SIZE: 4
TASK: "itc"
DATA:
  TYPES: ["Attribute/color"]
  TEST_DATA: ["vg","vaw"]   
OUTPUT: 
  DIR: "output/vilt"

2. Prepare Evaluation Data We provide the initial curated jsons at data/ and corresponding yamls at vl_checklist/corpus. You can need to download image dataset. You can find the instruction in detail Link

3. Load the model which contain predict() and Evaluate class as follows. Please find an example model class Link

4. Run start() as follows

Here is an example code

from example_models.vilt.engine import ViLT
from vl_checklist.evaluate import Evaluate

if __name__ == '__main__':
    model = ViLT('vilt_200k_mlm_itm.ckpt')
    vilt_eval = Evaluate(config="configs/sample.yaml", model=model)
    vilt_eval.start()

5. check the results in the OUTDIR DIR you defined the yaml file You can check the output format LINK

Download Pretrained Weights

We include examples models at example_models/. You can download the pretrained weights at resources/ folder to test our example models:

Demo

We present the demo in huggingface space, you can try it here: Demo link
In this demo, you can change the object and attribute of object in the text prompt. You can also change the size and location of the object.

References

If you use any source codes or datasets included in this toolkit in your work, please cite the following paper. The bibtex are listed below:

@misc{https://doi.org/10.48550/arxiv.2207.00221,
  doi = {10.48550/ARXIV.2207.00221}, 
  url = {https://arxiv.org/abs/2207.00221},
  author = {Zhao, Tiancheng and Zhang, Tianqi and Zhu, Mingwei and Shen, Haozhan and Lee, Kyusong and Lu, Xiaopeng and Yin, Jianwei},
  keywords = {Computer Vision and Pattern Recognition (cs.CV), Computation and Language (cs.CL), Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {VL-CheckList: Evaluating Pre-trained Vision-Language Models with Objects, Attributes and Relations},
  publisher = {arXiv},
  year = {2022},
  copyright = {Creative Commons Attribution 4.0 International}
}
You might also like...

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data (CVPR 2022) Potentials of primitive shapes f

Sep 27, 2022

The official repo for the paper "An Empirical Study of Remote Sensing Pretraining"

The official repo for the paper

An Empirical Study of Remote Sensing Pretraining Di Wang, Jing Zhang, Bo Du, Gui-Song Xia and Dacheng Tao Updates | Introduction | Usage | Results & M

Nov 25, 2022

Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos

Video-Pre-Training Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos ๐Ÿ“„ Read Paper ๐Ÿ“ฃ Blog Post ๐Ÿ‘พ MineRL Environment (note

Nov 27, 2022

The human face subset of LAION-400M for large-scale face pretraining.

The human face subset of LAION-400M for large-scale face pretraining.

LAION-Face Introduction LAION-Face is the human face subset of LAION-400M, it consists of 50 million image-text pairs. Face detection is conducted to

Nov 17, 2022

[ECCV2022] Learning to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining

[ECCV2022] Learning to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining

Learning to Drive by Watching YouTube videos: Action-Conditioned Contrastive Policy Pretraining (ECCV22) Webpage | Code | Paper Installation Our codeb

Nov 23, 2022

VSR: A probing benchmark for spatial undersranding of vision-language models.

VSR: A probing benchmark for spatial undersranding of vision-language models.

VSR: Visual Spatial Reasoning A probing benchmark for spatial undersranding of vision-language models. arxiv ยท dataset ยท benchmark 1 Overview The Visu

Nov 24, 2022

Training and evaluating NBM and SPAM for interpretable machine learning.

Neural Basis Model (NBM) and Scalable Polynomial Additive Model (SPAM) Training and evaluating NBM and SPAM for interpretable machine learning. Librar

Nov 24, 2022

Python tools for running and evaluating DM-VIO.

Python tools for DM-VIO: Delayed Marginalization Visual-Inertial Odometry When using this project in academic work, please consider citing: @article{s

Oct 27, 2022
Comments
  • Running test with other models

    Running test with other models

    Hi! Thank you for publishing this great work. I was able to run test with your Vilt model, is it possible to run test with other models such TCL and the rest? their checkpoints are different so it's not clear to me if the code should support it or not. Thank you and have a great week, Amit

    opened by alfassy 0
  • Object features for Oscar model

    Object features for Oscar model

    Hi, it is a great work! Since the region-based methods like Oscar using extracted features for evaluating, can you provide the features.tsv file or the detector used for object detection in your paper?

    Many thanks!

    opened by thecharm 0
Owner
Om Research Lab
Multimodal AI Research for Good
Om Research Lab
A tiny observer for the attributes of Python objects.

objerve A tiny observer for the attributes of Python objects. Installation objerve can be installed by running pip install objerve Example Usage Let's

Furkan Onder 31 Oct 23, 2022
A very simple tool to rewrite parameters such as attributes and constants for OPs in ONNX models. Simple Attribute and Constant Modifier for ONNX.

sam4onnx A very simple tool to rewrite parameters such as attributes and constants for OPs in ONNX models. Simple Attribute and Constant Modifier for

Katsuya Hyodo 6 May 15, 2022
A set of simple tools for splitting, merging, OP deletion, size compression, rewriting attributes and constants, OP generation, change opset, change to the specified input order, and RGB to BGR conversion for ONNX models.

simple-onnx-processing-tools A set of simple tools for splitting, merging, OP deletion, size compression, rewriting attributes and constants, OP gener

Katsuya Hyodo 132 Nov 23, 2022
ECCV2022,Bootstrapped Masked Autoencoders for Vision BERT Pretraining

BootMAE, ECCV2022 This repo is the official implementation of "Bootstrapped Masked Autoencoders for Vision BERT Pretraining". Introduction We propose

DLight 81 Nov 12, 2022
Show DI relations between Min-entropy and CHSH Value

Randomness_Certification Show DI relations between Min-entropy and CHSH Value Here we solved the 2nd NPA Heirarchy to get lower bound on min-entropy,

Abhishek Mishra 1 Jun 20, 2022
PyTorch implementation of the paper "MILAN: Masked Image Pretraining on Language Assisted Representation" https://arxiv.org/pdf/2208.06049.pdf.

MILAN: Masked Image Pretraining on Language Assisted Representation This repository contains the PyTorch implementation of the paper MILAN: Masked Ima

null 48 Nov 16, 2022
[NeurIPS 2022] DRAGON ๐Ÿฒ: Deep Bidirectional Language-Knowledge Graph Pretraining

DRAGON: Deep Bidirectional Language-Knowledge Graph Pretraining This repo provides the source code & data of our paper "DRAGON: Deep Bidirectional Lan

Michihiro Yasunaga 93 Nov 22, 2022
Instance-level Facial Attributes Editing (CVIU 2021)

Image Style Disentangling for Instance-level Facial Attribute Transfer (CVIU 2021) Xuyang Guo, Meina Kan, Zhenliang He, Xingguang Song, Shiguang Shan

Xuyang Guo 8 Sep 14, 2022
Code for VAuLT: Augmenting the Vision-and-Language Transformer with the Propagation of Deep Language Representations.

VAuLT: Vision-and-Augmented-Language Transformer Code for VAuLT: Augmenting the Vision-and-Language Transformer with the Propagation of Deep Language

Georgios Chochlakis 7 Sep 8, 2022