270 Repositories
Python computer-vision Libraries
Benchmarking Attention Mechanism in Vision Transformers.
Vision Transformer Attention Benchmark This repo is a collection of attention mechanisms in vision Transformers. Beside the re-implementation, it prov
A traffic Analytics system using Deep Learning (Desktop Version)
TerrificEye: An Edge Computing System For Traffic Analytics From Videos Requirements Python 3 tensorflow DarkNet (Yolov4) NVIDIA SoC board (e.g. Jetso
The code for 'What You See Helps You Read: Understanding How Vision Enhances Language Semantics, Relations and Commonsense'
Vision Analysis Codes for the paper What You See Helps You Read: Understanding How Vision Enhances Language Semantics, Relations and Commonsense Pretr
Adaptive Token Sampling for Efficient Vision Transformers (ECCV 2022 Oral Presentation)
Adaptive Token Sampling For Efficient Vision Transformers This is the official implementation of the ECCV 2022 paper: "Adaptive Token Sampling for Eff
Official repository for "CLIP model is an Efficient Continual Learner".
Continual-CLIP: CLIP is an Efficient Continual Learner CLIP is an Efficient Continual Learner by Vishal Thengane, Salman Khan, Munawar Hayat, Fahad Kh
[SIGGRAPH Aisa 2022 Conference Paper] Shape Completion with Points in the Shadow
Shape Completion with Points in the Shadow Bowen Zhang, Xi Zhao, He Wang, Ruizhen Hu This repository contains the source code for the paper Shape Comp
TensorFlow 2.X reimplementation of CvT: Introducing Convolutions to Vision Transformers, Haiping Wu, Bin Xiao, Noel Codella, Mengchen Liu, Xiyang Dai, Lu Yuan, Lei Zhang.
CvT-TensorFlow TensorFlow 2.X reimplementation of CvT: Introducing Convolutions to Vision Transformers, Haiping Wu, Bin Xiao, Noel Codella, Mengchen L
Collaborative Brain-Computer Interfaces Toolbox
Collaborative Brain-Computer Interfaces (cBCI) Toolbox This repository contains Python analytical tools and libraries to study the decision-making per
Code and reuslts accompanying the NeurIPS 2022 paper with the title SPD domain-specific batch normalization to crack interpretable unsupervised domain adaptation in EEG
SPD domain-specific batch normalization to crack interpretable unsupervised domain adaptation in EEG This repository contains code and data accompanyi
Prompt Generation Networks for Efficient Adaptation of Frozen Vision Transformers. Jochem Loedeman, Maarten C. Stol, Tengda Han, Yuki M. Asano. Tech Report. 2022
Prompt Generation Networks for Efficient Adaptation of Frozen Vision Transformers This repository is the official implementation of Prompt Generation
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022
SpeechCLIP arXiv link: https://arxiv.org/abs/2210.00705 Prequisite Install packages pip install -r requirements.txt Download Pretrained Checkpoints ba
Simple COCO Objects Viewer in Tkinter
Simple COCO Objects Viewer in Tkinter. Allows quick viewing on local machine.
👋 Aligning Human & Machine Vision using explainability
Aligning Machine & Human Vision Thomas Fel*, Ivan Felipe Rodriguez*, Drew Linsley*, Thomas Serre Read the official paper » Explore results . Documenta
code for our paper "Attention Distillation: self-supervised vision transformer students need more guidance" in BMVC 2022
Attention Distillation: self-supervised vision transformer students need more guidance (BMVC 2022) Kai Wang, Fei Yang and Joost van de Weijer Requirem
[NeurIPS 2022] The official implementation for "Learning to Discover and Detect Objects"
Learning to Discover and Detect Objects This repository provides the official implementation of the following paper: Learning to Discover and Detect O
Face Liveness Detection (Face Anti Spoofing) Server SDK
Face Liveness Detection SDK For Linux Fully Offline, On-Premise Face Liveness Detection SDK for Linux Documentation at https://docs.faceonlive.c
[CoRL2022] CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers
CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers [CORL2022] This is the official implementation of CoRL2022 paper "C
This project is an unofficial summary of the resources related to VALSE and its annual seminar. Its main purpose is to more facilitate your communication and learning, and we also welcome your additions and suggestions.
简体中文 | English Abstract Official website | Official Accounts | Blog The main purpose of the VALSE annual seminar is to provide a stage for deep academ
UNetFormer: A UNet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery, ISPRS. Also, including other vision transformers and CNNs for satellite, aerial image and UAV image segmentation.
Version 2.0 (stable) Welcome to my homepage! News UNetFormer (accepted by ISPRS, Research Gate) and UAVid dataset are supported. ISPRS Vaihingen and P
This is my first solo project after going through few Python tutorials and training materials. The program will allow users to play the classic game "Rock, Paper, Scissors" against the computer.
DXM Rock, Paper, Scissors Game This is my first solo project after going through few Python tutorials and training materials. The program will allow u
Official PyTorch implementation of TriHorn-Net
TriHorn-Net This repository contains the PyTorch implementation of TriHorn-Net: A Model for Accurate Depth-Based 3D Hand Pose Estimation. It contains
computer-science-algorithms
cs-algorithms computer-science-algorithms "Trying To Build A Program Without Understanding The Algorithm Is Like Trying To Build A Car Without Underst
Interface with the Roboflow API and Python package for running inference (receiving predictions) and customizing result images from your Roboflow Train computer vision models.
roboflow-computer-vision-utilities Interface with the Roboflow API and Python package for running inference (receiving predictions) from your Roboflow
[NeurIPS 2022] The official repository of Expression Learning with Identity Matching for Facial Expression Recognition
ELIM_FER Optimal Transport-based Identity Matching for Identity-invariant Facial Expression Recognition (NeurIPS 2022) Daeha Kim, Byung Cheol Song CVI
Test&Track bot, FOR EDUCATIONAL PURPOSES ONLY
JakeT23's Test And Track Bot! Educational Purposes Only! Try it Online! work in progress. How to run the proof-of-concept python version: Dependencies
A YOLOv6 based computer vision system to identify whether pets are getting on the couch, and if so - commands them to get down.
PetVision A YOLOv6 based computer vision system to identify whether pets are getting on the couch, and if so - commands them to get down. My dog - Zoe
A computer vision system that detects mask usage (mask and no mask)
MaskDetector-SVM A computer vision system that detects mask usage (mask and no mask). The system diagram is shown below It consists of three stages: F
'Rethinking Knowledge Distillation via Cross-Entropy' and 'ViTKD: Practical Guidelines for ViT feature knowledge distillation'
Knowledge Distillation for Image Classification This repository includes official implementation for the following papers: NKD and tf-NKD: Rethinking
Implementation of efficient backbones for computer vision task.
Efficient Backbones Efficient backbones is the project which aims to provide efficient SOTA vision backbones based on PyTorch. Currently following mod
PyTorch implementation of ``User-Controllable Latent Transformer for StyleGAN Image Layout Editing'' [Computer Graphics Forum (Proc. of Pacific Graphics 2022)]
User-Controllable Latent Transformer for StyleGAN Image Layout Editing This repository contains our implementation of the following paper: Yuki Endo:
World of Warcraft fishing boot, powered by computer vision and deep learning
DeepFish World of Warcraft fishing bot powered by Computer Vision and Deep Learning. Note: Usage of this bot is abuse of World of Warcraft terms of se
Some basic topics in the field of deep learning, including papers, notes and codes, etc., hope to be helpful to later people.
CVM-DL_Base Based on the fundamental topics of deep learning, all content is collected by members of the JLU-CVM Group. 🆕 New features Add the answer
Official implementation of Learning from Unlabeled 3D Environments for Vision-and-Language Navigation (ECCV'22).
Learning from Unlabeled 3D Environments for Vision-and-Language Navigation This repository is the official implementation of Learning from Unlabeled 3
Viewer-Centred Surface Completion for Unsupervised Domain Adaptation in 3D Object Detection
SEE-VCN This is the official codebase for Viewer-Centred Surface Completion for Unsupervised Domain Adaptation in 3D Object Detection Abstract Every a
Storage for some useful connector for R-Vision SOAR
Сборник коннекторов для R-Vision SOAR nad_connector.py Коннектор для обогащения инцидентов, порождаемых системой класса NTA - PT Network Attack Discov
The Neon Genesis Evangelion™ animated computer interfaces FOSS
OpenGUIlion v1.0 OpenGUIlion is a very basic application written in Python that only does one thing: it shows non-interactive, animated GUIs from the
Brain tumors segmentation on 3D MRI images. The model has been trained on Brats-2020 dataset.
Brain tumors segmentation The advancement in healthcare and biotechnology have led to the growing use and need of AI in medical imaging analysis. AI t
YOLOv5 Object Tracking + Detection + Object Blurring + Streamlit Dashboard Using OpenCV, PyTorch and Streamlit
yolov5-object-tracking New Features YOLOv5 Object Tracking Using Sort Tracker Added Object blurring Option Added Support of Streamlit Dashboard Code c
Generate OMORI character sprite with GAN.
OMORI GAN Generate OMORI character sprite with Generative Adversarial Networks (GAN) using PyTorch Lightning. Example generated image Spoiler warning:
ICME 2022: Few-shot Multi-modal Sentiment Analysis with Prompt-based Vision-aware Language Modeling
PVLM (Prompt-based Vision-aware Language Modeling) This is the implementation of our ICME 2022 paper "Few-shot Multi-modal Sentiment Analysis with Pro
This file and application helps you to flood your victims computer with billions of file.
pc-flooding This file and application helps you to flood your victims computer with billions of file. This python script helps you to flood your victi
An annotated bibliography of computer systems research papers.
Systems Bibliography This repository contains an annotated bibliography that I've built from my readings in computer systems research. Contents At the
This repo contains the virtual painting with uses python opencv, Hand Tracking Module to virtually paint things without touching the Computer
Virutal Painting SAMPLE OUTPUT INTRODUTION This Project contains a python file to detect hands and Draw different Shapes and pencil with different Col
Code for paper titled "Failure Prediction with Statistical Guarantees for Vision-Based Robot Control"
Failure Prediction with Statistical Guarantees for Vision-Based Robot Control This repository contains code for running many of the experiments and re
PyTorch implementation of Denoising of 3D MR images using a voxel-wise hybrid residual MLP-CNN model to improve small lesion diagnostic confidence (MICCAI 2022).
Denoising of 3D MR images using a voxel-wise hybrid residual MLP-CNN model to improve small lesion diagnostic confidence PyTorch implementation of Den
Southeast University, Computer Science Engineering class notes
SEU CSE notes Class notes of Southeast University, Computer Science Engineering (Dimploma) course We are organizing notes only of Batch 14, Section C.
A RISC-V based toy computer
Riscy-D2 Riscy-D2 is a project to build a RISC-V based computer. The goal is to run it on an FPGA. The CPU currently supports the RV32I ISA. This repo
Reinforcement learning agents take on osu!
neurosama training reinforcement learning agents with a 100% vision based approach custom gym environment is not a wrapper - built without interacting
An image to text model base on transformer which can also be used on OCR task.
Image2Text Model An image to text model base on transformer which can also be used on OCR task. 在这里集成了 NVIDIA FasterTransformer 用于预测加速。同时集成了 FasterTra
We are basically going to play the Chrome Dinosaur game using hand movements, the game has two basic moves (up and down), which are going to be replicated by the hand moving up and down.
Chrome-s-Dinosaur-Game-using-Computer-Vision We are basically going to play the Chrome Dinosaur game using hand movements, the game has two basic move
TensorFlow 2.X reimplementation of Global Context Vision Transformers.
GCViT-TensorFlow TensorFlow 2.X reimplementation of Global Context Vision Transformers Ali Hatamizadeh, Hongxu (Danny) Yin, Jan Kautz Pavlo Molchanov.
VMFormer: End-to-End Video Matting with Transformer
VMFormer: End-to-End Video Matting with Transformer Jiachen Li, Vidit Goel, Marianna Ohanyan, Shant Navasardyan, Yunchao Wei, Humphrey Shi [arXiv] [Pr
Code for VAuLT: Augmenting the Vision-and-Language Transformer with the Propagation of Deep Language Representations.
VAuLT: Vision-and-Augmented-Language Transformer Code for VAuLT: Augmenting the Vision-and-Language Transformer with the Propagation of Deep Language
3D scene reconstruction and camera pose estimation given images from different views (Structure from Motion)
SfM (Structure from Motion - Classical Approach) 3D scene reconstruction and camera pose estimation given images from different views (Structure from
This is the traditional board game "Battleships" programmed in Python, in which the opponent is the computer.
Battleships-Game This is the traditional board game "Battleships" programmed in Python, in which the opponent is the computer. Game Rules You are expe
A Neural Network for detecting german traffic signs on the GTSRB dataset. Will later be used for a basic self-driving car simulation
A Neural Network for detecting german traffic signs on the GTSRB dataset. Will later be used for a basic self-driving car simulation Using Tensorflow
Official Released code for MICCAI 2022 paper: CaRTS: Causality-driven Robot Tool Segmentation from Vision and Kinematics Data
CaRTS: Causality-driven Robot Tool Segmentation from Vision and Kinematics Data This repo hosts the code for implementing the CaRTS algorithms for Rob
Implementation of JEPA, Yann LeCun's vision of how AGI would be built, in Pytorch
JEPA - Pytorch (wip) Implementation of JEPA (Joint Embedding Predictive Architectures), Yann LeCun's vision of how AGI would be built, in Pytorch Yann
DM-NeRF in PyTorch
DM-NeRF: 3D Scene Geometry Decomposition and Manipulation from 2D Images Bing Wang, Lu Chen, Bo Yang* Paper | Video | DM-SR The architecture of our pr
Official PyTorch implementation of "iColoriT: Towards Propagating Local Hint to the Right Region in Interactive Colorization by Leveraging Vision Transformer." (WACV 2023)
iColoriT (WACV 2023) Official Implementation This is the official PyTorch implementation of the paper: iColoriT: Towards Propagating Local Hint to the
All computer vision exercises from basic, this repository is extremely handy for all the beginners out there.
ComputerVisionExercise In this repository I will be contributing all the necessary tools with better code explanation regarding OpenCV and its applica
Final projects submissions made by 160 students for computer graphics course (CCE2211) Fall 2022 at Tanta university.
Final projects submissions for computer graphics course (CCE2211) Fall 2022. List of teams group 01 group 02 group 03 group 04 group 05 group 06 group
ProtoPFormer: Concentrating on Prototypical Parts in Vision Transformers for Interpretable Image Recognition
ProtoPFormer Introduction This is the official implementation of paper "ProtoPFormer: Concentrating on Prototypical Parts in Vision Transformers for I
Smart home alarm. Using computer vision to distinguish between humans to other moving object
Smart home alarm. Using computer vision to distinguish between humans to other moving object, which makes it usefull for outside enviroment. Once a person is seen, the alarm starts and a push notification is sent to the phone via PushBullet
A 3D computer vision development toolkit based on PaddlePaddle. It supports point-cloud object detection, segmentation, and monocular 3D object detection models.
Paddle3D Paddle3D是飞桨官方开源的端到端深度学习3D感知套件,涵盖了许多前沿和经典的3D感知模型,支持多种模态和多种任务,可以助力开发者便捷地完成 『自动驾驶』 领域模型 从训练到部署的全流程应用。Paddle3D具备以下特性: 【丰富的模型库】聚合主流3D感知算法及精度调优策略,覆
pytorch implementation of a WACV 2021 Paper "Class-agnostic Few-shot-Object-Counting"
Class agnostic Few shot Object Counting This repository is the non-official pytorch implementation of a WACV 2021 Paper "Class-agnostic Few-shot-Objec
Official Repository for the 3D 2022 paper "The 8-Point Algorithm as an Inductive Bias for Relative Pose Prediction by ViTs"
The 8-Point Algorithm as an Inductive Bias for Relative Pose Prediction by ViTs (3DV 2022) Chris Rockwell, Justin Johnson and David F. Fouhey Project
Upon opening the computer you can select which other applications you want to turn on
Upon opening the computer you can select which other applications you want to turn on. With this application you will avoid having multiple unused applications opening on start of your computer.
Restoring Vision in Adverse Weather Conditions with Patch-Based Denoising Diffusion Models
This is the code repository of the following paper to train and perform inference with patch-based diffusion models for image restoration under adverse weather conditions.
Tracks a certain coloured object and tracks its centre in space - giving x,y coordinates.
Computer-Vision-Motion-Tracking Tracks a certain coloured object and tracks its centre in space. Currently set to detect a red object using the built
A simple Virus which will shutdown your computer when it boots to your desktop.
Windows-Shutup-Virus A simple Virus which will shutdown your computer when it boots to your desktop. It is made using python with a litle help of Batc
[ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
XMem Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model Ho Kei Cheng, Alexander Schwing University of Illinois Urbana-Champaig
Official PyTorch implementation of A-ViT: Adaptive Tokens for Efficient Vision Transformer (CVPR 2022)
A-ViT: Adaptive Tokens for Efficient Vision Transformer This repository is the official PyTorch implementation of A-ViT: Adaptive Tokens for Efficient
Code for "MeuMesh: Learning Disentangled Neural Mesh-based Implicit Field for Geometry and Texture Editing", ECCV 2022 Oral
NeuMesh: Learning Disentangled Neural Mesh-based Implicit Field for Geometry and Texture Editing Project Page | Video | Paper NeuMesh: Learning Disent
Spinning Servo Using Computer Vision
This project uses opencv and my hand detection module to spin a servo depending on the distance between the tips of the thumb and index finger. Other specfic angles can also be achieved.
This project uses computer vision and accurate hand detection to detect and display via image and text how many fingers are up on a hand on any present webcam.
Comuter-Vision-Finger-Counter This project uses computer vision and accurate hand detection to detect and display via image and text how many fingers
Through computer vision and accurate hand tracking, by using your webcam you can paint on a virtual canvas by just using your fingers.
Python-Virtual-Painter Through computer vision and accurate hand tracking, by using your webcam you can paint on a virtual canvas by just using your f
This repository includes projects I am working on regarding computer vision.
Computer-Vision This repository includes projects I am working on regarding computer vision. SIFT Feature Detection Image of best 100 matches detected
opl is a shell command that will search your entire computer for a file or directory.
opl opl is a shell command that will search your entire computer for a file or directory. Install Install python Clone or download repository cd into
[ECCV 2022] Patch Similarity Aware Data-Free Quantization for Vision Transformers
Patch Similarity Aware Data-Free Quantization for Vision Transformers This repository contains the official PyTorch implementation for the ECCV 2022 p
Codes and Datasets of "Spike Transformer: Monocular Depth Estimation for Spiking Camera" in European Conference on Computer Vision (ECCV) 2022.
MDE-SpikingCamera Codes and Datasets of "Spike Transformer: Monocular Depth Estimation for Spiking Camera". Jiyuan Zhang*, Lulu Tang*, Zhaofei Yu $\da
Super-Resolution by Predicting Offsets: An Ultra-Efficient Super-Resolution Network for Rasterized Images. ECCV 2022. [ Official ]
SRPO [Official Code] Super-Resolution by Predicting Offsets: An Ultra-Efficient Super-Resolution Network for Rasterized Images. (Paper Link) By Jinjin
[ECCV 2022] Official pytorch implementation of "mc-BEiT: Multi-choice Discretization for Image BERT Pre-training" in European Conference on Computer Vision (ECCV) 2022.
mc-BEiT: Multi-choice Discretization for Image BERT Pre-training Official pytorch implementation of "mc-BEiT: Multi-choice Discretization for Image BE
A Python package for fast and robust Image Stitching
stitching A Python package for fast and robust Image Stitching. Based on opencv's stitching module and inspired by the stitching_detailed.py python co
A collection of benchmarks I've run
Is the 3090 good for computer vision This repo holds a set of personal benchmark I've done on my system. OS: Ubuntu 22.04.1 LTS x86_64 CPU: AMD Ryzen
Unified interface to google vision, aws textract, azure, tesseract and other OCR tools
ocrpy Unified interface to google vision, aws textract, azure, tesseract and other OCR tools The Core objective of OcrPy is to let users OCR, Archive,
A command-line based project. The computer plays tic-tac-toe with you, in three increasing levels of difficulty (and intelligence).
tic-tac-toe About tic-tac-toe tic-tac-toe is a tiny, simple, interactive command-line based project. Date of creaton: September 10, 2019 The game can
Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations.
VL-CheckList Updates 07/04/2022: VL-CheckList paper on arxiv https://arxiv.org/abs/2207.00221 07/12/2022: Updated object, relation, attribute splits/d
MixGen: A New Multi-Modal Data Augmentation
MixGen: A New Multi-Modal Data Augmentation This is the official PyTorch implementation of MixGen, which is a joint data augmentation technique for vi
Self-supervised monocular depth estimation with a vision transformer
MonoViT This is the reference PyTorch implementation for training and testing depth estimation models using the method described in MonoViT: Self-Supe
Repository of SECON paper - ViTag: Online WiFi Finite Time Measurements Aided Vision-Motion Identity Association in Multi-person Environments.
ViTag Repository of our paper accepted in SECON 2022: Bryan Bo Cao, Abrar Alali, Hansi Liu, Nicholas Meegan, Marco Gruteser, Kristin Dana, Ashwin Asho
Rewriting Geometric Rules of a GAN: Warp a GAN model to customized, out-of-domain shapes.
Rewriting Geometric Rules of a GAN Project | Paper | Youtube With our method, a user can edit a GAN model to synthesize many unseen objects with the d
[ECCV2022] Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects Suppression
night_enhancement (ECCV'2022) Introduction This is an implementation of the following paper. Unsupervised Night Image Enhancement: When Layer Decompos
REALY: Rethinking the Evaluation of 3D Face Reconstruction (ECCV 2022)
REALY Benchmark This is the official repository for 3D face reconstruction evaluation on the Region-aware benchmark based on the LYHM Benchmark (REALY
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions
HorNet Created by Yongming Rao*, Wenliang Zhao*, Yansong Tang, Jie Zhou, Ser-Nam Lim†, Jiwen Lu† This repository contains PyTorch implementation for H
[ECCV'22] The official PyTorch implementation of our ECCV 2022 paper: "AiATrack: Attention in Attention for Transformer Visual Tracking".
AiATrack The official PyTorch implementation of our ECCV 2022 paper: AiATrack: Attention in Attention for Transformer Visual Tracking Shenyuan Gao, Ch
A little Python-Bash-Powershell project that creates a repo on your Github account and clone it on your computer
Create-Project-Automation Description This is a little Python-Bash-Powershell project that creates a repo on your Github account and clone it on your
Machine learning model which can play car and bike driving game and can be used as a reference for self driving vehicle.
Inception About: It is a machine learning model which can play car and bike driving game and can be used as a reference for self driving vehicle. Mode
GIT: A Generative Image-to-text Transformer for Vision and Language
Introduction This repo presents some example codes to reproduce some results in GIT: A Generative Image-to-text Transformer for Vision and Language. I
Official implementation for paper "LightViT: Towards Light-Weight Convolution-Free Vision Transformers"
LightViT Official implementation for paper "LightViT: Towards Light-Weight Convolution-Free Vision Transformers". By Tao Huang, Lang Huang, Shan You,