ASSET: Autoregressive Semantic Scene Editing with Transformers at High Resolutions (SIGGRAPH 2022 - Journal Track)

Related tags

Admin Panels ASSET
Overview

ASSET: Autoregressive Semantic Scene Editing with Transformers at High Resolutions

Difan Liu, Sandesh Shetty, Tobias Hinz, Matthew Fisher, Richard Zhang, Taesung Park, Evangelos Kalogerakis

UMass Amherst and Adobe Research

ACM Transactions on Graphics (SIGGRAPH 2022)

teaser

Project page | Paper

Requirements

Testing

Flickr-Landscape

Download pretrained model, replace the config keys in configs/landscape_test.yaml with the path of pretrained model, run

python test.py -t configs/landscape_test.yaml -i data_test/landscape_input.jpg -s data_test/landscape_seg.png -m data_test/landscape_mask.png -c waterother -n water_reflection

COCO-Stuff

Download pretrained model, replace the config keys in configs/coco_test.yaml with the path of pretrained model, run

python test.py -t configs/coco_test.yaml -i data_test/coco_input.png -s data_test/coco_seg.png -m data_test/coco_mask.png -c pizza -n coco_pizza

Training

Datasets

Flickr-Landscape

Train a VQGAN with:

python main.py --base configs/landscape_vqgan.yaml -t True --gpus -1 --num_gpus 8 --save_dir <path to ckpt>

In configs/landscape_guiding_transformer.yaml, replace the config key model.params.first_stage_config.params.ckpt_path with the pretrained VQGAN, replace the config key model.params.cond_stage_config.params.ckpt_path with landscape_VQ_seg_model.ckpt (check Taming Transformers for the training of VQ_seg_model), train the guiding transformer at 256 resolution with:

python main.py --base configs/landscape_guiding_transformer.yaml -t True --gpus -1 --num_gpus 8 --user_lr 3.24e-5 --save_dir <path to ckpt>

In configs/landscape_SGA_transformer_512.yaml, replace the config key model.params.guiding_ckpt_path and model.params.ckpt_path with the pretrained guiding transformer, finetune the SGA transformer at 512 resolution with:

python main.py --base configs/landscape_SGA_transformer_512.yaml -t True --gpus -1 --num_gpus 8 --user_lr 1.25e-5 --save_dir <path to ckpt>

In configs/landscape_SGA_transformer_1024.yaml, replace the config key model.params.guiding_ckpt_path with the pretrained guiding transformer, replace the config key model.params.ckpt_path with the SGA transformer finetuned at 512 resolution, finetune the SGA transformer at 1024 resolution with:

python main.py --base configs/landscape_SGA_transformer_1024.yaml -t True --gpus -1 --num_gpus 8 --user_lr 5e-6 --save_iters 4000 --val_iters 16000 --accumulate_bs 4 --save_dir <path to ckpt>

COCO-Stuff

Train a VQGAN with:

python main.py --base configs/coco_vqgan.yaml -t True --gpus -1 --num_gpus 2 --save_dir <path to ckpt>

In configs/coco_guiding_transformer.yaml, replace the config key model.params.first_stage_config.params.ckpt_path with the pretrained VQGAN, replace the config key model.params.cond_stage_config.params.ckpt_path with coco_VQ_seg_model.ckpt, train the guiding transformer at 256 resolution with:

python main.py --base configs/coco_guiding_transformer.yaml -t True --gpus -1 --num_gpus 8 --user_lr 3.24e-5 --save_dir <path to ckpt>

In configs/coco_SGA_transformer_512.yaml, replace the config key model.params.guiding_ckpt_path and model.params.ckpt_path with the pretrained guiding transformer, finetune the SGA transformer at 512 resolution with:

python main.py --base configs/coco_SGA_transformer_512.yaml -t True --gpus -1 --num_gpus 8 --user_lr 1.25e-5 --save_dir <path to ckpt>

BibTex:

@article{liu2022asset,
author = {Liu, Difan and Shetty, Sandesh and Hinz, Tobias and Fisher, Matthew and Zhang, Richard and Park, Taesung and Kalogerakis, Evangelos},
title = {ASSET: Autoregressive Semantic Scene Editing with Transformers at High Resolutions},
year = {2022},
volume = {41},
number = {4},
journal = {ACM Trans. Graph.},}

Acknowledgment

Our code is developed based on Taming Transformers.

Contact

To ask questions, please email.

You might also like...

A web scrapping method to extract journal information from PubMed and Google Scholar using Python.

A web scrapping method to extract journal information from PubMed and Google Scholar using Python.

ScrapPaper About this project ScrapPaper is a web scrapping method to extract journal information from PubMed and Google Scholar using Python script.

Aug 30, 2022

Pattern Recognition (Journal): Residual Objectness for Object Detection, implemented by detectron2

ResObj in Detectron2 Install # e.g., pytorch + cuda 11.6, detectron2 conda install pytorch torchvision cudatoolkit=11.6 -c pytorch -c conda-forge pip

Aug 10, 2022

This paper had accepted in "Journal of Circuits, Systems and Computers". You can cite it if this work is useful for you.

This paper had accepted in

LiCAM: Long-Tailed Instance Segmentation with Real-Time Classification Accuracy Monitoring The overview of the LiCAM. This project is a pytorch implem

Aug 18, 2022

This is a collection of python codes for a preprint submitted to Computers and Geosciences journal for the paper titled "A Deep Learning Approach to Pore Network Inference in Sedimentary Rock Core Samples".

This is a collection of python codes for a preprint submitted to Computers and Geosciences journal for the paper titled

dlfpni This is a collection of python codes for a preprint submitted to Computers and Geosciences journal for the paper titled "A Deep Learning Approa

Sep 8, 2022

Pytorch implementation of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors

Pytorch implementation of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors

Make-A-Scene - PyTorch Pytorch implementation (inofficial) of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors (https://arxiv.org/

Sep 22, 2022

A smart home SDK uses the intranet loopback address socket to input the scene. Through the yolov5 and Media Pose network analysis contained in the SDK, it gives whether the scene is effectively triggered.

A smart home SDK uses the intranet loopback address socket to input the scene. Through the yolov5 and Media Pose network analysis contained in the SDK, it gives whether the scene is effectively triggered.

Introduction A smart home SDK uses the intranet loopback address socket to input the scene. Through the yolov5 and Media Pose network analysis contain

May 2, 2022

Blender addon for generate preview for asset.

Blender addon for generate preview for asset.

NX_Preview Blender addon for generate assets preview. Installation Download ZIP file on your system. In Blender, install addon from Preferences Add-

May 6, 2022

py-octoxlabs is an API client for Octox Labs Cyber Security Asset Management platform on python programming language.

Octox Labs Python SDK py-octoxlabs is an API client for Octox Labs Cyber Security Asset Management platform on python programming language. Installati

Aug 8, 2022

Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Official implementation for

Style Transformer for Image Inversion and Editing (CVPR2022) https://arxiv.org/abs/2203.07932 Existing GAN inversion methods fail to provide latent co

Sep 20, 2022
Owner
null
Implementation of a protein autoregressive language model, but with autoregressive infilling objective (editing subsequences capability)

Protein GLM (wip) Implementation of a protein autoregressive language model, but with autoregressive infilling objective (editing subsequences capabil

Phil Wang 17 May 6, 2022
This is the official PyTorch implementation of TBSR. Our team received 2nd place (real data track) and 3rd place (synthetic track) in NTIRE 2022 Burst Super-Resolution Challenge (CVPRW 2022).

Transformer for Burst Image Super-Resolution (TBSR) This is the official PyTorch implementation of TBSR. Our team received 2nd place (real data track)

Zhilu Zhang 11 Jul 26, 2022
UNetFormer: A UNet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery, ISPRS. Also, including other vision transformers and CNNs for satellite, aerial image and UAV image segmentation.

Version 2.0 (stable) Welcome to my homepage! News UNetFormer (accepted by ISPRS, Research Gate) and UAVid dataset are supported. ISPRS Vaihingen and P

Libo Wang 99 Sep 26, 2022
Baseline for NeRF scene-editing with directional CLIP loss

Clip NeRF Baseline About This repo contains my work as a Research Assistant (Intern) at CityU, Hong Kong. In this repo I am trying to develop a very n

Ruixiang JIANG 13 Sep 5, 2022
Code for the SIGGRAPH 2022 paper "DeltaConv: Anisotropic Operators for Geometric Deep Learning on Point Clouds."

DeltaConv [Paper] [Project page] Code for the SIGGRAPH 2022 paper "DeltaConv: Anisotropic Operators for Geometric Deep Learning on Point Clouds" by Ru

null 95 Aug 22, 2022
Queries on neural implicit surfaces via range analysis: ray casting, intersection, closest point, & more. SIGGRAPH 2022 paper. JAX implementation.

Perform geometric queries on neural implicit surfaces like ray casting, intersection testing, fast mesh extraction, closest points, and more. Works on

Nicholas Sharp 128 Sep 20, 2022
SIGGRAPH 2022 Labs Demo for "Learning Smooth Neural Functions via Lipschitz Regularization"

Lipschitz Multilayer Perceptron SIGGRAPH Demo This repository contains the homework exercise for "Learning Smooth Neural Functions via Lipschitz Regul

Hsueh-Ti Derek Liu 7 Aug 19, 2022
This repository provides source code, trained neural network model and dataset for our NeuralPassthrough work that is published at SIGGRAPH 2022.

NeuralPassthrough Introduction This repository provides source code, trained neural network model and dataset for our NeuralPassthrough work that is p

Meta Research 53 Sep 19, 2022
Code for "Neural 3D Reconstruction in the Wild", SIGGRAPH 2022 (Conference Proceedings)

Neural 3D Reconstruction in the Wild Project Page | Paper Neural 3D Reconstruction in the Wild Jiaming Sun, Xi Chen, Qianqian Wang, Zhengqi Li, Hadar

ZJU3DV 515 Sep 27, 2022
Results and analysis of post-SIGGRAPH COVID poll 2022

Results and analysis of post-SIGGRAPH COVID poll 2022 At SIGGRAPH/DigiPro 2022, I led a public service / science experiment / performance art piece in

Larry Gritz 10 Sep 13, 2022