CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP

Overview

CLIP-GEN

[简体中文][English]

本项目在萤火二号集群上用 PyTorch 实现了论文 《CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP》。

clip-gen

CLIP-GEN 是一个 Language-Free 的文本生成图像的方法,它不依赖图文训练样本,通过预训练 CLIP 模型的强大表征能力,只需要图片数据就可以训练出一个文本生成图像的模型。该方法的基本原理是:CLIP-GEN 首先会训练一个 VQ-GAN,把图片映射到离散空间;然后再训练一个 GPT 模型,把 CLIP embedding 映射到 VQ-GAN 的离散空间;由于在 CLIP 中,文本和图像共享一个特征空间,在 inference 的时候我们就可以通过同样的方法把文本映射到 VQ-GAN 的离散空间,然后 decode 为 RGB 图像。

Requirements

Training

支持的数据集:coco, imagenet, googlecc

  1. 下载 CLIP 预训练模型

    下载 CLIP 后放至 pretrained/clip_vit_b32.pt,该预训练模型来自 OpenAI.

  2. 在 COCO 上训练 VQGAN

    提交任务至萤火集群:

    hfai python train_vqgan.py --ds coco -- -n 1 -p 30

    本地运行:

    python train_vqgan.py --ds coco
  3. 在 COCO 上训练 Conditional GPT

    提交任务至萤火集群:

    hfai python train_gpt.py --ds coco --vqgan_ckpt /path/to/vqgan/ckpt -- -n 4 -p 30

    本地运行:

    python train_gpt.py --ds coco --vqgan_ckpt /path/to/vqgan/ckpt

Demo

下载在 COCO 上训练好的 VQGANGPT 模型,分别放到 pretrained/vqgan_coco.ptpretrained/gpt_coco.pt;然后运行:

python demo.py --text "A city bus driving on the city street" --out "bus.jpg"

NOTE: demo 的运行不依赖 hfai,用户可以在装有 PyTorch 的环境下直接使用

Samples

下面是一些文本生成图像的样本:

tower bus living train skiing

References

Citation

@article{wang2022clip,
  title={CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP},
  author={Wang, Zihao and Liu, Wei and He, Qian and Wu, Xinglong and Yi, Zili},
  journal={arXiv preprint arXiv:2203.00386},
  year={2022}
}

TODO

  • 预训练模型
  • FFRecord 数据
You might also like...

About A Mail Gen For Verification Of Websites Especially Discord 3

About A Mail Gen For Verification Of Websites Especially Discord <3

A FREE API VERSION FOR CREATING MAIL FOR VERIFICATION WEBSITED 📧 ⭐ Feel Free To Fork And Star The Repo . ➡️ Email Are Fetched From Mail.tm And Mail.g

Sep 12, 2022

This project is a speech recognition based text editor with Multiple language including Indian language and also various functionality like Paraphrasing, Audio or video recordings to text, translator

This project is a speech recognition based text editor with Multiple language including Indian language and also various functionality like Paraphrasing, Audio or video recordings to text, translator

Speechnotes Speechnotes is Speech Recognition based text Editor where we type the sentence througn our voice with multiple languages including our Ind

Apr 22, 2022

Predict a CLIP image embedding from its text embedding using a diffusion prior.

Conditioned Prior (WIP) Weights and code by @nousr Predict a CLIP image embedding from its text embedding using a diffusion prior. This code is part o

Aug 12, 2022

AI-powered art generator based on VQGAN+CLIP

AI-powered art generator based on VQGAN+CLIP

AI-art-gen AI-powered art generator based on VQGAN+CLIP Open the jupyter notebook in Google Colab Prompt "Hindu_God" Aspect Ratios and Re-Sizing I str

Jul 26, 2022

Text classification is one of the popular tasks in NLP that allows a program to classify free-text documents based on pre-defined classes.

Deep-Learning-for-Text-Document-Classification Text classification is one of the popular tasks in NLP that allows a program to classify free-text docu

Mar 17, 2022

This is a compiler developed in Python. It compiles a new language MiniDecaf, a subset of C language into assamble language of RISC-V.

MiniDecaf Python Framework This is a compiler developed in Python. It compiles a new language MiniDecaf, a subset of C language into assamble language

Sep 20, 2022

[ECCV'22] Official PyTorch Implementation of "CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes"

CLIP-Actor Project Page | Paper This repository contains a pytorch implementation for the ECCV 2022 paper, CLIP-Actor: Text-Driven Recommendation and

Sep 29, 2022

An official implementation for "X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval"

An official implementation for

X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval Introduction The implementation of paper X-CLIP: End-to-End Multi-grain

Sep 20, 2022
Comments
  • "nn.TransformerEncoderLayer" is adopted to construct the "conditonal transformer" in your paper.

    Thanks for your great work.

    I noticed that you utilize "nn.TransformerEncoderLayer" when constructing "conditional transformer". Since it is used to predict the next token index, I am wondering whether the decoder of transformer is more appropriate for the construction of your conditional transformer? or what's the reason that you don't adopt "nn.TransformerdecoderLayer" ?

    Because of the structure of "nn.TransformerEncoderLayer" is simpler or more concise than that of "nn.TransformerDEcoderLayer" ?

    opened by fido20160817 0
  • Add Web Demo & Docker environment

    Add Web Demo & Docker environment

    This pull request makes it possible to run your model inside a Docker environment, which makes it easier for other people to run it. We're using an open source tool called Cog to make this process easier.

    This also means we can make a web page where other people can try out your model, view it here: https://replicate.com/hfailab/clip-gen. You can find the docker file under the tab ‘run model with docker’.

    We have added some examples to the page, but do claim the page so you can own the page, customise the Example gallery as you like, push any future update to the web demo, and we'll feature it on our website and tweet about it too. You can find the 'Claim this model' button on the top of the page. Any member of the HFAiLab organization on GitHub can claim the model ~ When the page is claimed, it will be automatically linked to the arXiv website as well (under “Demos”).

    In case you're wondering who I am, I'm from Replicate, where we're trying to make machine learning reproducible. We got frustrated that we couldn't run all the really interesting ML work being done. So, we're going round implementing models we like. 😊

    opened by chenxwh 0
Owner
null
WieszakWare 7 Sep 23, 2022
One of the top free token gens. This token gen you cause use at home ip or you can use proxies. Please do not claim as yours

Discord-Token-Generator INFO This is a Simple Discord Token Generator which creates verified discord accounts These accounts are good for selling and

ProxyFilter 3 May 29, 2022
Code for paper LAFITE: Towards Language-Free Training for Text-to-Image Generation (CVPR 2022)

Lafite Code for paper LAFITE: Towards Language-Free Training for Text-to-Image Generation (CVPR 2022) Update more details later. Requirements The impl

Yufan 110 Sep 19, 2022
Official implementation of CLIP-Mesh: Generating textured meshes from text using pretrained image-text models

CLIP-Mesh Official implementation of CLIP-Mesh: Generating textured meshes from text using pretrained image-text models Nasir Mohammad Khalid, Tianhao

Nasir Khalid 205 Sep 28, 2022
📥 Comepress - Convert and Optimize your Web Project's Image Files to Next-Gen WebP format in just one click!

?? Comepress Super trivial app to optimize your web project by converting all PNG, JPG and JPEG images to Next-Gen WebP format Just drag and drop your

null 17 Sep 6, 2022
Stripe Key Generator (SK Key Checker) & Mass SK Key Generator (SK Key Generator) written in Python

?? SK Tools ?? SK Tools is the best sk checker and sk generator in Python3. With SK Tools you can have a lot of sk key. ?? Dependencies ?? In order fo

v6nom 3 Jul 20, 2022
[SIGIR 2022] CenterCLIP: Token Clustering for Efficient Text-Video Retrieval. Also, a text-video retrieval toolbox based on CLIP + fast pyav video decoding.

CenterCLIP CenterCLIP achieves state-of-the-art text-video retrieval performance and decent computation cost reduction on MSVD, MSRVTT, LSMDC, and Act

Shuai Zhao 66 Sep 28, 2022
Create a string for importing the team on pokemon showdown from your save. Working only on gen 4 (for now)

Pokemon-showdown-importer Create a string for importing the team on pokemon showdown from your save. Working only on gen 4 (for now) Usage python main

Davide Nicolini 3 Aug 29, 2022
Dumb Danny's Token Gen

Working Discord Account Maker as of 2022-08-17! It uses selenium. It has a built in hCaptcha Solver ( thanks to gpudrops.com ) and can confidentally solve the captchas. Mail Verification will be coming soon, as I am having difficulties with the tokens getting phone locked with the current provider.

DumbDanny 2 Aug 17, 2022
About A Mail Gen For Verification Of Websites Especially Discord <3

A FREE API VERSION FOR CREATING MAIL FOR VERIFICATION WEBSITED ?? ⭐ Feel Free To Fork And Star The Repo . ➡️ Email Are Fetched From Mail.tm And Mail.g

Harshal Waykole 11 Sep 21, 2022