Stable Diffusion Video to Video, Image to Image, Template Prompt Generation system and more, for use with any stable diffusion model

Related tags

Admin Panels sdutils
Overview

SDUtils: Stable Diffusion Utility Wrapper

Stable Diffusion General utilities wrapper including: Video to Video, Image to Image, Template Prompt Generation system and more, for use with any stable diffusion model.

Note, this is by far not a finished project and it will be continously improved upon. This was kept as modular as possible save for including torch code in the genutils.generate function. I am doing this because as I use stable diffusion in code, I realize the need for specific things.

Features

  • Easy prompt randomization through templating
  • Keyframe base prompt updates
  • Image2Image generation witih seed control
  • Video2Video generation with seed control and overrides
  • Multiple Image to Multiple Image generation
  • Batch Processing for Image2Image and Video2Video - Get as many as you want from one input
  • Outputs all seeds, strengths and prompts generated into a file called vidmap.json
  • Stores all output videos and images with seed data and index number for easy association with stored map

PromptGenerator promptgen.py a standalone class that generates prompts based on possibilities you give it so that you can easily change out any word or phrase in your prompt with a random entry from a list of possibilities. It makes prompt templates re-usable, just change the data to get different prompts

Scaffold genutils.py a standalone class that holds all the utility functions and takes a (device, scheduler, pipe) from the torch environment thus wrapping the initial Stable Diffusion functionality and utility functions into one class, and expanding on that functionality with a Video2Video function, and batching.

sdunlock.py includes an unlocked version of the pipelines for Text to Image and Image to Image.

The greatest standalone feature here: PromptGenerator

This is a unique (afaik) way to separate your data you want to rotate/randomize in your prompt and your prompt itself.

Imagine you have a prompt:

_prompt = 'A tall, skinny man walking along a tight rope at night.'

But you want to have options for how the man looks, what he's doing and when, plus you want Stable Diffusion to render from those options randomly or in a rotating fashion. Let's even go so far as to say you want that to be different on every new image you render. What do you do? You make your prompt like this:

_prompt = 'A $height, $composure $sex $action $timeofday'

Then you give your prompt what to fill in for that template

data = {
  'height': [
    'tall',
    'short',
    'average height'
  ],
  'composure': [
    'fat',
    'skinny',
    'lanky',
    'fit',
    'muscular',
    'wide shouldered',
  ],
  'sex': [
    'man',
    'woman',
  ],
  'action': [
    'walking on a tightrope',
    'drinking a coffee',
    'playing soccer',
    'doing yoga',
    'on a trapese',
    'dressed as a clown',
  ],
  'timeofday': [
    'at night',
    'in the morning',
    'in the afternoon',
    'at dawn',
    'at sunset',
    'during a hailstorm',
  ],
}

Add it to a prompt generator object

prompt = PromptGenerator(_prompt, data)

Then every time you call prompt.generate() function it will give you a new generated text prompt

for i in range(0, 10):
  print(prompt.generate())

This should output something along these lines:

('A average height, fat woman walking on a tightrope during a hailstorm', 0.5, {'height': 'average height', 'composure': 'fat', 'sex': 'woman', 'action': 'walking on a tightrope', 'timeofday': 'during a hailstorm'})

('A tall, fat woman dressed as a clown in the morning', 0.5, {'height': 'tall', 'composure': 'fat', 'sex': 'woman', 'action': 'dressed as a clown', 'timeofday': 'in the morning'})

('A short, skinny woman on a trapese at dawn', 0.5, {'height': 'short', 'composure': 'skinny', 'sex': 'woman', 'action': 'on a trapese', 'timeofday': 'at dawn'})

('A short, fit woman walking on a tightrope at sunset', 0.5, {'height': 'short', 'composure': 'fit', 'sex': 'woman', 'action': 'walking on a tightrope', 'timeofday': 'at sunset'})

('A tall, fat woman drinking a coffee at sunset', 0.5, {'height': 'tall', 'composure': 'fat', 'sex': 'woman', 'action': 'drinking a coffee', 'timeofday': 'at sunset'})

('A average height, wide shouldered man drinking a coffee at dawn', 0.5, {'height': 'average height', 'composure': 'wide shouldered', 'sex': 'man', 'action': 'drinking a coffee', 'timeofday': 'at dawn'})

('A tall, fat man dressed as a clown at night', 0.5, {'height': 'tall', 'composure': 'fat', 'sex': 'man', 'action': 'dressed as a clown', 'timeofday': 'at night'})

('A tall, wide shouldered woman drinking a coffee at dawn', 0.5, {'height': 'tall', 'composure': 'wide shouldered', 'sex': 'woman', 'action': 'drinking a coffee', 'timeofday': 'at dawn'})

('A tall, fat man doing yoga at sunset', 0.5, {'height': 'tall', 'composure': 'fat', 'sex': 'man', 'action': 'doing yoga', 'timeofday': 'at sunset'})

('A average height, wide shouldered woman walking on a tightrope at sunset', 0.5, {'height': 'average height', 'composure': 'wide shouldered', 'sex': 'woman', 'action': 'walking on a tightrope', 'timeofday': 'at sunset'})

The tuple that prompt.generate() outputs is (prompt, strength, data_map)

  • prompt - The prompt string to feed to your diffusion model
  • strength - For use with img2img and video2video
  • data_map - You can use this if you want to store the attributes of your creations in an index file for instance, this is used in Scaffold for example

How to use PromptGenerator and Scaffold with Stable Diffusion

First thing first, you have to include the libraries

from promptgen import PromptGenerator
from genutils import Scaffold

Make sure to have your device, scheduler and pipe declared in your torch code, then put them in the scaffold ...

scaffold = Scaffold(device, scheduler, pipe)

Prompt Generation and data setup

Setting up a new Prompt with different possible outputs

map = {
  'attribute1': ['tall', 'short', 'lumpy'],
  'attribute2': ['headed', 'framed', 'legged'],
  ...
}
prompt = PromptGenerator('A self portrait of a $attribute1 $attribute2 person', map)

What about getting a text prompt from this?

text_prompt, strength, prompt_data = prompt.generate()

Examples

Image2Image Example generating 1 image with a random seed

img_path = 'path_to_some_image.jpg'

scaffold.generate(
  img_path,
  prompt,
  num_seeds = 1
)

Image2Image Generate Many Images with different seeds from one Image

scaffold.generate(
  img_path,
  prompt,
  num_seeds = 100
)

Video2Video Example

video_path = 'path_to_some_video.mp4'
map = { ... }
prompt = PromptGenerator( ... )

scaffold.generate_video(
  video_path,
  prompt
)

Batch Video2Video Example

# Generate 10 different videos from one video
scaffold.generate_batch_videos(
  video_path,
  10,
  prompt
)

Prompts that change at Keyframes using PromptGenerator

# Make a bunch of videos with different subjects and the same sequence of scenes
map = {
  'aspect': ['hairy', 'hairless', 'suspicious looking', 'happy'],
  'creature': ['chinchilla', 'armadillo', 'cockatil', 'ferret', 'rooster']
}

prompt = PromptGenerator({
  0: 'a $aspect $creature walking on a tightrope. wide angle shot from the side.',
  122: 'a $aspect $creature falling in the air tumbling. wide angle shot from below.',
  184: 'a $aspect $creature falling into a small drinking glass.',
  192: 'a $aspect $creature squished inside a small drinking glass.'
}, map)

scaffold.generate_batch_videos(
  video_path,
  100,
  prompt
)

Notes about Video2Video and getting it just right

If you use the strength override as arguments to the scaffold.generate_video and scaffold.generate_batch_videos functions, you can test around for the right strength for your set of prompt possibilities. This should help you find a good tradeoff between matching your prompt set more and being a choppy animation.

For example: if you do strength 0.1 or around there, you can get really smooth video, but not much dreaming toward the prompt, whereas if you do 0.7 you can get a pretty choppy but still good video that matches your prompt much better.

My suggestion is to use a video that approximates the same style and actions as you want to recreate with your prompts, AND to make the prompt possibilities mostly include things that are going to be very different, using the !! and (( syntax to aid this. That will allow you to keep your strength relatively down so that you get unique enough videos that are also not so choppy.

This entire thing will be fixed by the upcoming integration of /u/aqwis's gist code with the K_Euler sampler reversal. This should allow you to bring strength down and also still be able to get a proper match for each frame of the video as far as it's noise. The one downside to this new method is the elimination of the "seed" parameter being replaced by noise which most matches the initial input image for each frame. This should not be such a problem. If you have enough possibilities in your PromptGenerator map space (you can see this by running the prompt.stats() function,) you should not have to worry about creating enough video variations using seeds. The stats function will tell you how many possibilities your prompt template and data will create.

Yes, Yes, There is an NFTUtil now...

Look, I know that SD people in reddit and NFT people in reddit don't get along but we're on github now and this is a utility library that I hope to make into a multi-stack library. This means you should be able to, with a few lines of code, do everything from generating, upscaling, cropping, batching, launching an NFT collection on IPFS with full metadata control in one run. This is just the first direction of a stack that I have taken. Each class of this repository shoudl work standalone or with minimal in-library requirements. Imagine in the near future when we've got this building entire landing pages from a prompt. SD and other LDMs are going to interrupt so many industries that I'm amazed it has gotten this far as an open source project. Anywho, sure, NFTs are whatever you want them to be, and here, you can make a collection of 10,000 of these in about 60 Google Colab hours.

TLDR; Leave politic to reddit.

Using the NFTUtil class in your workflows

# example coming soon

Coming Soon

  • K_Euler sampler reversal trick for Video2Video consistency
  • Image/Video in-painting and out-painting
You might also like...

Diffusion attentive attribution maps for interpreting Stable Diffusion.

What the DAAM: Interpreting Stable Diffusion Using Cross Attention Caveat: the codebase is in a bit of a mess. I plan to continue refactoring and poli

Nov 22, 2022

A Stable Diffusion desktop frontend with inpainting, img2img and more!

A Stable Diffusion desktop frontend with inpainting, img2img and more!

UnstableFusion A Stable Diffusion desktop frontend with inpainting, img2img and more! trailer.mp4 How to run locally? Install the dependencies (for ex

Nov 19, 2022

Long-form text-to-images generation, using a pipeline of deep generative models (GPT-3 and Stable Diffusion)

Long-form text-to-images generation, using a pipeline of deep generative models (GPT-3 and Stable Diffusion)

Long Stable Diffusion: Long-form text to images e.g. story - Stable Diffusion - illustrations Right now, Stable Diffusion can only take in a short p

Nov 27, 2022

AI imagined images. Pythonic generation of stable diffusion images.

AI imagined images. Pythonic generation of stable diffusion images.

ImaginAIry 🤖 🧠 AI imagined images. Pythonic generation of stable diffusion images. "just works" on Linux and OSX(M1). Examples pip install imagin

Nov 28, 2022

This is a template for drive code in future years. To use it for a new year, create a new repository in the Choate Robotics Organization and select this as the "Repository Template".

7407-DriveCode-{TEMPLATE} Team 7407 Wired Boars {TEMPLATE} Robot Code File Tree: 7407-DriveCode-Template ├── autonomous (Contains autonomous routines

Oct 22, 2022

A set of scripts I use with stable diffusion.

the Artificial Artist Set-up Dependencies Let's install the dependencies in a fresh conda environment: conda create --name artist python conda activat

Aug 26, 2022

GPU-ready Dockerfile to run the Stability.AI stable-diffusion model with a simple web interface

GPU-ready Dockerfile to run the Stability.AI stable-diffusion model with a simple web interface

A friend of mine working in art/design wanted to try out Stable Diffusion on his own GPU-equipped PC, but he doesn't know much about coding, so I thou

Nov 23, 2022

A Python Script developed by Md Nayem Sheikh (Noob-Hacker71). This script has too many option for customize your termux Terminal and Command prompt . Follow me for more ...

Termux-Style A Python Script developed by Md Nayem Sheikh (Noob-Hacker71). This script has too many option for customize your termux Terminal and Comm

Jul 3, 2022
Owner
null
Stable Diffusion web UI - A browser interface based on Gradio library for Stable Diffusion

Stable Diffusion web UI A browser interface based on Gradio library for Stable Diffusion. Features Detailed feature showcase with images: Original txt

null 23k Nov 29, 2022
Fully AI powered promt enhancer. Allow a user to autocomplete a stable diffusion or midjourney prompt with suggested terms likely to improve image quality by choosing form successive dropdown lists.

Prompt_Enhancer Fully AI powered prompt enhancer. Allow a user to autocomplete a stable diffusion or midjourney prompt with suggested terms likely to

Robert DiBiano 3 Oct 10, 2022
Simple prompt generator for Midjourney, DALLe, Stable and Disco Diffusion, and etc.

Simple Prompt Generator Simple prompt generation script for Midjourney, DALLe, Stable and Disco diffusion and etc neural networks. Quick start Downloa

Артём Голуб 41 Nov 26, 2022
This repository implements a prompt tuning model for hierarchical text classification. This work has been accepted as the long paper "HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification" in EMNLP 2022.

Implement of HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification This repository implements a prompt tuning model for hierarchical

Wang Zihan 24 Nov 22, 2022
Script for AUTOMATIC1111/stable-diffusion-webui to provide a way to quickly add tags from a list to your prompt

tagger Script for AUTOMATIC1111/stable-diffusion-webui to provide a way to quickly add tags from a list to your prompt What it does The script will ad

null 18 Nov 7, 2022
Script for Automatic1111 Stable Diffusion to pull a random lexica prompt.

sd-lexica2prompt A script for Automatic1111 Stable Diffusion that pulls prompts from Lexica.art based on search terms. How to Install Drop lexica2prom

null 21 Nov 21, 2022
In stable diffusion, generate a sequence of images shifting attention in the prompt.

shift-attention In AUTOMATIC1111/stable-diffusion-webui, generate a sequence of images shifting attention in the prompt. This script enables you to gi

null 36 Nov 16, 2022
Prompt-aligned Gradient for Prompt Tuning

Prompt-aligned Gradient for Prompt Tuning We present Prompt-aligned Gradient, dubbed ProGrad, to prevent prompt tuning from forgetting the the general

PatatiPatata 71 Nov 25, 2022
Generate morph sequences with Stable Diffusion. Interpolate between two or more prompts and create an image at each step.

prompt-morph Generate morph sequences with Stable Diffusion. Interpolate between two or more prompts and create an image at each step. Installation Co

null 49 Nov 26, 2022
【Free & Stable discord bot】 Ticket System|Anti-Spam System|Suggestions System|Welcomer|Moderation|Games|Giveaways|over 80+ commands.

Shinobi Bot The bot has a smart anti-spam system, bad words filter, ticket system, suggestions system, welcomer, games and every moderation command yo

Shinobi 9 Oct 11, 2022