MegEngine implementation of Diffusion Models.

Overview

MegDiffusion

MegEngine implementation of Diffusion Models (in early development).

Current maintainer: @MegChai

Usage

Infer with pre-trained models

Now users can use megengine.hub to get pre-trained models directly:

megengine.hub.list("MegEngine/MegDiffusion:main")
megengine.hub.help("MegEngine/MegDiffusion:main", "ddpm_cifar10")
model = megengine.hub.load("MegEngine/MegDiffusion:main", "ddpm_cifar10", pretrained=True)
model.eval()

Or if you have downloaded or installed MegDiffusion, you can get pre-trained models from model module.

from megdiffusion.model import ddpm_cifar10
model = ddpm_cifar10(pretrained=True)
model.eval()

The inference script shows how to generate 64 CIFAR10-like images and make a grid of them:

python3 -m megdiffusion.scripts.inference

Train from scratch

  • Take DDPM CIFAR10 for example:

    python3 -m megdiffusion.scripts.train \
        --flagfile ./megdiffusion/config/ddpm-cifar10.txt
  • [Optional] Overwrite arguments:

    python3 -m megdiffusion.scripts.train \
       --flagfile ./megdiffusion/config/ddpm-cifar10.txt \
       --logdir ./path/to/logdir \
       --batch_size=64 \
       --save_step=100000 \
       --parallel=True

Known issues:

  • Training with single GPU & using gradient clipping will cause error in MegEngine 1.9.x version.

Development

python3 -m pip install -r requirements.txt
python3 -m pip install -v -e .

Develop this project with a new branch locally, remember to add necessary test codes. If finished, submit Pull Request to the main branch then just wait for review.

Acknowledgment

The following open-sourced projects was referenced here:

Thanks to people including @gaohuazuo, @xxr3376, @P2Oileen and other contributors for support in this project. The R&D platform and the resources required for the experiment are provided by MEGVII Inc. The deep learning framework used in this project is MegEngine -- a magic weapon.

Citations

@article{ho2020denoising,
    title   = {Denoising Diffusion Probabilistic Models},
    author  = {Jonathan Ho and Ajay Jain and Pieter Abbeel},
    year    = {2020},
    eprint  = {2006.11239},
    archivePrefix = {arXiv},
    primaryClass = {cs.LG}
}
You might also like...

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation

Speech Enhancement and Dereverberation with Diffusion-based Generative Models This repository contains the official PyTorch implementations for the 20

Nov 19, 2022

Self-contained, minimalistic implementation of diffusion models with Pytorch.

minDiffusion Goal of this educational repository is to provide a self-contained, minimalistic implementation of diffusion models using Pytorch. Many i

Nov 25, 2022

Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch

Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch

these fireworks do not exist Video Diffusion - Pytorch (wip) Text to video, it is happening! Official Project Page Implementation of Video Diffusion M

Nov 21, 2022

Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch

Retrieval-Augmented Denoising Diffusion Probabilistic Models (wip) Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in P

Nov 16, 2022

An implementation of Elucidating the Design Space of Diffusion-Based Generative Models (Karras et al., 2022) for PyTorch.

k-diffusion An implementation of Elucidating the Design Space of Diffusion-Based Generative Models (Karras et al., 2022) for PyTorch. This repo is a w

Nov 29, 2022

This is the official implementation for Estimating the Optimal Covariance with Imperfect Mean in Diffusion Probabilistic Models (Accepted in ICML 2022)

Official implementation for Estimating the Optimal Covariance with Imperfect Mean in Diffusion Probabilistic Models (ICML 2022), and a reimplementation of Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models (ICLR 2022)

Nov 20, 2022

Pytorch implementation of diffusion models on Lie Groups for 6D grasp pose generation https://sites.google.com/view/se3dif/home

Pytorch implementation of diffusion models on Lie Groups for 6D grasp pose generation https://sites.google.com/view/se3dif/home

Pytorch implementation of Diffusion models in SE(3) for grasp and motion generation This library provides the tools for training and sampling diffusio

Nov 7, 2022

Official PyTorch implementation of "Denoising MCMC for Accelerating Diffusion-Based Generative Models"

Official PyTorch implementation of

DMCMC Official PyTorch implementation of Denoising MCMC for Accelerating Diffusion-Based Generative Models. We propose a general sampling framework, D

Nov 16, 2022

BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis

BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis

Bilateral Denoising Diffusion Models (BDDMs) This is the official PyTorch implementation of the following paper: BDDM: BILATERAL DENOISING DIFFUSION M

Nov 17, 2022
Comments
  • eps for GroupNorm

    eps for GroupNorm

    Great work! The paramter 'eps' in group norm will be initialized to 1e-5 by default. However, the group norm in TensorFlow has a little diference, which is initialized with 1e-6. Maybe it doesn't have any influence on training results, but can you just modify this(for all GroupNorm in code) for aligning? Because I want to convert the trained model from torch or tf to megengine, the less the error is, the better it is.

    opened by Asthestarsfalll 5
  • About padding in Downsample

    About padding in Downsample

    I'm willing to upload my convert codes, but it doesn't work well after converting. The error between megengine and pytorch implementation are high with the same input. Because of the padding of convolution in Downsample are different, which in pytorch implementation it uses asymmetric padding. Atfter I modified the megengine implmetation, the result:

    class DownSample(M.Module):
        """"A downsampling layer with an optional convolution.
    
        Args:
            in_ch: channels in the inputs and outputs.
            use_conv: if ``True``, apply convolution to do downsampling; otherwise use pooling.
        """""
    
        def __init__(self, in_ch, with_conv=True):
            super().__init__()
            self.with_conv = with_conv
            if with_conv:
                self.main = M.Conv2d(in_ch, in_ch, 3, stride=2)
            else:
                self.main = M.AvgPool2d(2, stride=2)
    
        def _initialize(self):
            for module in self.modules():
                if isinstance(module, M.Conv2d):
                    init.xavier_uniform_(module.weight)
                    init.zeros_(module.bias)
    
        def forward(self, x, temb):  # add unused temb param here just for convince
            if self.with_conv:
                x = F.nn.pad(x, [*[(0, 0)
                             for i in range(x.ndim - 2)], (0, 1), (0, 1)])
            return self.main(x)
    

    image

    Btw, I'm also a beginner in ddpm, your blog helps me a lot!

    Originally posted by @Asthestarsfalll in https://github.com/MegEngine/MegDiffusion/issues/5#issuecomment-1193254961

    opened by ChaiEnjoy 3
  • Gradient clipping issues in MegEngine v1.9.x

    Gradient clipping issues in MegEngine v1.9.x

    Description

    Training with a single GPU & using gradient clipping in this codebase will cause an error in MegEngine 1.9.x version. After 1 iteration with auto diff & parameter update, the next time the model do forward will break. Error message:

    RuntimeError: assertion `filter.ndim == img_ndim + 2 || filter.ndim == img_ndim + 4' failed at ../../../../../../imperative/src/impl/ops/convolution.cpp:61: megdnn::TensorLayout mgb::imperative::{anonymous}::convolution::do_shape_infer(const mgb::imperative::OpDef&, size_t, megdnn::TensorLayout, megdnn::TensorLayout)
    extra message: bad filter ndim for dense convolution: spatial_ndim=2 filter_ndim=0
    

    Here is the simplest example to reproduce this problem:

    import megengine
    import megengine.functional as F
    import megengine.module as M
    import megengine.optimizer as optim
    import megengine.autodiff as autodiff
    
    megengine.async_level = 0
    
    class SimpleModel(M.Module):
        def __init__(self, in_ch):
            super().__init__()
            self.conv1 = M.Conv2d(in_ch, in_ch, 3, stride=1, padding=1)
            self.conv2 = M.Conv2d(in_ch, in_ch, 3, stride=1, padding=1)
    
        def forward(self, x):
            x = self.conv1(x)
            x = F.nn.interpolate(x, scale_factor=1, mode="nearest")
            x = self.conv2(x)
            return x
    
    if __name__ == "__main__":
        x = F.ones((1, 1, 2, 2))
        model = SimpleModel(in_ch = 1)
    
        optimizer = optim.SGD(model.parameters(), lr=1e-3)
        gm = autodiff.GradManager()
        gm.attach(model.parameters())
    
        with gm:
            loss = model(x) + 0
            gm.backward(loss)
            
        optim.clip_grad_norm(model.parameters(), max_norm=1.)
        optimizer.step()
        y = model(x)
    

    Workaround

    • Solution 1: Comment this line in megdiffusion.scripts.train:

      optim.clip_grad_norm(model.parameters(), FLAGS.grad_clip)
      

      Then we can train the model without clipping grad. ( But it's not expected... 😣 )

    • Solution 2: This situation did not happen when using distributed training.

    • Solution 3: Try changing loss = model(x) + 0 to loss = model(x) 🤔🤔🤔

    • Solution 4: Try deleting x = F.nn.interpolate(x, scale_factor=1, mode="nearest") 🤔🤔🤔

    Issue Track

    This problem was fixed in https://github.com/MegEngine/MegEngine/commit/df5ebd3da7495b8eb8f079651dbe980c5f4d7d37 so you can wait for the release of MegEngine v1.10 or build MegEngine dev latest than this commit from the source.

    bug 
    opened by ChaiEnjoy 1
  • Handle with saving checkpoint failed

    Handle with saving checkpoint failed

    If the machine is preemptive, it might be scheduled to be preempted (or encounter other situations that cause the machine to go down). If the checkpoint is being saved at the exact moment, the original data will be corrupted. Therefore, it is reasonable to keep multiple backups locally. Considering the disk space occupancy, it is better to support cloud storage, such as supporting the use of AWS s3.

    opened by ChaiEnjoy 0
Releases(v0.0.2)
  • v0.0.2(Aug 4, 2022)

    Try using MegDiffuson to sample images with DDPM pretrained model. (It's still a demo script now.)

    python3 -m megdiffusion.pipeline.ddpm.sample --config ./configs/ddpm/cifar10.yaml
    

    Other configs are available here.

    Thanks to @Asthestarsfalll for converting original Tensorflow checkpoints to MegEngine (from @pesser ‘s PyTorch pretrained diffusion models and pretrained models on Huggingface provided by Google ).

    Source code(tar.gz)
    Source code(zip)
Owner
旷视天元 MegEngine
旷视天元 MegEngine
This repo is the official megengine implementation of the ECCV2022 paper: Efficient One Pass Self-distillation with Zipf's Label Smoothing.

This repo is the official megengine implementation of the ECCV2022 paper: Efficient One Pass Self-distillation with Zipf's Label Smoothing. The pytorc

MEGVII Research 20 Nov 10, 2022
Official MegEngine implementation of ECCV2022 "D2C-SR: A Divergence to Convergence Approach for Real-World Image Super-Resolution".

[ECCV 2022] D2C-SR: A Divergence to Convergence Approach for Real-World Image Super-Resolution Youwei Li$^1$, Haibin Huang$^2$, Lanpeng Jia$^1$, Haoqi

MEGVII Research 31 Nov 14, 2022
The official MegEngine implementation of the ECCV 2022 paper: Ghost-free High Dynamic Range Imaging with Context-aware Transformer

[ECCV 2022]Ghost-free High Dynamic Range Imaging with Context-aware Transformer By Zhen Liu1, Yinglong Wang2, Bing Zeng3 and Shuaicheng Liu3,1* 1Megvi

MEGVII Research 52 Nov 11, 2022
MegBox is a easy-use, well-rounded and safe toolbox of MegEngine. Aim to imporving usage experience and speeding up develop process.

MegBox Introduction MegBox is a easy-use, well-rounded and safe toolbox of MegEngine. Aim to imporving usage experience and speeding up develop proces

null 4 Nov 8, 2022
Official PyTorch implementation for paper: Diffusion-GAN: Training GANs with Diffusion

Diffusion-GAN — Official PyTorch implementation Diffusion-GAN: Training GANs with Diffusion Zhendong Wang, Huangjie Zheng, Pengcheng He, Weizhu Chen a

Daniel 174 Nov 21, 2022
Implementation of Bit Diffusion, Hinton's group's attempt at discrete denoising diffusion, in Pytorch

Bit Diffusion - Pytorch Implementation of Bit Diffusion, Hinton's group's attempt at discrete denoising diffusion, in Pytorch It seems like they misse

Phil Wang 151 Nov 16, 2022
Minimal diffusion model for generating MNIST, from 'Classifier-Free Diffusion Guidance'

Conditional Diffusion MNIST script.py is a minimal, self-contained implementation of a conditional diffusion model. It learns to generate MNIST digits

Tim Pearce 82 Nov 18, 2022
Stable Diffusion web UI - A browser interface based on Gradio library for Stable Diffusion

Stable Diffusion web UI A browser interface based on Gradio library for Stable Diffusion. Features Detailed feature showcase with images: Original txt

null 23k Nov 29, 2022
Stable Diffusion Video to Video, Image to Image, Template Prompt Generation system and more, for use with any stable diffusion model

SDUtils: Stable Diffusion Utility Wrapper Stable Diffusion General utilities wrapper including: Video to Video, Image to Image, Template Prompt Genera

null 14 Oct 17, 2022
Diffusion attentive attribution maps for interpreting Stable Diffusion.

What the DAAM: Interpreting Stable Diffusion Using Cross Attention Caveat: the codebase is in a bit of a mess. I plan to continue refactoring and poli

Castorini 157 Nov 22, 2022