Repo for external large-scale work

Related tags

Admin Panels metaseq
Overview

Metaseq

A codebase for working with Open Pre-trained Transformers.

Using OPT with 🤗 Transformers

The OPT 125M--30B models are now available in HuggingFace Transformers.

Getting Started in Metaseq

Follow setup instructions here to get started.

Documentation on workflows

Background Info

Support

If you have any questions, bug reports, or feature requests regarding either the codebase or the models released in the projects section, please don't hesitate to post on our Github Issues page.

Please remember to follow our Code of Conduct.

Contributing

We welcome PRs from the community!

You can find information about contributing to metaseq in our Contributing document.

The Team

Metaseq is currently maintained by the CODEOWNERS: Susan Zhang, Stephen Roller, Anjali Sridhar, Naman Goyal, Punit Singh Koura, Moya Chen, and Christopher Dewan.

License

The majority of metaseq is licensed under the MIT license, however portions of the project are available under separate license terms:

Comments
  • [Community] OPT Inference in HF Transformers

    [Community] OPT Inference in HF Transformers

    You can now use the OPT models in Hugging Face Transformers

    Go here for details: https://twitter.com/huggingface/status/1524783489593360385

    (Edited by admin. Original post below)


    We're working hard at Hugging Face on adding all the checkpoints to Transformers. Thanks to @stephenroller and co. , we've now managed to correctly convert the checkpoints. They are all uploaded here: https://huggingface.co/models?other=opt_metasq

    If you go into a specific repo, you'll find a detailed explanation on how to run them.

    question 
    opened by patrickvonplaten 28
  • [scripts] Convert resharded MP checkpoints to unflattened.

    [scripts] Convert resharded MP checkpoints to unflattened.

    Patch Description Adds a new script which meets the requirements of #31.

    Testing steps Ran on 125m. The usage example in the docstring shows real output.

    cla signed 
    opened by stephenroller 27
  • 175B model outputting gibberish?

    175B model outputting gibberish?

    🐛 Bug

    To Reproduce

    I'm running python3 -m metaseq_cli.interactive_hosted on a DGX machine not from SLURM. To avoid the SLURM initialization path I'm setting force_distributed=True to hit _infer_single_node_init.

    I'm using the defaults from constants.py except:

    • I commented out --distributed-port (to avoid SLURM init)
    • MODEL_SHARED_FOLDER = "/home/hlang/opt_175b_consolidated/"
    • I set CHECKPOINT_LOCAL = os.path.join(MODEL_SHARED_FOLDER, "reshard.pt")
      • putting "reshard.pt" at the end here seemed very important? That way in load_model_ensemble_and_task this line:
       filename = filename.replace(".pt", suffix + ".pt")
      

      correctly sets the filename to be reshard-model_part-{model_part_number}.pt for each worker. This is the main part I'm not sure about?

    MODEL_SHARED_FOLDER looks like:

    (base) [[email protected] ~]$ ls -1 /home/hlang/opt_175b_consolidated/
    dict.txt
    gpt2-merges.txt
    gpt2-vocab.json
    reshard-model_part-0.pt
    reshard-model_part-1.pt
    reshard-model_part-2.pt
    reshard-model_part-3.pt
    reshard-model_part-4.pt
    reshard-model_part-5.pt
    reshard-model_part-6.pt
    reshard-model_part-7.pt
    

    I got the gpt2-* files from assets/ and the dict.txt from the file @stephenroller posted in https://github.com/facebookresearch/metaseq/issues/19.

    That all seems to work: I get all the way to:

    2022-05-10 03:00:29 | INFO | metaseq.modules.fused_bias_gelu | Done with compiling and loading fused kernels.
    2022-05-10 03:00:39 | INFO | metaseq.checkpoint_utils | Done loading state dict
    2022-05-10 03:00:39 | INFO | metaseq_cli.interactive | loaded model 0
    2022-05-10 03:00:39 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:12 to store for rank: 0
    2022-05-10 03:00:47 | INFO | torch.distributed.distributed_c10d | Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 8 nodes.
    2022-05-10 03:00:50 | INFO | metaseq_cli.interactive | Worker engaged! 192.168.2.1:6010
     * Serving Flask app 'interactive_hosted' (lazy loading)
     * Environment: production
       WARNING: This is a development server. Do not use it in a production deployment.
       Use a production WSGI server instead.
     * Debug mode: off
    

    Running a prompt

    With the server up and running, sending:

    wget -O- --post-data='{"prompt": "Paris is the capital of ", "max_tokens": "1", "temperature": "0.0"} --header='Content-Type:application/json' localhost:6010/completions
    

    gives the response:

    "choices":[{
       "logprobs":{
          "finish_reason":"length",
          "text_offset":[0],
          "token_logprobs":[56.087547302246094],
          "tokens":[" hoped"],
          "top_logprobs":null},
       "text":" hoped"}]
    

    The server logs for this request look pretty benign (should "input" say something else though?):

    2022-05-10 03:44:32 | INFO | metaseq.hub_utils | Preparing generator with settings {'_name': None, 'beam': 1, 'nbest': 1, 'max_len_a': 0, 'max_len_b': 8, 'min_len': 7, 'sampling': True, 'sampling_topk': -1, 'sampling_topp': 1.0, 'temperature': 1.0, 'no_seed_provided': False, 'buffer_size': 4194304, 'input': '-'}
    2022-05-10 03:44:32 | INFO | metaseq.hub_utils | Executing generation on input tensor size torch.Size([1, 7])
    2022-05-10 03:44:32 | INFO | metaseq.hub_utils | Total time: 0.312 seconds; generation time: 0.309
    2022-05-10 03:44:32 | INFO | werkzeug | 127.0.0.1 - - [10/May/2022 03:44:32] "POST /completions HTTP/1.1" 200 -
    

    Other queries gave similarly weird text:

    wget -O- --post-data='{"prompt": "Review: This movie was terrible! Sentiment: ", "max_tokens": "10"} --header='Content-Type:application/json' localhost:6010/completions
    

    yields: "text":" play little to become heard sure intended reg to seriously"

    bug 
    opened by hunterlang 17
  • 175B opt model seems to output gibberish

    175B opt model seems to output gibberish

    ❓ Questions and Help

    What is your question?

    We recently downloaded the open sourced OPT-175B weight and managed to set it up using our own Alpa system, and tried to play with it. Thanks for the effort of making it public!

    Unfortunately, we found that the175B model outputs gibberish. An example:

    Prompt: Computer science is the study of computation Output: Computer science is the study of computation and a gotten hoped planned done hoped also to practice to to a come to become nothing to to good taken become hoped to placed originally better only to to just gotten better just given done issues hoped good come come some'

    The way we process the 175B model is as follows:

    1. use the scripts/reshard_mp.py to merge FSDP shards (192) into one for each model-parallel part, resulting in 8 parts.
    2. use the scripts/consolidate_fsdp_shard.py to merge the 8 parts into one consolidated version (the core function used here is glue_megatron_parts())
    3. then we load this model into our system.

    All the other models processed following this procedure (from 125M to 30B ones) work pretty well, except that the 175B outputs gibberish.

    We also cross-checked with HuggingFace's implementation models from 125M to 30B -- the output logits are the same on models. Hence it shouldn't be a problem of the Alpa system.

    Here is our model spec: https://github.com/alpa-projects/alpa/blob/torch_transformer_model/playground/model/opt_model.py#L499

    We're wondering if you could give any hints? Several questions:

    1. we downloaded the 992 shards about 2 weeks ago. Since then, have you updated the weights?
    2. is there any architectural changes specific to the 175B weight we're unaware of?

    What's your environment?

    • metaseq Version (e.g., 1.0 or master): master
    • PyTorch Version (e.g., 1.0): 1.10
    • OS (e.g., Linux): Ubuntu
    • How you installed metaseq (pip, source): source
    • Build command you used (if compiling from source):
    • Python version: 3.8
    • CUDA/cuDNN version: 11.1
    • GPU models and configuration: 32x V100
    • Any other relevant information:
    question 
    opened by zhisbug 14
  • Add Aim logging to progress_bar

    Add Aim logging to progress_bar

    Patch Description

    • Added AimProgressBarWrapper class; an implementation of BaseProgressBar using Aim, an open-source experiment tracker: https://github.com/aimhubio/aim.
    • Added two new arguments to CommonConfig used to control AimProgressBarWrapper creation/usage:
      • aim_repo: path to the Aim repository.
      • aim_run_hash: If specified, will be used to determine aim.Run for metadata tracking. If skipped, checkpoint.save_dir will be used to find and re-open the existing Run or if no matching Run is found, new one is created.

    Testing steps Make sure progress_bar logs data to selected outputs, with Aim logger enabled/disabled. Run unit-tests.

    • [ ] Was this discussed/approved via a Github issue?
    • [x] Did you read the contributor guideline?
    • [ ] Did you make sure to update the docs?
    • [ ] Did you write any new necessary tests?
    cla signed 
    opened by alberttorosyan 14
  • Loading models

    Loading models

    ❓ Questions and Help

    Before asking:

    1. search the issues.
    2. search the docs.

    What is your question?

    I am trying to load the OPT models in a similar way as the fairseq models, but I seem to hit into quite some snags. Has anyone got a good example on how to load these models?

    Code

    from metaseq.models.transformer_lm import TransformerLanguageModel
    model = TransformerLanguageModel.from_pretrained("./model_location")
    

    What have you tried?

    I have already tried various alternatives, but they always get the notification that the model could not get loaded, that it's missing something, etc.

    What's your environment?

    • fairseq Version (e.g., 1.0 or master): master
    • PyTorch Version (e.g., 1.0): 1.11
    • OS (e.g., Linux): Linux
    • How you installed fairseq (pip, source): source
    • Build command you used (if compiling from source): pip install -e .
    • Python version: 3.9
    • CUDA/cuDNN version: N/A
    • GPU models and configuration: N/A
    • Any other relevant information: Just wanting to load the weights for conversion.
    question 
    opened by mrseeker 14
  • Availability of 175B model

    Availability of 175B model

    Hi,

    I filled out the Google form for accessing the 175B model last week from my corporate email ID , I haven't heard anything since. Does anyone know how long iot takes to be granted (or denied) model access? Thanks

    question 
    opened by Mrs-Hudson 11
  • continue opt175b download from the latest available file

    continue opt175b download from the latest available file

    Patch Description Add script to continue the download after the latest available opt175b file in the case of an internet issue or interruption.

    Testing steps The following cases are tested:

    • file does not exist.
    • file exist with zero byte in case of running the script without internet.

    related request #288

    cla signed 
    opened by Barqawiz 10
  • OPT 175B inference using accelerate

    OPT 175B inference using accelerate

    What is your question?

    We have ran the OPT 30B model for inference, using the accelerate library, with multi GPU configuration. Reference notebook - Accelerate_OPT. So, can we use accelerate to run OPT 175B model, for inference, by loading consolidated shards?

    What's your environment?

    -metaseq Version (e.g., 1.0 or master): -PyTorch Version (e.g., 1.0): -OS (e.g., Linux): Linux -How you installed metaseq (pip, source): source -Build command you used (if compiling from source): Referred this link : setup -Python version: 3.7.12 -CUDA/cuDNN version: 11.2 -GPU models and configuration: 16 A100 40GB -Any other relevant information:

    question 
    opened by BalajiAJ 10
  • How to load sharded checkpoints?

    How to load sharded checkpoints?

    ❓ Questions and Help

    After having set-up the libraries as described in: https://github.com/facebookresearch/metaseq/blob/main/docs/setup.md , it is possible to load the 350m checkpoint since it's not sharded as follows:

    wget https://dl.fbaipublicfiles.com/opt/v1_20220502/350m/reshard.pt ./
    
    1. Next we need to comment out one line in the Megatron-LM library which is only relevant for training (initialize different random seeds accross pp ranks): Comment out this line: https://github.com/ngoyal2707/Megatron-LM/blob/ae0b844c1f6725c3433a95e42cac760b3885170b/megatron/initialize.py#L65 in your local clone of Megatron-LM

    2. Now we write the following Python script to a run_model.py file:

    import os
    
    from transformers import AutoTokenizer, GPT2Tokenizer
    from megatron.initialize import initialize_megatron
    from metaseq import checkpoint_utils
    import torch
    
    path = "./"
    
    # arguments taken from: https://arxiv.org/pdf/2205.01068.pdf | table 1
    initialize_megatron(args_defaults={
        "micro_batch_size": 1, 
        "num_layers": 24, 
        "hidden_size": 1024, 
        "num_attention_heads": 16,
        "max_position_embeddings": 2048, 
        "encoder_seq_length": 2048 
    })
    
    tokenizer = GPT2Tokenizer.from_pretrained("patrickvonplaten/opt_gpt2_tokenizer")
    tokenizer.save_pretrained(path)
    
    checkpoint = checkpoint_utils.load_model_ensemble_and_task(
        [os.path.join(path, "reshard.pt")],
        arg_overrides={
            "vocab_filename": os.path.join(path, "vocab.json"),
            "merges_filename": os.path.join(path, "merges.txt"),
        }
    )
    
    model = checkpoint[0][0].eval()
    
    1. We can load the checkpoint when running
    torchrun run_model.py --pipeline-model-parallel-size 1 --tensor-model-parallel-size 1
    

    Problem This only works for the 350m checkpoint!!! For the other checkpoints this doesn't work. E.g. when replacing: [os.path.join(path, "reshard.pt")] by [os.path.join(path, "reshard-model_part-0.pt"), os.path.join(path, "reshard-model_part-1.pt")] (part-0 and part-1 of the 125M model), we're getting an error because the weigths are all flattened into 1D-arrays.

    Using https://github.com/facebookresearch/metaseq/pull/29 sadly also doesn't help, since the checkpoints don't seem to be in the *shard* format as required here: https://github.com/facebookresearch/metaseq/blob/48b9b6c083237f9b95c2eb67afc10005e10d67ee/metaseq/distributed/stitch_fsdp_ckpt.py#L45

    The parameter flattening seems to come from Fairscale and we've found some functionality to unflatten it here: https://github.com/facebookresearch/fairscale/blob/51b53ddb6c3aa77426c7d5cc0b543b79628053c4/fairscale/nn/misc/flatten_params_wrapper.py#L358 , but we don't manage to wrap our head around how to make it work exactly.

    @stephenroller @suchenzang @zhiqwang - any pointers on how we could load the 125M model (and the others) into a model instance of metaseq?

    question 
    opened by patrickvonplaten 9
  • 125m checkpoint outputting gibberish

    125m checkpoint outputting gibberish

    Converting the sharded checkpoints of 125m to a singleton checkpoint with https://github.com/facebookresearch/metaseq/pull/60:

        $ ls 125m
        dict.txt
        gpt2-merges.txt
        gpt2-vocab.json
        reshard-model_part-0.pt
        reshard-model_part-1.pt
        $ python -m metaseq.scripts.convert_to_singleton 125m
    

    gives a new

    restored.pt
    

    file.

    I then transformed the checkpoint into the same format as 350m to test some generation on it:

    import torch
    orig_state = torch.load("./reshard-model_part-0.pt")
    model = torch.load("./restored.pt")
    
    orig_state["model"] = model  # this format allows one to use the standard `checkpoint_utils.load_model_ensemble_and_task` function
    orig_state["cfg"]["model"]._name = "transformer_lm"  # we change the architecture name to "transformer_lm" to be able to run it in a non-CUDA environment
    torch.save(orig_state, "./reshard.pt")
    

    I tried running an inference example on the model to see whether the generation works as expected. Here the code:

    import os
    
    from transformers import GPT2Tokenizer
    from metaseq import checkpoint_utils
    import torch
    
    path = "/home/patrick/add_opt"
    
    """
    $ ls path
    vocab.json
    merges.txt
    reshard.pt
    """
    
    
    tokenizer = GPT2Tokenizer.from_pretrained("patrickvonplaten/opt_gpt2_tokenizer")
    tokenizer.save_pretrained(path)
    
    paths = [os.path.join(path, "reshard.pt")]
    
    checkpoint = checkpoint_utils.load_model_ensemble_and_task(
        paths,
        arg_overrides={
            "vocab_filename": os.path.join(path, "vocab.json"),
            "merges_filename": os.path.join(path, "merges.txt"),
        }
    )
    
    model = checkpoint[0][0].eval()
    
    
    # forward passes
    def single_batch_forward_logits(prompts):
        input_ids = tokenizer(prompts, return_tensors="pt").input_ids
        input_ids = torch.cat([torch.tensor([[2]]), input_ids], dim=-1)
        logits = model(input_ids)[0]
        return logits
    
    
    prompts = [
        "Today is a beautiful day and I want to",
        "In the city of",
        "Paris is the capital of France and",
        "Computers and mobile phones have taken",
    ]
    
    
    print("Next word generation")
    for prompt in prompts:
        print("-------------")
        print(f"Prompt: {prompt}...\n")
        logits = single_batch_forward_logits(prompt)
        pred_next_token = torch.argmax(logits[0, -1], -1)
        next_token = tokenizer.convert_ids_to_tokens([pred_next_token])
        next_token = next_token[0].replace("Ġ", "")
        print(f"Next word: {next_token}")
        print("-------------")
    

    This sadly gives gibberish:

    Next word generation
    -------------
    Prompt: Today is a beautiful day and I want to...
    
    Next word: Robbins
    -------------
    -------------
    Prompt: In the city of...
    
    Next word: of
    -------------
    -------------
    Prompt: Paris is the capital of France and...
    
    Next word: Robbins
    -------------
    -------------
    Prompt: Computers and mobile phones have taken...
    
    Next word: Robbins
    -------------
    

    Note that this script works perfectly fine with the 350m checkpoint.

    @stephenroller - any ideas?

    bug 
    opened by patrickvonplaten 8
  • Changes flag

    Changes flag

    We need Changes=true during training for doc attention to work. We need Changes=False during eval. This solves this using an environment variable called EVAL.

    cla signed 
    opened by sriniiyer 1
  • E2E tests for training resumption

    E2E tests for training resumption

    🚀 Feature Request

    Adding tests for restarting a training run after some interruption

    Motivation

    Restoring training runs from checkpoints is a critical feature for large-scale models and has multiple steps, including downloading checkpoint from blob storage, loading the data to the correct token, storing checkpoints after some conditions are met, etc.

    Tests would allow us to safe-guard this critical flow in case of potential broken changes.

    Pitch

    A CI test that trains a small model, stores a checkpoint file to blob, downloads that checkpoint file and successfully continues the training run.

    Alternatives

    Unit-tests that cover each step separately.

    Additional context

    https://github.com/facebookresearch/metaseq/commit/e3ea5070a8c1bae77703aef7fc0f5537bd437963 -- this commit caused some checkpoint files to stop being stored.

    enhancement help wanted 
    opened by ruanslv 2
  • Python package location issue during setup

    Python package location issue during setup

    Hi, I have a some kind of problem with installing Metaseq on Ubuntu Jelly. It looks like some packages are installing in place where the ./run.sh script for running the OPT-2.7B model following this guide: https://www.pragnakalp.com/exploring-the-text-generation-with-opt-open-pre-trained-transformers/ and I've tried uninstalling metaseq and reinstalling following the current official repo here: https://github.com/facebookresearch/metaseq/blob/main/docs/setup.md. I'm using a virtual env.

    Basically when running ./run.sh I get this error: "ImportError: cannot import name 'data_dir' from 'helpers' (/home/user/.local/lib/python3.10/site-packages/helpers/init.py)". If I instead use sudo with sudo pip3 install -e ., I get a warning about installing as root but this error goes away. This suggests to me that Helpers is installed in a directory only accessible by root, though this is the first time I installed a pip package with sudo.

    So I tried reinstalling with no sudo and I get a couple of warnings/errors: "Defaulting to user installation because normal site-packages is not writeable" and "[Errno 13] Permission denied: '/usr/local/lib/python3.10/dist-packages/test-easy-install-294163.write-test'"

    So I try rolling with it and run ./run.sh as sudo but then get "ModuleNotFoundError: No module named 'torch._C'", even though pip show torch reports "Version: 1.12.1+cu116" at /home/user/.local/lib/python3.10/site-packages. (a user install location, not root).

    I'm trying some other LLM installs, but wanted to report here in case either there is a simple fix, or others run into the same issue. Thanks

    bug 
    opened by auwsom 3
  • Any recommend way to improve training speed on hardware with low VRAM?

    Any recommend way to improve training speed on hardware with low VRAM?

    ❓ Questions and Help

    Before asking:

    1. search the issues.
    2. search the docs.

    What is your question?

    Hi, I am working on training a 10B model in a limited resources (16 * A100 40GB). The problem I am facing now is that I cannot achieve the flop/s target (130T/s ~ 150T/s). I have tuned parameters around, but the most prominent speed-up comes from reducing the model size and increasing batch size. So I am thinking the reason may be the small batch size I have to use to accomodate the lower VRAM.

    n params (B) | hidden | ffw | # heads | # layers | # tensor parallel | batch size | wps| Tflop/s/A100 -- | -- | -- | -- | -- | -- | -- | -- | -- 8.172 | 4096 | 16384 | 32 | 40 | 2 | 8 | 17k | 69 8.172 | 4096 | 16384 | 32 | 40 | 4 | 16 | OOM | OOM 4.144 | 4096 | 16384 | 32 | 20 | 2 | 16 | 43k | 89 4.144 | 4096 | 16384 | 32 | 20 | 4 | 32 | 27k | 56

    The most straightforward way to validate this is to increasing parallel size. However from my observations, increasing tensor parallel size from 2 to 4 only slows down training. Is it as expected? If it is, is there any other way to improve the training speed here?

    Seq_len 2048, Flops calculation: wps * n_params * 8 / n_gpus,

    Code

    What have you tried?

    What's your environment?

    • metaseq Version (e.g., 1.0 or master): master
    • PyTorch Version (e.g., 1.0) nightly 1.13.0a0+d321be6
    • OS (e.g., Linux): Ubuntu 20.04
    • How you installed metaseq (pip, source): source
    • Build command you used (if compiling from source): pip install . -e
    • Python version: 3.9
    • CUDA/cuDNN version: 450.80.02/11.7/8600
    • GPU models and configuration: A100 40GB * 16
    • Any other relevant information:
    question 
    opened by QIU-Shuo 1
  • Delete Heartbeat Timeout

    Delete Heartbeat Timeout

    Heartbeat timeout is no longer necessary. It's since replaced by simple use of the NCCL_ASYNC_ERROR_HANDLING=1 environmental variable.

    Let's clean up this dead code.

    help wanted 
    opened by stephenroller 5
Owner
Meta Research
Meta Research
MODDA: a drug repositioning method based on a large-scale multi-omics heterogeneous network

MODDA Code and Dataset for "MODDA: a drug repositioning method based on a large-scale multi-omics heterogeneous network". Reference If you make advant

null 5 Jun 12, 2022
SIMS: Scalable, Interpretable Models for Cell Annotation of large scale single-cell RNA-seq data

SIMS: Scalable, Interpretable Modeling for Single-Cell RNA-Seq Data Classification SIMS is a pipeline for building interpretable and accurate classifi

Julian Lehrer 2 May 12, 2022
HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR. CVPR 2022

HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR. CVPR 2022 [Project page | Video] Getting sta

null 47 Sep 26, 2022
PyTorch implementation of the method presented in "Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation"

DeepViewAgg [CVPR 2022 Oral] Official repository for Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation paper ?? sel

Damien ROBERT 140 Sep 22, 2022
BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training By Likun Cai, Zhi Zhang, Yi Zhu, Li Zhang, Mu Li, Xiangyang Xue. This

null 265 Sep 17, 2022
Personal codes for Large-scale Structures measurement

lssbox personal codes for Large-scale Structures measurement Installation create a clean conda environment conda create -n lssbox python=3.8 conda act

zhaoruiyang98 2 Jun 26, 2022
Brain Agent for Large-Scale and Multi-Task Agent Learning

Brain Agent Brain Agent is a distributed agent learning system for large-scale and multi-task reinforcement learning, developed by Kakao Brain. Brain

Kakao Brain 72 Sep 20, 2022
DeepGNN is a framework for training machine learning models on large scale graph data.

DeepGNN Overview DeepGNN is a framework for training machine learning models on large scale graph data. DeepGNN contains all the necessary features in

Microsoft 35 Sep 24, 2022
A large-scale repository of images for software-specific lexicons database called 'SE-ImageNet' to complement software engineering communities and computer vision researchers

SE-ImageNet Deenu summer research 2021 work - Preparing a large-scale repository of images for software-specific lexicons called 'SE-ImageNet' constru

Deenu Yadav 1 May 25, 2022
A large-scale dataset for stereo matching of high-resolution satellite imagery

WHU-Stereo This repository contains: I. A large-scale dataset named WHU-Stereo for stereo matching of high-resolution satellite imagery. II. Several d

null 11 Jul 22, 2022
Unofficial pytorch implementation of BigVGAN: A Universal Neural Vocoder with Large-Scale Training

BigVGAN: A Universal Neural Vocoder with Large-Scale Training In this repository, I try to implement BigVGAN (specifically BigVGAN-base model) [Paper]

Sang-Hoon Lee 75 Sep 29, 2022
A light-weight library for adding fault tolerance to large-scale PyTorch distributed training workloads.

torchsnapshot This library is currently in Alpha and currently does not have a stable release. The API may change and may not be backward compatible.

Meta Research 40 Sep 28, 2022
Voxel-MAE: Masked Autoencoders for Pre-training Large-scale Point Clouds

Voxel-MAE: Masked Autoencoders for Pre-training Large-scale Point Clouds Repository for our arxiv paper "Voxel-MAE: Masked Autoencoders for Pre-traini

minchen 104 Sep 29, 2022
Source codes for "EfficientFi: Towards Large-Scale Lightweight WiFi Sensing via CSI Compression"

EfficientFi: Towards Large-Scale Lightweight WiFi Sensing via CSI Compression Introduction WiFi technology has been applied to various places due to t

Jianfei Yang 6 Sep 20, 2022
Large-scale pretrained models for goal-directed dialog

GODEL: Large-Scale Pre-Training for Goal-Directed Dialog Introduction This repository showcases building goal-directed dialog using GODEL, and contain

Microsoft 191 Sep 19, 2022
Optimized library for large-scale extraction of frames and audio from video.

video2numpy A nice template to start with Install pip install video2numpy Or build from source: python setup.py install Usage NAME video2numpy -

Maciej Kilian 150 Sep 27, 2022
The human face subset of LAION-400M for large-scale face pretraining.

LAION-Face Introduction LAION-Face is the human face subset of LAION-400M, it consists of 50 million image-text pairs. Face detection is conducted to

null 31 Sep 6, 2022
Official implementation of "AnimeCeleb: Large-Scale Animation CelebHeads Dataset for Head Reenactment" (ECCV 2022)

AnimeCeleb — Official Dataset & PyTorch Implementation AnimeCeleb: Large-Scale Animation CelebHeads Dataset for Head Reenactment Kangyeol Kim*1,4, Sun

Kangyeol Kim 48 Sep 30, 2022