This repository contains the official release of the model "BanglaT5" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaNLG: Benchmarks and Resources for Evaluating Low-Resource Natural Language Generation in Bangla".

Overview

BanglaNLG

This repository contains the official release of the model "BanglaT5" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaNLG: Benchmarks and Resources for Evaluating Low-Resource Natural Language Generation in Bangla".

Table of Contents

Models

The BanglaT5 model checkpoint is available at Huggingface model hub.

To use this model for the supported downstream tasks in this repository see Training & Evaluation.

We also release the following finetuned checkpoints:

Model Name Task name
banglat5_nmt_bn_en Bengali-English MT
banglat5_nmt_en_bn English-Bengali MT

Note: This model was pretrained using a specific normalization pipeline available here. All finetuning scripts in this repository uses this normalization by default. If you need to adapt the pretrained model for a different task make sure the text units are normalized using this pipeline before tokenizing to get best results. A basic example is available at the model page.

Setup

For installing the necessary requirements, use the following snippet

$ git clone https://github.com/csebuetnlp/BanglaNLG
$ cd BanglaNLG/
$ conda create python==3.7.9 pytorch==1.8.1 torchvision==0.9.1 torchaudio==0.8.0 cudatoolkit=10.2 -c pytorch -p ./env
$ conda activate ./env # or source activate ./env (for older versions of anaconda)
$ bash setup.sh 
  • Use the newly created environment for running the scripts in this repository.

Training & Evaluation

While all tasks we consider are modeled as seq2seq tasks, some tasks need specific data preprocessing for preparing the input and output sequences. See below for task-specific finetuning/inference scripts:

  • Sequence To Sequence.
    • For general sequence to sequence tasks such as
      • Machine Translation
      • Text Summarization
      • News Headline Generation etc.

Benchmarks

  • Supervised fine-tuning
Model Params MT (SacreBLEU) TS (ROUGE-2) QA (EM/F1) MD (SacreBLEU-1) NHG (ROUGE-2) XLS (ROUGE-2) BNLG score
mT5 (base) 582M 36.6/22.5 10.3 59.0/65.3 17.5 9.6 2.7/0.7 24.9
XLM-ProphetNet 616M 23.3/16.4 7.8 53.0/57.3 20.0 9.5 6.2/2.7 21.8
mBART-50 611M 23.6/16.7 10.4 53.4/58.9 18.5 11.2 5.4/3.7 22.4
IndicBART 244M 22.7/13.1 8.1 53.3/58.8 14.8 7.9 6.3/2.5 20.8
BanglaT5 247M 38.8/25.2 13.7 68.5/74.8 19.0 13.8 6.4/4.0 29.4

The benchmarking datasets are as follows:

License

Contents of this repository are restricted to non-commercial research purposes only under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).

Creative Commons License

Citation

If you use any of the datasets, models or code modules, please cite the following paper:

@article{bhattacharjee2022banglanlg,
  author    = {Abhik Bhattacharjee and Tahmid Hasan and Wasi Uddin Ahmad and Rifat Shahriyar},
  title     = {BanglaNLG: Benchmarks and Resources for Evaluating Low-Resource Natural Language Generation in Bangla},
  journal   = {CoRR},
  volume    = {abs/2205.11081},
  year      = {2022},
  url       = {https://arxiv.org/abs/2205.11081},
  eprinttype = {arXiv},
  eprint    = {2205.11081}
}
You might also like...

Recreating the results of paper titled as "Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks"

Recreating the results of paper titled as

NeuralCleanse-TensorFlow Recreating the results of paper titled as "Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks" Th

Apr 21, 2022

This is a collection of python codes for a preprint submitted to Computers and Geosciences journal for the paper titled "A Deep Learning Approach to Pore Network Inference in Sedimentary Rock Core Samples".

This is a collection of python codes for a preprint submitted to Computers and Geosciences journal for the paper titled

dlfpni This is a collection of python codes for a preprint submitted to Computers and Geosciences journal for the paper titled "A Deep Learning Approa

Sep 8, 2022

An implementation of the 2022 paper titled 'Mip-NeRF 360' from Google.

mip-NeRF 360 Work in Progress. See V0.1 branch for progress. NeRF NeRF paper encodes the volumetric density and colour of a scene within the weights o

Sep 23, 2022

Code to produce graphs of the energy functions associated with projectile motion. Intended mainly as an education aid.

This script produces graphs of the kinetic, potential and total energy of a 1kg mass falling from rest. It was created quickly for primarily education

Sep 3, 2022

Official implementation and data release of the paper "Visual Prompting via Image Inpainting".

Official implementation and data release of the paper

Visual Prompting via Image Inpainting Amir Bar*, Yossi Gandelsman*, Trevor Darrell, Amir Globerson, Alexei A. Efros This repository is the implementat

Sep 28, 2022

Code release for "Omni3D A Large Benchmark and Model for 3D Object Detection in the Wild"

Code release for

Omni3D & Cube R-CNN Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild Garrick Brazil, Julian Straub, Nikhila Ravi, Justin Johnso

Sep 30, 2022

Initial public release of code, data, and model weights for FourCastNet

Initial public release of code, data, and model weights for FourCastNet

FourCastNet This repository contains the code used for "FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural

Sep 26, 2022

Official code release for Monocular 3D Object Reconstruction with GAN Inversion (ECCV 2022)

Monocular 3D Object Reconstruction with GAN Inversion (ECCV 2022) This paper presents a novel GAN Inversion framework for single view 3D object recons

Sep 24, 2022

Scripts to convert datasets from various sources to Hugging Face Datasets.

Hugging Face Datasets Converter Scripts to convert datasets from various sources to Hugging Face datasets. Demo Convert Any Kaggle Dataset To a Huggin

Sep 26, 2022
Comments
  • Inference/finetuning

    Inference/finetuning

    When will the inference code and finetuning scripts for banglanlg will be released? If not finetuning then whats the prompts for nmt or Summarization etc? @abhik1505040

    opened by Tahsin-Mayeesha 2
Owner
BUET CSE NLP Group
BUET CSE NLP Group
Code and Data repository for the paper titled "Enhancing PM2.5 Prediction Using NARX-Based Combined CNN and LSTM Hybrid Model"

This is the code and data associated with the paper titled: "Enhancing PM2.5 Prediction Using NARX-Based Combined CNN and LSTM Hybrid Model" The paper

Ahmed Samy Moursi Hashwa 1 Aug 14, 2022
Official repository of paper titled "Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection".

Object Centric Open Vocabulary Detection Official repository of paper titled "Bridging the Gap between Object and Image-level Representations for Open

Hanoona Rasheed 145 Sep 29, 2022
Code and Data repository for the paper titled "An IoT enabled system for enhanced air quality monitoring and prediction on the edge"

This is the code and data associated with the paper titled: "An IoT enabled system for enhanced air quality monitoring and prediction on the edge" The

Ahmed Samy Moursi Hashwa 2 Aug 14, 2022
Count the number of people downstream of areas of interest.

downstream-beneficiaries-cli Count the number of people downstream of pixels of interest. Installation conda create -p ./env python=3.9 -c conda-forge

James Douglass 1 May 9, 2022
Code and data for COLING 2022 paper titled "Structural Bias For Aspect Sentiment Triplet Extraction"

StructBias This repository contains code and data for COLING 2022 paper titled Structural Bias For Aspect Sentiment Triplet Extraction. **************

GeneZC 11 Sep 12, 2022
Code for COLING 2022 accepted paper titled "MuCDN: Mutual Conversational Detachment Network for Emotion Recognition in Multi-Party Conversations"

MuCDN Code for COLING 2022 accepted paper titled "MuCDN: Mutual Conversational Detachment Network for Emotion Recognition in Multi-Party Conversations

null 2 Aug 18, 2022
Code for paper titled "Failure Prediction with Statistical Guarantees for Vision-Based Robot Control"

Failure Prediction with Statistical Guarantees for Vision-Based Robot Control This repository contains code for running many of the experiments and re

Intelligent Robot Motion Lab 1 Sep 6, 2022
Code for paper titled "Learning Latent Seasonal-Trend Representations for Time Series Forecasting" in NeurIPS 2022

LaST: Learning Latent Seasonal-Trend Representations for Time Series Forecasting In this repository, we provide source code of LaST framework for repr

Zhiyuan 7 Sep 26, 2022
This is an implementation code for hypothesis testing methods introduced in "Sequential locality of graphs and its hypothesis testing."

Hypothesis testing for the sequential locality of graphs You can assess the statistical significance of the sequential locality of an adjacency matrix

null 1 Jul 30, 2022
The official code release of the paper "BiBL: AMR Parsing and Generation with Bidirectional Bayesian Learning".

BiBL This is the repo for BiBL (Bidirectional Bayesian Learning). Best Model Checkpoints Urls for best model checkpoints are not available during the

null 4 Sep 13, 2022