✂️ Fast slice finding for Machine Learning model debugging.

Overview

Sliceline

Sliceline is a Python library for fast slice finding for Machine Learning model debugging.

It is an implementation of SliceLine: Fast, Linear-Algebra-based Slice Finding for ML Model Debugging, from Svetlana Sagadeeva and Matthias Boehm of Graz University of Technology.

👉 Getting started

Given an input dataset X and a model error vector errors, SliceLine finds the top slices in X that identify where a ML model performs significantly worse.

You can use sliceline as follows:

from sliceline.slicefinder import Slicefinder

slice_finder = Slicefinder()

slice_finder.fit(X, errors)

print(slice_finder.top_slices_)

X_trans = slice_finder.transform(X)

🛠 Installation

Sliceline is intended to work with Python 3.9 or above. Installation can be done with pip:

pip install sliceline

There are wheels available for Linux, MacOS, and Windows, which means that you most probably won’t have to build Sliceline from source.

You can install the latest development version from GitHub as so:

pip install git+https://github.com/DataDome/sliceline --upgrade

Or, through SSH:

pip install git+ssh://[email protected]/datadome/sliceline.git --upgrade

🔗 Useful links

👐 Contributing

Feel free to contribute in any way you like, we’re always open to new ideas and approaches.

  • Open a discussion if you have any question or enquiry whatsoever. It’s more useful to ask your question in public rather than sending us a private email. It’s also encouraged to open a discussion before contributing, so that everyone is aligned and unnecessary work is avoided.
  • Feel welcome to open an issue if you think you’ve spotted a bug or a performance issue.

Please check out the contribution guidelines if you want to bring modifications to the code base.

📝 License

Sliceline is free and open-source software licensed under the 3-clause BSD license.

You might also like...

Python and R Implementation of machine learning models, created to simplify machine learning complex models.

Python and R Implementation of machine learning models,  created to simplify machine learning complex models.

⚠️ MLMasterz release introduces some breaking changes, including new CLI option naming for input, and the drop of dedicated GPU package. About MLMaste

Sep 11, 2022

Prospect Pruning: Finding Trainable Weights at Initialization Using Meta-Gradients

Prospect Pruning (ProsPr) The code for "Prospect Pruning: Finding Trainable Weights at Initialization Using Meta-Gradients" Installation 1️⃣ Reproduci

Sep 14, 2022

A Python script that converts images (such as album covers) into gradient images by finding the most dominant RGB values.

Image-to-Gradient A Python script that converts images (such as album covers) into gradient images by finding the most dominant RGB values. Requiremen

Mar 22, 2022

FEMevents - Web app for quickly finding STEM events

FEMevents Inspiration According to the AAUW, women only make up 28% of the workforce in STEM. Over the years, people have tried to fix this, by empowe

Apr 17, 2022

A python library for finding the best prices on trading cards from stores in New Zealand.

cardhunter-nz A python library for finding the best prices on trading cards singles from stores in New Zealand. Currently works for Magic: the Gatheri

Apr 4, 2022

A fully automated, reliable, and accurate scanner for finding Spring4Shell and Spring Cloud RCE vulnerabilities

A fully automated, reliable, and accurate scanner for finding Spring4Shell and Spring Cloud RCE vulnerabilities

spring4shell-scan A fully automated, reliable, and accurate scanner for finding Spring4Shell and Spring Cloud RCE vulnerabilities Features Support for

Sep 22, 2022

Using path finding algorithms to find a path between two points on a randomly generated maze.

Solving-a-randomly-generated-maze Using path finding algorithms to find a path between two points on a randomly generated maze. A maze is randomly gen

May 5, 2022

Finding all the possible partitions of a list maintaining in the partition the order of the elements in the original list, and all the elements in the original list must be in some partition.

komby The Problem Finding all the possible partitions of a list maintaining in the partition the order of the elements in the original list, and all t

May 30, 2022

a pre-commit hook for finding unused variables in terraform modules and removing them.

terraform-check-unused-variables a pre-commit hook for finding unused variables in terraform modules and removing them. Scan terraform module(s) for u

Sep 6, 2022
Comments
  • Add `__version__` attribute to the package

    Add `__version__` attribute to the package

    Currently sliceline.__version__ outputs an AttributeError.

    Goal Keep the version management simple with one unique source of truth. Add the __version__ attribute to the package. See an implementation of BumpVer.

    Note The version number is managed with GitHub release number (i.e. commit tag). The version property in pyproject.toml is just a placeholder that is modify by the Release job in GitHub Action CI.

    opened by adedaran 0
  • Remove the `_dummify` method

    Remove the `_dummify` method

    Currently The Slicefinder class has a _dummify method implementing a one-hot encoding function with an additional argument: n_col_x_encoded. This argument enable to generate extra column of 0 to fit the required shape.

    Goal Remove this method. It is only used in _create_and_score_basic_slices. Maybe sklearn OneHotEncoder can do the job.

    opened by adedaran 0
Releases(0.2.7)
  • 0.2.7(Sep 1, 2022)

    What's Changed

    • Add demo notebooks by @adedaran in https://github.com/DataDome/sliceline/pull/18
    • Change notebook format by @adedaran in https://github.com/DataDome/sliceline/pull/19

    Full Changelog: https://github.com/DataDome/sliceline/compare/0.2.6...0.2.7

    Source code(tar.gz)
    Source code(zip)
  • 0.2.6(Jul 6, 2022)

    What's Changed

    • Add LICENSE and CONTRIBUTING.md by @adedaran in https://github.com/DataDome/sliceline/pull/15

    Full Changelog: https://github.com/DataDome/sliceline/compare/0.2.4...0.2.6

    Source code(tar.gz)
    Source code(zip)
  • 0.2.4(Jul 5, 2022)

    What's Changed

    • Increase python version range by @adedaran in https://github.com/DataDome/sliceline/pull/13
    • Update manually numpy and python version by @adedaran in https://github.com/DataDome/sliceline/pull/14

    Full Changelog: https://github.com/DataDome/sliceline/compare/0.2.3...0.2.4

    Source code(tar.gz)
    Source code(zip)
  • 0.2.3(Jul 4, 2022)

    What's Changed

    • Chore/reset licence by @adedaran in https://github.com/DataDome/sliceline/pull/11
    • Fix README by @adedaran in https://github.com/DataDome/sliceline/pull/12

    Full Changelog: https://github.com/DataDome/sliceline/compare/0.2.0...0.2.3

    Source code(tar.gz)
    Source code(zip)
  • 0.2.0(Jul 1, 2022)

    What's Changed

    • Feature/add pypi in ci by @adedaran in https://github.com/DataDome/sliceline/pull/8
    • Fix env by @adedaran in https://github.com/DataDome/sliceline/pull/9

    Full Changelog: https://github.com/DataDome/sliceline/compare/0.1.0...0.2.0

    Source code(tar.gz)
    Source code(zip)
  • 0.1.0(Jul 1, 2022)

    What's Changed

    • Add documentation by @adedaran in https://github.com/DataDome/sliceline/pull/1
    • Bug/fix documentation by @adedaran in https://github.com/DataDome/sliceline/pull/2
    • Add GitHub Action CI by @adedaran in https://github.com/DataDome/sliceline/pull/3
    • Add pre commit hooks by @adedaran in https://github.com/DataDome/sliceline/pull/4
    • Write readme by @adedaran in https://github.com/DataDome/sliceline/pull/5
    • Make doc locally buildable by @adedaran in https://github.com/DataDome/sliceline/pull/6
    • Add code of conduct by @adedaran in https://github.com/DataDome/sliceline/pull/7

    New Contributors

    • @adedaran made their first contribution in https://github.com/DataDome/sliceline/pull/1

    Full Changelog: https://github.com/DataDome/sliceline/commits/0.1.0

    Source code(tar.gz)
    Source code(zip)
Owner
DataDome
DataDome
Meaningfully debugging model mistakes with conceptual counterfactual explanations. ICML 2022

Meaningfully debugging model mistakes with conceptual counterfactual explanations What is this work about? Understanding model mistakes is critical to

Mert Yuksekgonul 50 Sep 12, 2022
Get started with Automated Machine Learning (AutoML) and Machine Learning Operations (MLOps) in Azure Machine Learning

Introduction Azure Machine Learning's automated ML capability helps you discover high-performing models without you reimplementing every possible appr

Alvin Haryanto 11 Jun 20, 2022
Practical examples of "Flawed Machine Learning Security" together with ML Security best practice across the end to end stages of the machine learning model lifecycle from training, to packaging, to deployment.

Flawed Machine Learning Security (AKA Exploring Secure ML) About this repo This Repo contains a set of resources relevant to the talk "Secure Machine

The Institute for Ethical Machine Learning 43 Sep 26, 2022
AutoML debugging and remediation tool called MARO: ML Automated Remediation Oracle

MARO: ML Automated Remediation Oracle AutoML debugging and remediation tool. Installation: Requires Python3 and Docker Run pip install -e . Run docker

International Business Machines 1 Jun 21, 2022
Helper script for Windows kernel debugging with IDA Pro on native Bochs debugger (including PDB symbols)

ida_bochs_windows Helper script for Windows kernel debugging with IDA Pro on native Bochs debugger (including PDB symbols) python3 + idapython 7.4 Exp

Dreg 43 Sep 29, 2022
A custom papermill engine to enable debugging. 🐞

ploomber-engine A custom papermill engine to enable debugging. ?? Papermill does not support debugging notebooks when they crash. For example, if you

Ploomber 5 Aug 26, 2022
Helper scripts for windows debugging with symbols for Bochs and IDA Pro (PDB files). Very handy for user mode <--> kernel mode

symseghelper Helper scripts for a simple debugging session with symbols for Bochs and IDA Pro (PDB files) Demo video (windbg_output.txt was generated

Dreg 12 Sep 5, 2022
Helper script for Windows kernel debugging with IDA Pro on VMware + GDB stub (including PDB symbols)

ida_vmware_windows_gdb.py Helper script for Windows kernel debugging with IDA Pro on VMware + GDB stub (including PDB symbols) python3 + idapython 7.4

Dreg 36 Aug 8, 2022
It is for debugging python functions and codes

It is for debugging python functions and codes. It will be useful when you face a whole new repo or project and wants to understand the workflow of functions.

Rasoul Bousaeedi 1 Aug 26, 2022
Tools for Linux kernel debugging on Bochs (including symbols, native Bochs debugger and IDA PRO)

bochs_linux_kernel_debugging Tools for Linux kernel debugging on Bochs (including symbols, native Bochs debugger and IDA PRO) Generate symbol file for

Dreg 24 Sep 8, 2022