We present embo, a Python package to analyze empirical data using the Information Bottleneck (IB) method and its variants, such as the Deterministic Information Bottleneck (DIB). Given two random variables X and Y, the IB finds the stochastic mapping M of X that encodes the most information about Y, subject to a constraint on the information that M is allowed to retain about X. Despite the popularity of the IB, an accessible implementation of the reference algorithm oriented towards ease of use on empirical data was missing. Embo is optimized for the common case of discrete, lowdimensional data. Embo is fast, provides a standard data-processing pipeline, offers a parallel implementation of key computational steps, and includes reasonable defaults for the method parameters. Embo is broadly applicable to different problem domains, as it can be employed with any dataset consisting in joint observations of two discrete variables. It is available from the Python Package Index (PyPI), Zenodo and GitLab.

Embo: a Python package for empirical data analysis using the Information Bottleneck / Piasini, E.; Filipowicz, A. L. S.; Levine, J.; Gold, J. I.. - In: JOURNAL OF OPEN RESEARCH SOFTWARE. - ISSN 2049-9647. - 9:1(2021), pp. 1-7. [10.5334/JORS.322]

Embo: a Python package for empirical data analysis using the Information Bottleneck

Piasini, E.
;
2021-01-01

Abstract

We present embo, a Python package to analyze empirical data using the Information Bottleneck (IB) method and its variants, such as the Deterministic Information Bottleneck (DIB). Given two random variables X and Y, the IB finds the stochastic mapping M of X that encodes the most information about Y, subject to a constraint on the information that M is allowed to retain about X. Despite the popularity of the IB, an accessible implementation of the reference algorithm oriented towards ease of use on empirical data was missing. Embo is optimized for the common case of discrete, lowdimensional data. Embo is fast, provides a standard data-processing pipeline, offers a parallel implementation of key computational steps, and includes reasonable defaults for the method parameters. Embo is broadly applicable to different problem domains, as it can be employed with any dataset consisting in joint observations of two discrete variables. It is available from the Python Package Index (PyPI), Zenodo and GitLab.
2021
9
1
1
7
http://doi.org/10.5334/jors.322
Piasini, E.; Filipowicz, A. L. S.; Levine, J.; Gold, J. I.
File in questo prodotto:
File Dimensione Formato  
322-4971-2-PB.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 861.19 kB
Formato Adobe PDF
861.19 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11767/127535
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact