Python for Scientific Audio !Awesome (https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg) (https://github.com/sindresorhus/awesome) !Build Status  (https://github.com/faroit/awesome-python-scientific-audio/workflows/CI/badge.svg) (https://github.com/faroit/awesome-python-scientific-audio/actions?query=workflow%3ACI+branch%3Amaster+event%3Apush) The aim of this repository is to create a comprehensive, curated list of python software/tools related and used for scientific research in audio/music applications. Contents ⟡ Audio Related Packages (#audio-related-packages) - **Read/Write** (#read-write)  - **Transformations - General DSP** (#transformations---general-dsp)  - **Feature extraction** (#feature-extraction)  - **Data augmentation** (#data-augmentation)  - **Speech Processing** (#speech-processing)  - **Environmental Sounds** (#environmenta)  - **Perceptial Models - Auditory Models** (#perceptial-models---auditory-models) - **Source Separation** (#source-separation)  - **Music Information Retrieval** (#music-information-retrieval)  - **Deep Learning** (#deep-learning)  - **Symbolic Music - MIDI - Musicology** (#symbolic-music---midi---musicology)  - **Realtime applications** (#realtime-applications)  - **Web - Audio** (#web-audio)  - **Audio related APIs and Datasets** (#audio-related-apis-and-datasets)  - **Wrappers for Audio Plugins** (#wrappers-for-audio-plugins)  ⟡ Tutorials (#tutorials) ⟡ Books (#books) ⟡ Scientific Paper (#scientific-papers) ⟡ Other Resources (#other-resources) ⟡ Related lists (#related-lists) ⟡ Contributing (#contributing) ⟡ License (#license) Audio Related Packages - Total number of packages: 66 Read-Write ⟡ audiolazy (https://github.com/danilobellini/audiolazy) :octocat: (https://github.com/danilobellini/audiolazy) :package: (https://pypi.python.org/pypi/audiolazy/) - Expressive Digital Signal Processing (DSP)  package for Python. ⟡ audioread (https://github.com/beetbox/audioread) :octocat: (https://github.com/beetbox/audioread) :package: (https://pypi.python.org/pypi/audioread/) - Cross-library (GStreamer + Core Audio + MAD + FFmpeg)  audio decoding. ⟡ mutagen (https://mutagen.readthedocs.io/) :octocat: (https://github.com/quodlibet/mutagen) :package: (https://pypi.python.org/pypi/mutagen) - Reads and writes all kind of audio metadata for various formats. ⟡ pyAV (http://docs.mikeboers.com/pyav/) :octocat: (https://github.com/mikeboers/PyAV) - PyAV is a Pythonic binding for FFmpeg or Libav. ⟡ (Py)Soundfile (http://pysoundfile.readthedocs.io/) :octocat: (https://github.com/bastibe/PySoundFile) :package: (https://pypi.python.org/pypi/SoundFile) - Library based on libsndfile, CFFI, and NumPy. ⟡ pySox (https://github.com/rabitt/pysox) :octocat: (https://github.com/rabitt/pysox) :package: (https://pypi.python.org/pypi/pysox/) - Wrapper for sox. ⟡ stempeg (https://github.com/faroit/stempeg) :octocat: (https://github.com/faroit/stempeg) :package: (https://pypi.python.org/pypi/stempeg/) - read/write of STEMS multistream audio. ⟡ tinytag (https://github.com/devsnd/tinytag) :octocat: (https://github.com/devsnd/tinytag) :package: (https://pypi.python.org/pypi/tinytag/) - reading music meta data of MP3, OGG, FLAC and Wave files. Transformations - General DSP ⟡ acoustics (http://python-acoustics.github.io/python-acoustics/) :octocat: (https://github.com/python-acoustics/python-acoustics/) :package: (https://pypi.python.org/pypi/acoustics) - useful tools for  acousticians. ⟡ AudioTK (https://github.com/mbrucher/AudioTK) :octocat: (https://github.com/mbrucher/AudioTK) - DSP filter toolbox (lots of filters). ⟡ AudioTSM (https://audiotsm.readthedocs.io/) :octocat: (https://github.com/Muges/audiotsm) :package: (https://pypi.python.org/pypi/audiotsm/) - real-time audio time-scale modification procedures. ⟡ Gammatone (https://github.com/detly/gammatone) :octocat: (https://github.com/detly/gammatone) - Gammatone filterbank implementation. ⟡ pyFFTW (http://pyfftw.github.io/pyFFTW/) :octocat: (https://github.com/pyFFTW/pyFFTW) :package: (https://pypi.python.org/pypi/pyFFTW/) - Wrapper for FFTW(3). ⟡ NSGT (https://grrrr.org/research/software/nsgt/) :octocat: (https://github.com/grrrr/nsgt) :package: (https://pypi.python.org/pypi/nsgt) - Non-stationary gabor transform, constant-q. ⟡ matchering (https://github.com/sergree/matchering) :octocat: (https://github.com/sergree/matchering) :package: (https://pypi.org/project/matchering/) - Automated reference audio mastering. ⟡ MDCT (https://github.com/nils-werner/mdct) :octocat: (https://github.com/nils-werner/mdct) :package: (https://pypi.python.org/pypi/mdct) - MDCT transform. ⟡ pydub (http://pydub.com) :octocat: (https://github.com/jiaaro/pydub) :package: (https://pypi.python.org/pypi/mdct) - Manipulate audio with a simple and easy high level interface. ⟡ pytftb (http://tftb.nongnu.org) :octocat: (https://github.com/scikit-signal/pytftb) - Implementation of the MATLAB Time-Frequency Toolbox. ⟡ pyroomacoustics (https://github.com/LCAV/pyroomacoustics) :octocat: (https://github.com/LCAV/pyroomacoustics) :package: (https://pypi.python.org/pypi/pyroomacoustics) - Room Acoustics Simulation (RIR  generator) ⟡ PyRubberband (https://github.com/bmcfee/pyrubberband) :octocat: (https://github.com/bmcfee/pyrubberband) :package: (https://pypi.python.org/pypi/pyrubberband/) - Wrapper for rubberband  (http://breakfastquay.com/rubberband/) to do pitch-shifting and time-stretching. ⟡ PyWavelets (http://pywavelets.readthedocs.io) :octocat: (https://github.com/PyWavelets/pywt) :package: (https://pypi.python.org/pypi/PyWavelets) - Discrete Wavelet Transform in Python. ⟡ Resampy (http://resampy.readthedocs.io) :octocat: (https://github.com/bmcfee/resampy) :package: (https://pypi.python.org/pypi/resampy) - Sample rate conversion. ⟡ SFS-Python (http://www.sfstoolbox.org) :octocat: (https://github.com/sfstoolbox/sfs-python) :package: (https://pypi.python.org/pypi/sfs/) - Sound Field Synthesis Toolbox. ⟡ sound_field_analysis (https://appliedacousticschalmers.github.io/sound_field_analysis-py/) :octocat: (https://github.com/AppliedAcousticsChalmers/sound_field_analysis-py) :package:  (https://pypi.org/project/sound-field-analysis/) - Analyze, visualize and process sound field data recorded by spherical microphone arrays. ⟡ STFT (http://stft.readthedocs.io) :octocat: (https://github.com/nils-werner/stft) :package: (https://pypi.python.org/pypi/stft) - Standalone package for Short-Time Fourier Transform. Feature extraction ⟡ aubio (http://aubio.org/) :octocat: (https://github.com/aubio/aubio) :package: (https://pypi.python.org/pypi/aubio) - Feature extractor, written in C, Python interface. ⟡ audioFlux (https://github.com/libAudioFlux/audioFlux) :octocat: (https://github.com/libAudioFlux/audioFlux) :package: (https://pypi.python.org/pypi/audioflux) - A library for audio and music analysis, feature  extraction. ⟡ audiolazy (https://github.com/danilobellini/audiolazy) :octocat: (https://github.com/danilobellini/audiolazy) :package: (https://pypi.python.org/pypi/audiolazy/) - Realtime Audio Processing lib, general  purpose. ⟡ essentia (http://essentia.upf.edu) :octocat: (https://github.com/MTG/essentia) - Music related low level and high level feature extractor, C++ based, includes Python bindings. ⟡ python_speech_features (https://github.com/jameslyons/python_speech_features) :octocat: (https://github.com/jameslyons/python_speech_features) :package: (https://pypi.python.org/pypi/python_speech_features) -  Common speech features for ASR. ⟡ pyYAAFE (https://github.com/Yaafe/Yaafe) :octocat: (https://github.com/Yaafe/Yaafe) - Python bindings for YAAFE feature extractor. ⟡ speechpy (https://github.com/astorfi/speechpy) :octocat: (https://github.com/astorfi/speechpy) :package: (https://pypi.python.org/pypi/speechpy) - Library for Speech Processing and Recognition, mostly feature  extraction for now. ⟡ spafe (https://github.com/SuperKogito/spafe) :octocat: (https://github.com/SuperKogito/spafe) :package: (https://pypi.org/project/spafe/) - Python library for features extraction from audio files. Data augmentation ⟡ audiomentations (https://github.com/iver56/audiomentations) :octocat: (https://github.com/iver56/audiomentations) :package: (https://pypi.org/project/audiomentations/) - Audio Data Augmentation. ⟡ muda (https://muda.readthedocs.io/en/latest/) :octocat: (https://github.com/bmcfee/muda) :package: (https://pypi.python.org/pypi/muda) - Musical Data Augmentation. ⟡ pydiogment (https://github.com/SuperKogito/pydiogment) :octocat: (https://github.com/SuperKogito/pydiogment) :package: (https://pypi.org/project/pydiogment/) - Audio Data Augmentation. Speech Processing ⟡ aeneas (https://www.readbeyond.it/aeneas/) :octocat: (https://github.com/readbeyond/aeneas/) :package: (https://pypi.python.org/pypi/aeneas/) - Forced aligner, based on MFCC+DTW, 35+ languages. ⟡ deepspeech (https://github.com/mozilla/DeepSpeech) :octocat: (https://github.com/mozilla/DeepSpeech) :package: (https://pypi.org/project/deepspeech/) - Pretrained automatic speech recognition. ⟡ gentle (https://github.com/lowerquality/gentle) :octocat: (https://github.com/lowerquality/gentle) - Forced-aligner built on Kaldi. ⟡ Parselmouth (https://github.com/YannickJadoul/Parselmouth) :octocat: (https://github.com/YannickJadoul/Parselmouth) :package: (https://pypi.org/project/praat-parselmouth/) - Python interface to the Praat  (http://www.praat.org) phonetics and speech analysis, synthesis, and manipulation software. ⟡ persephone (https://persephone.readthedocs.io/en/latest/) :octocat: (https://github.com/persephone-tools/persephone) :package: (https://pypi.org/project/persephone/) - Automatic phoneme transcription tool. ⟡ pyannote.audio (https://github.com/pyannote/pyannote-audio) :octocat: (https://github.com/pyannote/pyannote-audio) :package: (https://pypi.org/project/pyannote-audio/) - Neural building blocks for speaker  diarization. ⟡ pyAudioAnalysis (https://github.com/tyiannak/pyAudioAnalysis)² :octocat: (https://github.com/tyiannak/pyAudioAnalysis) :package: (https://pypi.python.org/pypi/pyAudioAnalysis/) - Feature Extraction,  Classification, Diarization. ⟡ py-webrtcvad (https://github.com/wiseman/py-webrtcvad) :octocat: (https://github.com/wiseman/py-webrtcvad) :package: (https://pypi.python.org/pypi/webrtcvad/) - Interface to the WebRTC Voice Activity  Detector. ⟡ pypesq (https://github.com/vBaiCai/python-pesq) :octocat: (https://github.com/vBaiCai/python-pesq) - Wrapper for the PESQ score calculation. ⟡ pystoi (https://github.com/mpariente/pystoi) :octocat: (https://github.com/mpariente/pystoi) :package: (https://pypi.org/project/pystoi) - Short Term Objective Intelligibility measure (STOI). ⟡ PyWorldVocoder (https://github.com/JeremyCCHsu/Python-Wrapper-for-World-Vocoder) :octocat: (https://github.com/JeremyCCHsu/Python-Wrapper-for-World-Vocoder) - Wrapper for Morise's World Vocoder. ⟡ Montreal Forced Aligner (https://montrealcorpustools.github.io/Montreal-Forced-Aligner/) :octocat: (https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner) - Forced aligner, based on Kaldi (HMM),  English (others can be trained). ⟡ SIDEKIT (http://lium.univ-lemans.fr/sidekit/) :package: (https://pypi.python.org/pypi/SIDEKIT/) - Speaker and Language recognition. ⟡ SpeechRecognition (https://github.com/Uberi/speech_recognition) :octocat: (https://github.com/Uberi/speech_recognition) :package: (https://pypi.python.org/pypi/SpeechRecognition/) - Wrapper for several ASR  engines and APIs, online and offline. Environmental Sounds ⟡ sed_eval (http://tut-arg.github.io/sed_eval) :octocat: (https://github.com/TUT-ARG/sed_eval) :package: (https://pypi.org/project/sed_eval/) - Evaluation toolbox for Sound Event Detection Perceptial Models - Auditory Models ⟡ cochlea (https://github.com/mrkrd/cochlea) :octocat: (https://github.com/mrkrd/cochlea) :package: (https://pypi.python.org/pypi/cochlea/) - Inner ear models. ⟡ Brian2 (http://briansimulator.org/) :octocat: (https://github.com/brian-team/brian2) :package: (https://pypi.python.org/pypi/Brian2) - Spiking neural networks simulator, includes cochlea model. ⟡ Loudness (https://github.com/deeuu/loudness) :octocat: (https://github.com/deeuu/loudness) - Perceived loudness, includes Zwicker, Moore/Glasberg model. ⟡ pyloudnorm (https://www.christiansteinmetz.com/projects-blog/pyloudnorm) :octocat: (https://github.com/csteinmetz1/pyloudnorm) - Audio loudness meter and normalization, implements ITU-R BS.1770-4. ⟡ Sound Field Synthesis Toolbox (http://www.sfstoolbox.org) :octocat: (https://github.com/sfstoolbox/sfs-python) :package: (https://pypi.python.org/pypi/sfs/) - Sound Field Synthesis Toolbox. Source Separation ⟡ commonfate (https://github.com/aliutkus/commonfate) :octocat: (https://github.com/aliutkus/commonfate) :package: (https://pypi.python.org/pypi/commonfate) - Common Fate Model and Transform. ⟡ NTFLib (https://github.com/stitchfix/NTFLib) :octocat: (https://github.com/stitchfix/NTFLib) - Sparse Beta-Divergence Tensor Factorization. ⟡ NUSSL (https://interactiveaudiolab.github.io/project/nussl.html) :octocat: (https://github.com/interactiveaudiolab/nussl) :package: (https://pypi.python.org/pypi/nussl) - Holistic source separation framework  including DSP methods and deep learning methods. ⟡ NIMFA (http://nimfa.biolab.si) :octocat: (https://github.com/marinkaz/nimfa) :package: (https://pypi.python.org/pypi/nimfa) - Several flavors of non-negative-matrix factorization. Music Information Retrieval ⟡ Catchy (https://github.com/jvbalen/catchy) :octocat: (https://github.com/jvbalen/catchy) - Corpus Analysis Tools for Computational Hook Discovery. ⟡ chord-detection (https://github.com/sevagh/chord-detection) :octocat: (https://github.com/sevagh/chord-detection) - Algorithms for chord detection and key estimation. ⟡ Madmom (https://madmom.readthedocs.io/en/latest/) :octocat: (https://github.com/CPJKU/madmom) :package: (https://pypi.python.org/pypi/madmom) - MIR packages with strong focus on beat detection, onset detection and chord recognition. ⟡ mir_eval (http://craffel.github.io/mir_eval/) :octocat: (https://github.com/craffel/mir_eval) :package: (https://pypi.python.org/pypi/mir_eval) - Common scores for various MIR tasks. Also includes bss_eval  implementation. ⟡ msaf (http://pythonhosted.org/msaf/) :octocat: (https://github.com/urinieto/msaf) :package: (https://pypi.python.org/pypi/msaf) - Music Structure Analysis Framework. ⟡ librosa (http://librosa.github.io/librosa/) :octocat: (https://github.com/librosa/librosa) :package: (https://pypi.python.org/pypi/librosa) - General audio and music analysis. Deep Learning ⟡ Kapre (https://github.com/keunwoochoi/kapre) :octocat: (https://github.com/keunwoochoi/kapre) :package: (https://pypi.python.org/pypi/kapre) - Keras Audio Preprocessors ⟡ TorchAudio (https://github.com/pytorch/audio) :octocat: (https://github.com/pytorch/audio) - PyTorch Audio Loaders ⟡ nnAudio (https://github.com/KinWaiCheuk/nnAudio) :octocat: (https://github.com/KinWaiCheuk/nnAudio) :package: (https://pypi.org/project/nnAudio/) - Accelerated audio processing using 1D convolution networks in PyTorch. Symbolic Music - MIDI - Musicology ⟡ Music21 (http://web.mit.edu/music21/) :octocat: (https://github.com/cuthbertLab/music21) :package: (https://pypi.python.org/pypi/music21) - Toolkit for Computer-Aided Musicology. ⟡ Mido (https://mido.readthedocs.io/en/latest/) :octocat: (https://github.com/olemb/mido) :package: (https://pypi.python.org/pypi/mido) - Realtime MIDI wrapper. ⟡ mingus (https://github.com/bspaans/python-mingus) :octocat: (https://github.com/bspaans/python-mingus) :package: (https://pypi.org/project/mingus) - Advanced music theory and notation package with MIDI file  and playback support. ⟡ Pretty-MIDI (http://craffel.github.io/pretty-midi/) :octocat: (https://github.com/craffel/pretty-midi) :package: (https://pypi.python.org/pypi/pretty-midi) - Utility functions for handling MIDI data in a  nice/intuitive way. Realtime applications ⟡ Jupylet (https://github.com/nir/jupylet) :octocat: (https://github.com/nir/jupylet) - Subtractive, additive, FM, and sample-based sound synthesis. ⟡ PYO (http://ajaxsoundstudio.com/software/pyo/) :octocat: (https://github.com/belangeo/pyo) - Realtime audio dsp engine. ⟡ python-sounddevice (https://github.com/spatialaudio/python-sounddevice) :octocat: (http://python-sounddevice.readthedocs.io) :package: (https://pypi.python.org/pypi/sounddevice) - PortAudio wrapper providing  realtime audio I/O with NumPy. ⟡ ReTiSAR (https://github.com/AppliedAcousticsChalmers/ReTiSAR) :octocat: (https://github.com/AppliedAcousticsChalmers/ReTiSAR) - Binarual rendering of streamed or IR-based high-order spherical microphone array  signals. Web Audio ⟡ TimeSide (Beta) (https://github.com/Parisson/TimeSide/tree/dev) :octocat: (https://github.com/Parisson/TimeSide/tree/dev) - high level audio analysis, imaging, transcoding, streaming and labelling. Audio Dataset and Dataloaders ⟡ beets (http://beets.io/) :octocat: (https://github.com/beetbox/beets) :package: (https://pypi.python.org/pypi/beets) - Music library manager and MusicBrainz (https://musicbrainz.org/) tagger. ⟡ musdb (http://dsdtools.readthedocs.io) :octocat: (https://github.com/sigsep/sigsep-mus-db) :package: (https://pypi.python.org/pypi/musdb) - Parse and process the MUSDB18 dataset. ⟡ medleydb (http://medleydb.readthedocs.io) :octocat: (https://github.com/marl/medleydb) - Parse medleydb (http://medleydb.weebly.com/) audio + annotations. ⟡ Soundcloud API (https://github.com/soundcloud/soundcloud-python) :octocat: (https://github.com/soundcloud/soundcloud-python) :package: (https://pypi.python.org/pypi/soundcloud) - Wrapper for Soundcloud API  (https://developers.soundcloud.com/). ⟡ Youtube-Downloader (http://rg3.github.io/youtube-dl/) :octocat: (https://github.com/rg3/youtube-dl) :package: (https://pypi.python.org/pypi/youtube_dl) - Download youtube videos (and the audio). ⟡ audiomate (https://github.com/ynop/audiomate) :octocat: (https://github.com/ynop/audiomate) :package: (https://pypi.python.org/pypi/audiomate/) - Loading different types of audio datasets. ⟡ mirdata (https://mirdata.readthedocs.io/en/latest/) :octocat: (https://github.com/mir-dataset-loaders/mirdata) :package: (https://pypi.python.org/pypi/mirdata) - Common loaders for Music Information Retrieval  (MIR) datasets. Wrappers for Audio Plugins ⟡ VamPy Host (https://code.soundsoftware.ac.uk/projects/vampy-host) :package: (https://pypi.python.org/pypi/vamp) - Interface compiled vamp plugins. Tutorials ⟡ Whirlwind Tour Of Python (https://jakevdp.github.io/WhirlwindTourOfPython/) :octocat: (https://github.com/jakevdp/WhirlwindTourOfPython ) - fast-paced introduction to Python essentials, aimed at researchers and developers. ⟡ Introduction to Numpy and Scipy (http://www.scipy-lectures.org/index.html) :octocat: (https://github.com/scipy-lectures/scipy-lecture-notes) - Highly recommended tutorial, covers large parts of the scientific  Python ecosystem. ⟡ Numpy for MATLAB® Users (https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html) - Short overview of equivalent python functions for switchers. ⟡ MIR Notebooks (http://musicinformationretrieval.com/) :octocat: (https://github.com/stevetjoa/stanford-mir) - collection of instructional iPython Notebooks for music information retrieval (MIR). ⟡ Selected Topics in Audio Signal Processing ( https://github.com/spatialaudio/selected-topics-in-audio-signal-processing-exercises) - Exercises as iPython notebooks. ⟡ Live-coding a music synthesizer (https://www.youtube.com/watch?v=SSyQ0kRHzis) Live-coding video showing how to use the SoundDevice library to reproduce realistic sounds. Code  (https://github.com/cool-RR/python_synthesizer). Books ⟡ Python Data Science Handbook (https://github.com/jakevdp/PythonDataScienceHandbook) - Jake Vanderplas, Excellent Book and accompanying tutorial notebooks. ⟡ Fundamentals of Music Processing (https://www.audiolabs-erlangen.de/fau/professor/mueller/bookFMP) - Meinard Müller, comes with Python exercises. Scientific Papers ⟡ Python for audio signal processing (http://eprints.maynoothuniversity.ie/4115/1/40.pdf) - John C. Glover, Victor Lazzarini and Joseph Timoney, Linux Audio Conference 2011. ⟡ librosa: Audio and Music Signal Analysis in Python (http://conference.scipy.org/proceedings/scipy2015/pdfs/brian_mcfee.pdf), Video (https://www.youtube.com/watch?v=MhOdbtPhbLU) - Brian McFee, Colin Raffel,  Dawen Liang, Daniel P.W. Ellis, Matt McVicar, Eric Battenberg, Oriol Nieto, Scipy 2015. ⟡ pyannote.audio: neural building blocks for speaker diarization (https://arxiv.org/abs/1911.01255), Video (https://www.youtube.com/watch?v=37R_R82lfwA) - Hervé Bredin, Ruiqing Yin, Juan Manuel Coria, Gregory  Gelly, Pavel Korshunov, Marvin Lavechin, Diego Fustes, Hadrien Titeux, Wassim Bouaziz, Marie-Philippe Gill, ICASSP 2020. Other Resources ⟡ Coursera Course (https://www.coursera.org/learn/audio-signal-processing) - Audio Signal Processing, Python based course from UPF of Barcelona and Stanford University. ⟡ Digital Signal Processing Course (http://dsp-nbsphinx.readthedocs.io/en/nbsphinx-experiment/index.html) - Masters Course Material (University of Rostock) with many Python examples. ⟡ Slack Channel (https://mircommunity.slack.com) - Music Information Retrieval Community. Related lists There is already PythonInMusic (https://wiki.python.org/moin/PythonInMusic) but it is not up to date and includes too many packages of special interest that are mostly not relevant for scientific applications.  Awesome-Python (https://github.com/vinta/awesome-python) is large curated list of python packages. However, the audio section is very small. Contributing Your contributions are always welcome! Please take a look at the contribution guidelines (CONTRIBUTING.md) first. I will keep some pull requests open if I'm not sure whether those libraries are awesome, you could vote for them by adding 👍 to them. License !License: CC BY 4.0 (https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg) (https://creativecommons.org/licenses/by/4.0/)