Awesome Python Chemistry 

A curated list of awesome Python frameworks, libraries, software and
resources related to Chemistry.
Inspired by awesome-python.
Table of contents
General Chemistry
Packages and tools for general chemistry.
- AQME - Ensemble of
automated QM workflows that can be run through jupyter notebooks,
command lines and yaml files.
- aizynthfinder -
A tool for retrosynthetic planning.
- batchcalculator - A
GUI app based on wxPython for calculating the correct amount of
reactants (batch) for a particular composition given by the molar ratio
of its components.
- cctbx - The Computational
Crystallography Toolbox.
- ChemFormula -
ChemFormula provides a class for working with chemical formulas. It
allows parsing chemical formulas, calculating formula weights, and
generating formatted output strings (e.g. in HTML, LaTeX, or
Unicode).
- chemlib - A
robust and easy-to-use package that solves a variety of chemistry
problems.
- chempy - ChemPy is a
package useful for chemistry (mainly physical/inorganic/analytical
chemistry).
- datamol: -
Molecular Manipulation Made Easy. A light wrapper build on top of
RDKit.
- GoodVibes - A
Python program to compute quasi-harmonic thermochemical data from
Gaussian frequency calculations.
- hgraph2graph -
Hierarchical Generation of Molecular Graphs using Structural
Motifs.
- ionize -
Calculates the properties of individual ionic species in aqueous
solution, as well as aqueous solutions containing arbitrary sets of
ions.
- LModeA-nano -
Calculates the intrinsic chemical bond strength based on local
vibrational mode theory in solids and molecules.
- mendeleev -
A package that provides a python API for accessing various properties of
elements from the periodic table of elements.
- nmrglue - A
package for working with nuclear magnetic resonance (NMR) data including
functions for reading common binary file formats and processing NMR
data.
- Open Babel - A
chemical toolbox designed to speak the many languages of chemical
data.
- periodictable -
This package provides a periodic table of the elements with support for
mass, density and xray/neutron scattering information.
- propka -
Predicts the pKa values of ionizable groups in proteins and
protein-ligand complexes based in the 3D structure.
- pybel
- Pybel provides convenience functions and classes that make it simpler
to use the Open Babel libraries from Python.
- pycroscopy
- Scientific analysis of nanoscale materials imaging data.
- pyEQL - A
set of tools for conventional calculations involving solutions
(mixtures) and electrolytes.
- pyiron - pyiron - an integrated
development environment (IDE) for computational materials science.
- pymatgen - Python Materials
Genomics is a robust, open-source library for materials analysis.
- pymatviz - A
toolkit for visualizations in materials informatics.
- symfit - a
curve-fitting library ideally suited to chemistry problems, including
fitting experimental kinetics data.
- symmetry - Symmetry
is a library for materials symmetry analysis.
- stk - A library
for building, manipulating, analyzing and automatic design of molecules,
including a genetic algorithm.
- spectrochempy
- A library for processing, analyzing and modeling spectroscopic
data.
Machine Learning
Packages and tools for employing machine learning and data
science in chemistry.
- amp - Is an
open-source package designed to easily bring machine-learning to
atomistic calculations.
- atom3d - Enables
machine learning on three-dimensional molecular structure.
- chainer-chemistry
- A deep learning framework (based on Chainer) with applications in
Biology and Chemistry.
- chemml - A
machine learning and informatics program suite for the analysis, mining,
and modeling of chemical and materials data.
- chemprop -
Message Passing Neural Networks for Molecule Property Prediction .
- cgcnn - Crystal graph
convolutional neural networks for predicting material properties.
- deepchem - Deep-learning models
for Drug Discovery and Quantum Chemistry.
- DeepPurpose -
A Deep Learning Library for Compound and Protein Modeling DTI, Drug
Property, PPI, DDI, Protein Function Prediction.
- DescriptaStorus
- Descriptor computation (chemistry) and (optional) storage for machine
learning.
- DScribe -
Descriptor library containing a variety of fingerprinting techniques,
including the Smooth Overlap of Atomic Positions (SOAP).
- graphein - Provides
functionality for producing geometric representations of protein and RNA
structures, and biological interaction networks.
- Matminer
- Library of descriptors to aid in the data-mining of materials
properties, created by the Lawrence Berkeley National Laboratory.
- MoleOOD - a
robust molecular representation learning framework against distribution
shifts.
- megnet -
Graph Networks as a Universal Machine Learning Framework for Molecules
and Crystals.
- MAML -
Aims to provide useful high-level interfaces that make ML for materials
science as easy as possible.
- MORFEUS -
Library for fast calculations of
molecular
features from 3D
structures for machine learning with a focus on steric descriptors.
- olorenchemengine
- Molecular property prediction with unified API for diverse models and
respresentations, with integrated uncertainty quantification,
interpretability, and hyperparameter/architecture tuning.
- ROBERT - Ensemble
of automated machine learning protocols that can be run sequentially
through a single command line. The program works for regression and
classification problems.
- schnetpack
- Deep Neural Networks for Atomistic Systems.
- selfies
- Self-Referencing Embedded Strings (SELFIES): A 100% robust molecular
string representation.
- Summit
- Package for optimizing chemical reactions using machine learning
(contains 10 algorithms + several benchmarks).
- TDC - Therapeutics
Data Commons (TDC) is the first unifying framework to systematically
access and evaluate machine learning across the entire range of
therapeutics.
- XenonPy -
Library with several compositional and structural material descriptors,
along with a few pre-trained neural network models of material
properties.
Generative Molecular Design
Packages and tools for generating molecular species
- GraphINVENT
- A platform for graph-based molecular generation using graph neural
networks.
- GuacaMol - A
package for benchmarking of models for de novo molecular
design.
- moses - A
benchmarking platform for molecular generation models.
- perses -
Experiments with expanded ensembles to explore chemical space.
Simulations
Packages for atomistic simulations and computational
chemistry.
- alchemlyb -
Makes alchemical free energy calculations easier by leveraging the full
power and flexibility of the PyData stack.
- atomate2
- atomate2 is a library of computational materials science
workflows.
- Atomic Silumation
Environment (ASE) - Is a set of tools and modules for setting up,
manipulating, running, visualizing and analyzing atomistic
simulations.
- basis_set_exchange
- A library containing basis sets for use in quantum chemistry
calculations. In addition, this library has functionality for
manipulation of basis set data.
- CACTVS - Cactvs is
a universal, scriptable cheminformatics toolkit, with a large collection
of modules for property computation, chemistry data file I/O and other
tasks.
- CalcUS - Quantum
chemisttry web platform that brings all the necessary tools to perform
quantum chemistry in a user-friendly web interface.
- cantera - A
collection of object-oriented software tools for problems involving
chemical kinetics, thermodynamics, and transport processes.
- CatKit -
General purpose tools for high-throughput catalysis.
- ccinput - A tool
and library for creating quantum chemistry input files.
- cclib - A library for parsing
output files various quantum chemical programs.
- cinfony - A common API to
several cheminformatics toolkits (Open Babel, RDKit, the CDK, Indigo,
JChem, OPSIN and cheminformatics webservices).
- chemlab -
Is a library that can help the user with chemistry-relevant
calculations.
- emmet - A
package to ‘build’ collections of materials properties from the output
of computational materials calculations.
- fromage
- The “FRamewOrk for Molecular AGgregate Excitations” enables localised
QM/QM’ excited state calculations in a solid state environment.
- GPAW - Is a
density-functional theory (DFT) Python code based on the
projector-augmented wave (PAW) method and the atomic simulation
environment (ASE).
- horton -
Helpful Open-source Research TOol for N-fermion system, a
quantum-chemistry program that can perform computations involving model
Hamiltonians.
- HTMD -
High-Throughput Molecular Dynamics: Programming Environment for
Molecular Discovery.
- Indigo - Universal
cheminformatics libraries, utilities and database search tools.
- Jarvis-tools - An
open-access software package for atomistic data-driven materials
design
- mathchem - Is a free open
source package for calculating topological indices and other invariants
of molecular graphs.
- MDAnalysis - Is an
object-oriented library to analyze trajectories from molecular dynamics
(MD) simulations in many popular formats.
- MDTraj - Package for manipulating
molecular dynamics trajectories with support for multiple formats.
- MMTK - The
Molecular Modeling Toolkit is an Open Source program library for
molecular simulation applications.
- MolMod - A
library with many components that are useful to write molecular modeling
programs.
- oddt - Open Drug
Discovery Toolkit, a modular and comprehensive toolkit for use in
cheminformatics, molecular modeling etc.
- OPEM - Open source PEM
(Proton Exchange Membrane) fuel cell simulation tool.
- openmmtools
- A batteries-included toolkit for the GPU-accelerated OpenMM molecular
simulation engine.
- overreact - A
library and command-line tool for building and analyzing complex
homogeneous microkinetic models from quantum chemistry calculations,
with support for quasi-harmonic thermochemistry, quantum tunnelling
corrections, molecular symmetries and more.
- ParmEd -
Parameter/topology editor and molecular simulator with visualization
capability.
- pGrAdd
- A library for estimating thermochemical properties of molecules and
adsorbates using group additivity.
- phonopy - An open
source package for phonon calculations at harmonic and quasi-harmonic
levels.
- PLAMS - Python Library
for Automating Molecular Simulation: input preparation, job execution,
file management, output processing and building data workflows.
- pMuTT - A
library for ab-initio thermodynamic and kinetic parameter
estimation.
- PorePy - A
Simulation Tool for Fractured and Deformable Porous Media.
- ProDy - An open source
package for protein structural dynamics analysis with a flexible and
responsive API.
- ProLIF -
Interaction Fingerprints for protein-ligand complexes and more.
- Psi4 - A hybrid Python/C++
open-source package for quantum chemistry.
- Psi4NumPy -
Psi4-based reference implementations and Jupyter notebook-based
tutorials for foundational quantum chemistry methods.
- pyEMMA - Library
for the estimation, validation and analysis Markov models of molecular
kinetics and other kinetic and thermodynamic models from molecular
dynamics data.
- pygauss -
An interactive tool for supporting the life cycle of a computational
molecular chemistry investigations.
- PyQuante - Is an
open-source suite of programs for developing quantum chemistry
methods.
- pysic - A calculator
incorporating various empirical pair and many-body potentials.
- Pyscf - A quantum
chemistry package written in Python.
- pyvib2 - A program for
analyzing vibrational motion and vibrational spectra.
- RDKit - Open-Source
Cheminformatics Software.
- ReNView
- A program to visualize reaction networks.
- stk - A library
for building, manipulating, analyzing and automatic design of
molecules.
- QMsolve - A
module for solving and visualizing the Schrödinger equation.
- QUIP - A collection of
software tools to carry out molecular dynamics simulations.
- torchmd -
End-To-End Molecular Dynamics (MD) Engine using PyTorch.
- tsase - The library
which depends on ASE to tackle transition state calculations.
- yank - An open,
extensible Python framework for GPU-accelerated alchemical free energy
calculations.
Force Fields
Packages related to force fields
- CHGNet -
Pretrained universal neural network potential for charge-informed
atomistic modeling.
- FitSNAP - A Package
For Training SNAP Interatomic Potentials for use in the LAMMPS molecular
dynamics package.
- fftool - Tool to
build force field input files for molecular simulation.
- FLARE - A package
for creating fast and accurate interatomic potentials.
- global-chem -
A Chemical Knowledge Graph and Toolkit, writting in IUPAC/SMILES/SMARTS,
for common small molecules from diverse communities to aid users in
selecting compounds for forcefield parametirization.
- matbench-discovery
- A benchmark for ML-guided high-throughput materials discovery.
- NeuralForceField
- Neural Network Force Field based on PyTorch.
- openff-toolkit
- The Open Forcefield Toolkit provides implementations of the SMIRNOFF
format, parameterization engine, and other tools.
Molecular Visualization
Packages for viewing molecular structures.
- ase-gui
- The graphical user-interface allows users to visualize, manipulate,
and render molecular systems and atoms objects.
- chemiscope -
An interactive structure/property explorer for materials and
molecules.
- chemview -
An interactive molecular viewer designed for the IPython notebook.
- imolecule -
An embeddable webGL molecule viewer and file format converter.
- moleculekit -
A molecule manipulation library.
- nglview - A Jupyter widget to interactively view
molecular structures and trajectories.
- PyMOL - A user-sponsored molecular
visualization system on an open-source foundation, maintained and
distributed by Schrödinger.
- pymoldyn
- A viewer for atomic clusters, crystalline and amorphous materials in a
unit cell corresponding to one of the seven 3D Bravais lattices.
- sumo - A toolkit
for plotting and analysis of ab initio solid-state calculation
data.
- surfinpy -
A library for the analysis, plotting and visualisation of ab initio
surface calculation data.
- trident-chemwidgets
- Jupyter Widgets to interact with molecular datasets.
Database Wrappers
Providing a python layer for accessing chemical
databases
- ccdc
- An API for the Cambridge Structural Database System.
- ChemSpiPy -
ChemSpider wrapper, that allows
chemical searches, chemical file downloads, depiction and retrieval of
chemical properties.
- CIRpy - An
interface for the Chemical Identifier Resolver (CIR) by the CADD Group
at the NCI/NIH.
- pubchempy -
PubChemPy provides a way to interact with PubChem in Python.
- chembl-downloader
- Automate downloading and querying the latest (or a given) version of
ChEMBL
- drugbank-downloader
- Automate downloading, opening, and parsing DrugBank
Learning Resources
Resources for learning to apply python to chemistry.
- An
Introduction to Applied Bioinformatics - A Jupyter book
demonstrating working with biochemical data using the scikit-bio library
for tasks such as sequence alignment and calculating Hamming
distances.
- Computational
Thermodynamics - This collection of Jupyter notebooks demonstrates
solutions to a range of thermodynamic problems including solving
chemical equilibria, comparing real versus ideal gas behavior, and
calculating the temperature and composition of a combustion
reaction.
- SciCompforChemists
- Scientific Computing for Chemists with Python is a Jupyter book
teaching basic python in chemistry skills, including relevant libraries,
and applies them to solving chemical problems.
Miscellaneous Awesome
- Colorful
Nuclide Chart - A beatuful, interactive visualization of nuclides
with access to a varirty of nuclear properties and allows saving high
quality images for publications, presentations and outreach.
See Also
- awesome-cheminformatics
Another list focuses on Cheminformatics, including tools not only in
Python.
- awesome-small-molecule-ml
A collection of papers, datasets, and packages for small-molecule drug
discovery. Most links to code are in Python.
- awesome-molecular-docking
A curated list of molecular docking software, datasets, and papers.
- jarvis Joint Automated
Repository for Various Integrated Simulations is a repository designed
to automate materials discovery and optimization using classical
force-field, density functional theory, machine learning calculations
and experiments.
- polypharmacy-ddi-synergy-survey
A collection of research papers (with Python implementations) focusing
on drug-drug interactions, synergy and polypharmacy.