Cheminformatics (also known as chemoinformatics, chemioinformatics
and chemical informatics) is the use of computer and informational
techniques applied to a range of problems in the field of chemistry.— Wikipedia
A curated list of awesome Cheminformatics software, resources, and
libraries. Mostly command line based, and free or open-source. Please
feel free to contribute !
Contents
Applications
### Visualization
- PyMOL -
Python-enhanced molecular graphics tool.
- Jmol - Browser-based
HTML5 viewer and stand-alone Java viewer for chemical structures in
3D.
- VMD - Molecular
visualization program for displaying, animating, and analyzing large
biomolecular systems using 3-D graphics and built-in scripting.
- Chimera - Highly
extensible program for interactive molecular visualization and analysis.
Source
is available.
- ChimeraX - The
next-generation molecular visualization program, following UCSF Chimera.
Source is available here.
- DataWarrior
- A program for data Visualization and analysis which combines dynamic
graphical views and interactive row filtering with chemical
intelligence.
### Command Line Tools
- Open Babel -
Chemical toolbox designed to speak the many languages of chemical
data.
- MayaChemTools
- Collection of Perl and Python scripts, modules, and classes that
support day-to-day computational discovery needs.
- Packmol -
Initial configurations for molecular dynamics simulations by packing
optimization.
- BCL::Commons
### Docking
- AutoDock Vina - Molecular
docking and virtual screening.
- smina -
Customized AutoDock Vina to
better support scoring function development and high-performance energy
minimization.
### Virtual Machine
- myChEMBL
- A version of ChEMBL built using Open Source software (Ubuntu,
PostgreSQL, RDKit)
- 3D e-Chem
Virtual Machine - Virtual machine with all software and sample data
to run 3D-e-Chem Knime workflows
Libraries
### General Purpose
- RDKit - Collection of
cheminformatics and machine-learning software written in C++ and
Python.
- Indigo - Universal
molecular toolkit that can be used for molecular fingerprinting,
substructure search, and molecular visualization written in C++ package,
with Java, C#, and Python wrappers.
- CDK (Chemistry
Development Kit) - Algorithms for structural chemo- and
bioinformatics, implemented in Java.
- ChemmineR
- Cheminformatics package for analyzing drug-like small molecule data in
R.
- ChemPy - A Python
package useful for chemistry (mainly physical/inorganic/analytical
chemistry)
- MolecularGraph.jl
- A graph-based molecule modeling and chemoinformatics analysis toolkit
fully implemented in Julia
- datamol: -
Molecular Manipulation Made Easy. A light wrapper build on top of
RDKit.
- CGRtools -
Toolkit for processing molecules, reactions and condensed graphs of
reactions. Can be used for chemical standardization, MCS search,
tautomers generation with backward compatibility to RDKit and
NetworkX.
### Format Checking
### Visualization
- Kekule.js -
Front-end JavaScript library for providing the ability to represent,
draw, edit, compare and search molecule structures on web browsers.
- 3Dmol.js - An
object-oriented, webGL based JavaScript library for online molecular
visualization.
- JChemPaint -
Chemical 2D structure editor application/applet based on the Chemistry Development
Kit.
- rdeditor - Simple
RDKit molecule editor GUI using PySide.
- nglviewer -
Interactive molecular graphics for Jupyter notebooks.
- RDKit.js -
Official JavaScript distribution of cheminformatics functionality from
the RDKit - a C++ library for cheminformatics.
### Molecular Descriptors
- mordred
- Molecular descriptor calculator based on RDKit.
- DescriptaStorus
- Descriptor computation(chemistry) and (optional) storage for machine
learning.
- mol2vec - Vector
representations of molecular substructures.
- Align-it
- Align molecules according their pharmacophores.
- Rcpi - R/Bioconductor
package to generate various descriptors of proteins, compounds and their
interactions.
### Machine Learning
- DeepChem - Deep
learning library for Chemistry based on Tensorflow
- Chemprop -
Directed message passing neural networks for property prediction of
molecules and reactions with uncertainty and interpretation.
- ChemML - ChemML
is a machine learning and informatics program suite for the analysis,
mining, and modeling of chemical and materials data. (based on
Tensorflow)
- olorenchemengine
- Molecular property prediction with unified API for diverse models and
respresentations, with integrated uncertainty quantification,
interpretability, and hyperparameter/architecture tuning.
- OpenChem -
OpenChem is a deep learning toolkit for Computational Chemistry with
PyTorch backend.
- DGL-LifeSci -
DGL-LifeSci is a DGL-based package for
various applications in life science with graph neural network.
- chainer-chemistry
- A Library for Deep Learning in Biology and Chemistry.
- pytorch-geometric
- A PyTorch library provides implementation of many graph convolution
algorithms.
- chemmodlab - A
Cheminformatics Modeling Laboratory for Fitting and Assessing Machine
Learning Models in R.
- Summit
- A python package for optimizing chemical reactions using machine
learning (contains 10 algorithms + several benchmarks).
### Web APIs
### Databases
### Docking * Rosetta - A
comprehensive software suite for modeling macromolecular structures.
Used larely for protein-protein docking. * DOCKSTRING -
Automates and standardizes ligand preparation for AutoDock Vina.
### Molecular Dynamics
- Gromacs - Molecular dynamics
package mainly designed for simulations of proteins, lipids and nucleic
acids.
- OpenMM - High performance toolkit
for molecular simulation including extensive language bindings for
Python, C, C++, and even Fortran.
- NAMD - a
parallel molecular dynamics code designed for high-performance
simulation of large biomolecular systems.
- MDTraj - Analysis of
molecular dynamics trajectories.
- cclib - Parsers and
algorithms for computational chemistry logfiles.
- ProDy - A Python
package for protein dynamics analysis
### Others
- eiR - Accelerated
similarity searching of small molecules
- OPSIN - Open Parser
for Systematic IUPAC nomenclature
- Cookiecutter
for Computational Molecular Sciences - Python-centric Cookiecutter
for Molecular Computational Chemistry Packages by MolSSL
- Auto-QChem
- an automated workflow for the generation and storage of DFT
calculations for organic molecules.
- Gypsum-DL
- a program for converting 2D SMILES strings to 3D models.
- RDchiral -
Wrapper for RDKit’s RunReactants to improve stereochemistry
handling
- confgen -
Webapp for generating conformers
Journals
Resources
Courses
Blogs
- Open
Source Molecular Modeling - Updateable catalog of open source
molecular modeling software.
- PubChem Blog -
News, updates and tutorials about PubChem.
- The ChEMBL-og blog -
Stories and news from Computational Chemical Biology Group at EMBL-EBI.
- ChEMBL blog - ChEMBL on
GitHub.
- SteinBlog
- Blog of Christoph
Steinbeck, who is the head of cheminformatics and metabolism at the
EMBL-EBI.
- Practical
Cheminformatics - Blog with in-depth examples of practical
application of cheminformatics.
- So much to do, so little time -
Trying to squeeze sense out of chemical data - Bolg of Rajarshi Guha, who is a
research scientist at NIH Center for Advancing Translational Science. *
Some old blogs 1 2.
- Noel O’Blog - Blog of
Noel O’Boyle, who is a
Senior Software Engineer at NextMove Software.
- chem-bla-ics - Blog
of Egon Willighagen, who is an
assistant professor at Maastricht University.
- steeveslab-blog - Some
examples using RDKit.
- Macs in Chemistry - Provide
a resource for chemists using Apple Macintosh computers.
- DrugDiscovery.NET - Blog
of Andreas Bender, who is a
Reader for Molecular Informatics at University of Cambridge.
- Is life worth
living? - Some examples for cheminformatics libraries.
- Cheminformatics 2.0 - Blog of
Alex M. Clark, a research
scientist at Collaborative Drug Discovery.
- Depth-First - Blog of Richard L. Apodaca, a chemist
living in La Jolla, California.
- Cheminformania - Blog
of Ph.D,
Esben Jannik Bjerrum, who is a Principle Scientist and a Machine
Learning and AI specialists at AstraZeneca.
Books
## See Also
License
