
The novel discipline of materials informatics is a junction
of materials, computer, and data sciences. It aims to unite the nowadays
competing physics- and data-intensive efforts for the most impactful
applied science, that transformed our society in the 20th century.
Contributions are very welcome - please follow the guidelines.
Contents
Software and products
- AFLOW -
High-Throughput ab-initio Computing (C++).
- AiiDA - Automated Infrastructure and
Database for Ab-initio design (Python).

- ASE - Atomic Simulation
Environment (Python).

- ASR - Atomic Simulation
Recipes, based on ASE (Python).

- atomate -
Materials science workflows based on FireWorks, developed at LBNL
(Python).

- aviary - Predict
materials properties using compositions and Wyckoff representations
(Python).

- BIOVIA
Materials Studio - Proprietary simulation
infrastructure.
- CAMD - Agent-based
sequential learning software for materials discovery
(Python).

- cclib - Parse and interpret
the results of computational chemistry packages
(Python).

- cctbx - Computational
Crystallography Toolbox (C++).

- CDVAE -
Python Crystal Diffusion Variational AutoEncoder
(CDVAE) generates novel stable materials via inverse design.

- CrabNet -
Predict materials properties using only the composition information.
(Python).

- Crystal Toolkit - A
framework for building web apps for materials science powering the new
Materials Project website.

- Custodian -
Simple, robust and flexible just-in-time (JIT) job management framework
(Python).

- datamol -
Molecular Manipulation Made Easy. A light wrapper built on top of RDKit
(Python).

- ElMD - Quantify the
chemical similarity between two compositions using the Element Movers
Distance.

- FireWorks
- Workflow engine developed at LBNL (Python).

- Granta MI -
Proprietary enterprise infrastructure for the materials
data.
- Grobid
superconductors - Open source Grobid module for
extracting superconductor material and related properties
- httk - High-throughput
toolkit (Python).

- ICMD - A digital
materials design platform in the cloud from QuesTek Innovations LLC
(proprietary).
- ioChem-BD - Solution to
manage computational chemistry Big Data (Java).
- MAST-ML - An
open-source Python package designed to broaden and accelerate the use of
machine learning in materials science research
(Python).

- matador - A library
for aggregation and analysis of high-throughput DFT
(Python).

- matbench
- Matbench: Benchmarks for materials science property prediction
(Python).

- matbench-genmetrics
- Generative materials benchmarking metrics, inspired by guacamol and CDVAE
(Python).

- matminer
- A library for data mining in materials science
(Python).

- MatSciBERT
- A Materials Domain Language Model for Text Mining and Information
Extraction (Python).

- mat_discover -
Find high-performance candidates in chemical spaces, composition-only
(Python).

- MDCS - Materials
Data Curation System (Python).

- MedeA -
Proprietary computational Tcl environment by
Materials Design, Inc.
- MODNet - Select
optimal descriptions and build models for predicting materials
properties (Python).

- mp-time-split -
Use time-based cross-validation splits from Materials Project for
generative modeling benchmarking (Python).

- NOMAD
Oasis - A web-based software to manage and share materials data
(Python/javascript).

- OACIS - Job
management software for simulation studies using a Ruby on
Rails webserver.

- optimade-python-tools
- Tools for OPTIMADE APIs in
Python.

- piro - Software for
evaluating pareto-optimal synthesis pathways (Python).

- pyiron - Integrated
development environment (IDE) for computational materials science
(Python).

- pymatflow -
Toolbox for (high-throughput) DFT modeling of materials
(Python).

- Pymatgen - A robust, open-source
Python library for materials analysis.

- Pymatviz - A
toolkit for visualizations in materials informatics.

- pymks - Materials Knowledge System
(Python).

- QMForge -
Python framework and GUI for analyzing results of
quantum chemistry codes.
- QMflows -
Python library for input generation and task handling
in computational chemistry.

- qmpy -
Python backend creating and running the Open Quantum
Materials Database.

- quacc -
Python platform for high-throughput, database-driven
computational materials science and quantum chemistry

- RDKit - A collection of
cheminformatics and machine-learning software written in
C++ and Python.

- SEAMM - Simulation
Environment for Atomistic and Molecular Modeling
(Python).

- SuperCon2 - A
user interface for curating Superconductors materials and properties
extracted by grobid-superconductors
- SLAMD - An
open source web app for data driven acceleration of cement and concrete
development through digital lab twin and AI optimization
(Python/javascript).

- tilde -
Python framework for ab initio data repositories.

- XenonPy - A
Python library that implements a comprehensive set of machine learning
tools for materials informatics.

- xtal2png -
Python package for invertibly representing crystal
structures as PNG images for screening state-of-the-art image-processing
generative models.

- Absolidix - An early preview of
the on-demand cloud simulations of materials from MPDS
(PAULING FILE) with AiiDA framework.
- AiiDAlab - Web
platform & GUI for AiiDA in the Cloud (cf.
AiiDA framework).
- Ångström AI - Accelerating
molecular simulation using generative AI (California, USA).
- Atomic Tessellator -
Computational chemistry cloud and AI lab from New Zealand.
- Azure Quantum Elements -
Microsoft’s Quantum Computing including generative chemistry and
accelerated DFT.
- Compular - New materials
development cloud from Sweden.
- CuspAI - Combat Climate Change
with AI-Designed Materials (Cambridge, UK).
- Dunia Innovations - A Berlin-based
materials discovery startup (Germany).
- Entalpic - AI-driven company for
discovering new chemical processes and materials (France).
- LMDS - The Liverpool
materials discovery server hosts computational tools to help
experimental chemists search for new materials.
- Mat3ra - Materials Modeling 2.0
(cloud engine from Silicon Valley).

- MatCloud - Cloud-based
computational infrastructure of the Chinese Materials Genome Project
(China).
- Materials Square - Ab
initio and CALPHAD simulations cloud (South Korea).
- Matlantis - Accelerated
materials discovery platform (Japan).

- Orbital Materials -
Advanced materials, made with AI (UK).
- Periodic Labs - A new
materials AI startup from OpenAI and Google DeepMind (USA and UK).
- Radical AI - Accelerating
materials R&D (New York, USA).
- Quantistry Lab -
Cloud-based simulations of syntheses, designing novel materials,
computational chemistry (Germany).
- SIT Rolos - Research platform for
materials from Schaffhausen Institute of Technology (Switzerland).
Machine-readable materials
datasets
- AFLOW - Flow for Materials
Discovery repository (cf. AFLOW
framework).
- ATB - Automated Topology Builder
and Repository.
- AtomWork and AtomWork-Adv - Data platform
of NIMS, Japan (based on the PAULING FILE experimental database).
- Baikov Institute of Metallurgy and
Materials Science - Databases of Russian Academy of Sciences.
- Carolina Materials
Database - an ML-DFT database of the University of South
Carolina.
- CascadesDB - Molecular dynamics
simulations of collision cascades, by the International Atomic Energy
Agency.
- Catalysis Hub -
Web-platform for sharing data and software for computational catalysis
research.
- cccbdb - Computational
Chemistry Comparison and Benchmark Database.
- CCDC - Cambridge
Crystallographic Data Centre (partly proprietary).
- Citrination - AI-Powered
Materials Data Platform (partly proprietary).
- CMR - Computational
Materials Repository (cf. ASE framework).
- COD - Crystallography
Open Database (including theoretical database).
- ESP - Electronic
Structure Project.
- HybriD3 Materials
Database - A comprehensive collection of experimental and
computational materials data for crystalline organic-inorganic
compounds.
- ICSD -
Inorganic Crystal Structure Database (partly proprietary).
- JARVIS - Joint Automated
Repository for Various Integrated Simulations (NIST).
- Khazana - Repository for
data created in atomistic simulations, features also the polymer
genome.
- Materials Cloud - A
Platform for Open Materials Science (cf. AiiDA
framework).
- Materials Genome Engineering
Databases of China - National integration platform (cf.
MatCloud).
- MaterialsMine - An
open-source repository for nanocomposite data (NanoMine) and mechanical
metamaterials data (MetaMine).
- Materials Project -
Computed information on known and predicted materials (cf.
Pymatgen framework).
- MDF - Materials Data
Facility, a set of data services built specifically to support materials
science researchers.
- MolSSI - The MolSSI
Quantum Chemistry Archive.
- MPDS - Materials Platform for Data
Science (based on the PAULING FILE experimental database, partly
proprietary).
- MPOD - Material Properties
Open Database.
- MSE - Test Set for
Materials Science and Engineering.
- nanoHUB - Place for
computational nanotechnology research, education, and
collaboration.
- NOMAD - Novel Materials
Discovery, Repository, and Laboratory (cf. NOMAD
Oasis).
- NREL MatDB - Computational
database of thermochemical and electronic properties of materials for
renewable energy applications
- Organic Materials Database -
Electronic structure database for 3-dimensional organic crystals
(Nordita).
- Open Materials Database -
Materials-genome-type repository from ab-inito calculations
(cf. httk framework).
- OpenKIM - Repository of
interatomic potential implementations and computational protocols for
testing them.
- OQMD - Open Quantum Materials Database
(cf. qmpy framework).
- Phonon database at Kyoto
university - Computational phonon band structures, density of states
and thermal properties.
- Pitt Quantum Repository -
Molecular properties predicted from quantum mechanics.
- ROD - Raman
Open Database.
- SuperMat - A
dataset of superconductors materials
- Topological
Materials Database - A Complete Catalogue of High-Quality
Topological Materials.
Standardization initiatives
- Blue Obelisk - Movement
for open data, open source and open standards in chemistry and materials
science (by Murray-Rust).
- CIF -
Crystallographic Information File, a standard for crystallographic
information (by IUCr, International Union of Crystallography).
- CML - Chemical Markup Language:
molecules, compounds, reactions, spectra, crystals etc. (by
Murray-Rust).
- ColabFit - Collaborative
infrastructure for the development and distribution of state-of-the-art
data-driven interatomic potentials (DDIPs).
- EMMO - European
Materials Modelling Ontology.
- ESCDF
- Electronic Structure Common Data Format.
- ESSE - Exabyte
Source of Schemas and Examples designed for digital materials
science.
- GEMD -
Graphical Expression of Materials Data (by Citrine), supersedes
PIF.
- JCAMP-DX - Electronic data
standards for chemical and spectroscopy information (by IUPAC).
- KIM API - API standard
for connecting molecular simulation codes with interatomic models.
- NOMAD Meta Info
- Schema for storing results of ab initio and force-field atomistic
simulations (by NOMAD Laboratory).
- OPTIMADE - Open Databases
Integration for Materials Design, a REST API standard for exchanging
materials information.
- PIF
- Physical Information File schema (by Citrine), superseded by
GEMD.
- Semantic Assets for
Materials Science - Task group within the vocabulary
services interest group of the Research Data Alliance.
- Open Force
Field Toolkit - Specification for encoding molecular mechanics force
fields (by Open Force Field
Initiative).
Similar compilations
License

materialsinformatics.md
Github