<a href="https://krzjoa.github.io/awesome-python-data-science/"><img width="250" height="250" src="img/py-datascience.png" alt="pyds"></a>
<br>
<br>
<br>

Awesome Python Data Science

Awesome


Probably the best curated list of data science software in Python

Contents

Machine Learning

General Purpose Machine Learning

Gradient Boosting

Ensemble Methods

Imbalanced Datasets

Random Forests

Kernel Methods

Deep Learning

PyTorch

TensorFlow

JAX

Others

Automated Machine Learning

Natural Language Processing

Computer Audition

Computer Vision

Time Series

Reinforcement Learning

Graph Machine Learning

Learning-to-Rank & Recommender Systems

Probabilistic Graphical Models

Probabilistic Methods

Model Explanation

Genetic Programming

## Optimization * Optuna - A hyperparameter optimization framework. * pymoo - Multi-objective Optimization in Python. * pycma - Python implementation of CMA-ES. * Spearmint - Bayesian optimization. * BoTorch - Bayesian optimization in PyTorch. PyTorch based/compatible * scikit-opt - Heuristic Algorithms for optimization. * sklearn-genetic-opt - Hyperparameters tuning and feature selection using evolutionary algorithms. sklearn * SMAC3 - Sequential Model-based Algorithm Configuration. * Optunity - Is a library containing various optimizers for hyperparameter tuning. * hyperopt - Distributed Asynchronous Hyperparameter Optimization in Python. * hyperopt-sklearn - Hyper-parameter optimization for sklearn. sklearn * sklearn-deap - Use evolutionary algorithms instead of gridsearch in scikit-learn. sklearn * sigopt_sklearn - SigOpt wrappers for scikit-learn methods. sklearn * Bayesian Optimization - A Python implementation of global optimization with gaussian processes. * SafeOpt - Safe Bayesian Optimization. * scikit-optimize - Sequential model-based optimization with a scipy.optimize interface. * Solid - A comprehensive gradient-free optimization framework written in Python. * PySwarms - A research toolkit for particle swarm optimization in Python. * Platypus - A Free and Open Source Python Library for Multiobjective Optimization. * GPflowOpt - Bayesian Optimization using GPflow. sklearn * POT - Python Optimal Transport library. * Talos - Hyperparameter Optimization for Keras Models. * nlopt - Library for nonlinear optimization (global and local, constrained or unconstrained). * OR-Tools - An open-source software suite for optimization by Google; provides a unified programming interface to a half dozen solvers: SCIP, GLPK, GLOP, CP-SAT, CPLEX, and Gurobi.

Feature Engineering

General

Feature Selection

Visualization

General Purposes

NLP

Deployment

Statistics

Data Manipulation

Data Frames

Pipelines

Data-centric AI

Synthetic Data

Distributed Computing

Experimentation

Data Validation

Evaluation

Computations

Web Scraping

Spatial Analysis

Quantum Computing

Conversion

Contributing

Contributions are welcome! :sunglasses:
Read the contribution guideline.

License

This work is licensed under the Creative Commons Attribution 4.0 International License - CC BY 4.0

pythondatascience.md Github