Awesome H2O


Below is a curated list of all the awesome projects, applications,
research, tutorials, courses and books that use H2O, an open source,
distributed machine learning platform. H2O offers parallelized
implementations of many supervised and unsupervised machine learning
algorithms such as Generalized Linear Models, Gradient Boosting Machines
(including XGBoost), Random Forests, Deep Neural Networks (Deep
Learning), Stacked Ensembles, Naive Bayes, Cox Proportional Hazards,
K-means, PCA, Word2Vec, as well as a fully automatic machine learning
algorithm (AutoML).
H2O.ai produces many tutorials, blog posts, presentations and videos about H2O, but the
list below is comprised of awesome content produced by the greater H2O
user community.
We are just getting started with this list, so pull requests are very
much appreciated! đ Please review the contribution guidelines before making a pull
request. If youâre not a GitHub user and want to make a contribution,
please send an email to community@h2o.ai.
If you think H2O is awesome too, please â the H2O GitHub repository.
Contents
Blog Posts & Tutorials
- Using
H2O AutoML to simplify training process (and also predict wine
quality) Aug 4, 2020
- Visualizing ML Models with
LIME
- Parallel
Grid Search in H2O Jan 17, 2020
- Importing,
Inspecting and Scoring with MOJO models inside H2O Dec 10, 2019
- Artificial
Intelligence Made Easy with H2O.ai: A Comprehensive Guide to Modeling
with H2O.ai and AutoML in Python June 12, 2019
- Anomaly
Detection With Isolation Forests Using H2O Dec 03, 2018
- Predicting
residential property prices in Bratislava using recipes - H2O Machine
learning Nov 25, 2018
- Inspecting
Decision Trees in H2O Nov 07, 2018
- Gentle
Introduction to AutoML from H2O.ai Sep 13, 2018
- Machine
Learning With H2O â Hands-On Guide for Data Scientists Jun 27,
2018
- Using
machine learning with LIME to understand employee churn June 25,
2018
- Analytics at Scale:
h2o, Apache Spark and R on AWS EMR June 21, 2018
- Automated
and unmysterious machine learning in cancer detection Nov 7,
2017
- Time
series machine learning with h2o+timetk Oct 28, 2017
- Sales
Analytics: How to use machine learning to predict and optimize product
backorders Oct 16, 2017
- HR
Analytics: Using machine learning to predict employee turnover Sep
18, 2017
- Autoencoders
and anomaly detection with machine learning in fraud analytics May
1, 2017
- Building
deep neural nets with h2o and rsparkling that predict arrhythmia of the
heart Feb 27, 2017
- Predicting
food preferences with sparklyr (machine learning) Feb 19, 2017
- Moving
largish data from R to H2O - spam detection with Enron emails Feb
18, 2016
- Deep
learning & parameter tuning with mxnet, h2o package in R Jan 30,
2017
Books
- Big
data in psychiatry and neurology, Chapter 11: A scalable medication
intake monitoring system Diane Myung-Kyung Woodbridge and Kevin
Bengtson Wong. (2021)
- Hands
on Time Series with R Rami Krispin. (2019)
- Mastering
Machine Learning with Spark 2.x Alex Tellez, Max Pumperla, Michal
Malohlava. (2017)
- Machine
Learning Using R Karthik Ramasubramanian, Abhishek Singh.
(2016)
- Practical
Machine Learning with H2O: Powerful, Scalable Techniques for Deep
Learning and AI Darren Cook. (2016)
- Disruptive
Analytics Thomas Dinsmore. (2016)
- Computer Age
Statistical Inference: Algorithms, Evidence, and Data Science
Bradley Efron, Trevor Hastie. (2016)
- R
Deep Learning Essentials Joshua F. Wiley. (2016)
- Spark in
Action Petar ZeÄeviÄ, Marko BonaÄi. (2016)
- Handbook
of Big Data Peter BĂŒhlmann, Petros Drineas, Michael Kane, Mark J.
van der Laan (2015)
Research Papers
- Automated
machine learning: AI-driven decision making in business analytics
Marc Schmitt. (2023)
- Water-Quality
Prediction Based on H2O AutoML and Explainable AI Techniques Hamza
Ahmad Madni, Muhammad Umer, Abid Ishaq, Nihal Abuzinadah, Oumaima
Saidani, Shtwai Alsubai, Monia Hamdi, Imran Ashraf. (2023)
- Which
model to choose? Performance comparison of statistical and machine
learning models in predicting PM2.5 from high-resolution satellite
aerosol optical depth Padmavati Kulkarnia, V.Sreekantha, Adithi
R.Upadhyab, Hrishikesh ChandraGautama. (2022)
- Prospective
validation of a transcriptomic severity classifier among patients with
suspected acute infection and sepsis in the emergency department Noa
Galtung, Eva Diehl-Wiesenecker, Dana Lehmann, Natallia Markmann, Wilma H
Bergström, James Wacker, Oliver Liesenfeld, Michael Mayhew, Ljubomir
Buturovic, Roland Luethy, Timothy E Sweeney , Rudolf Tauber, Kai
Kappert, Rajan Somasundaram, Wolfgang Bauer. (2022)
- Depression Level Prediction in
People with Parkinsonâs Disease during the COVID-19 Pandemic)
Hashneet Kaur, Patrick Ka-Cheong Poon, Sophie Yuefei Wang, Diane
Myung-kyung Woodbridge. (2021)
- Machine Learning-based Meal
Detection Using Continuous Glucose Monitoring on Healthy Participants:
An Objective Measure of Participant Compliance to Protocol Victor
Palacios, Diane Myung-kyung Woodbridge, Jean L. Fry. (2021)
- Maturity
of gray matter structures and white matter connectomes, and their
relationship with psychiatric symptoms in youth Alex Luna, Joel
Bernanke, Kakyeong Kim, Natalie Aw, Jordan D. Dworkin, Jiook Cha,
Jonathan Posner (2021).
- Appendectomy
during the COVID-19 pandemic in Italy: a multicenter ambispective cohort
study by the Italian Society of Endoscopic Surgery and new technologies
(the CRAC study) Alberto Sartori, Mauro Podda, Emanuele Botteri,
Roberto Passera, Ferdinando Agresta, Alberto Arezzo. (2021)
- Forecasting
Canadian GDP Growth with Machine Learning Shafiullah Qureshi, Ba
Chu, Fanny S. Demers. (2021)
- Morphological
traits of reef corals predict extinction risk but not conservation
status NussaĂŻbah B. Raja, Andreas Lauchstedt, John M. Pandolfi, Sun
W. Kim, Ann F. Budd, Wolfgang Kiessling. (2021)
- Machine
Learning as a Tool for Improved Housing Price Prediction Henrik I W.
Wolstad and Didrik Dewan. (2020)
- Citizen
Science Data Show Temperature-Driven Declines in Riverine Sentinel
Invertebrates Timothy J. Maguire, Scott O. C. Mundle. (2020)
- Predicting
Risk of Delays in Postal Deliveries with Neural Networks and Gradient
Boosting Machines Matilda Söderholm. (2020)
- Stock
Market Analysis using Stacked Ensemble Learning Method Malkar Takle.
(2020)
- H2O
AutoML: Scalable Automatic Machine Learning. Erin LeDell, Sebastien
Poirier. (2020)
- Single-cell
mass cytometry on peripheral blood identifies immune cell subsets
associated with primary biliary cholangitis Jin Sung Jang, Brian D.
Juran, Kevin Y. Cunningham, Vinod K. Gupta, Young Min Son, Ju Dong Yang,
Ahmad H. Ali, Elizabeth Ann L. Enninga, Jaeyun Sung & Konstantinos
N. Lazaridis. (2020)
- Prediction
of the functional impact of missense variants in BRCA1 and BRCA2 with
BRCA-ML Steven N. Hart, Eric C. Polley, Hermella Shimelis,
Siddhartha Yadav, Fergus J. Couch. (2020)
- Innovative deep
learning artificial intelligence applications for predicting
relationships between individual tree height and diameter at breast
height İlker Ercanlı. (2020)
- An
Open Source AutoML Benchmark Peter Gijsbers, Erin LeDell, Sebastien
Poirier, Janek Thomas, Berndt Bischl, Joaquin Vanschoren. (2019)
- Machine Learning in
Python: Main developments and technology trends in data science, machine
learning, and artificial intelligence Sebastian Raschka, Joshua
Patterson, Corey Nolet. (2019)
- Human
actions recognition in video scenes from multiple camera viewpoints
Fernando Itano, Ricardo Pires, Miguel Angelo de Abreu de Sousa, Emilio
Del-Moral-Hernandeza. (2019)
- Extending
MLP ANN hyper-parameters Optimization by using Genetic Algorithm
Fernando Itano, Miguel Angelo de Abreu de Sousa, Emilio
Del-Moral-Hernandez. (2018)
- askMUSIC:
Leveraging a Clinical Registry to Develop a New Machine Learning Model
to Inform Patients of Prostate Cancer Treatments Chosen by Similar
Men Gregory B. Auffenberg, Khurshid R. Ghani, Shreyas Ramani, Etiowo
Usoro, Brian Denton, Craig Rogers, Benjamin Stockton, David C. Miller,
Karandeep Singh. (2018)
- Machine
Learning Methods to Perform Pricing Optimization. A Comparison with
Standard GLMs Giorgio Alfredo Spedicato, Christophe Dutang, and
Leonardo Petrini. (2018)
- Comparative Performance
Analysis of Neural Networks Architectures on H2O Platform for Various
Activation Functions Yuriy Kochura, Sergii Stirenko, Yuri Gordienko.
(2017)
- Algorithmic
trading using deep neural networks on high frequency data Andrés
Arévalo, Jaime Niño, German Hernandez, Javier Sandoval, Diego León,
Arbey AragĂłn. (2017)
- Generic online
animal activity recognition on collar tags Jacob W. Kamminga, Helena
C. Bisby, Duc V. Le, Nirvana Meratnia, Paul J. M. Havinga. (2017)
- Soil
nutrient maps of Sub-Saharan Africa: assessment of soil nutrient content
at 250 m spatial resolution using machine learning Tomislav Hengl,
Johan G. B. Leenaars, Keith D. Shepherd, Markus G. Walsh, Gerard B. M.
Heuvelink, Tekalign Mamo, Helina Tilahun, Ezra Berkhout, Matthew Cooper,
Eric Fegraus, Ichsani Wheeler, Nketia A. Kwabena. (2017)
- Robust and flexible
estimation of data-dependent stochastic mediation effects: a proposed
method and example in a randomized trial setting Kara E. Rudolph,
Oleg Sofrygin, Wenjing Zheng, and Mark J. van der Laan. (2017)
- Automated versus
do-it-yourself methods for causal inference: Lessons learned from a data
analysis competition Vincent Dorie, Jennifer Hill, Uri Shalit, Marc
Scott, Dan Cervone. (2017)
- Using
deep learning to predict the mortality of leukemia patients Reena
Shaw Muthalaly. (2017)
- Use
of a machine learning framework to predict substance use disorder
treatment success Laura Acion, Diana Kelmansky, Mark van der Laan,
Ethan Sahker, DeShauna Jones, Stephan Arnd. (2017)
- Ultra-wideband
antenna-induced error prediction using deep learning on channel response
data Janis Tiemann, Johannes Pillmann, Christian Wietfeld.
(2017)
- Inferring
passenger types from commuter eigentravel matrices Erika Fille T.
Legara, Christopher P. Monterola. (2017)
- Deep
neural networks, gradient-boosted trees, random forests: Statistical
arbitrage on the S&P 500 Christopher Krauss, Xuan Anh Doa,
Nicolas Huckb. (2016)
- Identifying
IT purchases anomalies in the Brazilian government procurement system
using deep learning Silvio L. Domingos, Rommel N. Carvalho, Ricardo
S. Carvalho, Guilherme N. Ramos. (2016)
- Predicting
recovery of credit operations on a Brazilian bank Rogério G. Lopes,
Rommel N. Carvalho, Marcelo Ladeira, Ricardo S. Carvalho. (2016)
- Deep
learning anomaly detection as support fraud investigation in Brazilian
exports and anti-money laundering Ebberth L. Paula, Marcelo Ladeira,
Rommel N. Carvalho, Thiago MarzagĂŁo. (2016)
- Deep learning and
association rule mining for predicting drug response in cancer
Konstantinos N. Vougas, Thomas Jackson, Alexander Polyzos, Michael
Liontos, Elizabeth O. Johnson, Vassilis Georgoulias, Paul Townsend, Jiri
Bartek, Vassilis G. Gorgoulis. (2016)
- The
value of points of interest information in predicting cost-effective
charging infrastructure locations Stéphanie Florence Visser.
(2016)
- Adaptive
modelling of spatial diversification of soil classification units.
Journal of Water and Land Development Krzysztof UrbaĆski, StanisĆaw
GruszczyĆsk. (2016)
- Scalable
ensemble learning and computationally efficient variance estimation
Erin LeDell. (2015)
- Superchords:
decoding EEG signals in the millisecond range Rogerio Normand, Hugo
Alexandre Ferreira. (2015)
- Understanding random
forests: from theory to practice Gilles Louppe. (2014)
Benchmarks
Presentations
Courses
Software
License

To the extent possible under law, H2O.ai
has waived all copyright and related or neighboring rights to this
work.