Files
awesome-awesomeness/html/datascience.md2.html
2024-04-23 15:17:38 +02:00

2887 lines
123 KiB
HTML
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<div data-align="center">
<img src="./assets/head.jpg">
</div>
<h1 id="awesome-data-science">AWESOME DATA SCIENCE</h1>
<p><a href="https://github.com/sindresorhus/awesome"><img
src="https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg"
alt="Awesome" /></a></p>
<p><strong>An open-source Data Science repository to learn and apply
towards solving real world problems.</strong></p>
<p>This is a shortcut path to start studying <strong>Data
Science</strong>. Just follow the steps to answer the questions, “What
is Data Science and what should I study to learn Data Science?”</p>
<h2 id="sponsors">Sponsors</h2>
<table>
<thead>
<tr class="header">
<th>Sponsor</th>
<th>Pitch</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td></td>
<td>Be the first to sponsor! <code>github@academic.io</code></td>
</tr>
</tbody>
</table>
<p><br></p>
<h2 id="table-of-contents">Table of Contents</h2>
<ul>
<li><a href="#what-is-data-science">What is Data Science?</a></li>
<li><a href="#where-do-i-start">Where do I Start?</a></li>
<li><a href="#training-resources">Training Resources</a>
<ul>
<li><a href="#tutorials">Tutorials</a></li>
<li><a href="#free-courses">Free Courses</a></li>
<li><a href="#moocs">Massively Open Online Courses</a></li>
<li><a href="#intensive-programs">Intensive Programs</a></li>
<li><a href="#colleges">Colleges</a></li>
</ul></li>
<li><a href="#the-data-science-toolbox">The Data Science Toolbox</a>
<ul>
<li><a href="#algorithms">Algorithms</a>
<ul>
<li><a href="#supervised-learning">Supervised Learning</a></li>
<li><a href="#unsupervised-learning">Unsupervised Learning</a></li>
<li><a href="#semi-supervised-learning">Semi-Supervised
Learning</a></li>
<li><a href="#reinforcement-learning">Reinforcement Learning</a></li>
<li><a href="#data-mining-algorithms">Data Mining Algorithms</a></li>
<li><a href="#deep-learning-architectures">Deep Learning
Architectures</a></li>
</ul></li>
<li><a href="#general-machine-learning-packages">General Machine
Learning Packages</a></li>
<li><a href="#deep-learning-packages">Deep Learning Packages</a>
<ul>
<li><a href="#pytorch-ecosystem">PyTorch Ecosystem</a></li>
<li><a href="#tensorflow-ecosystem">TensorFlow Ecosystem</a></li>
<li><a href="#keras-ecosystem">Keras Ecosystem</a></li>
</ul></li>
<li><a href="#visualization-tools">Visualization Tools</a></li>
<li><a href="#miscellaneous-tools">Miscellaneous Tools</a></li>
</ul></li>
<li><a href="#literature-and-media">Literature and Media</a>
<ul>
<li><a href="#books">Books</a>
<ul>
<li><a href="#book-deals-affiliated-">Book Deals (Affiliated)</a></li>
</ul></li>
<li><a href="#journals-publications-and-magazines">Journals,
Publications, and Magazines</a></li>
<li><a href="#newsletters">Newsletters</a></li>
<li><a href="#bloggers">Bloggers</a></li>
<li><a href="#presentations">Presentations</a></li>
<li><a href="#podcasts">Podcasts</a></li>
<li><a href="#youtube-videos--channels">YouTube Videos &amp;
Channels</a></li>
</ul></li>
<li><a href="#socialize">Socialize</a>
<ul>
<li><a href="#facebook-accounts">Facebook Accounts</a></li>
<li><a href="#twitter-accounts">Twitter Accounts</a></li>
<li><a href="#telegram-channels">Telegram Channels</a></li>
<li><a href="#slack-communities">Slack Communities</a></li>
<li><a href="#github-groups">GitHub Groups</a></li>
<li><a href="#data-science-competitions">Data Science
Competitions</a></li>
</ul></li>
<li><a href="#fun">Fun</a>
<ul>
<li><a href="#infographics">Infographics</a></li>
<li><a href="#datasets">Datasets</a></li>
<li><a href="#comics">Comics</a></li>
</ul></li>
<li><a href="#other-awesome-lists">Other Awesome Lists</a>
<ul>
<li><a href="#hobby">Hobby</a></li>
</ul></li>
</ul>
<h2 id="what-is-data-science">What is Data Science?</h2>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<p>Data Science is one of the hottest topics on the Computer and
Internet farmland nowadays. People have gathered data from applications
and systems until today and now is the time to analyze them. The next
steps are producing suggestions from the data and creating predictions
about the future. <a
href="https://www.quora.com/Data-Science/What-is-data-science">Here</a>
you can find the biggest question for <strong>Data Science</strong> and
hundreds of answers from experts.</p>
<table>
<colgroup>
<col style="width: 50%" />
<col style="width: 50%" />
</colgroup>
<thead>
<tr class="header">
<th>Link</th>
<th>Preview</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><a href="https://www.oreilly.com/ideas/what-is-data-science">What is
Data Science @ Oreilly</a></td>
<td><em>Data scientists combine entrepreneurship with patience, the
willingness to build data products incrementally, the ability to
explore, and the ability to iterate over a solution. They are inherently
interdisciplinary. They can tackle all aspects of a problem, from
initial data collection and data conditioning to drawing conclusions.
They can think outside the box to come up with new ways to view the
problem, or to work with very broadly defined problems: “heres a lot of
data, what can you make from it?”</em></td>
</tr>
<tr class="even">
<td><a
href="https://www.quora.com/Data-Science/What-is-data-science">What is
Data Science @ Quora</a></td>
<td>Data Science is a combination of a number of aspects of Data such as
Technology, Algorithm development, and data interference to study the
data, analyse it, and find innovative solutions to difficult problems.
Basically Data Science is all about Analysing data and driving for
business growth by finding creative ways.</td>
</tr>
<tr class="odd">
<td><a
href="https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century">The
sexiest job of 21st century</a></td>
<td><em>Data scientists today are akin to Wall Street “quants” of the
1980s and 1990s. In those days people with backgrounds in physics and
math streamed to investment banks and hedge funds, where they could
devise entirely new algorithms and data strategies. Then a variety of
universities developed masters programs in financial engineering, which
churned out a second generation of talent that was more accessible to
mainstream firms. The pattern was repeated later in the 1990s with
search engineers, whose rarefied skills soon came to be taught in
computer science programs.</em></td>
</tr>
<tr class="even">
<td><a
href="https://en.wikipedia.org/wiki/Data_science">Wikipedia</a></td>
<td><em>Data science is an interdisciplinary field that uses scientific
methods, processes, algorithms and systems to extract knowledge and
insights from many structural and unstructured data. Data science is
related to data mining, machine learning and big data.</em></td>
</tr>
<tr class="odd">
<td><a
href="https://www.mastersindatascience.org/careers/data-scientist/">How
to Become a Data Scientist</a></td>
<td><em>Data scientists are big data wranglers, gathering and analyzing
large sets of structured and unstructured data. A data scientists role
combines computer science, statistics, and mathematics. They analyze,
process, and model data then interpret the results to create actionable
plans for companies and other organizations.</em></td>
</tr>
<tr class="even">
<td><a
href="https://www.forbes.com/sites/gilpress/2013/05/28/a-very-short-history-of-data-science/">a
very short history of #datascience</a></td>
<td><em>The story of how data scientists became sexy is mostly the story
of the coupling of the mature discipline of statistics with a very young
onecomputer science. The term “Data Science” has emerged only recently
to specifically designate a new profession that is expected to make
sense of the vast stores of big data. But making sense of data has a
long history and has been discussed by scientists, statisticians,
librarians, computer scientists and others for years. The following
timeline traces the evolution of the term “Data Science” and its use,
attempts to define it, and related terms.</em></td>
</tr>
<tr class="odd">
<td><a
href="https://www.rstudio.com/blog/software-development-resources-for-data-scientists/">Software
Development Resources for Data Scientists</a></td>
<td><em>Data scientists concentrate on making sense of data through
exploratory analysis, statistics, and models. Software developers apply
a separate set of knowledge with different tools. Although their focus
may seem unrelated, data science teams can benefit from adopting
software development best practices. Version control, automated testing,
and other dev skills help create reproducible, production-ready code and
tools.</em></td>
</tr>
</tbody>
</table>
<h2 id="where-do-i-start">Where do I Start?</h2>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<p>While not strictly necessary, having a programming language is a
crucial skill to be effective as a data scientist. Currently, the most
popular language is <em>Python</em>, closely followed by <em>R</em>.
Python is a general-purpose scripting language that sees applications in
a wide variety of fields. R is a domain-specific language for
statistics, which contains a lot of common statistics tools out of the
box.</p>
<p><a href="https://python.org/">Python</a> is by far the most popular
language in science, due in no small part to the ease at which it can be
used and the vibrant ecosystem of user-generated packages. To install
packages, there are two main methods: Pip (invoked as
<code>pip install</code>), the package manager that comes bundled with
Python, and <a href="https://www.anaconda.com">Anaconda</a> (invoked as
<code>conda install</code>), a powerful package manager that can install
packages for Python, R, and can download executables like Git.</p>
<p>Unlike R, Python was not built from the ground up with data science
in mind, but there are plenty of third party libraries to make up for
this. A much more exhaustive list of packages can be found later in this
document, but these four packages are a good set of choices to start
your data science journey with: <a
href="https://scikit-learn.org/stable/index.html">Scikit-Learn</a> is a
general-purpose data science package which implements the most popular
algorithms - it also includes rich documentation, tutorials, and
examples of the models it implements. Even if you prefer to write your
own implementations, Scikit-Learn is a valuable reference to the
nuts-and-bolts behind many of the common algorithms youll find. With <a
href="https://pandas.pydata.org/">Pandas</a>, one can collect and
analyze their data into a convenient table format. <a
href="https://numpy.org/">Numpy</a> provides very fast tooling for
mathematical operations, with a focus on vectors and matrices. <a
href="https://seaborn.pydata.org/">Seaborn</a>, itself based on the <a
href="https://matplotlib.org/">Matplotlib</a> package, is a quick way to
generate beautiful visualizations of your data, with many good defaults
available out of the box, as well as a gallery showing how to produce
many common visualizations of your data.</p>
<p>When embarking on your journey to becoming a data scientist, the
choice of language isnt particularly important, and both Python and R
have their pros and cons. Pick a language you like, and check out one of
the <a href="#free-courses">Free courses</a> weve listed below!</p>
<h2 id="real-world">Real World</h2>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<p>Data science is a powerful tool that is utilized in various fields to
solve real-world problems by extracting insights and patterns from
complex data.</p>
<h3 id="disaster">Disaster</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a href="https://huggingface.co/deprem-ml">deprem-ml</a> <a
href="https://linktr.ee/acikyazilimagi">AYA: Açık Yazılım Ağı</a> (+25k
developers) is trying to help disaster response using artificial
intelligence. Everything is open-sourced <a
href="https://afet.org">afet.org</a>.</li>
</ul>
<h2 id="training-resources">Training Resources</h2>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<p>How do you learn data science? By doing data science, of course!
Okay, okay - that might not be particularly helpful when youre first
starting out. In this section, weve listed some learning resources, in
rough order from least to greatest commitment - <a
href="#tutorials">Tutorials</a>, <a href="#moocs">Massively Open Online
Courses (MOOCs)</a>, <a href="#intensive-programs">Intensive
Programs</a>, and <a href="#colleges">Colleges</a>.</p>
<h3 id="tutorials">Tutorials</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a href="https://cloud.blobcity.com/#/ps/explore">1000 Data Science
Projects</a> you can run on the browser with IPython.</li>
<li><a
href="https://github.com/rfordatascience/tidytuesday">#tidytuesday</a> A
weekly data project aimed at the R ecosystem.</li>
<li><a href="https://github.com/jadianes/data-science-your-way">Data
science your way</a></li>
<li><a href="https://github.com/kevinschaich/pyspark-cheatsheet">PySpark
Cheatsheet</a></li>
<li><a
href="https://www.manning.com/livevideo/machine-learning-data-science-and-deep-learning-with-python">Machine
Learning, Data Science and Deep Learning with Python</a></li>
<li><a href="https://www.lighttag.io/how-to-label-data/">How To Label
Data</a></li>
<li><a
href="https://medium.com/@lettier/how-does-lda-work-ill-explain-using-emoji-108abf40fa7d">Your
Guide to Latent Dirichlet Allocation</a></li>
<li><a href="https://classpert.com/search/data-science">Over 1000 Data
Science Online Courses at Classpert Online Search Engine</a></li>
<li><a
href="https://github.com/handcraftsman/GeneticAlgorithmsWithPython">Tutorials
of source code from the book Genetic Algorithms with Python by Clinton
Sheppard</a></li>
<li><a
href="https://github.com/jinglescode/python-signal-processing">Tutorials
to get started on signal processing for machine learning</a></li>
<li><a href="https://www.microprediction.com/python-1">Realtime
deployment</a> Tutorial on Python time-series model deployment.</li>
<li><a
href="https://learntocodewith.me/posts/python-for-data-science/">Python
for Data Science: A Beginners Guide</a></li>
<li><a
href="https://github.com/khangich/machine-learning-interview">Minimum
Viable Study Plan for Machine Learning Interviews</a></li>
<li><a href="http://mlzoomcamp.com/">Understand and Know Machine
Learning Engineering by Building Solid Projects</a></li>
<li><a
href="https://www.datawars.io/articles/12-free-data-science-projects-to-practice-python-and-pandas">12
free Data Science projects to practice Python and Pandas</a></li>
</ul>
<h3 id="free-courses">Free Courses</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a href="https://www.datacamp.com/tracks/data-scientist-with-r">Data
Scientist with R</a></li>
<li><a
href="https://www.datacamp.com/tracks/data-scientist-with-python">Data
Scientist with Python</a></li>
<li><a
href="https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-034-artificial-intelligence-fall-2010/lecture-videos/lecture-1-introduction-and-scope/">Genetic
Algorithms OCW Course</a></li>
<li><a href="https://github.com/AMAI-GmbH/AI-Expert-Roadmap">AI Expert
Roadmap</a> - Roadmap to becoming an Artificial Intelligence Expert</li>
<li><a href="https://www.edx.org/course/convex-optimization">Convex
Optimization</a> - Convex Optimization (basics of convex analysis;
least-squares, linear and quadratic programs, semidefinite programming,
minimax, extremal volume, and other problems; optimality conditions,
duality theory…)</li>
<li><a
href="https://skillcombo.com/courses/development/data-science/free/">Skillcombo
- Data Science</a> - 1000+ free online Data Science courses</li>
<li><a href="https://home.work.caltech.edu/telecourse.html">Learning
from Data</a> - Introduction to machine learning covering basic theory,
algorithms and applications</li>
<li><a href="https://www.kaggle.com/learn">Kaggle</a> - Learn about Data
Science, Machine Learning, Python etc</li>
<li><a href="https://arize.com/ml-observability-fundamentals/">ML
Observability Fundamentals</a> - Learn how to monitor and root-cause
production ML issues.</li>
<li><a
href="https://www.wandb.courses/courses/effective-mlops-model-development">Weights
&amp; Biases Effective MLOps: Model Development</a> - Free Course and
Certification for building an end-to-end machine using W&amp;B</li>
<li><a
href="https://globalaihub.com/courses/introduction-to-python-the-road-to-machine-learning/">Python
for Machine Learning</a> - Start your journey to machine learning with
Python, one of the most powerful programming languages.</li>
<li><a
href="https://www.scaler.com/topics/course/python-for-data-science/">Python
for Data Science by Scaler</a> - This course is designed to empower
beginners with the essential skills to excel in todays data-driven
world. The comprehensive curriculum will give you a solid foundation in
statistics, programming, data visualization, and machine learning.</li>
<li><a
href="https://github.com/jacopotagliabue/MLSys-NYU-2022/tree/main">MLSys-NYU-2022</a>
- Slides, scripts and materials for the Machine Learning in Finance
course at NYU Tandon, 2022.</li>
<li><a
href="https://github.com/Paulescu/hands-on-train-and-deploy-ml">Hands-on
Train and Deploy ML</a> - A hands-on course to train and deploy a
serverless API that predicts crypto prices.</li>
<li><a href="https://www.comet.com/site/llm-course/">LLMOps: Building
Real-World Applications With Large Language Models</a> - Learn to build
modern software with LLMs using the newest tools and techniques in the
field.</li>
</ul>
<h3 id="moocs">MOOCs</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a
href="https://www.coursera.org/specializations/data-science">Coursera
Introduction to Data Science</a></li>
<li><a
href="https://www.coursera.org/specializations/jhu-data-science">Data
Science - 9 Steps Courses, A Specialization on Coursera</a></li>
<li><a href="https://www.coursera.org/specializations/data-mining">Data
Mining - 5 Steps Courses, A Specialization on Coursera</a></li>
<li><a
href="https://www.coursera.org/specializations/machine-learning">Machine
Learning 5 Steps Courses, A Specialization on Coursera</a></li>
<li><a href="https://cs109.github.io/2015/">CS 109 Data Science</a></li>
<li><a href="https://www.openintro.org/">OpenIntro</a></li>
<li><a href="https://www.cs171.org/#!index.md">CS 171
Visualization</a></li>
<li><a href="https://www.coursera.org/learn/process-mining">Process
Mining: Data science in Action</a></li>
<li><a href="https://www.cs.ox.ac.uk/projects/DeepLearn/">Oxford Deep
Learning</a></li>
<li><a
href="https://www.youtube.com/playlist?list=PLE6Wd9FR--EfW8dtjAuPoTuPcqmOV53Fu">Oxford
Deep Learning - video</a></li>
<li><a href="https://www.cs.ox.ac.uk/research/ai_ml/index.html">Oxford
Machine Learning</a></li>
<li><a href="https://www.cs.ubc.ca/~nando/540-2013/lectures.html">UBC
Machine Learning - video</a></li>
<li><a href="https://github.com/DataScienceSpecialization/courses">Data
Science Specialization</a></li>
<li><a href="https://www.coursera.org/specializations/big-data">Coursera
Big Data Specialization</a></li>
<li><a
href="https://www.edx.org/course/statistical-thinking-for-data-science-and-analytic">Statistical
Thinking for Data Science and Analytics by Edx</a></li>
<li><a href="https://cognitiveclass.ai/">Cognitive Class AI by
IBM</a></li>
<li><a
href="https://www.udacity.com/course/intro-to-tensorflow-for-deep-learning--ud187">Udacity
- Deep Learning</a></li>
<li><a href="https://www.manning.com/livevideo/keras-in-motion">Keras in
Motion</a></li>
<li><a
href="https://academy.microsoft.com/en-us/professional-program/tracks/data-science/">Microsoft
Professional Program for Data Science</a></li>
<li><a href="https://tdgunes.com/COMP6246-2019Fall/">COMP3222/COMP6246 -
Machine Learning Technologies</a></li>
<li><a href="https://cs231n.github.io/">CS 231 - Convolutional Neural
Networks for Visual Recognition</a></li>
<li><a
href="https://www.coursera.org/professional-certificates/tensorflow-in-practice">Coursera
Tensorflow in practice</a></li>
<li><a
href="https://www.coursera.org/specializations/deep-learning">Coursera
Deep Learning Specialization</a></li>
<li><a href="https://365datascience.com/">365 Data Science
Course</a></li>
<li><a
href="https://www.coursera.org/specializations/natural-language-processing">Coursera
Natural Language Processing Specialization</a></li>
<li><a
href="https://www.coursera.org/specializations/generative-adversarial-networks-gans">Coursera
GAN Specialization</a></li>
<li><a
href="https://www.codecademy.com/learn/paths/data-science">Codecademys
Data Science</a></li>
<li><a
href="https://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/video-lectures/">Linear
Algebra</a> - Linear Algebra course by Gilbert Strang</li>
<li><a
href="https://ocw.mit.edu/resources/res-18-010-a-2020-vision-of-linear-algebra-spring-2020/">A
2020 Vision of Linear Algebra (G. Strang)</a></li>
<li><a
href="https://intellipaat.com/academy/course/python-for-data-science-free-training/">Python
for Data Science Foundation Course</a></li>
<li><a
href="https://www.coursera.org/specializations/data-science-statistics-machine-learning">Data
Science: Statistics &amp; Machine Learning</a></li>
<li><a
href="https://www.coursera.org/specializations/machine-learning-engineering-for-production-mlops">Machine
Learning Engineering for Production (MLOps)</a></li>
<li><a
href="https://www.coursera.org/specializations/recommender-systems">Recommender
Systems Specialization from University of Minnesota</a> is an
intermediate/advanced level specialization focused on Recommender System
on the Coursera platform.</li>
<li><a
href="https://online.stanford.edu/programs/artificial-intelligence-professional-program">Stanford
Artificial Intelligence Professional Program</a></li>
<li><a
href="https://app.datacamp.com/learn/career-tracks/data-scientist-with-python">Data
Scientist with Python</a></li>
<li><a
href="https://www.udemy.com/course/programming-with-julia/">Programming
with Julia</a></li>
<li><a href="https://www.scaler.com/data-science-course/">Scaler Data
Science &amp; Machine Learning Program</a></li>
</ul>
<h3 id="intensive-programs">Intensive Programs</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a href="https://www.s2ds.org/">S2DS</a></li>
</ul>
<h3 id="colleges">Colleges</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a
href="https://github.com/ryanswanstrom/awesome-datascience-colleges">A
list of colleges and universities offering degrees in data
science.</a></li>
<li><a href="https://ischoolonline.berkeley.edu/data-science/">Data
Science Degree @ Berkeley</a></li>
<li><a href="https://datascience.virginia.edu/">Data Science Degree @
UVA</a></li>
<li><a href="https://datasciencedegree.wisconsin.edu/">Data Science
Degree @ Wisconsin</a></li>
<li><a href="https://study.iitm.ac.in/ds/">BS in Data Science &amp;
Applications</a></li>
<li><a
href="https://www.bu.edu/online/programs/graduate-programs/computer-information-systems-masters-degree/">MS
in Computer Information Systems @ Boston University</a></li>
<li><a
href="https://asuonline.asu.edu/online-degree-programs/graduate/master-science-business-analytics/">MS
in Business Analytics @ ASU Online</a></li>
<li><a
href="https://ischool.syr.edu/academics/applied-data-science-masters-degree/">MS
in Applied Data Science @ Syracuse</a></li>
<li><a
href="https://www.leuphana.de/en/graduate-school/masters-programmes/management-data-science.html">M.S.
Management &amp; Data Science @ Leuphana</a></li>
<li><a
href="https://study.unimelb.edu.au/find/courses/graduate/master-of-data-science/#overview">Master
of Data Science @ Melbourne University</a></li>
<li><a
href="https://www.ed.ac.uk/studying/postgraduate/degrees/index.php?r=site/view&amp;id=902">Msc
in Data Science @ The University of Edinburgh</a></li>
<li><a href="https://smith.queensu.ca/grad_studies/mma/index.php">Master
of Management Analytics @ Queens University</a></li>
<li><a
href="https://www.iit.edu/academics/programs/data-science-mas">Master of
Data Science @ Illinois Institute of Technology</a></li>
<li><a
href="https://www.si.umich.edu/programs/master-applied-data-science-online">Master
of Applied Data Science @ The University of Michigan</a></li>
<li><a
href="https://www.tue.nl/en/education/graduate-school/master-data-science-and-artificial-intelligence/">Master
Data Science and Artificial Intelligence @ Eindhoven University of
Technology</a></li>
<li><a href="https://masteres.ugr.es/datcom/">Masters Degree in Data
Science and Computer Engineering @ University of Granada</a></li>
</ul>
<h2 id="the-data-science-toolbox">The Data Science Toolbox</h2>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<p>This section is a collection of packages, tools, algorithms, and
other useful items in the data science world.</p>
<h3 id="algorithms">Algorithms</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<p>These are some Machine Learning and Data Mining algorithms and models
help you to understand your data and derive meaning from it.</p>
<h4 id="three-kinds-of-machine-learning-systems">Three kinds of Machine
Learning Systems</h4>
<ul>
<li>Based on training with human supervision</li>
<li>Based on learning incrementally on fly</li>
<li>Based on data points comparison and pattern detection</li>
</ul>
<h4 id="supervised-learning">Supervised Learning</h4>
<ul>
<li><a
href="https://en.wikipedia.org/wiki/Regression">Regression</a></li>
<li><a href="https://en.wikipedia.org/wiki/Linear_regression">Linear
Regression</a></li>
<li><a
href="https://en.wikipedia.org/wiki/Ordinary_least_squares">Ordinary
Least Squares</a></li>
<li><a href="https://en.wikipedia.org/wiki/Logistic_regression">Logistic
Regression</a></li>
<li><a href="https://en.wikipedia.org/wiki/Stepwise_regression">Stepwise
Regression</a></li>
<li><a
href="https://en.wikipedia.org/wiki/Multivariate_adaptive_regression_spline">Multivariate
Adaptive Regression Splines</a></li>
<li><a
href="https://d2l.ai/chapter_linear-classification/softmax-regression.html">Softmax
Regression</a></li>
<li><a href="https://en.wikipedia.org/wiki/Local_regression">Locally
Estimated Scatterplot Smoothing</a></li>
<li>Classification
<ul>
<li><a
href="https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm">k-nearest
neighbor</a></li>
<li><a
href="https://en.wikipedia.org/wiki/Support_vector_machine">Support
Vector Machines</a></li>
<li><a href="https://en.wikipedia.org/wiki/Decision_tree">Decision
Trees</a></li>
<li><a href="https://en.wikipedia.org/wiki/ID3_algorithm">ID3
algorithm</a></li>
<li><a href="https://en.wikipedia.org/wiki/C4.5_algorithm">C4.5
algorithm</a></li>
</ul></li>
<li><a
href="https://scikit-learn.org/stable/modules/ensemble.html">Ensemble
Learning</a>
<ul>
<li><a
href="https://en.wikipedia.org/wiki/Boosting_(machine_learning)">Boosting</a></li>
<li><a
href="https://machinelearningmastery.com/stacking-ensemble-machine-learning-with-python">Stacking</a></li>
<li><a
href="https://en.wikipedia.org/wiki/Bootstrap_aggregating">Bagging</a></li>
<li><a href="https://en.wikipedia.org/wiki/Random_forest">Random
Forest</a></li>
<li><a href="https://en.wikipedia.org/wiki/AdaBoost">AdaBoost</a></li>
</ul></li>
</ul>
<h4 id="unsupervised-learning">Unsupervised Learning</h4>
<ul>
<li><a
href="https://scikit-learn.org/stable/modules/clustering.html#clustering">Clustering</a>
<ul>
<li><a
href="https://scikit-learn.org/stable/modules/clustering.html#hierarchical-clustering">Hierchical
clustering</a></li>
<li><a
href="https://scikit-learn.org/stable/modules/clustering.html#k-means">k-means</a></li>
<li><a
href="https://scikit-learn.org/stable/modules/clustering.html#dbscan">Density-based
clustering</a></li>
<li><a href="https://en.wikipedia.org/wiki/Fuzzy_clustering">Fuzzy
clustering</a></li>
<li><a href="https://en.wikipedia.org/wiki/Mixture_model">Mixture
models</a></li>
</ul></li>
<li><a
href="https://en.wikipedia.org/wiki/Dimensionality_reduction">Dimension
Reduction</a>
<ul>
<li><a
href="https://scikit-learn.org/stable/modules/decomposition.html#principal-component-analysis-pca">Principal
Component Analysis (PCA)</a></li>
<li><a
href="https://scikit-learn.org/stable/modules/decomposition.html#principal-component-analysis-pca">t-SNE;
t-distributed Stochastic Neighbor Embedding</a></li>
<li><a
href="https://scikit-learn.org/stable/modules/decomposition.html#factor-analysis">Factor
Analysis</a></li>
<li><a
href="https://scikit-learn.org/stable/modules/decomposition.html#latent-dirichlet-allocation-lda">Latent
Dirichlet Allocation (LDA)</a></li>
</ul></li>
<li><a href="https://en.wikipedia.org/wiki/Neural_network">Neural
Networks</a></li>
<li><a
href="https://en.wikipedia.org/wiki/Self-organizing_map">Self-organizing
map</a></li>
<li><a
href="https://en.wikipedia.org/wiki/Adaptive_resonance_theory">Adaptive
resonance theory</a></li>
<li><a href="https://en.wikipedia.org/wiki/Hidden_Markov_model">Hidden
Markov Models (HMM)</a></li>
</ul>
<h4 id="semi-supervised-learning">Semi-Supervised Learning</h4>
<ul>
<li>S3VM</li>
<li><a
href="https://en.wikipedia.org/wiki/Weak_supervision#Cluster_assumption">Clustering</a></li>
<li><a
href="https://en.wikipedia.org/wiki/Weak_supervision#Generative_models">Generative
models</a></li>
<li><a
href="https://en.wikipedia.org/wiki/Weak_supervision#Low-density_separation">Low-density
separation</a></li>
<li><a
href="https://en.wikipedia.org/wiki/Weak_supervision#Laplacian_regularization">Laplacian
regularization</a></li>
<li><a
href="https://en.wikipedia.org/wiki/Weak_supervision#Heuristic_approaches">Heuristic
approaches</a></li>
</ul>
<h4 id="reinforcement-learning">Reinforcement Learning</h4>
<ul>
<li><a href="https://en.wikipedia.org/wiki/Q-learning">Q
Learning</a></li>
<li><a
href="https://en.wikipedia.org/wiki/State%E2%80%93action%E2%80%93reward%E2%80%93state%E2%80%93action">SARSA
(State-Action-Reward-State-Action) algorithm</a></li>
<li><a
href="https://en.wikipedia.org/wiki/Temporal_difference_learning#:~:text=Temporal%20difference%20(TD)%20learning%20refers,estimate%20of%20the%20value%20function.">Temporal
difference learning</a></li>
</ul>
<h4 id="data-mining-algorithms">Data Mining Algorithms</h4>
<ul>
<li><a href="https://en.wikipedia.org/wiki/C4.5_algorithm">C4.5</a></li>
<li><a
href="https://en.wikipedia.org/wiki/K-means_clustering">k-Means</a></li>
<li><a href="https://en.wikipedia.org/wiki/Support_vector_machine">SVM
(Support Vector Machine)</a></li>
<li><a
href="https://en.wikipedia.org/wiki/Apriori_algorithm">Apriori</a></li>
<li><a
href="https://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm">EM
(Expectation-Maximization)</a></li>
<li><a href="https://en.wikipedia.org/wiki/PageRank">PageRank</a></li>
<li><a href="https://en.wikipedia.org/wiki/AdaBoost">AdaBoost</a></li>
<li><a
href="https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm">KNN
(K-Nearest Neighbors)</a></li>
<li><a href="https://en.wikipedia.org/wiki/Naive_Bayes_classifier">Naive
Bayes</a></li>
<li><a href="https://en.wikipedia.org/wiki/Decision_tree_learning">CART
(Classification and Regression Trees)</a></li>
</ul>
<h4 id="deep-learning-architectures">Deep Learning architectures</h4>
<ul>
<li><a
href="https://en.wikipedia.org/wiki/Multilayer_perceptron">Multilayer
Perceptron</a></li>
<li><a
href="https://en.wikipedia.org/wiki/Convolutional_neural_network">Convolutional
Neural Network (CNN)</a></li>
<li><a
href="https://en.wikipedia.org/wiki/Recurrent_neural_network">Recurrent
Neural Network (RNN)</a></li>
<li><a href="https://en.wikipedia.org/wiki/Boltzmann_machine">Boltzmann
Machines</a></li>
<li><a
href="https://www.tensorflow.org/tutorials/generative/autoencoder">Autoencoder</a></li>
<li><a
href="https://developers.google.com/machine-learning/gan/gan_structure">Generative
Adversarial Network (GAN)</a></li>
<li><a
href="https://en.wikipedia.org/wiki/Self-organizing_map">Self-Organized
Maps</a></li>
<li><a
href="https://www.tensorflow.org/text/tutorials/transformer">Transformer</a></li>
<li><a
href="https://towardsdatascience.com/conditional-random-fields-explained-e5b8256da776">Conditional
Random Field (CRF)</a></li>
</ul>
<h3 id="general-machine-learning-packages">General Machine Learning
Packages</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a href="https://scikit-learn.org/">scikit-learn</a></li>
<li><a
href="https://github.com/scikit-multilearn/scikit-multilearn">scikit-multilearn</a></li>
<li><a
href="https://github.com/tmadl/sklearn-expertsys">sklearn-expertsys</a></li>
<li><a
href="https://github.com/jundongl/scikit-feature">scikit-feature</a></li>
<li><a
href="https://github.com/EpistasisLab/scikit-rebate">scikit-rebate</a></li>
<li><a href="https://github.com/larsmans/seqlearn">seqlearn</a></li>
<li><a
href="https://github.com/AmazaspShumik/sklearn-bayes">sklearn-bayes</a></li>
<li><a
href="https://github.com/TeamHG-Memex/sklearn-crfsuite">sklearn-crfsuite</a></li>
<li><a
href="https://github.com/rsteca/sklearn-deap">sklearn-deap</a></li>
<li><a
href="https://github.com/sigopt/sigopt-sklearn">sigopt_sklearn</a></li>
<li><a
href="https://github.com/edublancas/sklearn-evaluation">sklearn-evaluation</a></li>
<li><a
href="https://github.com/scikit-image/scikit-image">scikit-image</a></li>
<li><a
href="https://github.com/guofei9987/scikit-opt">scikit-opt</a></li>
<li><a
href="https://github.com/maximtrp/scikit-posthocs">scikit-posthocs</a></li>
<li><a href="https://github.com/pystruct/pystruct">pystruct</a></li>
<li><a href="https://www.shogun-toolbox.org/">Shogun</a></li>
<li><a href="https://github.com/aksnzhy/xlearn">xLearn</a></li>
<li><a href="https://github.com/rapidsai/cuml">cuML</a></li>
<li><a href="https://github.com/uber/causalml">causalml</a></li>
<li><a href="https://github.com/mlpack/mlpack">mlpack</a></li>
<li><a href="https://github.com/rasbt/mlxtend">MLxtend</a></li>
<li><a href="https://github.com/modAL-python/modAL">modAL</a></li>
<li><a
href="https://github.com/lensacom/sparkit-learn">Sparkit-learn</a></li>
<li><a
href="https://github.com/danielhanchen/hyperlearn">hyperlearn</a></li>
<li><a href="https://github.com/davisking/dlib">dlib</a></li>
<li><a href="https://github.com/csinva/imodels">imodels</a></li>
<li><a href="https://github.com/christophM/rulefit">RuleFit</a></li>
<li><a href="https://github.com/dswah/pyGAM">pyGAM</a></li>
<li><a
href="https://github.com/deepchecks/deepchecks">Deepchecks</a></li>
<li><a
href="https://scikit-survival.readthedocs.io/en/stable">scikit-survival</a></li>
</ul>
<h3 id="deep-learning-packages">Deep Learning Packages</h3>
<h4 id="pytorch-ecosystem">PyTorch Ecosystem</h4>
<ul>
<li><a href="https://github.com/pytorch/pytorch">PyTorch</a></li>
<li><a href="https://github.com/pytorch/vision">torchvision</a></li>
<li><a href="https://github.com/pytorch/text">torchtext</a></li>
<li><a href="https://github.com/pytorch/audio">torchaudio</a></li>
<li><a href="https://github.com/pytorch/ignite">ignite</a></li>
<li><a href="https://github.com/pytorch/tnt">PyTorchNet</a></li>
<li><a href="https://github.com/GRAAL-Research/poutyne">PyToune</a></li>
<li><a href="https://github.com/skorch-dev/skorch">skorch</a></li>
<li><a href="https://github.com/ctallec/pyvarinf">PyVarInf</a></li>
<li><a
href="https://github.com/pyg-team/pytorch_geometric">pytorch_geometric</a></li>
<li><a
href="https://github.com/cornellius-gp/gpytorch">GPyTorch</a></li>
<li><a href="https://github.com/pyro-ppl/pyro">pyro</a></li>
<li><a
href="https://github.com/catalyst-team/catalyst">Catalyst</a></li>
<li><a
href="https://github.com/manujosephv/pytorch_tabular">pytorch_tabular</a></li>
<li><a href="https://github.com/ultralytics/yolov3">Yolov3</a></li>
<li><a href="https://github.com/ultralytics/yolov5">Yolov5</a></li>
<li><a href="https://github.com/ultralytics/ultralytics">Yolov8</a></li>
</ul>
<h4 id="tensorflow-ecosystem">TensorFlow Ecosystem</h4>
<ul>
<li><a
href="https://github.com/tensorflow/tensorflow">TensorFlow</a></li>
<li><a
href="https://github.com/tensorlayer/TensorLayer">TensorLayer</a></li>
<li><a href="https://github.com/tflearn/tflearn">TFLearn</a></li>
<li><a href="https://github.com/deepmind/sonnet">Sonnet</a></li>
<li><a
href="https://github.com/tensorpack/tensorpack">tensorpack</a></li>
<li><a href="https://github.com/deepmind/trfl">TRFL</a></li>
<li><a href="https://github.com/polyaxon/polyaxon">Polyaxon</a></li>
<li><a href="https://github.com/itdxer/neupy">NeuPy</a></li>
<li><a href="https://github.com/riga/tfdeploy">tfdeploy</a></li>
<li><a
href="https://github.com/ROCmSoftwarePlatform/tensorflow-upstream">tensorflow-upstream</a></li>
<li><a href="https://github.com/tensorflow/fold">TensorFlow
Fold</a></li>
<li><a href="https://github.com/batzner/tensorlm">tensorlm</a></li>
<li><a
href="https://github.com/bsautermeister/tensorlight">TensorLight</a></li>
<li><a href="https://github.com/tensorflow/mesh">Mesh
TensorFlow</a></li>
<li><a href="https://github.com/ludwig-ai/ludwig">Ludwig</a></li>
<li><a href="https://github.com/tensorflow/agents">TF-Agents</a></li>
<li><a
href="https://github.com/tensorforce/tensorforce">TensorForce</a></li>
</ul>
<h4 id="keras-ecosystem">Keras Ecosystem</h4>
<ul>
<li><a href="https://keras.io">Keras</a></li>
<li><a
href="https://github.com/keras-team/keras-contrib">keras-contrib</a></li>
<li><a href="https://github.com/maxpumperla/hyperas">Hyperas</a></li>
<li><a href="https://github.com/maxpumperla/elephas">Elephas</a></li>
<li><a href="https://github.com/keplr-io/hera">Hera</a></li>
<li><a
href="https://github.com/danielegrattarola/spektral">Spektral</a></li>
<li><a href="https://github.com/google/qkeras">qkeras</a></li>
<li><a href="https://github.com/keras-rl/keras-rl">keras-rl</a></li>
<li><a href="https://github.com/autonomio/talos">Talos</a></li>
</ul>
<h4 id="visualization-tools">Visualization Tools</h4>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a href="https://altair-viz.github.io/">altair</a></li>
<li><a
href="https://opensource.addepar.com/ember-charts/#/overview">addepar</a></li>
<li><a href="https://www.amcharts.com/">amcharts</a></li>
<li><a href="https://www.anychart.com/">anychart</a></li>
<li><a href="https://bokeh.org/">bokeh</a></li>
<li><a
href="https://www.comet.com/site/products/ml-experiment-tracking/?utm_source=awesome-datascience">Comet</a></li>
<li><a href="https://slemma.com/">slemma</a></li>
<li><a href="https://cartodb.github.io/odyssey.js/">cartodb</a></li>
<li><a href="https://square.github.io/cube/">Cube</a></li>
<li><a href="https://d3plus.org/">d3plus</a></li>
<li><a href="https://d3js.org/">Data-Driven Documents(D3js)</a></li>
<li><a href="https://dygraphs.com/">dygraphs</a></li>
<li><a href="https://echarts.baidu.com/index-en.html">ECharts</a></li>
<li><a href="https://www.simile-widgets.org/exhibit/">exhibit</a></li>
<li><a href="https://gephi.org/">gephi</a></li>
<li><a href="https://ggplot2.tidyverse.org/">ggplot2</a></li>
<li><a href="http://docs.glueviz.org/en/latest/index.html">Glue</a></li>
<li><a
href="https://developers.google.com/chart/interactive/docs/gallery">Google
Chart Gallery</a></li>
<li><a href="https://www.highcharts.com/">highcarts</a></li>
<li><a href="https://www.import.io/">import.io</a></li>
<li><a href="https://www.jqplot.com/">jqplot</a></li>
<li><a href="https://matplotlib.org/">Matplotlib</a></li>
<li><a href="https://nvd3.org/">nvd3</a></li>
<li><a href="https://github.com/lutzroeder/netron">Netron</a></li>
<li><a href="https://openrefine.org/">Openrefine</a></li>
<li><a href="https://plot.ly/">plot.ly</a></li>
<li><a href="https://rawgraphs.io">raw</a></li>
<li><a href="https://github.com/abistarun/resseract-lite">Resseract
Lite</a></li>
<li><a href="https://seaborn.pydata.org/">Seaborn</a></li>
<li><a href="https://techanjs.org/">techanjs</a></li>
<li><a href="https://timeline.knightlab.com/">Timeline</a></li>
<li><a
href="https://variancecharts.com/index.html">variancecharts</a></li>
<li><a href="https://vida.io/">vida</a></li>
<li><a href="https://github.com/vizzuhq/vizzu-lib">vizzu</a></li>
<li><a href="http://vis.stanford.edu/wrangler/">Wrangler</a></li>
<li><a
href="https://www.r2d3.us/visual-intro-to-machine-learning-part-1/">r2d3</a></li>
<li><a href="https://networkx.org/">NetworkX</a></li>
<li><a href="https://redash.io/">Redash</a></li>
<li><a href="https://c3js.org/">C3</a></li>
<li><a
href="https://github.com/microsoft/tensorwatch">TensorWatch</a></li>
<li><a href="https://pypi.org/project/geomap/">geomap</a></li>
</ul>
<h3 id="miscellaneous-tools">Miscellaneous Tools</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<table>
<colgroup>
<col style="width: 50%" />
<col style="width: 50%" />
</colgroup>
<thead>
<tr class="header">
<th>Link</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><a href="https://github.com/dslp/dslp">The Data Science Lifecycle
Process</a></td>
<td>The Data Science Lifecycle Process is a process for taking data
science teams from Idea to Value repeatedly and sustainably. The process
is documented in this repo</td>
</tr>
<tr class="even">
<td><a href="https://github.com/dslp/dslp-repo-template">Data Science
Lifecycle Template Repo</a></td>
<td>Template repository for data science lifecycle project</td>
</tr>
<tr class="odd">
<td><a href="https://github.com/AstraZeneca/rexmex">RexMex</a></td>
<td>A general purpose recommender metrics library for fair
evaluation.</td>
</tr>
<tr class="even">
<td><a
href="https://github.com/AstraZeneca/chemicalx">ChemicalX</a></td>
<td>A PyTorch based deep learning library for drug pair scoring.</td>
</tr>
<tr class="odd">
<td><a
href="https://github.com/benedekrozemberczki/pytorch_geometric_temporal">PyTorch
Geometric Temporal</a></td>
<td>Representation learning on dynamic graphs.</td>
</tr>
<tr class="even">
<td><a
href="https://github.com/benedekrozemberczki/littleballoffur">Little
Ball of Fur</a></td>
<td>A graph sampling library for NetworkX with a Scikit-Learn like
API.</td>
</tr>
<tr class="odd">
<td><a href="https://github.com/benedekrozemberczki/karateclub">Karate
Club</a></td>
<td>An unsupervised machine learning extension library for NetworkX with
a Scikit-Learn like API.</td>
</tr>
<tr class="even">
<td><a href="https://github.com/ml-tooling/ml-workspace">ML
Workspace</a></td>
<td>All-in-one web-based IDE for machine learning and data science. The
workspace is deployed as a Docker container and is preloaded with a
variety of popular data science libraries (e.g., Tensorflow, PyTorch)
and dev tools (e.g., Jupyter, VS Code)</td>
</tr>
<tr class="odd">
<td><a href="https://neptune.ai">Neptune.ai</a></td>
<td>Community-friendly platform supporting data scientists in creating
and sharing machine learning models. Neptune facilitates teamwork,
infrastructure management, models comparison and reproducibility.</td>
</tr>
<tr class="even">
<td><a href="https://github.com/minerva-ml/steppy">steppy</a></td>
<td>Lightweight, Python library for fast and reproducible machine
learning experimentation. Introduces very simple interface that enables
clean machine learning pipeline design.</td>
</tr>
<tr class="odd">
<td><a
href="https://github.com/minerva-ml/steppy-toolkit">steppy-toolkit</a></td>
<td>Curated collection of the neural networks, transformers and models
that make your machine learning work faster and more effective.</td>
</tr>
<tr class="even">
<td><a href="https://cloud.google.com/datalab/docs/">Datalab from
Google</a></td>
<td>easily explore, visualize, analyze, and transform data using
familiar languages, such as Python and SQL, interactively.</td>
</tr>
<tr class="odd">
<td><a
href="https://www.cloudera.com/downloads/hortonworks-sandbox.html">Hortonworks
Sandbox</a></td>
<td>is a personal, portable Hadoop environment that comes with a dozen
interactive Hadoop tutorials.</td>
</tr>
<tr class="even">
<td><a href="https://www.r-project.org/">R</a></td>
<td>is a free software environment for statistical computing and
graphics.</td>
</tr>
<tr class="odd">
<td><a href="https://www.tidyverse.org/">Tidyverse</a></td>
<td>is an opinionated collection of R packages designed for data
science. All packages share an underlying design philosophy, grammar,
and data structures.</td>
</tr>
<tr class="even">
<td><a href="https://www.rstudio.com">RStudio</a></td>
<td>IDE powerful user interface for R. Its free and open source, and
works on Windows, Mac, and Linux.</td>
</tr>
<tr class="odd">
<td><a href="https://www.anaconda.com">Python - Pandas -
Anaconda</a></td>
<td>Completely free enterprise-ready Python distribution for large-scale
data processing, predictive analytics, and scientific computing</td>
</tr>
<tr class="even">
<td><a href="https://github.com/adrotog/PandasGUI">Pandas GUI</a></td>
<td>Pandas GUI</td>
</tr>
<tr class="odd">
<td><a href="https://scikit-learn.org/stable/">Scikit-Learn</a></td>
<td>Machine Learning in Python</td>
</tr>
<tr class="even">
<td><a href="https://numpy.org/">NumPy</a></td>
<td>NumPy is fundamental for scientific computing with Python. It
supports large, multi-dimensional arrays and matrices and includes an
assortment of high-level mathematical functions to operate on these
arrays.</td>
</tr>
<tr class="odd">
<td><a href="https://vaex.io/">Vaex</a></td>
<td>Vaex is a Python library that allows you to visualize large datasets
and calculate statistics at high speeds.</td>
</tr>
<tr class="even">
<td><a href="https://scipy.org/">SciPy</a></td>
<td>SciPy works with NumPy arrays and provides efficient routines for
numerical integration and optimization.</td>
</tr>
<tr class="odd">
<td><a href="https://www.coursera.org/learn/data-scientists-tools">Data
Science Toolbox</a></td>
<td>Coursera Course</td>
</tr>
<tr class="even">
<td><a href="https://datasciencetoolbox.org/">Data Science
Toolbox</a></td>
<td>Blog</td>
</tr>
<tr class="odd">
<td><a href="https://www.wolfram.com/data-science-platform/">Wolfram
Data Science Platform</a></td>
<td>Take numerical, textual, image, GIS or other data and give it the
Wolfram treatment, carrying out a full spectrum of data science analysis
and visualization and automatically generate rich interactive
reports—all powered by the revolutionary knowledge-based Wolfram
Language.</td>
</tr>
<tr class="even">
<td><a href="https://www.datadoghq.com/">Datadog</a></td>
<td>Solutions, code, and devops for high-scale data science.</td>
</tr>
<tr class="odd">
<td><a href="https://variancecharts.com/">Variance</a></td>
<td>Build powerful data visualizations for the web without writing
JavaScript</td>
</tr>
<tr class="even">
<td><a href="https://kitesdk.org/docs/current/index.html">Kite
Development Kit</a></td>
<td>The Kite Software Development Kit (Apache License, Version 2.0), or
Kite for short, is a set of libraries, tools, examples, and
documentation focused on making it easier to build systems on top of the
Hadoop ecosystem.</td>
</tr>
<tr class="odd">
<td><a href="https://www.dominodatalab.com">Domino Data Labs</a></td>
<td>Run, scale, share, and deploy your models — without any
infrastructure or setup.</td>
</tr>
<tr class="even">
<td><a href="https://flink.apache.org/">Apache Flink</a></td>
<td>A platform for efficient, distributed, general-purpose data
processing.</td>
</tr>
<tr class="odd">
<td><a href="https://hama.apache.org/">Apache Hama</a></td>
<td>Apache Hama is an Apache Top-Level open source project, allowing you
to do advanced analytics beyond MapReduce.</td>
</tr>
<tr class="even">
<td><a href="https://www.cs.waikato.ac.nz/ml/weka/">Weka</a></td>
<td>Weka is a collection of machine learning algorithms for data mining
tasks.</td>
</tr>
<tr class="odd">
<td><a href="https://www.gnu.org/software/octave/">Octave</a></td>
<td>GNU Octave is a high-level interpreted language, primarily intended
for numerical computations.(Free Matlab)</td>
</tr>
<tr class="even">
<td><a href="https://spark.apache.org/">Apache Spark</a></td>
<td>Lightning-fast cluster computing</td>
</tr>
<tr class="odd">
<td><a href="https://github.com/Hydrospheredata/mist">Hydrosphere
Mist</a></td>
<td>a service for exposing Apache Spark analytics jobs and machine
learning models as realtime, batch or reactive web services.</td>
</tr>
<tr class="even">
<td><a href="https://www.datamechanics.co">Data Mechanics</a></td>
<td>A data science and engineering platform making Apache Spark more
developer-friendly and cost-effective.</td>
</tr>
<tr class="odd">
<td><a href="https://caffe.berkeleyvision.org/">Caffe</a></td>
<td>Deep Learning Framework</td>
</tr>
<tr class="even">
<td><a href="https://torch.ch/">Torch</a></td>
<td>A SCIENTIFIC COMPUTING FRAMEWORK FOR LUAJIT</td>
</tr>
<tr class="odd">
<td><a href="https://github.com/NervanaSystems/neon">Nervanas python
based Deep Learning Framework</a></td>
<td>Intel® Nervana™ reference deep learning framework committed to best
performance on all hardware.</td>
</tr>
<tr class="even">
<td><a href="https://github.com/skale-me/skale">Skale</a></td>
<td>High performance distributed data processing in NodeJS</td>
</tr>
<tr class="odd">
<td><a href="https://airbnb.io/aerosolve/">Aerosolve</a></td>
<td>A machine learning package built for humans.</td>
</tr>
<tr class="even">
<td><a href="https://github.com/intel/idlf">Intel framework</a></td>
<td>Intel® Deep Learning Framework</td>
</tr>
<tr class="odd">
<td><a href="https://www.datawrapper.de/">Datawrapper</a></td>
<td>An open source data visualization platform helping everyone to
create simple, correct and embeddable charts. Also at <a
href="https://github.com/datawrapper/datawrapper">github.com</a></td>
</tr>
<tr class="even">
<td><a href="https://www.tensorflow.org/">Tensor Flow</a></td>
<td>TensorFlow is an Open Source Software Library for Machine
Intelligence</td>
</tr>
<tr class="odd">
<td><a href="https://www.nltk.org/">Natural Language Toolkit</a></td>
<td>An introductory yet powerful toolkit for natural language processing
and classification</td>
</tr>
<tr class="even">
<td><a href="https://www.johnsnowlabs.com/annotation-lab/">Annotation
Lab</a></td>
<td>Free End-to-End No-Code platform for text annotation and DL model
training/tuning. Out-of-the-box support for Named Entity Recognition,
Classification, Relation extraction and Assertion Status Spark NLP
models. Unlimited support for users, teams, projects, documents.</td>
</tr>
<tr class="odd">
<td><a href="https://www.npmjs.com/package/nlp-toolkit">nlp-toolkit for
node.js</a></td>
<td>This module covers some basic nlp principles and implementations.
The main focus is performance. When we deal with sample or training data
in nlp, we quickly run out of memory. Therefore every implementation in
this module is written as stream to only hold that data in memory that
is currently processed at any step.</td>
</tr>
<tr class="even">
<td><a href="https://julialang.org">Julia</a></td>
<td>high-level, high-performance dynamic programming language for
technical computing</td>
</tr>
<tr class="odd">
<td><a href="https://github.com/JuliaLang/IJulia.jl">IJulia</a></td>
<td>a Julia-language backend combined with the Jupyter interactive
environment</td>
</tr>
<tr class="even">
<td><a href="https://zeppelin.apache.org/">Apache Zeppelin</a></td>
<td>Web-based notebook that enables data-driven, interactive data
analytics and collaborative documents with SQL, Scala and more</td>
</tr>
<tr class="odd">
<td><a
href="https://github.com/alteryx/featuretools">Featuretools</a></td>
<td>An open source framework for automated feature engineering written
in python</td>
</tr>
<tr class="even">
<td><a href="https://github.com/hi-primus/optimus">Optimus</a></td>
<td>Cleansing, pre-processing, feature engineering, exploratory data
analysis and easy ML with PySpark backend.</td>
</tr>
<tr class="odd">
<td><a
href="https://github.com/albumentations-team/albumentations">Albumentations</a></td>
<td>А fast and framework agnostic image augmentation library that
implements a diverse set of augmentation techniques. Supports
classification, segmentation, and detection out of the box. Was used to
win a number of Deep Learning competitions at Kaggle, Topcoder and those
that were a part of the CVPR workshops.</td>
</tr>
<tr class="even">
<td><a href="https://github.com/iterative/dvc">DVC</a></td>
<td>An open-source data science version control system. It helps track,
organize and make data science projects reproducible. In its very basic
scenario it helps version control and share large data and model
files.</td>
</tr>
<tr class="odd">
<td><a href="https://github.com/asavinov/lambdo">Lambdo</a></td>
<td>is a workflow engine that significantly simplifies data analysis by
combining in one analysis pipeline (i) feature engineering and machine
learning (ii) model training and prediction (iii) table population and
column evaluation.</td>
</tr>
<tr class="even">
<td><a href="https://github.com/feast-dev/feast">Feast</a></td>
<td>A feature store for the management, discovery, and access of machine
learning features. Feast provides a consistent view of feature data for
both model training and model serving.</td>
</tr>
<tr class="odd">
<td><a href="https://github.com/polyaxon/polyaxon">Polyaxon</a></td>
<td>A platform for reproducible and scalable machine learning and deep
learning.</td>
</tr>
<tr class="even">
<td><a href="https://www.lighttag.io/">LightTag</a></td>
<td>Text Annotation Tool for teams</td>
</tr>
<tr class="odd">
<td><a href="https://ubiai.tools">UBIAI</a></td>
<td>Easy-to-use text annotation tool for teams with most comprehensive
auto-annotation features. Supports NER, relations and document
classification as well as OCR annotation for invoice labeling</td>
</tr>
<tr class="even">
<td><a href="https://github.com/allegroai/clearml">Trains</a></td>
<td>Auto-Magical Experiment Manager, Version Control &amp; DevOps for
AI</td>
</tr>
<tr class="odd">
<td><a
href="https://github.com/logicalclocks/hopsworks">Hopsworks</a></td>
<td>Open-source data-intensive machine learning platform with a feature
store. Ingest and manage features for both online (MySQL Cluster) and
offline (Apache Hive) access, train and serve models at scale.</td>
</tr>
<tr class="even">
<td><a href="https://github.com/mindsdb/mindsdb">MindsDB</a></td>
<td>MindsDB is an Explainable AutoML framework for developers. With
MindsDB you can build, train and use state of the art ML models in as
simple as one line of code.</td>
</tr>
<tr class="odd">
<td><a href="https://github.com/mindsdb/lightwood">Lightwood</a></td>
<td>A Pytorch based framework that breaks down machine learning problems
into smaller blocks that can be glued together seamlessly with an
objective to build predictive models with one line of code.</td>
</tr>
<tr class="even">
<td><a href="https://github.com/awslabs/aws-data-wrangler">AWS Data
Wrangler</a></td>
<td>An open-source Python package that extends the power of Pandas
library to AWS connecting DataFrames and AWS data related services
(Amazon Redshift, AWS Glue, Amazon Athena, Amazon EMR, etc).</td>
</tr>
<tr class="odd">
<td><a href="https://aws.amazon.com/rekognition/">Amazon
Rekognition</a></td>
<td>AWS Rekognition is a service that lets developers working with
Amazon Web Services add image analysis to their applications. Catalog
assets, automate workflows, and extract meaning from your media and
applications.</td>
</tr>
<tr class="even">
<td><a href="https://aws.amazon.com/textract/">Amazon Textract</a></td>
<td>Automatically extract printed text, handwriting, and data from any
document.</td>
</tr>
<tr class="odd">
<td><a href="https://aws.amazon.com/lookout-for-vision/">Amazon Lookout
for Vision</a></td>
<td>Spot product defects using computer vision to automate quality
inspection. Identify missing product components, vehicle and structure
damage, and irregularities for comprehensive quality control.</td>
</tr>
<tr class="even">
<td><a href="https://aws.amazon.com/codeguru/">Amazon CodeGuru</a></td>
<td>Automate code reviews and optimize application performance with
ML-powered recommendations.</td>
</tr>
<tr class="odd">
<td><a href="https://github.com/iterative/cml">CML</a></td>
<td>An open source toolkit for using continuous integration in data
science projects. Automatically train and test models in production-like
environments with GitHub Actions &amp; GitLab CI, and autogenerate
visual reports on pull/merge requests.</td>
</tr>
<tr class="even">
<td><a href="https://dask.org/">Dask</a></td>
<td>An open source Python library to painlessly transition your
analytics code to distributed computing systems (Big Data)</td>
</tr>
<tr class="odd">
<td><a
href="https://www.statsmodels.org/stable/index.html">Statsmodels</a></td>
<td>A Python-based inferential statistics, hypothesis testing and
regression framework</td>
</tr>
<tr class="even">
<td><a href="https://radimrehurek.com/gensim/">Gensim</a></td>
<td>An open-source library for topic modeling of natural language
text</td>
</tr>
<tr class="odd">
<td><a href="https://spacy.io/">spaCy</a></td>
<td>A performant natural language processing toolkit</td>
</tr>
<tr class="even">
<td><a href="https://github.com/ricklamers/gridstudio">Grid
Studio</a></td>
<td>Grid studio is a web-based spreadsheet application with full
integration of the Python programming language.</td>
</tr>
<tr class="odd">
<td><a
href="https://github.com/jakevdp/PythonDataScienceHandbook">Python Data
Science Handbook</a></td>
<td>Python Data Science Handbook: full text in Jupyter Notebooks</td>
</tr>
<tr class="even">
<td><a
href="https://github.com/benedekrozemberczki/shapley">Shapley</a></td>
<td>A data-driven framework to quantify the value of classifiers in a
machine learning ensemble.</td>
</tr>
<tr class="odd">
<td><a href="https://dagshub.com">DAGsHub</a></td>
<td>A platform built on open source tools for data, model and pipeline
management.</td>
</tr>
<tr class="even">
<td><a href="https://deepnote.com">Deepnote</a></td>
<td>A new kind of data science notebook. Jupyter-compatible, with
real-time collaboration and running in the cloud.</td>
</tr>
<tr class="odd">
<td><a href="https://valohai.com">Valohai</a></td>
<td>An MLOps platform that handles machine orchestration, automatic
reproducibility and deployment.</td>
</tr>
<tr class="even">
<td><a href="https://docs.pymc.io/">PyMC3</a></td>
<td>A Python Library for Probabalistic Programming (Bayesian Inference
and Machine Learning)</td>
</tr>
<tr class="odd">
<td><a href="https://pypi.org/project/pystan/">PyStan</a></td>
<td>Python interface to Stan (Bayesian inference and modeling)</td>
</tr>
<tr class="even">
<td><a href="https://pypi.org/project/hmmlearn/">hmmlearn</a></td>
<td>Unsupervised learning and inference of Hidden Markov Models</td>
</tr>
<tr class="odd">
<td><a href="https://github.com/chaos-genius/chaos_genius/">Chaos
Genius</a></td>
<td>ML powered analytics engine for outlier/anomaly detection and root
cause analysis</td>
</tr>
<tr class="even">
<td><a href="https://nimblebox.ai/">Nimblebox</a></td>
<td>A full-stack MLOps platform designed to help data scientists and
machine learning practitioners around the world discover, create, and
launch multi-cloud apps from their web browser.</td>
</tr>
<tr class="odd">
<td><a href="https://github.com/towhee-io/towhee">Towhee</a></td>
<td>A Python library that helps you encode your unstructured data into
embeddings.</td>
</tr>
<tr class="even">
<td><a href="https://github.com/LineaLabs/lineapy">LineaPy</a></td>
<td>Ever been frustrated with cleaning up long, messy Jupyter notebooks?
With LineaPy, an open source Python library, it takes as little as two
lines of code to transform messy development code into production
pipelines.</td>
</tr>
<tr class="odd">
<td><a href="https://github.com/tensorchord/envd">envd</a></td>
<td>🏕️ machine learning development environment for data science and
AI/ML engineering teams</td>
</tr>
<tr class="even">
<td><a href="https://kandi.openweaver.com/explore/data-science">Explore
Data Science Libraries</a></td>
<td>A search engine 🔎 tool to discover &amp; find a curated list of
popular &amp; new libraries, top authors, trending project kits,
discussions, tutorials &amp; learning resources</td>
</tr>
<tr class="odd">
<td><a href="https://github.com/iterative/mlem">MLEM</a></td>
<td>🐶 Version and deploy your ML models following GitOps
principles</td>
</tr>
<tr class="even">
<td><a href="https://mlflow.org/">MLflow</a></td>
<td>MLOps framework for managing ML models across their full
lifecycle</td>
</tr>
<tr class="odd">
<td><a href="https://github.com/cleanlab/cleanlab">cleanlab</a></td>
<td>Python library for data-centric AI and automatically detecting
various issues in ML datasets</td>
</tr>
<tr class="even">
<td><a href="https://github.com/awslabs/autogluon">AutoGluon</a></td>
<td>AutoML to easily produce accurate predictions for image, text,
tabular, time-series, and multi-modal data</td>
</tr>
<tr class="odd">
<td><a href="https://arize.com/">Arize AI</a></td>
<td>Arize AI community tier observability tool for monitoring machine
learning models in production and root-causing issues such as data
quality and performance drift.</td>
</tr>
<tr class="even">
<td><a href="https://aureo.io">Aureo.io</a></td>
<td>Aureo.io is a low-code platform that focuses on building artificial
intelligence. It provides users with the capability to create pipelines,
automations and integrate them with artificial intelligence models all
with their basic data.</td>
</tr>
<tr class="odd">
<td><a href="https://www.erdlab.io/">ERD Lab</a></td>
<td>Free cloud based entity relationship diagram (ERD) tool made for
developers.</td>
</tr>
<tr class="even">
<td><a href="https://docs.arize.com/phoenix">Arize-Phoenix</a></td>
<td>MLOps in a notebook - uncover insights, surface problems, monitor,
and fine tune your models.</td>
</tr>
<tr class="odd">
<td><a href="https://github.com/comet-ml/comet-examples">Comet</a></td>
<td>An MLOps platform with experiment tracking, model production
management, a model registry, and full data lineage to support your ML
workflow from training straight through to production.</td>
</tr>
<tr class="even">
<td><a href="https://github.com/comet-ml/comet-llm">CometLLM</a></td>
<td>Log, track, visualize, and search your LLM prompts and chains in one
easy-to-use, 100% open-source tool.</td>
</tr>
<tr class="odd">
<td><a href="https://synthical.com">Synthical</a></td>
<td>AI-powered collaborative environment for research. Find relevant
papers, create collections to manage bibliography, and summarize content
— all in one place</td>
</tr>
<tr class="even">
<td><a href="https://github.com/mmore500/teeplot">teeplot</a></td>
<td>Workflow tool to automatically organize data visualization
output</td>
</tr>
</tbody>
</table>
<h2 id="literature-and-media">Literature and Media</h2>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<p>This section includes some additional reading material, channels to
watch, and talks to listen to.</p>
<h3 id="books">Books</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a
href="https://www.amazon.com/Data-Science-Scratch-Principles-Python-dp-1492041130/dp/1492041130/ref=dp_ob_title_bk">Data
Science From Scratch: First Principles with Python</a></li>
<li><a
href="https://www.tutorialspoint.com/artificial_intelligence_with_python/artificial_intelligence_with_python_tutorial.pdf">Artificial
Intelligence with Python - Tutorialspoint</a></li>
<li><a
href="https://dafriedman97.github.io/mlbook/content/introduction.html">Machine
Learning from Scratch</a></li>
<li><a href="https://probml.github.io/pml-book/book1.html">Probabilistic
Machine Learning: An Introduction</a></li>
<li><a
href="https://www.eecs189.org/static/resources/comprehensive-guide.pdf">A
Comprehensive Guide to Machine Learning</a></li>
<li><a
href="https://www.manning.com/books/how-to-lead-in-data-science">How to
Lead in Data Science</a> - Early Access</li>
<li><a
href="https://www.manning.com/books/fighting-churn-with-data">Fighting
Churn With Data</a></li>
<li><a
href="https://www.manning.com/books/data-science-with-python-and-dask">Data
Science at Scale with Python and Dask</a></li>
<li><a
href="https://jakevdp.github.io/PythonDataScienceHandbook/">Python Data
Science Handbook</a></li>
<li><a href="https://www.thedatasciencehandbook.com/">The Data Science
Handbook: Advice and Insights from 25 Amazing Data Scientists</a></li>
<li><a
href="https://www.manning.com/books/think-like-a-data-scientist">Think
Like a Data Scientist</a></li>
<li><a
href="https://www.manning.com/books/introducing-data-science">Introducing
Data Science</a></li>
<li><a
href="https://www.manning.com/books/practical-data-science-with-r">Practical
Data Science with R</a></li>
<li><a
href="https://www.amazon.com/dp/B08TZ1MT3W/ref=cm_sw_r_cp_apa_fabc_a0ceGbWECF9A8">Everyday
Data Science</a> &amp; <a href="https://gum.co/everydaydata">(cheaper
PDF version)</a></li>
<li><a
href="https://www.manning.com/books/exploring-data-science">Exploring
Data Science</a> - free eBook sampler</li>
<li><a
href="https://www.manning.com/books/exploring-the-data-jungle">Exploring
the Data Jungle</a> - free eBook sampler</li>
<li><a
href="https://www.manning.com/books/classic-computer-science-problems-in-python">Classic
Computer Science Problems in Python</a></li>
<li><a href="https://www.manning.com/books/math-for-programmers">Math
for Programmers</a> Early access</li>
<li><a href="https://www.manning.com/books/r-in-action-third-edition">R
in Action, Third Edition</a> Early Access</li>
<li><a href="https://www.manning.com/books/data-science-bookcamp">Data
Science Bookcamp</a> Early access</li>
<li><a href="https://www.springer.com/gp/book/9783319950914">Data
Science Thinking: The Next Scientific, Technological and Economic
Revolution</a></li>
<li><a href="https://www.springer.com/gp/book/9783030118204">Applied
Data Science: Lessons Learned for the Data-Driven Business</a></li>
<li><a
href="https://www.amazon.com/Data-Science-Handbook-Field-Cady/dp/1119092949">The
Data Science Handbook</a></li>
<li><a
href="https://www.manning.com/books/getting-started-with-natural-language-processing">Essential
Natural Language Processing</a> - Early access</li>
<li><a href="https://www.mmds.org/">Mining Massive Datasets</a> - free
e-book comprehended by an online course</li>
<li><a href="https://www.manning.com/books/pandas-in-action">Pandas in
Action</a> - Early access</li>
<li><a href="https://www.taylorfrancis.com/books/9780429141973">Genetic
Algorithms and Genetic Programming</a></li>
<li><a
href="https://www.intechopen.com/books/advances_in_evolutionary_algorithms">Advances
in Evolutionary Algorithms</a> - Free Download</li>
<li><a
href="https://www.intechopen.com/books/genetic-programming-new-approaches-and-successful-applications">Genetic
Programming: New Approaches and Successful Applications</a> - Free
Download</li>
<li><a
href="https://www.intechopen.com/books/evolutionary-algorithms">Evolutionary
Algorithms</a> - Free Download</li>
<li><a href="https://www.cs.bham.ac.uk/~wbl/aigp3/">Advances in Genetic
Programming, Vol. 3</a> - Free Download</li>
<li><a href="https://www.it-weise.de/projects/book.pdf">Global
Optimization Algorithms: Theory and Application</a> - Free Download</li>
<li><a
href="https://www.talkorigins.org/faqs/genalg/genalg.html">Genetic
Algorithms and Evolutionary Computation</a> - Free Download</li>
<li><a
href="https://web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf">Convex
Optimization</a> - Convex Optimization book by Stephen Boyd - Free
Download</li>
<li><a
href="https://www.manning.com/books/data-analysis-with-python-and-pyspark">Data
Analysis with Python and PySpark</a> - Early Access</li>
<li><a href="https://r4ds.had.co.nz/">R for Data Science</a></li>
<li><a
href="https://www.manning.com/books/build-a-career-in-data-science">Build
a Career in Data Science</a></li>
<li><a href="https://mlbookcamp.com/">Machine Learning Bookcamp</a> -
Early access</li>
<li><a
href="https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/">Hands-On
Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd
Edition</a></li>
<li><a
href="https://www.manning.com/books/effective-data-science-infrastructure">Effective
Data Science Infrastructure</a></li>
<li><a href="https://valohai.com/mlops-ebook/">Practical MLOps: How to
Get Ready for Production Models</a></li>
<li><a
href="https://www.manning.com/books/data-analysis-with-python-and-pyspark">Data
Analysis with Python and PySpark</a></li>
<li><a
href="https://www.manning.com/books/regression-a-friendly-guide">Regression,
a Friendly guide</a> - Early Access</li>
<li><a
href="https://www.oreilly.com/library/view/streaming-systems/9781491983867/">Streaming
Systems: The What, Where, When, and How of Large-Scale Data
Processing</a></li>
<li><a
href="https://www.oreilly.com/library/view/data-science-at/9781491947845/">Data
Science at the Command Line: Facing the Future with Time-Tested
Tools</a></li>
<li><a
href="https://www.cin.ufpe.br/~cavmj/Machine%20-%20Learning%20-%20Tom%20Mitchell.pdf">Machine
Learning - CIn UFPE</a></li>
<li><a
href="https://www.tutorialspoint.com/machine_learning_with_python/machine_learning_with_python_tutorial.pdf">Machine
Learning with Python - Tutorialspoint</a></li>
<li><a href="https://www.deeplearningbook.org/">Deep Learning</a></li>
<li><a
href="https://www.manning.com/books/designing-cloud-data-platforms">Designing
Cloud Data Platforms</a> - Early Access</li>
<li><a href="https://www.statlearning.com/">An Introduction to
Statistical Learning with Applications in R</a></li>
<li><a href="https://hastie.su.domains/ElemStatLearn/">The Elements of
Statistical Learning: Data Mining, Inference, and Prediction</a></li>
<li><a
href="https://www.simonandschuster.com/books/Deep-Learning-with-PyTorch/Eli-Stevens/9781617295263">Deep
Learning with PyTorch</a></li>
<li><a href="https://neuralnetworksanddeeplearning.com">Neural Networks
and Deep Learning</a></li>
<li><a
href="https://www.oreilly.com/library/view/deep-learning-cookbook/9781491995839/">Deep
Learning Cookbook</a></li>
<li><a
href="https://www.oreilly.com/library/view/introduction-to-machine/9781449369880/">Introduction
to Machine Learning with Python</a></li>
<li><a href="https://artint.info/index.html">Artificial Intelligence:
Foundations of Computational Agents, 2nd Edition</a> - Free HTML
version</li>
<li><a href="https://ai.stanford.edu/~nilsson/QAI/qai.pdf">The Quest for
Artificial Intelligence: A History of Ideas and Achievements</a> - Free
Download</li>
<li><a
href="https://www.manning.com/books/graph-algorithms-for-data-science">Graph
Algorithms for Data Science</a> - Early Access</li>
<li><a href="https://www.manning.com/books/data-mesh-in-action">Data
Mesh in Action</a> - Early Access</li>
<li><a
href="https://www.manning.com/books/julia-for-data-analysis">Julia for
Data Analysis</a> - Early Access</li>
<li><a
href="https://www.manning.com/books/julia-for-data-analysis">Casual
Inference for Data Science</a> - Early Access</li>
<li><a
href="https://www.manning.com/books/regular-expression-puzzles-and-ai-coding-assistants">Regular
Expression Puzzles and AI Coding Assistants</a> by David Mertz</li>
<li><a href="https://d2l.ai/">Dive into Deep Learning</a></li>
<li><a href="https://www.manning.com/books/data-for-all">Data for
All</a></li>
<li><a
href="https://christophm.github.io/interpretable-ml-book/">Interpretable
Machine Learning: A Guide for Making Black Box Models Explainable</a> -
Free GitHub version</li>
<li><a href="https://www.cs.cornell.edu/jeh/book.pdf">Foundations of
Data Science</a> Free Download</li>
<li><a
href="https://www.amazon.com/Comet-Data-Science-Enhance-optimize/dp/1801814430">Comet
for DataScience: Enhance your ability to manage and optimize the life
cycle of your data science project</a></li>
<li><a
href="https://www.manning.com/books/software-engineering-for-data-scientists">Software
Engineering for Data Scientists</a> - Early Access</li>
<li><a href="https://www.manning.com/books/julia-for-data-science">Julia
for Data Science</a> - Early Access</li>
<li><a href="https://www.statlearning.com/">An Introduction to
Statistical Learning</a> - Download Page</li>
<li><a
href="https://www.amazon.in/Machine-Learning-Absolute-Beginners-Introduction-ebook/dp/B07335JNW1">Machine
Learning For Absolute Beginners</a></li>
</ul>
<h4 id="book-deals-affiliated">Book Deals (Affiliated) 🛍</h4>
<ul>
<li><p><a
href="https://www.manning.com/?utm_source=mikrobusiness&amp;utm_medium=affiliate&amp;utm_campaign=ebook_sale_8_8_22">eBook
sale - Save up to 45% on eBooks!</a></p></li>
<li><p><a
href="https://www.manning.com/books/causal-machine-learning?utm_source=mikrobusiness&amp;utm_medium=affiliate&amp;utm_campaign=book_ness_causal_7_26_22&amp;a_aid=mikrobusiness&amp;a_bid=43a2198b">Causal
Machine Learning</a></p></li>
<li><p><a
href="https://www.manning.com/books/managing-machine-learning-projects?utm_source=mikrobusiness&amp;utm_medium=affiliate&amp;utm_campaign=book_thompson_managing_6_14_22">Managing
ML Projects</a></p></li>
<li><p><a
href="https://www.manning.com/books/causal-inference-for-data-science?utm_source=mikrobusiness&amp;utm_medium=affiliate&amp;utm_campaign=book_ruizdevilla_causal_6_6_22">Causal
Inference for Data Science</a></p></li>
<li><p><a
href="https://www.manning.com/books/data-for-all?utm_source=mikrobusiness&amp;utm_medium=affiliate">Data
for All</a></p></li>
</ul>
<h3 id="journals-publications-and-magazines">Journals, Publications and
Magazines</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a href="https://icml.cc/2015/">ICML</a> - International Conference
on Machine Learning</li>
<li><a
href="https://gecco-2019.sigevo.org/index.html/HomePage">GECCO</a> - The
Genetic and Evolutionary Computation Conference (GECCO)</li>
<li><a
href="https://epjdatascience.springeropen.com/">epjdatascience</a></li>
<li><a href="https://jds-online.org/journal/JDS">Journal of Data
Science</a> - an international journal devoted to applications of
statistical methods at large</li>
<li><a href="https://www.journals.elsevier.com/big-data-research">Big
Data Research</a></li>
<li><a href="https://journalofbigdata.springeropen.com/">Journal of Big
Data</a></li>
<li><a href="https://journals.sagepub.com/home/bds">Big Data &amp;
Society</a></li>
<li><a href="https://www.jstage.jst.go.jp/browse/dsj">Data Science
Journal</a></li>
<li><a href="https://www.datatau.com/news">datatau.com/news</a> - Like
Hacker News, but for data</li>
<li><a href="https://trello.com/b/rbpEfMld/data-science">Data Science
Trello Board</a></li>
<li><a href="https://medium.com/tag/data-science">Medium Data Science
Topic</a> - Data Science related publications on medium</li>
<li><a
href="https://towardsdatascience.com/introduction-to-genetic-algorithms-including-example-code-e396e98d8bf3#:~:text=A%20genetic%20algorithm%20is%20a,offspring%20of%20the%20next%20generation.">Towards
Data Science Genetic Algorithm Topic</a> -Genetic Algorithm related
Publications towards Data Science</li>
<li><a href="https://allainews.com/">all AI news</a> - The AI/ML/Big
Data news aggregator platform</li>
</ul>
<h3 id="newsletters">Newsletters</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a href="https://aidigest.net/">AI Digest</a>. A weekly newsletter
to keep up to date with AI, machine learning, and data science. <a
href="https://aidigest.net/digests">Archive</a>.</li>
<li><a href="https://datatalks.club">DataTalks.Club</a>. A weekly
newsletter about data-related things. <a
href="https://us19.campaign-archive.com/home/?u=0d7822ab98152f5afc118c176&amp;id=97178021aa">Archive</a>.</li>
<li><a href="https://roundup.getdbt.com/about">The Analytics Engineering
Roundup</a>. A newsletter about data science. <a
href="https://roundup.getdbt.com/archive">Archive</a>.</li>
</ul>
<h3 id="bloggers">Bloggers</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a href="https://wesmckinney.com/archives.html">Wes McKinney</a> -
Wes McKinney Archives.</li>
<li><a href="https://miningthesocialweb.com/">Matthew Russell</a> -
Mining The Social Web.</li>
<li><a href="https://www.gregreda.com/">Greg Reda</a> - Greg Reda
Personal Blog</li>
<li><a href="https://kldavenport.com/">Kevin Davenport</a> - Kevin
Davenport Personal Blog</li>
<li><a href="https://jvns.ca/">Julia Evans</a> - Recurse Center
alumna</li>
<li><a href="https://www.cse.unr.edu/~hkardes/">Hakan Kardas</a> -
Personal Web Page</li>
<li><a href="https://seanjtaylor.com/">Sean J. Taylor</a> - Personal Web
Page</li>
<li><a href="https://drewconway.com/">Drew Conway</a> - Personal Web
Page</li>
<li><a href="https://hilarymason.com/">Hilary Mason</a> - Personal Web
Page</li>
<li><a href="https://complexdiagrams.com/">Noah Iliinsky</a> - Personal
Blog</li>
<li><a href="https://hairysun.com/">Matt Harrison</a> - Personal
Blog</li>
<li><a href="https://allthingsds.wordpress.com/">Vamshi Ambati</a> -
AllThings Data Sciene</li>
<li><a href="https://www.mdmgeek.com/">Prash Chan</a> - Tech Blog on
Master Data Management And Every Buzz Surrounding It</li>
<li><a href="https://datasciencemasters.org/">Clare Corthell</a> - The
Open Source Data Science Masters</li>
<li><a href="https://cloudofdata.com/">Paul Miller</a> Based in the UK
and working globally, Cloud of Datas consultancy services help clients
understand the implications of taking data and more to the Cloud.</li>
<li><a href="https://datasciencelondon.org/">Data Science London</a>
Data Science London is a non-profit organization dedicated to the free,
open, dissemination of data science. We are the largest data science
community in Europe. We are more than 3,190 data scientists and data
geeks in our community.</li>
<li><a href="http://www.datawrangling.org">Datawrangling</a> by Peter
Skomoroch. MACHINE LEARNING, DATA MINING, AND MORE</li>
<li><a href="https://www.quora.com/topic/Data-Science">Quora Data
Science</a> - Data Science Questions and Answers from experts</li>
<li><a href="https://openresearch.wordpress.com/">Siah</a> a PhD student
at Berkeley</li>
<li><a href="https://www.ownml.co/blog/">Louis Dorard</a> a technology
guy with a penchant for the web and for data, big and small</li>
<li><a href="https://machinelearningmastery.com/">Machine Learning
Mastery</a> about helping professional programmers confidently apply
machine learning algorithms to address complex problems.</li>
<li><a href="https://www.danielforsyth.me/">Daniel Forsyth</a> -
Personal Blog</li>
<li><a href="https://www.datascienceweekly.org/">Data Science Weekly</a>
- Weekly News Blog</li>
<li><a href="https://blog.revolutionanalytics.com/">Revolution
Analytics</a> - Data Science Blog</li>
<li><a href="https://www.r-bloggers.com/">R Bloggers</a> - R
Bloggers</li>
<li><a href="https://practicalquant.blogspot.com/">The Practical
Quant</a> Big data</li>
<li><a href="https://yet-another-data-blog.blogspot.com/">Yet Another
Data Blog</a> Yet Another Data Blog</li>
<li><a href="https://spenczar.com/">Spenczar</a> a data scientist at
<em>Twitch</em>. I handle the whole data pipeline, from tracking to
model-building to reporting.</li>
<li><a href="https://www.kdnuggets.com/">KD Nuggets</a> Data Mining,
Analytics, Big Data, Data, Science not a blog a portal</li>
<li><a href="https://www.metabrown.com/blog/">Meta Brown</a> - Personal
Blog</li>
<li><a href="https://datascientists.net/">Data Scientist</a> is building
the data scientist culture.</li>
<li><a href="https://whatsthebigdata.com/">WhatSTheBigData</a> is some
of, all of, or much more than the above and this blog explores its
impact on information technology, the business world, government
agencies, and our lives.</li>
<li><a href="https://magnus-notitia.blogspot.com/">Tevfik Kosar</a> -
Magnus Notitia</li>
<li><a href="https://newdatascientist.blogspot.com/">New Data
Scientist</a> How a Social Scientist Jumps into the World of Big
Data</li>
<li><a href="https://harvarddatascience.com/">Harvard Data Science</a> -
Thoughts on Statistical Computing and Visualization</li>
<li><a href="https://ryanswanstrom.com/datascience101/">Data Science
101</a> - Learning To Be A Data Scientist</li>
<li><a href="https://www.chioka.in/kaggle-competition-solutions/">Kaggle
Past Solutions</a></li>
<li><a
href="https://datascientistjourney.wordpress.com/category/data-science/">DataScientistJourney</a></li>
<li><a href="https://chriswhong.github.io/nyctaxi/">NYC Taxi
Visualization Blog</a></li>
<li><a href="https://learninglover.com/blog/">Learning Lover</a></li>
<li><a href="https://www.dataists.com/">Dataists</a></li>
<li><a href="https://www.data-mania.com/">Data-Mania</a></li>
<li><a href="https://data-magnum.com/">Data-Magnum</a></li>
<li><a href="https://www.p-value.info/">P-value</a> - Musings on data
science, machine learning, and stats.</li>
<li><a
href="https://datascopeanalytics.com/blog/">datascopeanalytics</a></li>
<li><a href="https://tarrysingh.com/">Digital transformation</a></li>
<li><a
href="https://datascientistjourney.wordpress.com/category/data-science/">datascientistjourney</a></li>
<li><a href="https://www.data-mania.com/blog/">Data Mania Blog</a> - <a
href="https://chris-said.io/">The File Drawer</a> - Chris Saids science
blog</li>
<li><a href="https://www.emilio.ferrara.name/">Emilio Ferraras web
page</a></li>
<li><a href="https://datanews.tumblr.com/">DataNews</a></li>
<li><a href="https://www.reddit.com/r/textdatamining/">Reddit
TextMining</a></li>
<li><a href="https://periscopic.com/#!/news">Periscopic</a></li>
<li><a href="https://hilaryparker.com/">Hilary Parker</a></li>
<li><a href="https://datastori.es/">Data Stories</a></li>
<li><a href="https://datasciencelab.wordpress.com/">Data Science
Lab</a></li>
<li><a href="https://www.kennybastani.com/">Meaning of</a></li>
<li><a href="https://blog.smola.org">Adventures in Data Land</a></li>
<li><a href="https://blog.data-miners.com/">DATA MINERS BLOG</a></li>
<li><a href="https://theblog.okcupid.com/">Dataclysm</a></li>
<li><a href="https://flowingdata.com/">FlowingData</a> - Visualization
and Statistics</li>
<li><a href="https://www.calculatedriskblog.com/">Calculated
Risk</a></li>
<li><a
href="https://www.oreilly.com/content/topics/oreilly-learning/">Oreilly
Learning Blog</a></li>
<li><a href="https://blog.dominodatalab.com/">Dominodatalab</a></li>
<li><a href="https://iamtrask.github.io/">i am trask</a> - A Machine
Learning Craftsmanship Blog</li>
<li><a href="https://datasciencevademecum.wordpress.com/">Vademecum of
Practical Data Science</a> - Handbook and recipes for data-driven
solutions of real-world problems</li>
<li><a href="https://dataconomy.com/">Dataconomy</a> - A blog on the
newly emerging data economy</li>
<li><a href="https://www.springboard.com/blog/">Springboard</a> - A blog
with resources for data science learners</li>
<li><a href="https://www.analyticsvidhya.com/">Analytics Vidhya</a> - A
full-fledged website about data science and analytics study
material.</li>
<li><a href="https://www.kaushik.net/avinash/">Occams Razor</a> -
Focused on Web Analytics.</li>
<li><a href="https://www.dataschool.io/">Data School</a> - Data science
tutorials for beginners!</li>
<li><a href="https://colah.github.io">Colahs Blog</a> - Blog for
understanding Neural Networks!</li>
<li><a href="https://ruder.io/#open">Sebastians Blog</a> - Blog for NLP
and transfer learning!</li>
<li><a href="https://distill.pub">Distill</a> - Dedicated to clear
explanations of machine learning!</li>
<li><a href="https://chrisalbon.com/">Chris Albons Website</a> - Data
Science and AI notes</li>
<li><a href="https://andrewnc.github.io/blog/blog.html">Andrew Carr</a>
- Data Science with Esoteric programming languages</li>
<li><a
href="https://blog.floydhub.com/introduction-to-genetic-algorithms/">floydhub</a>
- Blog for Evolutionary Algorithms</li>
<li><a href="https://jinglescode.github.io/">Jingles</a> - Review and
extract key concepts from academic papers</li>
<li><a href="https://www.nbshare.io/notebooks/data-science/">nbshare</a>
- Data Science notebooks</li>
<li><a href="https://deep-and-shallow.com/">Deep and Shallow</a> - All
things Deep and Shallow in Data Science</li>
<li><a href="https://ltetrel.github.io/">Loic Tetrel</a> - Data science
blog</li>
<li><a href="https://huyenchip.com/blog/">Chip Huyens Blog</a> - ML
Engineering, MLOps, and the use of ML in startups</li>
<li><a href="https://www.mariakhalusova.com/">Maria Khalusova</a> - Data
science blog</li>
<li><a href="https://medium.com/@aditi2507rastogi">Aditi Rastogi</a> -
ML,DL,Data Science blog</li>
<li><a href="https://medium.com/@santiagobasulto">Santiago Basulto</a> -
Data Science with Python</li>
<li><a href="https://medium.com/@akhil0435">Akhil Soni</a> - ML, DL and
Data Science</li>
<li><a href="https://akhilworld.hashnode.dev/">Akhil Soni</a> - ML, DL
and Data Science</li>
</ul>
<h3 id="presentations">Presentations</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a
href="https://www.slideshare.net/ryanorban/how-to-become-a-data-scientist">How
to Become a Data Scientist</a></li>
<li><a
href="https://www.slideshare.net/NikoVuokko/introduction-to-data-science-25391618">Introduction
to Data Science</a></li>
<li><a
href="https://www.slideshare.net/pacoid/intro-to-data-science-for-enterprise-big-data">Intro
to Data Science for Enterprise Big Data</a></li>
<li><a
href="https://www.slideshare.net/dtunkelang/how-to-interview-a-data-scientist">How
to Interview a Data Scientist</a></li>
<li><a href="https://github.com/jtleek/datasharing">How to Share Data
with a Statistician</a></li>
<li><a
href="https://www.slideshare.net/katemats/the-science-of-a-great-career-in-data-science">The
Science of a Great Career in Data Science</a></li>
<li><a
href="https://www.slideshare.net/datasciencelondon/big-data-sorry-data-science-what-does-a-data-scientist-do">What
Does a Data Scientist Do?</a></li>
<li><a
href="https://www.slideshare.net/medriscoll/driscoll-strata-buildingdatastartups25may2011clean">Building
Data Start-Ups: Fast, Big, and Focused</a></li>
<li><a
href="https://www.slideshare.net/0xdata/how-to-win-data-science-competitions-with-deep-learning">How
to win data science competitions with Deep Learning</a></li>
<li><a
href="https://www.slideshare.net/AlexeyGrigorev/fullstack-data-scientist">Full-Stack
Data Scientist</a></li>
</ul>
<h3 id="podcasts">Podcasts</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a
href="https://podcasts.apple.com/us/podcast/data-science-at-home/id1069871378">AI
at Home</a></li>
<li><a href="https://www.cognilytica.com/aitoday/">AI Today</a></li>
<li><a href="https://adversariallearning.com/">Adversarial
Learning</a></li>
<li><a
href="https://www.becomingadatascientist.com/category/podcast/">Becoming
a Data Scientist</a></li>
<li><a
href="https://www.youtube.com/playlist?list=PLLvvXm0q8zUbiNdoIazGzlENMXvZ9bd3x">Chai
time Data Science</a></li>
<li><a href="https://datacrunchcorp.com/data-crunch-podcast/">Data
Crunch</a></li>
<li><a href="https://www.dataengineeringpodcast.com/">Data Engineering
Podcast</a></li>
<li><a href="https://datascienceathome.com/">Data Science at
Home</a></li>
<li><a
href="https://community.alteryx.com/t5/Data-Science-Mixer/bg-p/mixer">Data
Science Mixer</a></li>
<li><a href="https://dataskeptic.com/">Data Skeptic</a></li>
<li><a href="https://datastori.es/">Data Stories</a></li>
<li><a
href="https://jameskle.com/writes/category/Datacast">Datacast</a></li>
<li><a
href="https://www.datacamp.com/community/podcast">DataFramed</a></li>
<li><a href="https://anchor.fm/datatalksclub">DataTalks.Club</a></li>
<li><a href="https://wandb.ai/fully-connected/gradient-dissent">Gradient
Dissent</a></li>
<li><a href="https://www.learningmachines101.com/">Learning Machines
101</a></li>
<li><a
href="https://www.youtube.com/playlist?list=PLn_z5E4dh_Lj5eogejMxfOiNX3nOhmhmM">Lets
Data (Brazil)</a></li>
<li><a href="https://lineardigressions.com/">Linear Digressions</a></li>
<li><a href="https://nssdeviations.com/">Not So Standard
Deviations</a></li>
<li><a
href="https://www.oreilly.com/radar/topics/oreilly-data-show-podcast/">OReilly
Data Show Podcast</a></li>
<li><a href="https://partiallyderivative.com/">Partially
Derivative</a></li>
<li><a
href="https://www.superdatascience.com/podcast/">Superdatascience</a></li>
<li><a href="https://www.dataengineeringshow.com/">The Data Engineering
Show</a></li>
<li><a href="https://www.radicalai.org/">The Radical AI Podcast</a></li>
<li><a href="https://www.therobotbrains.ai/">The Robot Brains
Podcast</a></li>
<li><a href="https://fivethirtyeight.com/tag/whats-the-point/">Whats
The Point</a></li>
<li><a href="https://how-ai-built-this.captivate.fm/">How AI Built
This</a></li>
</ul>
<h3 id="youtube-videos-channels">YouTube Videos &amp; Channels</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a href="https://www.youtube.com/watch?v=WXHM_i-fgGo">What is
machine learning?</a></li>
<li><a href="https://www.youtube.com/watch?v=n1ViNeWhC24">Andrew Ng:
Deep Learning, Self-Taught Learning and Unsupervised Feature
Learning</a></li>
<li><a
href="https://www.youtube.com/c/TomiMesterData36comDataScienceForBeginners">Data36
- Data Science for Beginners by Tomi Mester</a></li>
<li><a href="https://www.youtube.com/watch?v=czLI3oLDe8M">Deep Learning:
Intelligence from Big Data</a></li>
<li><a href="https://www.youtube.com/watch?v=1Wp3IIpssEc">Interview with
Googles AI and Deep Learning Godfather Geoffrey Hinton</a></li>
<li><a href="https://www.youtube.com/watch?v=S75EdAcXHKk">Introduction
to Deep Learning with Python</a></li>
<li><a href="https://www.youtube.com/watch?v=elojMnjn4kk">What is
machine learning, and how does it work?</a></li>
<li><a
href="https://www.youtube.com/channel/UCnVzApLJE2ljPZSeQylSEyg">Data
School</a> - Data Science Education</li>
<li><a href="https://www.youtube.com/watch?v=Cu6A96TUy_o">Neural Nets
for Newbies by Melanie Warrick (May 2015)</a></li>
<li><a
href="https://www.youtube.com/playlist?list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH">Neural
Networks video series by Hugo Larochelle</a></li>
<li><a href="https://www.youtube.com/watch?v=evNCyRL3DOU">Google
DeepMind co-founder Shane Legg - Machine Super Intelligence</a></li>
<li><a
href="https://www.youtube.com/watch?v=cHzvYxBN9Ls&amp;list=PLPqVjP3T4RIRsjaW07zoGzH-Z4dBACpxY">Data
Science Primer</a></li>
<li><a href="https://www.youtube.com/watch?v=lpD38NxTOnk">Data Science
with Genetic Algorithms</a></li>
<li><a
href="https://www.youtube.com/playlist?list=PL2zq7klxX5ATMsmyRazei7ZXkP1GHt-vs">Data
Science for Beginners</a></li>
<li><a
href="https://www.youtube.com/channel/UCDvErgK0j5ur3aLgn6U-LqQ">DataTalks.Club</a></li>
<li><a
href="https://www.youtube.com/channel/UCYBSjwkGTK06NnDnFsOcR7g">Mildlyoverfitted
- Tutorials on intermediate ML/DL topics</a></li>
<li><a
href="https://www.youtube.com/channel/UCYBSjwkGTK06NnDnFsOcR7g">mlops.community
- Interviews of industry experts about production ML</a></li>
<li><a href="https://www.youtube.com/c/machinelearningstreettalk">ML
Street Talk - Unabashedly technical and non-commercial, so you will hear
no annoying pitches.</a></li>
<li><a
href="https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi">Neural
networks by 3Blue1Brown</a></li>
<li><a
href="https://www.youtube.com/playlist?list=PLQVvvaa0QuDcjD5BAw2DxE6OF2tius3V3">Neural
networks from scratch by Sentdex</a></li>
<li><a
href="https://www.youtube.com/c/ManningPublications/featured">Manning
Publications YouTube channel</a></li>
<li><a href="https://youtu.be/JYuQZii5o58">Ask Dr Chong: How to Lead in
Data Science - Part 1</a></li>
<li><a href="https://youtu.be/SzqIXV-O-ko">Ask Dr Chong: How to Lead in
Data Science - Part 2</a></li>
<li><a href="https://youtu.be/Ogwm7k_smTA">Ask Dr Chong: How to Lead in
Data Science - Part 3</a></li>
<li><a href="https://youtu.be/a9usjdzTxTU">Ask Dr Chong: How to Lead in
Data Science - Part 4</a></li>
<li><a href="https://youtu.be/MYdQq-F3Ws0">Ask Dr Chong: How to Lead in
Data Science - Part 5</a></li>
<li><a href="https://youtu.be/LOOt4OVC3hY">Ask Dr Chong: How to Lead in
Data Science - Part 6</a></li>
<li><a href="https://www.youtube.com/watch?v=9Hk8K8jhiOo">Regression
Models: Applying simple Poisson regression</a></li>
<li><a
href="https://www.youtube.com/playlist?list=PLv8Cp2NvcY8DpVcsmOT71kymgMmcr59Mf">Deep
Learning Architectures</a></li>
<li><a
href="https://www.youtube.com/playlist?list=PL3N9eeOlCrP5cK0QRQxeJd6GrQvhAtpBK">Time
Series Modelling and Analysis</a></li>
</ul>
<h2 id="socialize">Socialize</h2>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<p>Below are some Social Media links. Connect with other data
scientists!</p>
<ul>
<li><a href="#facebook-accounts">Facebook Accounts</a></li>
<li><a href="#twitter-accounts">Twitter Accounts</a></li>
<li><a href="#telegram-channels">Telegram Channels</a></li>
<li><a href="#slack-communities">Slack Communities</a></li>
<li><a href="#github-groups">GitHub Groups</a></li>
<li><a href="#data-science-competitions">Data Science
Competitions</a></li>
</ul>
<h3 id="facebook-accounts">Facebook Accounts</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a href="https://www.facebook.com/data">Data</a></li>
<li><a href="https://www.facebook.com/Bigdatascientist">Big Data
Scientist</a></li>
<li><a href="https://www.facebook.com/datascienceday/">Data Science
Day</a></li>
<li><a href="https://www.facebook.com/nycdatascience">Data Science
Academy</a></li>
<li><a
href="https://www.facebook.com/pages/Data-science/431299473579193?ref=br_rs">Facebook
Data Science Page</a></li>
<li><a
href="https://www.facebook.com/pages/Data-Science-London/226174337471513">Data
Science London</a></li>
<li><a
href="https://www.facebook.com/DataScienceTechnologyCorporation?ref=br_rs">Data
Science Technology and Corporation</a></li>
<li><a
href="https://www.facebook.com/groups/1394010454157077/?ref=br_rs">Data
Science - Closed Group</a></li>
<li><a
href="https://www.facebook.com/centerdatasciences?ref=br_rs">Center for
Data Science</a></li>
<li><a href="https://www.facebook.com/groups/bigdatahadoop/">Big data
hadoop NOSQL Hive Hbase</a></li>
<li><a href="https://www.facebook.com/groups/data.analytics/">Analytics,
Data Mining, Predictive Modeling, Artificial Intelligence</a></li>
<li><a href="https://www.facebook.com/groups/434352233255448/">Big Data
Analytics using R</a></li>
<li><a href="https://www.facebook.com/groups/rhadoop/">Big Data
Analytics with R and Hadoop</a></li>
<li><a href="https://www.facebook.com/groups/bigdatalearnings/">Big Data
Learnings</a></li>
<li><a href="https://www.facebook.com/groups/bigdatastatistics/">Big
Data, Data Science, Data Mining &amp; Statistics</a></li>
<li><a
href="https://www.facebook.com/groups/BigDataExpert/">BigData/Hadoop
Expert</a></li>
<li><a href="https://www.facebook.com/groups/machinelearningforum/">Data
Mining / Machine Learning / AI</a></li>
<li><a
href="https://www.facebook.com/groups/dataminingsocialnetworks/">Data
Mining/Big Data - Social Network Ana</a></li>
<li><a href="https://www.facebook.com/datasciencevademecum">Vademecum of
Practical Data Science</a></li>
<li><a href="https://www.facebook.com/groups/veribilimiistanbul/">Veri
Bilimi Istanbul</a></li>
<li><a href="https://www.facebook.com/theDataScienceBlog/">The Data
Science Blog</a></li>
</ul>
<h3 id="twitter-accounts">Twitter Accounts</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<table>
<colgroup>
<col style="width: 50%" />
<col style="width: 50%" />
</colgroup>
<thead>
<tr class="header">
<th>Twitter</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><a href="https://twitter.com/BigDataCombine">Big Data
Combine</a></td>
<td>Rapid-fire, live tryouts for data scientists seeking to monetize
their models as trading strategies</td>
</tr>
<tr class="even">
<td>Big Data Mania</td>
<td>Data Viz Wiz, Data Journalist, Growth Hacker, Author of Data Science
for Dummies (2015)</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/analyticbridge">Big Data
Science</a></td>
<td>Big Data, Data Science, Predictive Modeling, Business Analytics,
Hadoop, Decision and Operations Research.</td>
</tr>
<tr class="even">
<td>Charlie Greenbacker</td>
<td>Director of Data Science at <span class="citation"
data-cites="ExploreAltamira">@ExploreAltamira</span></td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/Chris_Said">Chris Said</a></td>
<td>Data scientist at Twitter</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/clarecorthell">Clare Corthell</a></td>
<td>Dev, Design, Data Science <span class="citation"
data-cites="mattermark">@mattermark</span> #hackerei</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/DadiCharles">DADI
Charles-Abner</a></td>
<td>#datascientist <span class="citation"
data-cites="Ekimetrics">@Ekimetrics</span>. , #machinelearning #dataviz
#DynamicCharts #Hadoop #R #Python #NLP #Bitcoin #dataenthousiast</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/DataScienceCtrl">Data Science
Central</a></td>
<td>Data Science Central is the industrys single resource for Big Data
practitioners.</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/ds_ldn">Data Science London</a></td>
<td>Data Science. Big Data. Data Hacks. Data Junkies. Data Startups.
Open Data</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/BecomingDataSci">Data Science
Renee</a></td>
<td>Documenting my path from SQL Data Analyst pursuing an Engineering
Masters Degree to Data Scientist</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/TedOBrien93">Data Science
Report</a></td>
<td>Mission is to help guide &amp; advance careers in Data Science &amp;
Analytics</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/datasciencetips">Data Science
Tips</a></td>
<td>Tips and Tricks for Data Scientists around the world! #datascience
#bigdata</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/DataVisualizati">Data Vizzard</a></td>
<td>DataViz, Security, Military</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/DataScienceX">DataScienceX</a></td>
<td></td>
</tr>
<tr class="odd">
<td>deeplearning4j</td>
<td></td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/dpatil">DJ Patil</a></td>
<td>White House Data Chief, VP @ RelateIQ.</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/DominoDataLab">Domino Data Lab</a></td>
<td></td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/drewconway">Drew Conway</a></td>
<td>Data nerd, hacker, student of conflict.</td>
</tr>
<tr class="odd">
<td>Emilio Ferrara</td>
<td>#Networks, #MachineLearning and #DataScience. I work on #Social
Media. Postdoc at <span class="citation"
data-cites="IndianaUniv">@IndianaUniv</span></td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/erinbartolo">Erin Bartolo</a></td>
<td>Running with #BigDataenjoying a love/hate relationship with its
hype. <span class="citation" data-cites="iSchoolSU">@iSchoolSU</span>
#DataScience Program Mgr.</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/gjreda">Greg Reda</a></td>
<td>Working @ <em>GrubHub</em> about data and pandas</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/kdnuggets">Gregory Piatetsky</a></td>
<td>KDnuggets President, Analytics/Big Data/Data Mining/Data Science
expert, KDD &amp; SIGKDD co-founder, was Chief Scientist at 2 startups,
part-time philosopher.</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/hadleywickham">Hadley Wickham</a></td>
<td>Chief Scientist at RStudio, and an Adjunct Professor of Statistics
at the University of Auckland, Stanford University, and Rice
University.</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/hakan_kardes">Hakan Kardas</a></td>
<td>Data Scientist</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/hmason">Hilary Mason</a></td>
<td>Data Scientist in Residence at <span class="citation"
data-cites="accel">@accel</span>.</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/hackingdata">Jeff Hammerbacher</a></td>
<td>ReTweeting about data science</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/johnmyleswhite">John Myles
White</a></td>
<td>Scientist at Facebook and Julia developer. Author of Machine
Learning for Hackers and Bandit Algorithms for Website Optimization.
Tweets reflect my views only.</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/BDataScientist">Juan Miguel
Lavista</a></td>
<td>Principal Data Scientist @ Microsoft Data Science Team</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/b0rk">Julia Evans</a></td>
<td>Hacker - Pandas - Data Analyze</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/kncukier">Kenneth Cukier</a></td>
<td>The Economists Data Editor and co-author of Big Data
(http://www.big-data-book.com/).</td>
</tr>
<tr class="odd">
<td>Kevin Davenport</td>
<td>Organizer of
https://www.meetup.com/San-Diego-Data-Science-R-Users-Group/</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/justmarkham">Kevin Markham</a></td>
<td>Data science instructor, and founder of <a
href="https://www.dataschool.io/">Data School</a></td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/krees">Kim Rees</a></td>
<td>Interactive data visualization and tools. Data flaneur.</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/KirkDBorne">Kirk Borne</a></td>
<td>DataScientist, PhD Astrophysicist, Top #BigData Influencer.</td>
</tr>
<tr class="odd">
<td>Linda Regber</td>
<td>Data storyteller, visualizations.</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/lmrei">Luis Rei</a></td>
<td>PhD Student. Programming, Mobile, Web. Artificial Intelligence,
Intelligent Robotics Machine Learning, Data Mining, Natural Language
Processing, Data Science.</td>
</tr>
<tr class="odd">
<td>Mark Stevenson</td>
<td>Data Analytics Recruitment Specialist at Salt (<span
class="citation" data-cites="SaltJobs">@SaltJobs</span>) Analytics -
Insight - Big Data - Data science</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/__mharrison__">Matt Harrison</a></td>
<td>Opinions of full-stack Python guy, author, instructor, currently
playing Data Scientist. Occasional fathering, husbanding, organic
gardening.</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/ptwobrussell">Matthew Russell</a></td>
<td>Mining the Social Web.</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/mertnuhoglu">Mert Nuhoğlu</a></td>
<td>Data Scientist at BizQualify, Developer</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/mrogati">Monica Rogati</a></td>
<td>Data @ Jawbone. Turned data into stories &amp; products at LinkedIn.
Text mining, applied machine learning, recommender systems. Ex-gamer,
ex-machine coder; namer.</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/noahi">Noah Iliinsky</a></td>
<td>Visualization &amp; interaction designer. Practical cyclist. Author
of vis books: https://www.oreilly.com/pub/au/4419</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/PaulMiller">Paul Miller</a></td>
<td>Cloud Computing/ Big Data/ Open Data Analyst &amp; Consultant.
Writer, Speaker &amp; Moderator. Gigaom Research Analyst.</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/peteskomoroch">Peter Skomoroch</a></td>
<td>Creating intelligent systems to automate tasks &amp; improve
decisions. Entrepreneur, ex-Principal Data Scientist <span
class="citation" data-cites="LinkedIn">@LinkedIn</span>. Machine
Learning, ProductRei, Networks</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/MDMGeek">Prash Chan</a></td>
<td>Solution Architect @ IBM, Master Data Management, Data Quality &amp;
Data Governance Blogger. Data Science, Hadoop, Big Data &amp;
Cloud.</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/q_datascience">Quora Data
Science</a></td>
<td>Quoras data science topic</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/Rbloggers">R-Bloggers</a></td>
<td>Tweet blog posts from the R blogosphere, data science conferences,
and (!) open jobs for data scientists.</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/randhindi">Rand Hindi</a></td>
<td></td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/randal_olson">Randy Olson</a></td>
<td>Computer scientist researching artificial intelligence. Data
tinkerer. Community leader for <span class="citation"
data-cites="DataIsBeautiful">@DataIsBeautiful</span>. #OpenScience
advocate.</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/EROLRecep">Recep Erol</a></td>
<td>Data Science geek @ UALR</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/ryanorban">Ryan Orban</a></td>
<td>Data scientist, genetic origamist, hardware aficionado</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/seanjtaylor">Sean J. Taylor</a></td>
<td>Social Scientist. Hacker. Facebook Data Science Team. Keywords:
Experiments, Causal Inference, Statistics, Machine Learning,
Economics.</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/silviakspiva">Silvia K. Spiva</a></td>
<td>#DataScience at Cisco</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/harshbg">Harsh B. Gupta</a></td>
<td>Data Scientist at BBVA Compass</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/spenczar_n">Spencer Nelson</a></td>
<td>Data nerd</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/tozCSS">Talha Oz</a></td>
<td>Enjoys ABM, SNA, DM, ML, NLP, HI, Python, Java. Top percentile
Kaggler/data scientist</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/anskarl">Tasos Skarlatidis</a></td>
<td>Complex Event Processing, Big Data, Artificial Intelligence and
Machine Learning. Passionate about programming and open-source.</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/Terry_Timko">Terry Timko</a></td>
<td>InfoGov; Bigdata; Data as a Service; Data Science; Open, Social
&amp; Business Data Convergence</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/TonyBaer">Tony Baer</a></td>
<td>IT analyst with Ovum covering Big Data &amp; data management with
some systems engineering thrown in.</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/tonyojeda3">Tony Ojeda</a></td>
<td>Data Scientist , Author , Entrepreneur. Co-founder <span
class="citation" data-cites="DataCommunityDC">@DataCommunityDC</span>.
Founder <span class="citation"
data-cites="DistrictDataLab">@DistrictDataLab</span>. #DataScience
#BigData #DataDC</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/vambati">Vamshi Ambati</a></td>
<td>Data Science @ PayPal. #NLP, #machinelearning; PhD, Carnegie Mellon
alumni (Blog: https://allthingsds.wordpress.com )</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/wesmckinn">Wes McKinney</a></td>
<td>Pandas (Python Data Analysis library).</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/WileyEd">WileyEd</a></td>
<td>Senior Manager - <span class="citation"
data-cites="Seagate">@Seagate</span> Big Data Analytics <span
class="citation" data-cites="McKinsey">@McKinsey</span> Alum #BigData +
#Analytics Evangelist #Hadoop, #Cloud, #Digital, &amp; #R
Enthusiast</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/datanews">WNYC Data News Team</a></td>
<td>The data news crew at <span class="citation"
data-cites="WNYC">@WNYC</span>. Practicing data-driven journalism,
making it visual, and showing our work.</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/Al_Grigor">Alexey Grigorev</a></td>
<td>Data science author</td>
</tr>
<tr class="even">
<td><a href="https://twitter.com/ilkerarslan_35">İlker Arslan</a></td>
<td>Data science author. Shares mostly about Julia programming</td>
</tr>
<tr class="odd">
<td><a href="https://twitter.com/WeAreInevitable">INEVITABLE</a></td>
<td>AI &amp; Data Science Start-up Company based in England, UK</td>
</tr>
</tbody>
</table>
<h3 id="telegram-channels">Telegram Channels</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a href="https://t.me/opendatascience">Open Data Science</a> First
Telegram Data Science channel. Covering all technical and popular staff
about anything related to Data Science: AI, Big Data, Machine Learning,
Statistics, general Math and the applications of former.</li>
<li><a href="https://t.me/loss_function_porn">Loss function porn</a>
Beautiful posts on DS/ML theme with video or graphic visualization.</li>
<li><a
href="https://t.me/ai_machinelearning_big_data">Machinelearning</a>
Daily ML news.</li>
</ul>
<h3 id="slack-communities">Slack Communities</h3>
<p><a href="#awesome-data-science">top</a></p>
<ul>
<li><a href="https://datatalks.club">DataTalks.Club</a></li>
<li><a href="https://www.womenwhocode.com/datascience">Women Who Code -
Data Science</a></li>
</ul>
<h3 id="github-groups">GitHub Groups</h3>
<ul>
<li><a href="https://github.com/BIDS">Berkeley Institute for Data
Science</a></li>
</ul>
<h3 id="data-science-competitions">Data Science Competitions</h3>
<p>Some data mining competition platforms</p>
<ul>
<li><a href="https://www.kaggle.com/">Kaggle</a></li>
<li><a href="https://www.drivendata.org/">DrivenData</a></li>
<li><a href="https://datahack.analyticsvidhya.com/">Analytics
Vidhya</a></li>
<li><a href="https://www.innocentive.com/">InnoCentive</a></li>
<li><a
href="https://www.microprediction.com/python-1">Microprediction</a></li>
</ul>
<h2 id="fun">Fun</h2>
<ul>
<li><a href="#infographics">Infographic</a></li>
<li><a href="#datasets">Datasets</a></li>
<li><a href="#comics">Comics</a></li>
</ul>
<h3 id="infographics">Infographics</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<table>
<colgroup>
<col style="width: 48%" />
<col style="width: 51%" />
</colgroup>
<thead>
<tr class="header">
<th>Preview</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><a
href="https://i.imgur.com/0OoLaa5.png"><img src="https://i.imgur.com/0OoLaa5.png" width="150" /></a></td>
<td><a
href="https://searchbusinessanalytics.techtarget.com/feature/Key-differences-of-a-data-scientist-vs-data-engineer">Key
differences of a data scientist vs. data engineer</a></td>
</tr>
<tr class="even">
<td><a
href="https://s3.amazonaws.com/assets.datacamp.com/blog_assets/DataScienceEightSteps_Full.png"><img src="https://cloud.githubusercontent.com/assets/182906/19517857/604f88d8-960c-11e6-97d6-16c9738cb824.png" width="150" /></a></td>
<td>A visual guide to Becoming a Data Scientist in 8 Steps by <a
href="https://www.datacamp.com">DataCamp</a> <a
href="https://s3.amazonaws.com/assets.datacamp.com/blog_assets/DataScienceEightSteps_Full.png">(img)</a></td>
</tr>
<tr class="odd">
<td><a
href="https://i.imgur.com/FxsL3b8.png"><img src="https://i.imgur.com/W2t2Roz.png" width="150" /></a></td>
<td>Mindmap on required skills (<a
href="https://i.imgur.com/FxsL3b8.png">img</a>)</td>
</tr>
<tr class="even">
<td><a
href="https://nirvacana.com/thoughts/wp-content/uploads/2013/07/RoadToDataScientist1.png"><img src="https://i.imgur.com/rb9ruaa.png" width="150" /></a></td>
<td>Swami Chandrasekaran made a <a
href="http://nirvacana.com/thoughts/2013/07/08/becoming-a-data-scientist/">Curriculum
via Metro map</a>.</td>
</tr>
<tr class="odd">
<td><a
href="https://i.imgur.com/4ZBBvb0.png"><img src="https://i.imgur.com/XBgKF2l.png" width="150" /></a></td>
<td>by <a href="https://twitter.com/kzawadz"><span class="citation"
data-cites="kzawadz">@kzawadz</span></a> via <a
href="https://twitter.com/MktngDistillery/status/538671811991715840">twitter</a></td>
</tr>
<tr class="even">
<td><a
href="https://i.imgur.com/xLY3XZn.jpg"><img src="https://i.imgur.com/l9ZGtal.jpg" width="150" /></a></td>
<td>By <a href="https://www.datasciencecentral.com/">Data Science
Central</a></td>
</tr>
<tr class="odd">
<td><a
href="https://i.imgur.com/0TydZ4M.png"><img src="https://i.imgur.com/TWkB4X6.png" width="150" /></a></td>
<td>Data Science Wars: R vs Python</td>
</tr>
<tr class="even">
<td><a
href="https://i.imgur.com/HnRwlce.png"><img src="https://i.imgur.com/gtTlW5I.png" width="150" /></a></td>
<td>How to select statistical or machine learning techniques</td>
</tr>
<tr class="odd">
<td><a
href="https://scikit-learn.org/stable/_static/ml_map.png"><img src="https://scikit-learn.org/stable/_static/ml_map.png" width="150" /></a></td>
<td>Choosing the Right Estimator</td>
</tr>
<tr class="even">
<td><a
href="https://i.imgur.com/uEqMwZa.png"><img src="https://i.imgur.com/3JSyUq1.png" width="150" /></a></td>
<td>The Data Science Industry: Who Does What</td>
</tr>
<tr class="odd">
<td><a
href="https://i.imgur.com/RsHqY84.png"><img src="https://i.imgur.com/DQqFwwy.png" width="150" /></a></td>
<td>Data Science <del>Venn</del> Euler Diagram</td>
</tr>
<tr class="even">
<td><a
href="https://www.springboard.com/blog/wp-content/uploads/2016/03/20160324_springboard_vennDiagram.png"><img src="https://www.springboard.com/blog/wp-content/uploads/2016/03/20160324_springboard_vennDiagram.png" width="150" height="150" /></a></td>
<td>Different Data Science Skills and Roles from <a
href="https://www.springboard.com/blog/data-science-career-paths-different-roles-industry/">this
article</a> by Springboard</td>
</tr>
<tr class="odd">
<td><a
href="https://data-literacy.geckoboard.com/poster/"><img src="https://data-literacy.geckoboard.com/assets/img/data-fallacies-to-avoid-preview.jpg" width="150" alt="Data Fallacies To Avoid" /></a></td>
<td>A simple and friendly way of teaching your non-data
scientist/non-statistician colleagues <a
href="https://data-literacy.geckoboard.com/poster/">how to avoid
mistakes with data</a>. From Geckoboards <a
href="https://data-literacy.geckoboard.com/">Data Literacy
Lessons</a>.</td>
</tr>
</tbody>
</table>
<h3 id="datasets">Datasets</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a href="https://academictorrents.com/">Academic Torrents</a></li>
<li><a href="https://www.adsbexchange.com/data-samples/">ADS-B
Exchange</a> - Specific datasets for aircraft and Automatic Dependent
Surveillance-Broadcast (ADS-B) sources.</li>
<li><a
href="https://hadoopilluminated.com/hadoop_illuminated/Public_Bigdata_Sets.html">hadoopilluminated.com</a></li>
<li><a href="https://catalog.data.gov/dataset">data.gov</a> - The home
of the U.S. Governments open data</li>
<li><a href="https://www.census.gov/">United States Census
Bureau</a></li>
<li><a href="https://usgovxml.com/">usgovxml.com</a></li>
<li><a href="https://enigma.com/">enigma.com</a> - Navigate the world of
public data - Quickly search and analyze billions of public records
published by governments, companies and organizations.</li>
<li><a href="https://datahub.io/">datahub.io</a></li>
<li><a
href="https://aws.amazon.com/datasets/">aws.amazon.com/datasets</a></li>
<li><a href="https://datacite.org/">datacite.org</a></li>
<li><a href="https://data.europa.eu/en">The official portal for European
data</a></li>
<li><a href="https://data.nasdaq.com/">NASDAQ:DATA</a> - Nasdaq Data
Link A premier source for financial, economic and alternative
datasets.</li>
<li><a href="https://figshare.com/">figshare.com</a></li>
<li><a href="https://dev.maxmind.com/geoip">GeoLite Legacy Downloadable
Databases</a></li>
<li><a
href="https://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public">Quoras
Big Datasets Answer</a></li>
<li><a
href="https://hadoopilluminated.com/hadoop_illuminated/Public_Bigdata_Sets.html">Public
Big Data Sets</a></li>
<li><a href="https://www.kaggle.com/datasets">Kaggle Datasets</a></li>
<li><a href="https://www.internationalgenome.org/data">A Deep Catalog of
Human Genetic Variation</a></li>
<li><a href="https://developers.google.com/freebase/">A
community-curated database of well-known people, places, and
things</a></li>
<li><a href="https://www.google.com/publicdata/directory">Google Public
Data</a></li>
<li><a href="https://data.worldbank.org/">World Bank Data</a></li>
<li><a href="https://chriswhong.github.io/nyctaxi/">NYC Taxi
data</a></li>
<li><a href="https://www.opendataphilly.org/">Open Data Philly</a>
Connecting people with data for Philadelphia</li>
<li><a href="https://grouplens.org/datasets/">grouplens.org</a> Sample
movie (with ratings), book and wiki datasets</li>
<li><a href="https://archive.ics.uci.edu/ml/">UC Irvine Machine Learning
Repository</a> - contains data sets good for machine learning</li>
<li><a
href="https://web.archive.org/web/20150320022752/https://bitly.com/bundles/hmason/1">research-quality
data sets</a> by <a
href="https://web.archive.org/web/20150501033715/https://bitly.com/u/hmason/bundles">Hilary
Mason</a></li>
<li><a href="https://www.ncei.noaa.gov/">National Centers for
Environmental Information</a></li>
<li><a href="https://www.climatedata.us/">ClimateData.us</a> (related:
<a href="https://toolkit.climate.gov/">U.S. Climate Resilience
Toolkit</a>)</li>
<li><a href="https://www.reddit.com/r/datasets/">r/datasets</a></li>
<li><a href="https://www.maplight.org/data-series">MapLight</a> -
provides a variety of data free of charge for uses that are freely
available to the general public. Click on a data set below to learn
more</li>
<li><a href="https://ghdx.healthdata.org/">GHDx</a> - Institute for
Health Metrics and Evaluation - a catalog of health and demographic
datasets from around the world and including IHME results</li>
<li><a href="https://fred.stlouisfed.org/">St. Louis Federal Reserve
Economic Data - FRED</a></li>
<li><a href="https://data1850.nz/">New Zealand Institute of Economic
Research Data1850</a></li>
<li><a href="https://github.com/datasciencemasters/data">Open Data
Sources</a></li>
<li><a href="https://data.unicef.org/">UNICEF Data</a></li>
<li><a href="https://data.un.org/">undata</a></li>
<li><a href="https://sedac.ciesin.columbia.edu/">NASA SocioEconomic Data
and Applications Center - SEDAC</a></li>
<li><a href="https://www.gdeltproject.org/">The GDELT Project</a></li>
<li><a href="https://www.scb.se/en/">Sweden, Statistics</a></li>
<li><a href="https://data.stackexchange.com">StackExchange Data
Explorer</a> - an open source tool for running arbitrary queries against
public data from the Stack Exchange network.</li>
<li><a href="https://socialgrep.com/datasets">SocialGrep</a> - a
collection of open Reddit datasets.</li>
<li><a href="https://datasf.org/opendata/">San Fransisco Government Open
Data</a></li>
<li><a href="https://developer.ibm.com/exchanges/data/">IBM Asset
Dataset</a></li>
<li><a href="https://index.okfn.org/">Open data Index</a></li>
<li><a
href="https://github.com/src-d/datasets/tree/master/PublicGitArchive">Public
Git Archive</a></li>
<li><a href="https://ghtorrent.org/">GHTorrent</a></li>
<li><a href="https://msropendata.com/">Microsoft Research Open
Data</a></li>
<li><a href="https://data.gov.in/">Open Government Data Platform
India</a></li>
<li><a href="https://datasetsearch.research.google.com/">Google Dataset
Search (beta)</a></li>
<li><a href="https://github.com/naynco/nayn.data">NAYN.CO Turkish News
with categories</a></li>
<li><a href="https://github.com/datasets/covid-19">Covid-19</a></li>
<li><a
href="https://github.com/google-research/open-covid-19-data">Covid-19
Google</a></li>
<li><a href="https://www.cs.cmu.edu/~./enron/">Enron Email
Dataset</a></li>
<li><a href="https://github.com/alexeygrigorev/clothing-dataset">5000
Images of Clothes</a></li>
<li><a href="https://data.ibb.gov.tr/en/">IBB Open Portal</a></li>
<li><a href="https://data.humdata.org/">The Humanitarian Data
Exchange</a></li>
</ul>
<h3 id="comics">Comics</h3>
<p><strong><a
href="#awesome-data-science"><code>^ back to top ^</code></a></strong></p>
<ul>
<li><a
href="https://medium.com/@nikhil_garg/a-compilation-of-comics-explaining-statistics-data-science-and-machine-learning-eeefbae91277">Comic
compilation</a></li>
<li><a
href="https://www.kdnuggets.com/websites/cartoons.html">Cartoons</a></li>
</ul>
<h2 id="other-awesome-lists">Other Awesome Lists</h2>
<ul>
<li>Other amazingly awesome lists can be found in the <a
href="https://github.com/bayandin/awesome-awesomeness">awesome-awesomeness</a></li>
<li><a
href="https://github.com/josephmisiti/awesome-machine-learning">Awesome
Machine Learning</a></li>
<li><a href="https://github.com/jnv/lists">lists</a></li>
<li><a
href="https://github.com/javierluraschi/awesome-dataviz">awesome-dataviz</a></li>
<li><a
href="https://github.com/vinta/awesome-python">awesome-python</a></li>
<li><a
href="https://github.com/donnemartin/data-science-ipython-notebooks">Data
Science IPython Notebooks.</a></li>
<li><a href="https://github.com/qinwf/awesome-R">awesome-r</a></li>
<li><a
href="https://github.com/awesomedata/awesome-public-datasets">awesome-datasets</a></li>
<li><a
href="https://github.com/ujjwalkarn/Machine-Learning-Tutorials/blob/master/README.md">awesome-Machine
Learning &amp; Deep Learning Tutorials</a></li>
<li><a href="https://github.com/JosPolfliet/awesome-ai-usecases">Awesome
Data Science Ideas</a></li>
<li><a
href="https://github.com/ZuzooVn/machine-learning-for-software-engineers">Machine
Learning for Software Engineers</a></li>
<li><a href="https://hackr.io/tutorials/learn-data-science">Community
Curated Data Science Resources</a></li>
<li><a
href="https://github.com/src-d/awesome-machine-learning-on-source-code">Awesome
Machine Learning On Source Code</a></li>
<li><a
href="https://github.com/benedekrozemberczki/awesome-community-detection">Awesome
Community Detection</a></li>
<li><a
href="https://github.com/benedekrozemberczki/awesome-graph-classification">Awesome
Graph Classification</a></li>
<li><a
href="https://github.com/benedekrozemberczki/awesome-decision-tree-papers">Awesome
Decision Tree Papers</a></li>
<li><a
href="https://github.com/benedekrozemberczki/awesome-fraud-detection-papers">Awesome
Fraud Detection Papers</a></li>
<li><a
href="https://github.com/benedekrozemberczki/awesome-gradient-boosting-papers">Awesome
Gradient Boosting Papers</a></li>
<li><a
href="https://github.com/nerox8664/awesome-computer-vision-models">Awesome
Computer Vision Models</a></li>
<li><a
href="https://github.com/benedekrozemberczki/awesome-monte-carlo-tree-search-papers">Awesome
Monte Carlo Tree Search</a></li>
<li><a
href="https://www.analyticsvidhya.com/glossary-of-common-statistics-and-machine-learning-terms/">Glossary
of common statistics and ML terms</a></li>
<li><a href="https://github.com/mhagiwara/100-nlp-papers">100 NLP
Papers</a></li>
<li><a
href="https://github.com/leomaurodesenv/game-datasets#readme">Awesome
Game Datasets</a></li>
<li><a
href="https://github.com/alexeygrigorev/data-science-interviews">Data
Science Interviews Questions</a></li>
<li><a
href="https://github.com/AstraZeneca/awesome-explainable-graph-reasoning">Awesome
Explainable Graph Reasoning</a></li>
<li><a
href="https://www.interviewbit.com/data-science-interview-questions/">Top
Data Science Interview Questions</a></li>
<li><a
href="https://github.com/AstraZeneca/awesome-drug-pair-scoring">Awesome
Drug Synergy, Interaction and Polypharmacy Prediction</a></li>
<li><a
href="https://www.adaface.com/blog/deep-learning-interview-questions/">Deep
Learning Interview Questions</a></li>
<li><a
href="https://medium.com/the-modern-scientist/top-future-trends-in-data-science-in-2023-3e616c8998b8">Top
Future Trends in Data Science in 2023</a></li>
<li><a
href="https://hbr.org/2022/11/how-generative-ai-is-changing-creative-work">How
Generative AI Is Changing Creative Work</a></li>
<li><a
href="https://www.techtarget.com/searchenterpriseai/definition/generative-AI">What
is generative AI?</a></li>
</ul>
<h3 id="hobby">Hobby</h3>
<ul>
<li><a href="https://github.com/ad-si/awesome-music-production">Awesome
Music Production</a></li>
</ul>
<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-YL0RV0E5XZ"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-YL0RV0E5XZ');
</script>