update
This commit is contained in:
100
terminal/learndatascience2
Normal file
100
terminal/learndatascience2
Normal file
@@ -0,0 +1,100 @@
|
||||
[38;5;12m [39m[38;2;255;187;0m[1m[4mData Science Tutorials & Resources for Beginners [0m[38;5;14m[1m[4m![0m[38;2;255;187;0m[1m[4mAwesome[0m[38;5;14m[1m[4m (https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)[0m[38;2;255;187;0m[1m[4m (https://github.com/sindresorhus/awesome)[0m
|
||||
|
||||
[48;2;30;30;40m[38;5;13m[3mIf you want to know more about Data Science but don't know where to start this list is for you![0m[38;5;12m :chart_with_upwards_trend:[39m
|
||||
|
||||
[38;5;12mNo previous knowledge is required but Python and statistics basics will definitely come in handy. These resources have been used successfully for many beginners at my local Data Science student group [39m[38;5;14m[1mML-KA[0m[38;5;12m (http://ml-ka.de/).[39m
|
||||
|
||||
[38;2;255;187;0m[4mWhat is Data Science?[0m
|
||||
|
||||
[38;5;12m- [39m[38;5;14m[1m'What is Data Science?' on Quora[0m[38;5;12m (https://www.quora.com/What-is-data-science)[39m
|
||||
[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mExplanation[0m[38;5;14m[1m [0m[38;5;14m[1mof[0m[38;5;14m[1m [0m[38;5;14m[1mimportant[0m[38;5;14m[1m [0m[38;5;14m[1mvocabulary[0m[38;5;12m [39m[38;5;12m(https://www.quora.com/What-is-the-difference-between-Data-Analytics-Data-Analysis-Data-Mining-Data-Science-Machine-Learning-and-Big-Data-1?share=1)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mDifferentiation[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mBig[39m[38;5;12m [39m[38;5;12mData,[39m[38;5;12m [39m[38;5;12mMachine[39m[38;5;12m [39m[38;5;12mLearning,[39m[38;5;12m [39m[38;5;12mData[39m[38;5;12m [39m
|
||||
[38;5;12mScience.[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mData Science for Business (Book)[0m[38;5;12m (https://amzn.to/2voPJUi) - An introduction to Data Science and its use as a business asset.[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mData Science Process: A Beginner’s Comprehensive Guide[0m[38;5;12m (https://www.scaler.com/blog/data-science-process/) - Technical Skills for the Data Science: This emphasizes the practical skills needed throughout the data science process.[39m
|
||||
|
||||
[38;2;255;187;0m[4mCommon Algorithms and Procedures[0m
|
||||
|
||||
[38;5;12m- [39m[38;5;14m[1mSupervised vs unsupervised learning[0m[38;5;12m (https://stackoverflow.com/questions/1832076/what-is-the-difference-between-supervised-learning-and-unsupervised-learning) - The two most common types of Machine Learning algorithms. [39m
|
||||
[38;5;12m- [39m[38;5;14m[1m9 important Data Science algorithms and their implementation[0m[38;5;12m (https://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.05-Naive-Bayes.ipynb) [39m
|
||||
[38;5;12m- [39m[38;5;14m[1mCross validation[0m[38;5;12m (https://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.03-Hyperparameters-and-Model-Validation.ipynb) - Evaluate the performance of your algorithm/model.[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mFeature engineering[0m[38;5;12m (https://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.04-Feature-Engineering.ipynb) - Modifying the data to better model predictions.[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mScientific introduction to 10 important Data Science algorithms[0m[38;5;12m (http://www.cs.umd.edu/%7Esamir/498/10Algorithms-08.pdf)[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mModel ensemble: Explanation[0m[38;5;12m (https://www.analyticsvidhya.com/blog/2017/02/introduction-to-ensembling-along-with-implementation-in-r/) - Combine multiple models into one for better performance.[39m
|
||||
|
||||
[38;2;255;187;0m[4mData Science using Python[0m
|
||||
[38;5;12mThis list covers only Python, as many are already familiar with this language. [39m[38;5;14m[1mData Science tutorials using R[0m[38;5;12m (https://github.com/ujjwalkarn/DataScienceR).[39m
|
||||
|
||||
[38;2;255;187;0m[4mGeneral[0m
|
||||
|
||||
[38;5;12m- [39m[38;5;14m[1mO'Reilly Data Science from Scratch (Book)[0m[38;5;12m (https://amzn.to/2GSjjrK) - Data processing, implementation, and visualization with example code.[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mCoursera Applied Data Science[0m[38;5;12m (https://www.coursera.org/specializations/data-science-python) - Online Course using Python that covers most of the relevant toolkits. [39m
|
||||
|
||||
[38;2;255;187;0m[4mLearning Python[0m
|
||||
|
||||
[38;5;12m- [39m[38;5;14m[1mYouTube tutorial series by sentdex[0m[38;5;12m (https://www.youtube.com/watch?v=oVp1vrfL_w4&list=PLQVvvaa0QuDe8XSftW-RAxdo6OmaeL85M)[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mInteractive Python tutorial website[0m[38;5;12m (http://www.learnpython.org/)[39m
|
||||
|
||||
[38;2;255;187;0m[4mnumpy[0m
|
||||
[38;5;14m[1mnumpy[0m[38;5;12m (http://www.numpy.org/) is a Python library which provides large multidimensional arrays and fast mathematical operations on them.[39m
|
||||
|
||||
[38;5;12m- [39m[38;5;14m[1mNumpy tutorial on DataCamp[0m[38;5;12m (https://www.datacamp.com/community/tutorials/python-numpy-tutorial#gs.h3DvLnk)[39m
|
||||
|
||||
[38;2;255;187;0m[4mpandas[0m
|
||||
[38;5;14m[1mpandas[0m[38;5;12m (http://pandas.pydata.org/index.html) provides efficient data structures and analysis tools for Python. It is build on top of numpy.[39m
|
||||
|
||||
[38;5;12m- [39m[38;5;14m[1mIntroduction to pandas[0m[38;5;12m (http://www.synesthesiam.com/posts/an-introduction-to-pandas.html)[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mDataCamp pandas foundations[0m[38;5;12m (https://www.datacamp.com/courses/pandas-foundations) - Paid course, but 30 free days upon account creation (enough to complete course).[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mPandas cheatsheet[0m[38;5;12m (https://github.com/pandas-dev/pandas/blob/master/doc/cheatsheet/Pandas_Cheat_Sheet.pdf) - Quick overview over the most important functions.[39m
|
||||
|
||||
[38;2;255;187;0m[4mscikit-learn[0m
|
||||
[38;5;14m[1mscikit-learn[0m[38;5;12m (http://scikit-learn.org/stable/) is the most common library for Machine Learning and Data Science in Python.[39m
|
||||
|
||||
[38;5;12m- [39m[38;5;14m[1mIntroduction and first model application[0m[38;5;12m (https://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.02-Introducing-Scikit-Learn.ipynb)[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mRough guide for choosing estimators[0m[38;5;12m (http://scikit-learn.org/stable/tutorial/machine_learning_map/)[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mScikit-learn complete user guide[0m[38;5;12m (http://scikit-learn.org/stable/user_guide.html)[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mModel ensemble: Implementation in Python[0m[38;5;12m (http://machinelearningmastery.com/ensemble-machine-learning-algorithms-python-scikit-learn/)[39m
|
||||
|
||||
[38;2;255;187;0m[4mJupyter Notebook[0m
|
||||
[38;5;14m[1mJupyter Notebook[0m[38;5;12m (https://jupyter.org/) is a web application for easy data visualisation and code presentation.[39m
|
||||
|
||||
[38;5;12m- [39m[38;5;14m[1mDownloading and running first Jupyter notebook[0m[38;5;12m (https://jupyter.org/install.html)[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mExample notebook for data exploration[0m[38;5;12m (https://www.kaggle.com/sudalairajkumar/simple-exploration-notebook-instacart)[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mSeaborn data visualization tutorial[0m[38;5;12m (https://elitedatascience.com/python-seaborn-tutorial) - Plot library that works great with Jupyter.[39m
|
||||
|
||||
|
||||
[38;2;255;187;0m[4mVarious other helpful tools and resources[0m
|
||||
|
||||
[38;5;12m- [39m[38;5;14m[1mTemplate folder structure for organizing Data Science projects[0m[38;5;12m (https://github.com/drivendata/cookiecutter-data-science)[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mAnaconda Python distribution[0m[38;5;12m (https://www.continuum.io/downloads) - Contains most of the important Python packages for Data Science.[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mSpacy[0m[38;5;12m (https://spacy.io/) - Open source toolkit for working with text-based data.[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mLightGBM gradient boosting framework[0m[38;5;12m (https://github.com/Microsoft/LightGBM) - Successfully used in many Kaggle challenges.[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mAmazon AWS[0m[38;5;12m (https://aws.amazon.com/) - Rent cloud servers for more timeconsuming calculations (r4.xlarge server is a good place to start).[39m
|
||||
|
||||
|
||||
[38;2;255;187;0m[4mData Science Challenges for Beginners[0m
|
||||
[38;5;12mSorted by increasing complexity.[39m
|
||||
|
||||
[38;5;12m- [39m[38;5;14m[1mWalkthrough: House prices challenge[0m[38;5;12m (https://www.dataquest.io/blog/kaggle-getting-started/) - Walkthrough through a simple challenge on house prices.[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mBlood Donation Challenge[0m[38;5;12m (https://www.drivendata.org/competitions/2/warm-up-predict-blood-donations/) - Predict if a donor will donate again.[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mTitanic Challenge[0m[38;5;12m (https://www.kaggle.com/c/titanic) - Predict survival on the Titanic.[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mWater Pump Challenge[0m[38;5;12m (https://www.drivendata.org/competitions/7/pump-it-up-data-mining-the-water-table/) - Predict the operating condition of water pumps in Africa.[39m
|
||||
|
||||
[38;2;255;187;0m[4mMore advanced resources and lists[0m
|
||||
|
||||
[38;5;12m- [39m[38;5;14m[1mAwesome Data Science[0m[38;5;12m (https://github.com/bulutyazilim/awesome-datascience)[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mData Science Python[0m[38;5;12m (https://github.com/ujjwalkarn/DataSciencePython)[39m
|
||||
[38;5;12m- [39m[38;5;14m[1mMachine Learning Tutorials[0m[38;5;12m (https://github.com/ujjwalkarn/Machine-Learning-Tutorials)[39m
|
||||
|
||||
[38;2;255;187;0m[4mContribute[0m
|
||||
|
||||
[38;5;12mContributions welcome! Read the [39m[38;5;14m[1mcontribution guidelines[0m[38;5;12m (contributing.md) first.[39m
|
||||
|
||||
|
||||
[38;2;255;187;0m[4mLicense[0m
|
||||
|
||||
[38;5;14m[1m![0m[38;5;12mCC0[39m[38;5;14m[1m (http://mirrors.creativecommons.org/presskit/buttons/88x31/svg/cc-zero.svg)[0m[38;5;12m (http://creativecommons.org/publicdomain/zero/1.0)[39m
|
||||
|
||||
[38;5;12mTo the extent possible under law, Simon Böhm has waived all copyright and[39m
|
||||
[38;5;12mrelated or neighboring rights to this work. Disclaimer: Some of the links are affiliate links.[39m
|
||||
|
||||
[38;5;12mlearndatascience Github: https://github.com/siboehm/awesome-learn-datascience[39m
|
||||
Reference in New Issue
Block a user