The goal of this note is to introduce the “science” of data science. It explains the core machine learning theories for data scientists who previously learned some machine learning techniques. Understand the origins of machines learning concepts such as over-fitting, regularization and cross validation prepares data scientists to quickly master newly emerging machine learning techniques, as their theoretical foundations likely remain unchanged. A grasp of the learning theory also enables data scientists to evaluate, modify and invent appropriate new machine learning techniques as needed, with confidence.
Part I. Machine Learning
The goals of each chapter is outlined as the following:
Chapter 1. What is Machine Learning?
- Introduction
- Is Machine Learning Really a Hype?
- What is Learning?
- Learning Resource
- The Learning Problem
Chapter 2. Learning Theory
- Evaluating a Single Hypothesis - Hoeffding Inequality
- Hypothesis Set and Learning Algorithm
- VC Dimension
- Where Does Over-fitting Come From?
- Regularization
- The Learning Curve
This chapter explains the core of the machine learning theory from frequentists' point of view - why machines can learn. It introduces the intuition behind the VC theory to explain why our typical models only contain finite number of effective hypotheses. The multiple-test problem is originated from the multiple hypotheses we use in the learning, which directly leads to a regularization error. It is important that we do not just minimize in-sample-error, which will lead to over-fitting, instead the best model should be identified by minimizing the sum of in-sample error and regularization error.
Chapter 3. Regularization and Cross Validation
- Bayesian Interpretation of Regularization
- Bias and Variance
- Learning Theory Review
- Unbiased $E_{out}$ Estimation with Leave-One-Out Cross Validation
- Hyper-parameter Selection and Model Selection
This chapter first completes the learning theory, it continues to emphasize the necessity of the regularization error term. Model hyper-parameters, such as the $\lambda$ within the regularization error term, should be determined by using a validation data set. To make the most effective usage of our limited training data set, Leave-One-Out Cross Validation scheme is used to provide the closest unbiased performance estimation for our final hypothesis. In practice, we may need to use two separate workflows, one to identify the best hypothesis, another to estimate its performance.
Chapter 4. Linear Models
- Linear Classification - Accuracy
- Logistic Regression - Cross Entropy Lost Function
- Linear Regression - Maximum Likelihood
- Feature Selection Capability of Linear Models
- Non-linear Classifier
- Regularization Again
This chapter discusses three popular applications of linear models, for classification, for probabilistic classification, and for regression. We discuss why linear model has built-in feature selection capability and how it provides a general framework that can be extended into non-linear models through a mapping function.
Extra: Multiclass Classifier
Chapter 5. Support Vector Machines
- Maximum Margin Linear Classifier
- SVM Engine
- Cross Validation of SVM
- Non-linear SVM and Kernel
- RBF Kernel - One Kernel for All
- RBF Kernel for Higher-dimensional Input
- Classification with Probability
This chapter discusses how SVM works. We will analyze the popular KBF kernel in great detail. Through the construction of the infinite linear space for the KBF kernel, we explain why KBF kernel can model both linear models and polynomial models and why this is one kernel for all.
Chapter 6. The Path Forward
- Review
- Bayesian and Graphical Models
- Deep Learning and Feature Learning
- Data Engineering: Big Data and Cloud
- Summary
Part II. Deep Learning
To a large extent, the recent data science revolution is the deep learning revolution. Deep Learning offers a few frameworks that can be applied to very complicated problems such as image classification, language translation, gaming, etc. Most traditional machine learning techniques are either inapplicable to such problems or perform poorly.
We will gradually cover the basics in this new field. Instead of discussing the implementation techniques (e.g., TensorFlow), we focus on explaining the concepts related to why and how. Due to the immaturity of its theoretical foundation, the so-called explanations are often people's or my own "arguments". Be warned that I am just a new learner, my opinions could be quite wrong. Nevertheless, I hope my gut feelings can help it look a bit less mysterious.
Chapter 7. Introduction to Deep Learning
- Resource
- Feature Learning
- GO Game as a Function
- Deep Learning Implementation
- A Blackbox DNN Solution
This chapter explains why deep learning is also known as feature learning. Deep learning is basically a function approximation technique, in fact, games as complicated as GO can be modeled as a function. We outline some implementation details and propose one can write a generic blackbox deep learning solution that is capable of solving traditional machine learning problems.
Bioinformatics Application: Gene Expression Inference with Deep Learning.
Chapter 8. Convolutional Neural Network (CNN)
- Introduction
- CNN for Face Recognition
- VGGNet for ImageNet
- Miscellaneous Topics
This chapter explains how convolutional neural network (CNN) constructs a hierarchy of features recognizing spatial patterns starting from simple one (such as line or corners) to more complex ones (such as eyes, wheels), which addresses the translation-invariant requirement in computer vision. The concept of different CNN neurons responding to individually specific patterns is very interesting, as it seems to shed lights on how our biological vision recognition system might function.
Bioinformatics Application: Motif Discovery for DNA- and RNA-binding Proteins by Deep Learning.
Chapter 9. Recurrent Neural Network (RNN)
- Introduction
- Architecture
- Memory
- Running Average and Derivative
- RNN Applications
- LSTM
This chapter explains how to handle variable-length input sequences, where the response of a system does not only just depends on the current input, but is also influenced by the historical inputs. The new RNN system can maintain a state vector encoding the trajectory of all past inputs. The memory capability enables RNN to handle many of the most exciting deep learning applications such as translation, video classification, sentiment analysis, etc.
Bioinformatics Application: LSTM Networks for Predicting Subcellular Localization of Proteins
25 comments:
Thank you. Well, it was the nice to post and very helpful information on Data Science online Training Hyderabad
Thanks for posting such a great article.you done a great job machine learning online course
Your blog has very useful information about this technology which i am searching now, i am eagerly waiting to see your next post as soon
Data science training in chennai
Data science course in chennai
Data science training in Anna nagar
Data science training in Adyar
Data science training in T Nagar
Cloud computing courses in chennai
Cloud computing training in chennai
Cloud computing training in Tambaram
Thanks for sharing your views about the concept which you know much better. Its easy to read and understand by the way you wrote the blog contents.
German Classes in Anna nagar
IELTS Coaching in Anna nagar
Spoken English Class in Anna Nagar
French Classes in Anna nagar
Your blog has very useful information about this technology which i am searching now, i am eagerly waiting to see your next post as soon
Java Training in Anna nagar
Data Science Training in Anna nagar
Data Science Course in Anna nagar
Devops Training in Anna nagar
Digital Marketing Course in anna nagar
Data science course in chennai
RPA Training in Anna nagar
Blue Prism Training in Anna nagar
It's great post and more effective ...informative blog!
Hadoop Admin Training in Chennai
Hadoop administration in Chennai
Big Data Analytics Courses in Chennai
Blockchain course
Informatica MDM Training in Chennai
Informatica Training in Chennai
Its very good to see this kind of information. I love to thank you for providing that information. Machine Learning Training in Bangalore
nice information on data science has given thank you very much.
Data Science coaching in Hyderabad
Very Good Information...
Data science Course in Mumbai
Thank You Very Much For Sharing These Nice Tips..
Really useful information.
Machine Learning Training in Pune
Thank You Very Much For Sharing These Nice Tips.
Really useful information.
Data Science Training in Mumbai
Thank You Very Much For Sharing These Nice Tips.
Python web development is quite in demand and a very good option for Python developers. In over the span of 25 years, Python has managed to reach a level that is high above others making it the fastest growing language.
Best Python Training Center in Delhi, India
Advanced Python Training Institute in Delhi
Advanced Python Training Institute in Noida
Really nice and interesting post. I was looking for this kind of information and enjoyed reading this one.
cyber security course training in Guwahati
You have share informative information. Thank you. Machine learning course in Mumbai
Techdata Solutions also provide Data Science and Machine learning course in Mumbai and Pune.
Data science course in Mumbai
Data science course in Pune
Machine learning course in Pune
SAS training in Mumbai
RPA training in Mumbai
Blockchain training in Mumbai
Your article has all the necessary information. It is a change of taste from other supposed informational content with some accurate points which needs to be focussed on to get the details about the topic.
SAP training in Kolkata
SAP course in kolkata
While reading this wonderful article, I came across many aspects on which I coincide with you. It made my head bound to ponder over the topic and read it over again.
SAP training in Mumbai
SAP course in Mumbai
thanks for sharing this information.
techitop
pdfdrive
jio rockers telugu
www.mpl.live
filmy4wap.xyz
extratorrents proxy
An amazing offer! I have quite recently sent this onto a collaborator who had been directing a little schoolwork on this. What's more, he indeed got me supper because of the way that I discovered it for him... haha. So permit me to rephrase this.... Much obliged to YOU for the dinner!! Be that as it may, definitely, thanx for investing energy to talk about this point here on your site. news updates
Thank you for posting this blog on the notes of data science, if you want you can check out
data science course in bangalore
data science course
Excellent article... Thank you for providing such valuable information; the contents are quite intriguing. I'll be waiting for the next post on Big Data Engineering Services with great excitement.
Very good concept and your presentation is very nice. Keep sharing with us.
ramanichandran novels
muthulakshmi raghavan novels pdf
sashi murali novels
tamil novels
srikala novels free download
mallika manivannan novels pdf download
tamil novel writers
Excellent post thanks for sharing this post
Hanuman Chalisa Lyrics Pdf
Hanuman Chalisa Tamil Pdf
Hanuman Chalisa English Pdf
Hanuman Chalisa Telugu pdf
Useful post thanks for sharing
gold price in chennai
gold rate today namakkal
gold price today salem
Post a Comment