Course Most Recently Updated Nov/2018!
Thank you all for the huge response to this emerging course! We are delighted to have over 20,000 students in over 160 different countries. I’m genuinely touched by the overwhelmingly positive and thoughtful reviews. It’s such a privilege to share and introduce this important topic with everyday people in a clear and understandable way.
I’m also excited to announce that I have created real closed captions for all course material, so weather you need them due to a hearing impairment, or find it easier to follow long (great for ESL students!)… I’ve got you covered.
Most importantly:
To make this course “real”, we’ve expanded. In November of 2018, the course went from 41 lectures and 8 sections, to 62 lectures and 15 sections! We hope you enjoy the new content!
Unlock the secrets of understanding Machine Learning for Data Science!
In this introductory course, the “Backyard Data Scientist” will guide you through wilderness of Machine Learning for Data Science. Accessible to everyone, this introductory course not only explains Machine Learning, but where it fits in the “techno sphere around us”, why it’s important now, and how it will dramatically change our world today and for days to come.
Our exotic journey will include the core concepts of:
The train wreck definition of computer science and one that will actually instead make sense.
An explanation of data that will have you seeing data everywhere that you look!
One of the “greatest lies” ever sold about the future computer science.
A genuine explanation of Big Data, and how to avoid falling into the marketing hype.
What is Artificial intelligence? Can a computer actually think? How do computers do things like navigate like a GPS or play games anyway?
What is Machine Learning? And if a computer can think – can it learn?
What is Data Science, and how it relates to magical unicorns!
How Computer Science, Artificial Intelligence, Machine Learning, Big Data and Data Science interrelate to one another.
We’ll then explore the past and the future while touching on the importance, impacts and examples of Machine Learning for Data Science:
How a perfect storm of data, computer and Machine Learning algorithms have combined together to make this important right now.
We’ll actually make sense of how computer technology has changed over time while covering off a journey from 1956 to 2014. Do you have a super computer in your home? You might be surprised to learn the truth.
We’ll discuss the kinds of problems Machine Learning solves, and visually explain regression, clustering and classification in a way that will intuitively make sense.
Most importantly we’ll show how this is changing our lives. Not just the lives of business leaders, but most importantly…you too!
To make sense of the Machine part of Machine Learning, we’ll explore the Machine Learning process:
How do you solve problems with Machine Learning and what are five things you must do to be successful?
How to ask the right question, to be solved by Machine Learning.
Identifying, obtaining and preparing the right data … and dealing with dirty data!
How every mess is “unique” but that tidy data is like families!
How to identify and apply Machine Learning algorithms, with exotic names like “Decision Trees”, “Neural Networks” “K’s Nearest Neighbors” and “Naive Bayesian Classifiers”
And the biggest pitfalls to avoid and how to tune your Machine Learning models to help ensure a successful result for Data Science.
Our final section of the course will prepare you to begin your future journey into Machine Learning for Data Science after the course is complete. We’ll explore:
How to start applying Machine Learning without losing your mind.
What equipment Data Scientists use, (the answer might surprise you!)
The top five tools Used for data science, including some surprising ones.
And for each of the top five tools – we’ll explain what they are, and how to get started using them.
And we’ll close off with some cautionary tales, so you can be the most successful you can be in applying Machine Learning to Data Science problems.
Bonus Course! To make this “really real”, I’ve included a bonus course!
Most importantly in the bonus course I’ll include information at the end of every section titled “Further Magic to Explore” which will help you to continue your learning experience.
In this bonus course we’ll explore:
Creating a real live Machine Learning Example of Titanic proportions. That’s right – we are going to predict survivability onboard the Titanic!
Use Anaconda Jupyter and python 3.x
A crash course in python – covering all the core concepts of Python you need to make sense of code examples that follow. See the included free cheat sheet!
Hands on running Python! (Interactively, with scripts, and with Jupyter)
Basics of how to use Jupyter Notebooks
Reviewing and reinforcing core concepts of Machine Learning (that we’ll soon apply!)
Foundations of essential Machine Learning and Data Science modules:
NumPy – An Array Implementation
Pandas – The Python Data Analysis Library
Matplotlib – A plotting library which produces quality figures in a variety of formats
SciPy – The fundamental Package for scientific computing in Python
Scikit-Learn – Simple and efficient tools data mining, data analysis, and Machine Learning
In the titanic hands on example we’ll follow all the steps of the Machine Learning workflow throughout:
1. Asking the right question.
2. Identifying, obtaining, and preparing the right data
3. Identifying and applying a Machine Learning algorithm
4. Evaluating the performance of the model and adjusting
5. Using and presenting the model
We’ll also see a real world example of problems in Machine learning, including underfit and overfit.
The bonus course finishes with a conclusion and further resources to continue your Machine Learning journey.
So I invite you to join me, the Backyard Data Scientist on an exquisite journey into unlocking the secrets of Machine Learning for Data Science…. for you know – everyday people… like you!
Sign up right now, and we’ll see you – on the other side!
Why should you buy this course?Â Â Begin here to see what we'll cover and what this course will bring to you!
I'm pleased to announce that my course has closed captioning on every lecture; that I have personally proof read, edited and corrected. I hope this helps all my students, better enjoy the course material. Please view this lecture for a personal message from me.
My personal thank you, for entrusting me with your time.Â It's a privilege to share this amazing topic with you.
A taste of what's to come - the course overview outlines what we'll be discussing, in each section of this course.
SECRETÂ SAUCE!:Â Top tips on how to get the most out of this course!Â Don't skip this lectureÂ - it's worth your time!
Do to popular request - I have updated all the lectures with links, as well as created two guides!
Find out how to access these resources!
Take a quick moment to think about why you are taking the course and what you dream of doing after it!
Please pause the course and visit http://www.tbdatascientist.com/surveys.html to let me know why you're here and what you hope to accomplish after!
What will we discover with core concepts?Â Here I'll give you a brief overview of all the exciting lecturesÂ contained inÂ this section.
The current definition of computer science is an incomprehensible train wreck!Â Â Find out why in this lecture!Â
In order to better understand what computer science is, it's useful to understand what DATAÂ is.Â By the end of this lecture you'll be able to see DATAÂ EVERYWHERE you look!Â
There are two different kinds of data - Structured and Unstructured.Â This is a key concept, that we are going to come back to time and time again later on.Â Important, and delivered in under 3 minutes!Â
Test your understanding of structured vs. unstructured data in this quick quiz!
Here we revisit the definition of what Computer Science is, with something that's actually comprehensible. Wondering what an algorithm is? We've got that covered to? And while we're at it - we'll even dive into programming.
Finally, we'll touch on what I call "One of the greatest lies - Ever SOLD".
So what is Big Data?Â Â Learn the three V's of big data, what it is... and what it isn't!Â Â
This lecture will educate you so you don't fall for the "marketing hype" often associated with Big Data.Â
A quiz on the ideas of big data.
This is a longer lecture, however within 12 minutes we'll cover off the most fundamental parts of Artificial Intelligence.Â
Do you how a computer plans a route in a GPS?Â Â Or how it would play a game like Tic-Tac-Toe?Â Â The answers might surprise you!Â Â This lecture has several animations to help illustrate the concepts and importantly - the challenges of AI in search.Â
And!Â We'll also cover off one of the most interesting questions - "Can a computer Really Think"?
Alas!Â Â We are discussing Machine Learning!Â Â In this lecture, we'll clearly define Machine Learning.Â We'll give a simplified overview of the Machine Learning Process, which we'll expand later on in section 4.Â We'll discuss some applications of Machine Learning, as well as what Machine Learning gives AI.
By the end of this lecture, you'll have an idea of what Machine Learning can be used for.
In this Animated Example, we'll show a simple Machine Learning application.Â While it's a very simple example, it will show how data can be looked at, examined for patterns, and will discuss the difference between sensitivity and specificity.Â These are key concepts to Machine Learning and important to understand when applying it.
What is Data Science?Â Â Magical Unicorns?Â Â (Yes really!). Â Battling Venn Diagrams (I'm not kidding!)
In this lecture, we'll define what Data Science is and what a Data Scientist does.Â
Big Data!Â Â AI!Â Â Machine Learning!Â Â Computer Science!Â Â Data Science!Â Â
How does this all fit together?Â Â Where does one "start" and the other "stop?"Â In this lecture, we'll use an animated diagram to explain how all these different domains interrelate.Â Confusion stops here!Â
What will we discover with "Impacts, Importance and Examples"?Â Here I'll give you a brief overview of all the exciting lecturesÂ contained inÂ this section.
Why are we talking about this?Â Why is this important now!Â
In this lecture we'll uncover the convergence of events that have come together in a perfect storm of digital change.Â
Computers exploding?!Â Â Every one always gives lip service to "how much technology has actually changed".Â Â But what does it really mean?Â In this longer lecture, we'll take a journey from 1956 to 2014, and really explain how the world has changed.
Do you have a super computer in your house?Â Â You might be surprised to find out the truth.....!
In this brief lecture, we'll cover the three different problems Machine Learning solves really well.Â
Pictures will help make sense of every concept, and it will be the bedrock for later seeing how different problems can be solved by Machine Learning.Â While watching this lecture, be sure to look at how a problem can be solved in different ways, using different approaches to Machine Learning.
We've covered off - what it is.Â How it works.Â What it provides....
Now the question is How is this changing our lives?
In this lecture we'll talk about what we'll likely see.Â What happens when Machine Learning goes wrong.Â And we'll touch on ethics - which is not just a case of banning killer robots, but much more subtle as well.
What will we discover with "The Machine Learning Process"?Â Here I'll give you a brief overview of all the exciting lecturesÂ contained inÂ this section.
In this lecture we'll cover off each of the five step of theÂ Machine Learning Process, sometimes called a "pipeline" or "workflow".Â Any problem being solved by Machine Learning will have to touch all of these fives steps - sometimes more than once.Â
This key lecture will discuss how the parts of the process work together.Â Not to be missed!Â
What question are you asking?Â What are your goals?
What does done look like? How good must our prediction be?
All these things are key parts ofÂ 1 - Asking the right question in the first place....
In this tell all lecture:
What are waiting for!Â Â Go to your the lecture (room)Â and clean that (data)Â up!Â Â All messes are not created equal.Â
It's science and it's art.Â In this lecture we'll discuss how Machine Learning algorithms interact with data to model answers to your problems.Â We'll discuss and illustrate four common Machine Learning algorithms.Â For each, we'll cover off how they work, and what workloads work best for them.Â You'll become a master of the digitally arcane, with powers over:
How do you evaluate the performance of your Machine Learning algorithm anyway?Â And if it's not working they wayÂ you expected - how do you fix it?Â In this tell all lecture, we'll discuss common problems of Machine Learning - and how address them.
Finally!Â Â We've reached the end goal!Â Â Or have we?
In this brief lecture, we'll cover off four important things to keep in mind to use your Machine Learning Model.Â
AÂ quiz on the process of Machine Learning
How do you get started in your journeyÂ to applying Machine Learning for Data Science?Â In this brief overview, we'll describe the tell-all lectures, that will give you a place to start to apply Machine Learning and Data Science.Â
HOW NOT TO LOOSE YOUR MIND.
Really.Â This lecture is a important one, because it will give you guidance on how to get started in your journey without loosing your mind along the way.Â
What do you need to do Machine Learning?Â Â Is it expensive?Â Â Out of reach?
In this surprising lecture, we'll pull back the curtain on what Data Scientists are actually using.Â We'll also list the top five tools for Data Science, that we will deep dive into, in the following lectures.
The number one tool for Data Science, is "R" and is a power house for Machine Learning applications.Â We'll describe the tool, as well as provide links an important tips on using it.
The second most popular tool for Data Science, is "Python".Â Python is a general programming language with incredible power, versatility and flexibility.Â It's gaining on R year by year, and has powerful Data Science and Machine Learning Capabilities.Â
We'll describe the python, as well as provide links an important tips on using it.
The third most common tool for Data Science is SQL.Â Pronounced SEA-QUEL, this is a Database language.Â In this lecture we'll describe what SQLÂ is, and why it has shown up in the third place for data science tools.Â
The fourth most common tool for Data Science is Microsoft Excel?Â Â Yes - really!Â Â In this lecture we'll describe Microsoft Excel and it's value as a Data Science tool.Â
Finally, we'll give you the "real deal", when it comes to doing Machine Learning in excel.Â The answer, will surprise you!
The final top five tool for Data Science is rapid miner. Â In this lecture we'll discuss using Software as a Service, and some things to think about when using Rapid Miner.Â
You made it!Â Â In this final lecture of section 5, we'll talk about things to watch out for when doing Machine Learning.Â This lecture will give you key information on how to avoid obstacles on your way to success!
Congratulations on your journey into Machine Learning and Data Science.Â We sincerely hope you enjoyed it - and we hope to see you again... in our next course!Â
NOTE: November 2018 - The next course is *IN THIS COURSE*!Â That's right - check out the next lecture for our included bonus course "Machine Learning in Python and Jupyter for Beginners"!Â
Introductions!Â Â Who am I?
Who are you?
Starting the Anaconda download.
Prerequisite knowledge
Topics for the course
What won't we cover today?
How the course will be delivered.
Titanic survivability project - what we'll be building.
Introducing Kaggle
Where it the titanic example?
Starting Anaconda Installation.
Platform Selections -Why python?
Platform Selections -Why python 3.x?
Platform Selections -Why Anaconda?
Comments.
Basic Variable and Assignments.
Notes about Data Types.
Data type Summary.
Basic Type Casting.
Advanced Assignments.
Advanced Assignments - Error situations.
Strings - Basic String Assignment.
Strings - Unusual String Assignment.
Strings - Basic String Operations.
Strings - Core Concept - Immutability.
Slices.
Lists.
Lists - Basic List Operations
Lists - Additional Operations.
Lists - Advanced Topics.
Notes about Expressions.
Arithmetic and Bitwise Operators.
Relational, Logical, and Identity Operators.
Identity Operators.
Assignment Operators and Membership Operators
Conditional Logic and "if" statements.
Iterations and Loops - Simple while loop.
Iterations and Loops - Advanced loops and for loops.
Functions and variable Scope.
Dictionaries.
Dictionaries - Errors Situations.
Dictionaries - Further Example.
Getting Help!
Further magic to explore - Where to go from here (to continue your learning on Python)
Completing the Anaconda Installation
Running Python Interactively.
Running Python stand alone scripts.
Running Python in Jupyter notebooks.
How to use Jupyter:
Creating notebooks
Using notebooks
Saving notebooks
Types of cells
How the Kernel works (and how to manage it)
Getting help
Help with Jupyter Markdown language.
What is Data Science?
What are Data Scientists?
Data Science areas.
What kinds of problems does Machine Learning Solve?
Classification
Regression
Clustering
Can aÂ Machine Learn?
What is Machine Learning?Â - Simplified Overview
What does data look like?
What does the data in our Titanic example look like?
Types of Machine Learning:
Supervised
Unsupervised
Reinforcement Learning
The 5 steps of a Machine Learning Workflow
Example algorithms
Overview - Decision Trees
Overview - Naive Bayesian Classifiers
Overview - Neural Networks
Overview - kNN
Evaluating the performance of a model and adjusting
Overfitting
Underfitting
Further magic to explore - Where to go from where (to continue your learning).
Overview - Highest to lowest level.
Overview:
SciKit-learn
SciPy
Matplotlib
Pandas
NumPy
Basics of NumPy
Basic Creation and Assignments
Updating Values
Array Builders - Ones
Array Builders - Zeros
Array Builders - Choose your own
Matrices
NumPy:Â Further Magic to explore - Where to go from hereÂ (to continue your learning)
Introducing Pandas - the Python Data Analysis Library
Introducing Matplotlib - plotting library which produces publication quality figures in a variety of formats.
Pandas:
Basic Series Creation and Assignments.
Basic Data Frame Creation and Assignments.
Creating a Data frame from CSV and reviewing it.
Exploring the Data - Data Shapes and Types.
Accessing and Changing the Data - Rows (cases) and Columns (features)
Removing Data
Filtering Data
Determining Unique Values
Simple Analysis
Matplotlib:
Simple analysis and plotting
Pandas:
Simple analysis and plotting
Matplotlib - Further magic to explore - Where to go from where (to continue your learning).
Pandas: - Further magic to explore - Where to go from where (to continue your learning).
SciPy
The fundamental package for scientific computing with Python
Sparse matrix (example)
SciPy: - Further magic to explore - Where to go from where (to continue your learning).
Scikit-Learn:
Simple and efficient tools for data mining, data analysis, and Machine Learning
Let's get our start by applying the 5 steps of Machine Learning Workflow to the titanic.
Asking the right question.
Identifying, obtaining, and preparing the right data.
Identifying and applying a Machine Learning algorithm.
Evaluating the performance of the model and adjusting
Using and presenting the model.
Step #1 - Asking the right question
Creating our Titanic Example file
Reviewing the data, and data dictionary
Importing outÂ modules - Pandas, Numpy, Matplotlib, and Scikit-learn
Loading the dataframe
Step #2 - Identifying, obtaining, and preparing the right data.
Reviewing the data, identifying gaps and problems with the data set.
Step #2 - Identifying, obtaining, and preparing the right data.
Exploring the data with Pandas and Matplotlib - understanding people in the data set in terms of:
Survival of the disaster
Gender of people onboard
Age of passengers (histogram)
Classes of passengers
Age distribution in the Classes of passengers
Embarkation location
Note: The goal of this lecture (and the next lecture), is to identify the right data and features to use in the Machine Learning algorithm.
Step #2 - Identifying,Â obtaining, and preparing the right data.
Exploring the data with Pandas and Matplotlib - understanding people in the data set in terms of:
Survival in relation to age (Scatter plot)
Survival in relation to gender
Survival in relation to passenger class
Survival in relation to passenger class and gender.
Note: The goal of this lecture (and the previous lecture), is to identify the right data and features to use in the Machine Learning algorithm.
Step #2 - Identifying,Â obtaining, and preparing the right data.
Preparing the right data - adjusting gender.
Preparing the right data - filling in missing ages.Â
Applying a basic hypothesis:
Step #3 - ApplyingÂ a algorithm (a basic one).
Step #4 - Evaluating the performance of the hypothesis and adjusting.Â
Applying Linear Regression:
Step #2Â - Preparing the data (building the training features, and training target)
Applying Linear Regression (continued)
Step #3 - Applying the algorithm (running fit)
Step #4 - Evaluating the performance of Linear Regression (Cross validation)
Applying a polynomial regression
Step #3 - Applying the algorithm (running fit)
Step #4 - Evaluating the performance of Polynomial Regression (Cross validation)
Applying Decision Trees:
Step #3 - Applying the algorithm (running fit)
Step #4 - Evaluating the performance of Decision tree (Cross validation)
What happened???Â - Overfit!! Note: See resources in this lecture for the charts)
Adjusting the algorithm
Step #3 - Applying the algorithm (running fit)
Step #4 - Evaluating the performance of Decision tree (Cross validation)
Step #5Â - Using and presenting the model.
Conclusion of the Decision tree model.Â What features did it decide are most important?
In conclusion:
Concept: "The algorithm with the most data selection wins!"
Thoughts on:
Feature engineering
Data selection
Algorithm selection
Further magic to explore - Where to go from here (to continue your learning)
Kaggle
Link to an amazing blog post
Links to several amazing Jupyter notebooks
How to contact me!
Thank you!
Attached is an article I wrote, in early 2017 of one of the most important developments of 2016. I think it's as relevant today as it was back then.
I hope you enjoy it! It's included in HTML format, as well as attached in PDF.