4.67 out of 5
4.67
8651 reviews on Udemy

Artificial Intelligence: Reinforcement Learning in Python

Complete guide to Reinforcement Learning, with Stock Trading and Online Advertising Applications
Instructor:
Lazy Programmer Team
40,912 students enrolled
English [Auto] More
Apply gradient-based supervised machine learning methods to reinforcement learning
Understand reinforcement learning on a technical level
Understand the relationship between reinforcement learning and psychology
Implement 17 different reinforcement learning algorithms

When people talk about artificial intelligence, they usually don’t mean supervised and unsupervised machine learning.

These tasks are pretty trivial compared to what we think of AIs doing – playing chess and Go, driving cars, and beating video games at a superhuman level.

Reinforcement learning has recently become popular for doing all of that and more.

Much like deep learning, a lot of the theory was discovered in the 70s and 80s but it hasn’t been until recently that we’ve been able to observe first hand the amazing results that are possible.

In 2016 we saw Google’s AlphaGo beat the world Champion in Go.

We saw AIs playing video games like Doom and Super Mario.

Self-driving cars have started driving on real roads with other drivers and even carrying passengers (Uber), all without human assistance.

If that sounds amazing, brace yourself for the future because the law of accelerating returns dictates that this progress is only going to continue to increase exponentially.

Learning about supervised and unsupervised machine learning is no small feat. To date I have over TWENTY FIVE (25!) courses just on those topics alone.

And yet reinforcement learning opens up a whole new world. As you’ll learn in this course, the reinforcement learning paradigm is very from both supervised and unsupervised learning.

It’s led to new and amazing insights both in behavioral psychology and neuroscience. As you’ll learn in this course, there are many analogous processes when it comes to teaching an agent and teaching an animal or even a human. It’s the closest thing we have so far to a true artificial general intelligence.  What’s covered in this course?

  • The multi-armed bandit problem and the explore-exploit dilemma

  • Ways to calculate means and moving averages and their relationship to stochastic gradient descent

  • Markov Decision Processes (MDPs)

  • Dynamic Programming

  • Monte Carlo

  • Temporal Difference (TD) Learning (Q-Learning and SARSA)

  • Approximation Methods (i.e. how to plug in a deep neural network or other differentiable model into your RL algorithm)

  • How to use OpenAI Gym, with zero code changes

  • Project: Apply Q-Learning to build a stock trading bot

If you’re ready to take on a brand new challenge, and learn about AI techniques that you’ve never seen before in traditional supervised machine learning, unsupervised machine learning, or even deep learning, then this course is for you.

See you in class!

“If you can’t implement it, you don’t understand it”

  • Or as the great physicist Richard Feynman said: “What I cannot create, I do not understand”.

  • My courses are the ONLY courses where you will learn how to implement machine learning algorithms from scratch

  • Other courses will teach you how to plug in your data into a library, but do you really need help with 3 lines of code?

  • After doing the same thing with 10 datasets, you realize you didn’t learn 10 things. You learned 1 thing, and just repeated the same 3 lines of code 10 times…

Suggested Prerequisites:

  • Calculus

  • Probability

  • Object-oriented programming

  • Python coding: if/else, loops, lists, dicts, sets

  • Numpy coding: matrix and vector operations

  • Linear regression

  • Gradient descent

WHAT ORDER SHOULD I TAKE YOUR COURSES IN?:

  • Check out the lecture “Machine Learning and AI Prerequisite Roadmap” (available in the FAQ of any of my courses, including the free Numpy course)

Welcome

1
Introduction
2
Course Outline and Big Picture
3
Where to get the Code
4
How to Succeed in this Course
5
Warmup

Return of the Multi-Armed Bandit

1
Section Introduction: The Explore-Exploit Dilemma
2
Applications of the Explore-Exploit Dilemma
3
Epsilon-Greedy Theory
4
Calculating a Sample Mean (pt 1)
5
Epsilon-Greedy Beginner's Exercise Prompt
6
Designing Your Bandit Program
7
Epsilon-Greedy in Code
8
Comparing Different Epsilons
9
Optimistic Initial Values Theory
10
Optimistic Initial Values Beginner's Exercise Prompt
11
Optimistic Initial Values Code
12
UCB1 Theory
13
UCB1 Beginner's Exercise Prompt
14
UCB1 Code
15
Bayesian Bandits / Thompson Sampling Theory (pt 1)
16
Bayesian Bandits / Thompson Sampling Theory (pt 2)
17
Thompson Sampling Beginner's Exercise Prompt
18
Thompson Sampling Code
19
Thompson Sampling With Gaussian Reward Theory
20
Thompson Sampling With Gaussian Reward Code
21
Why don't we just use a library?
22
Nonstationary Bandits
23
Bandit Summary, Real Data, and Online Learning
24
(Optional) Alternative Bandit Designs
25
Suggestion Box

High Level Overview of Reinforcement Learning

1
What is Reinforcement Learning?
2
From Bandits to Full Reinforcement Learning

Markov Decision Proccesses

1
MDP Section Introduction
2
Gridworld
3
Choosing Rewards
4
The Markov Property
5
Markov Decision Processes (MDPs)
6
Future Rewards
7
Value Functions
8
The Bellman Equation (pt 1)
9
The Bellman Equation (pt 2)
10
The Bellman Equation (pt 3)
11
Bellman Examples
12
Optimal Policy and Optimal Value Function (pt 1)
13
Optimal Policy and Optimal Value Function (pt 2)
14
MDP Summary

Dynamic Programming

1
Dynamic Programming Section Introduction
2
Iterative Policy Evaluation
3
Designing Your RL Program
4
Gridworld in Code
5
Iterative Policy Evaluation in Code
6
Windy Gridworld in Code
7
Iterative Policy Evaluation for Windy Gridworld in Code
8
Policy Improvement
9
Policy Iteration
10
Policy Iteration in Code
11
Policy Iteration in Windy Gridworld
12
Value Iteration
13
Value Iteration in Code
14
Dynamic Programming Summary

Monte Carlo

1
Monte Carlo Intro
2
Monte Carlo Policy Evaluation
3
Monte Carlo Policy Evaluation in Code
4
Monte Carlo Control
5
Monte Carlo Control in Code
6
Monte Carlo Control without Exploring Starts
7
Monte Carlo Control without Exploring Starts in Code
8
Monte Carlo Summary

Temporal Difference Learning

1
Temporal Difference Introduction
2
TD(0) Prediction
3
TD(0) Prediction in Code
4
SARSA
5
SARSA in Code
6
Q Learning
7
Q Learning in Code
8
TD Learning Section Summary

Approximation Methods

1
Approximation Methods Section Introduction
2
Linear Models for Reinforcement Learning
3
Feature Engineering
4
Approximation Methods for Prediction
5
Approximation Methods for Prediction Code
6
Approximation Methods for Control
7
Approximation Methods for Control Code
8
CartPole
9
CartPole Code
10
Approximation Methods Exercise
11
Approximation Methods Section Summary

Interlude: Common Beginner Questions

1
This Course vs. RL Book: What's the Difference?

Stock Trading Project with Reinforcement Learning

1
Beginners, halt! Stop here if you skipped ahead
2
Stock Trading Project Section Introduction
You can view and review the lecture materials indefinitely, like an on-demand channel.
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don't have an internet connection, some instructors also let their students download course lectures. That's up to the instructor though, so make sure you get on their good side!
4.7
4.7 out of 5
8651 Ratings

Detailed Rating

Stars 5
5212
Stars 4
2714
Stars 3
494
Stars 2
132
Stars 1
92
30-Day Money-Back Guarantee

Includes

15 hours on-demand video
Full lifetime access
Access on mobile and TV
Certificate of Completion
Artificial Intelligence: Reinforcement Learning in Python
Price:
$218.98

Community

For Professionals

For Businesses

We support Sales, Marketing, Account Management and CX professionals. Learn new skills. Share your expertise. Connect with experts. Get inspired.

Community

Partnership Opportunities

Layer 1
samcx.com
Logo
Register New Account
Compare items
  • Total (0)
Compare
0