4.79 out of 5
4.79
198 reviews on Udemy

Machine Learning with Imbalanced Data

Learn multiple techniques to tackle data imbalance and improve the performance of your machine learning models.
Instructor:
Soledad Galli
2,463 students enrolled
English [Auto]
Under-sampling methods at random
Under-sampling methods which focus on observations that are harder to classify
Under-sampling methods that ignore potentially noisy observations
Over-sampling methods to increase the number of minority observations
Ways of creating syntethic data to increase the examples of the minority class
SMOTE and its variants
Use ensemble methods with sampling techniques to improve model performance
The most suitable evaluation metrics to use with imbalanced datasets

Welcome to Machine Learning with Imbalanced Datasets. In this course, you will learn multiple techniques which you can use with imbalanced datasets to improve the performance of your machine learning models.

If you are working with imbalanced datasets right now and want to improve the performance of your models, or you simply want to learn more about how to tackle data imbalance, this course will show you how.

We’ll take you step-by-step through engaging video tutorials and teach you everything you need to know about working with imbalanced datasets. Throughout this comprehensive course, we cover almost every available methodology to work with imbalanced datasets, discussing their logic, their implementation in Python, their advantages and shortcomings, and the considerations to have when using the technique. Specifically, you will learn:

  • Under-sampling methods at random or focused on highlighting certain sample populations

  • Over-sampling methods at random and those which create new examples based of existing observations

  • Ensemble methods that leverage the power of multiple weak learners in conjunction with sampling techniques to boost model performance

  • Cost sensitive methods which penalize wrong decisions more severely for minority classes

  • The appropriate metrics to evaluate model performance on imbalanced datasets

By the end of the course, you will be able to decide which technique is suitable for your dataset, and / or apply and compare the improvement in performance returned by the different methods on multiple datasets.

This comprehensive machine learning course includes over 50 lectures spanning about 8 hours of video, and ALL topics include hands-on Python code examples which you can use for reference and for practice, and re-use in your own projects.

In addition, the code is updated regularly to keep up with new trends and new Python library releases.

So what are you waiting for? Enroll today, learn how to work with imbalanced datasets and build better machine learning models.

Introduction

1
Introduction
2
Course Curriculum Overview
3
Course Material
4
Code | Jupyter notebooks
5
Presentations covered in the course
6
Python package Imbalanced-learn
7
Download Datasets
8
Additional resources for Machine Learning and Python programming

Machine Learning with Imbalanced Data: Overview

1
Imbalanced classes - Introduction
2
Nature of the imbalanced class
3
Approaches to work with imbalanced datasets - Overview
4
Additional Reading Resources (Optional)

Evaluation Metrics

1
Introduction to Performance Metrics
2
Accuracy
3
Accuracy - Demo
4
Precision, Recall and F-measure
5
Install Yellowbrick
6
Precision, Recall and F-measure - Demo
7
Confusion tables, FPR and FNR
8
Confusion tables, FPR and FNR - Demo
9
Geometric Mean, Dominance, Index of Imbalanced Accuracy
10
Geometric Mean, Dominance, Index of Imbalanced Accuracy - Demo
11
ROC-AUC
12
ROC-AUC - Demo
13
Precision-Recall Curve
14
Precision-Recall Curve - Demo
15
Additional reading resources (Optional)
16
Probability

Udersampling

1
Under-Sampling Methods - Introduction
2
Random Under-Sampling - Intro
3
Random Under-Sampling - Demo
4
Condensed Nearest Neighbours - Intro
5
Condensed Nearest Neighbours - Demo
6
Tomek Links - Intro
7
Tomek Links - Demo
8
One Sided Selection - Intro
9
One Sided Selection - Demo
10
Edited Nearest Neighbours - Intro
11
Edited Nearest Neighbours - Demo
12
Repeated Edited Nearest Neighbours - Intro
13
Repeated Edited Nearest Neighbours - Demo
14
All KNN - Intro
15
All KNN - Demo
16
Neighbourhood Cleaning Rule - Intro
17
Neighbourhood Cleaning Rule - Demo
18
NearMiss - Intro
19
NearMiss - Demo
20
Instance Hardness Threshold - Intro
21
Instance Hardness Threshold - Demo
22
Undersampling Method Comparison
23
Summary Table

Oversampling

1
Over-Sampling Methods - Introduction
2
Random Over-Sampling
3
Random Over-Sampling - Demo
4
SMOTE
5
SMOTE - Demo
6
SMOTE-NC
7
SMOTE-NC - Demo
8
ADASYN
9
ADASYN - Demo
10
Borderline SMOTE
11
Borderline SMOTE - Demo
12
SVM SMOTE
13
SVM SMOTE - Demo
14
K-Means SMOTE
15
K-Means SMOTE - Demo
16
Over-Sampling Method Comparison

Over and Undersampling

1
Combining Over and Under-sampling - Intro
2
Combining Over and Under-sampling - Demo
3
Comparison of Over and Under-sampling Methods

Ensemble Methods

1
Ensemble methods with Imbalanced Data
2
Foundations of Ensemble Learning
3
Bagging
4
Bagging plus Over- or Under-Sampling
5
Boosting
6
Boosting plus Re-Sampling
7
Hybdrid Methods
8
Ensemble Methods - Demo
9
Additional Reading Resources

Cost Sensitive Learning

1
Cost-sensitive Learning - Intro
2
Types of Cost
3
Obtaining the Cost
4
Cost Sensitive Approaches
5
Misclassification Cost in Logistic Regression
6
Misclassification Cost in Decision Trees
7
Cost Sensitive Learning with Scikit-learn- Demo
8
Find Optimal Cost with hyperparameter tuning
9
Bayes Conditional Risk
10
MetaCost
11
MetaCost - Demo
12
Optional: MetaCost Base Code
13
Additional Reading Resources
You can view and review the lecture materials indefinitely, like an on-demand channel.
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don't have an internet connection, some instructors also let their students download course lectures. That's up to the instructor though, so make sure you get on their good side!
4.8
4.8 out of 5
198 Ratings

Detailed Rating

Stars 5
140
Stars 4
47
Stars 3
6
Stars 2
1
Stars 1
2
30-Day Money-Back Guarantee

Includes

8 hours on-demand video
13 articles
Full lifetime access
Access on mobile and TV
Certificate of Completion
Machine Learning with Imbalanced Data
Price:
$218.98 $169

Community

For Professionals

For Businesses

We support Sales, Marketing, Account Management and CX professionals. Learn new skills. Share your expertise. Connect with experts. Get inspired.

Community

Partnership Opportunities

Layer 1
samcx.com
Logo
Register New Account
Compare items
  • Total (0)
Compare
0