Forecasting with Machine Learning

Overview

Forecast single and multiple time series with machine learning models like linear regression, random forests and xgboost. Implement backtesting to evaluate models before deployment.

Forecast single and multiple time series with regression models.

If you're disappointed for whatever reason, you'll get a full refund.

Kishan is a machine learning and data science lead, course instructor, and open source software contributor. He has contributed to well known Python packages including Statsmodels, Feature-engine, and Prophet. He presents at data science conferences including ODSC and PyData. Kishan attained a PhD in Physics from Imperial College London in applied large scale time-series analysis and modelling of cardiac arrhythmias; during this time he taught and supervised undergraduates and master's students.

Sole is a lead data scientist, instructor and developer of open source software. She created and maintains the Python library Feature-engine, which allows us to impute data, encode categorical variables, transform, create and select features.

Sole is also the author of the "Python Feature engineering Cookbook" by Packt editorial.

Sole's been recognized as one of Linkedin's voices in data science in 2024.

Welcome to the course “Time Series Forecasting with Machine Learning.” In this course, you will learn how to forecast multiple time series by using traditional machine learning algorithms like linear regression, decision trees, random forests and gradient boosting machines.

Forecasting is the process of predicting future values of a time series based on historical data. Traditionally, we’ve used statistical methods like ARIMA, SARIMA or exponential smoothing for forecasting. These forecasting models take a time series as input, and return a time series as output. They are simple, explainable, don’t require a lot of data preprocessing, and in many cases, they make accurate predictions.

We can also forecast with deep learning models like recurrent neural networks (RNNs) or long short-term memory LSTMs. However, while these models are powerful in some cases, neural networks require huge datasets to offer a significant performance improvement respect to simpler models. For real world use cases like demand forecasting or forecasting air pollution concentration, where data is limited, we can use simpler algorithms, that are faster to train and easier to explain.

In recent years, there’s been a growing trend in using traditional machine learning models, such as xgboost and linear regression, to do forecasting. These machine learning methods have been shown to be effective in dealing with multiple time series, which are often enriched with variables from additional datasets, and where it is highly desirable to learn across all of our data simultaneously. Indeed, models such as LightGBM have been shown to be highly effective at large-scale time series forecasting.

Syllabus

Welcome
- Introduction
- Course curriculum
- Course overview
- Course requirements
- Refer a friend program
Course material
- Course material
- Download Jupyter notebooks
- Download presentations
- Download datasets
- How did you hear about us?
Time series as regression
- Time series overview
- Forecasting overview
- Forecasting as regression
- Feature engineering
- Feature engineering - demo
- Feature engineering with Feature-engine
- Benchmark models
- Benchmark models - demo
- Single step forecasting with ML - demo
- Forecasting pipelines
- Summary
- A word from your instructor
- Quiz
- Additional resources
Multistep forecasting
- Multistep forecasting
- Recursive forecasting
- Recursive forecasting with lags - demo
- Recursive forecasting with future known features - demo
- Recursive forecasting with window features - demo
- Exercise 1 - recursive forecasting
- Direct forecasting
- Direct forecasting with skforecast
- Direct forecasting with sklearn
- Direct forecasting with future known features
- Direct forecasting with future known features - skforecast
- Exercise 2 - direct forecasting
- Wrap up
- A word from your instructor
- Quiz
- Additional reading resources
- Extra Treat: Our Reading Suggestion 📕
Multiseries forecasting
- Introduction to forecasting multiple time series
- Multiple time series
- Local and global forecasting
- Forecasting multiple independent time series
- Forecasting multiple independent time series - data: demo
- Forecasting multiple independent time series - local forecasting: demo
- Forecasting multiple independent time series - global forecasting: demo
- Forecasting multiple independent time series - weights: demo
- Exercise 1: forecasting multiple independent time series
- Forecasting multiple dependent time series
- Forecasting multiple dependent time series - data: demo
- Forecasting multiple dependent time series - forecasting: demo
- Exercise 2: forecasting multiple dependent time series
- Summary
- Quiz
- Additional reading materials
Backtesting
- Introduction to backtesting
- Backtesting basics
- Backtesting multiple time series
- Backtesting strategies
- How to choose a backtesting strategy
- Backtesting without refitting the model
- Backtesting without refitting - single time series: demo
- Backtesting without refitting - custom error metrics: demo
- Backtesting without refitting - multiple time series: demo
- Backtesting with refitting the model and expanding training window
- Backtesting with refitting and expanding training window - single time series: demo
- Backtesting with refitting and expanding training window - multiple time series: demo
- Backtesting with refitting the model and rolling training window
- Backtesting with refitting and rolling training window - single time series: demo
- Backtesting with refitting and rolling training window - multiple time series: demo
- Backtesting with intermittent refitting
- Backtesting with intermittent refitting - single time series: demo
- Backtesting with intermittent refitting - multiple time series: demo
- Backtesting with a gap
- Backtesting with gap - single time series: demo
- Backtesting with a gap - multiple time series: demo
- Exercise 1: backtesting with a single time series
- Exercise 2: backtesting with multiple time series
- Summary
- Quiz
- Additional reading material
Error metrics
- Introduction to error metrics
- Structure of error metrics (part 1): overview of error metrics
- Structure of error metrics (part 2): multiple time series
- Structure of error metrics (part 3): what factors to consider
- Scale-dependent error metrics (part 1): definition
- Scale-dependent error metrics (part 2): pros, cons, and guidance
- Percentage error metrics (part 1): definition, pros and cons
- Percentage error metrics (part 2): modifications
- Percentage error metrics (part 3): guidance
- Relative error metrics and measures: definition, pros, cons, and guidance
- Scaled error metrics: definition, pros, cons, and guidance
- Error metrics for multiple time series
- Measuring bias
- Summary
- Exercise 1: Error metrics
- Quiz
- Additional reading material
Final section | Next steps
- Congratulations
- Next steps