Machine Learning in Climate and Environmental Sciences

Content

Content:

This module covers key concepts for real-world applications of machine learning, focusing on environmental data science. These include: 

•        foundations of machine learning (e.g., curse of dimensionality, cross-validation, cost functions, feature engineering)

•        several widely applied regression, classification, and unsupervised learning algorithms (e.g., LASSO, random forests, Gaussian processes, neural networks, LSTMs, transformers, self-organizing maps)

•        time series forecasting and causal inference.

•        explainable AI (e.g., SHAP value analyses, feature permutation methods, intrinsically interpretable methods). 

These concepts will be discussed in applied contexts, using current research examples from the climate and environmental sciences, including: climate change modelling, machine learning emulation of numerical models, forecasting air pollution and wildfires, understanding coupled dynamical systems such as global teleconnections in climate science, challenges in modelling non-stationary systems (e.g., predicting extreme weather events under global warming), and anomaly detection in measurement data. 

The lectures are accompanied by computer exercises in which students learn how to implement and modify machine learning modelling pipelines first-hand.

Workload:

Concerning in-person events, this is a 4 SWS module: 2 SWS for lectures, 2 SWS for exercises

 Overall:

 (2 SWS lectures + 2 SWS exercises + 1.5 x 4 SWS preparation and homework) x 15 +30 h preparation for the exam = 180 h = 6 ECTS

Competency/Goals:

Learning objectives:

 Students will be able to effectively address complex data science challenges. They can design and use robust strategies/modelling pipelines for machine learning applications in the climate and environmental sciences, which are transferable to other disciplines.

Their acquired knowledge will include major classes of machine learning techniques, how to choose and differentiate among algorithms in a variety of problem settings, ways of assessing important data properties that could for example help or interfere with modelling goals, and methods to combine data-driven modelling with prior scientific system understanding to increase performance and trustworthiness of machine learning.

 Students will learn how to implement these approaches in Python, using major machine learning software packages.

Language of instructionEnglish