Large-Scale AI Engineering

Abstract

This course focuses on the engineering principles and practices required to develop and optimize large-scale AI systems. Students will gain hands-on experience with high-performance computing (HPC) infrastructures, emphasizing the deployment and scaling of AI models on advanced GPU clusters.

Learning Objectives

By the end of this course, students will be able to:

  1. Understand the architecture and components of large-scale AI systems.
  2. Apply HPC techniques to enhance the performance of AI model training and inference.
  3. Implement optimizations, such as model parallelization, in AI workflows.
  4. Collaborate effectively in teams to improve AI system throughput and scalability.

Course Catalog

Learn more in the ETH Zurich course catalog entry.

JavaScript has been disabled in your browser