Semester and Thesis Projects

The ETH AI Center offers a wide range of semester and thesis projects for students at ETH Zurich, as well as other universities. Please see the list below for projects that are currently available.

Are you a student? Check out our Semester and Thesis projects below!

Need help? If you have any questions, feel free to request them at wayne.zeng@ai.ethz.ch

ETH Zurich uses SiROP to publish and search scientific projects. For more information visit sirop.org.

Extending Functional Scene Graphs to Include Articulated Object States

Computer Vision and Geometry Group

While traditional [1] and functional [2] scene graphs are capable of capturing the spatial relationships and functional interactions between objects and spaces, they encode each object as static, with fixed geometry. In this project, we aim to enable the estimation of the state of articulated objects and include it in the functional scene graph.

Keywords

scene understanding, scene graph, exploration

Labels

Master Thesis

PLEASE LOG IN TO SEE DESCRIPTION

This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:

  • Click link "Open this project..." below.
  • Log in to SiROP using your university login or create an account to see the details.

If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate

More information

Open this project... 

Published since: 2025-03-25 , Earliest start: 2025-03-25

Applications limited to ETH Zurich , EPFL - Ecole Polytechnique Fédérale de Lausanne

Organization Computer Vision and Geometry Group

Hosts Bauer Zuria, Dr. , Trisovic Jelena , Zurbrügg René

Topics Information, Computing and Communication Sciences , Engineering and Technology

Next-Gen Augmented Auditory Perception

ETH Competence Center - ETH AI Center

Join the Sensors Group (https://sensors.ini.ch/) at the Institute of Neuroinformatics (INI), UZH-ETH Zurich to develop next generation audio systems for augmented auditory perception! This project explores advanced audio processing techniques to enhance human hearing beyond natural capabilities. You will develop and optimize algorithms that amplify, filter, and selectively enhance sounds in complex auditory environments. Applications include assistive listening devices, augmented reality audio, and situational awareness systems. The work involves designing deep learning models, improving real-time processing efficiency, and optimizing hardware cost on embedded platforms.

Keywords

audio signal processing, neural networks, real-time computation, embedded platforms, FPGA, brain machine interface, neural recording analysis, control systems, deep learning, CUDA, Verilog, hardware acceleration, Jetson, VR/AR, hearing aids

Labels

Semester Project , Master Thesis , ETH Zurich (ETHZ)

Description

Join the Sensors Group (https://sensors.ini.ch/) at the Institute of Neuroinformatics (INI), UZH-ETH Zurich to develop next generation audio systems for augmented auditory perception! This project explores advanced audio processing techniques to enhance human hearing beyond natural capabilities. You will develop and optimize algorithms that amplify, filter, and selectively enhance sounds in complex auditory environments. Applications include assistive listening devices, augmented reality audio, and situational awareness systems. The work involves designing deep learning models, improving real-time processing efficiency, and optimizing hardware cost on embedded platforms.

Topics: - Audio retrieval with spoken language queries - Neural network-based active noise control - Auditory attention decoding from open source neural recording datasets - Hardware acceleration of auditory state space models

Application process: Write to us about your interests, CV and transcript, and we can arrange a meeting. We can supervise students from UZH and ETH. We offer semester projects as well as bachelor's and master's thesis projects.

Keywords: audio signal processing, neural networks, real-time computation, embedded platforms, FPGA, brain machine interface, neural recording analysis, control systems, deep learning, CUDA, Verilog, hardware acceleration, Jetson, VR/AR, hearing aids

Specific project descriptions:

--------------------------------------------------------------------------------------------

Audio Retrieval with Spoken Language Queries

Keywords: audio retrieval, spoken language understanding, deep learning

This project focuses on developing targeted sound extraction algorithms based on natural speech queries. You will design and implement signal processing techniques and machine learning models tailored for hearing aids or embedded audio devices. The work involves optimizing neural architectures, exploring trade-offs between computation cost and accuracy, and validating the system using real-world audio data.

Qualifications: Experience with Python, signal processing, and training neural networks, preferably also experience with embedded audio processing.

--------------------------------------------------------------------------------------------

Neural Network-Based Active Noise Control

Keywords: neural networks, adaptive filtering, real-time processing

This project aims to develop deep learning models for active noise control (ANC), improving noise reduction performance in real-world environments. You will work on end-to-end neural networks based ANC algorithms, optimizing for real-time environment adaptation. The project involves data-driven modeling of noise environments, designing low computation cost online learning algorithms, and implementing the system on edge hardware.

Qualifications: Experience with Python, Pytorch/TensorFlow, signal processing, and control systems.

--------------------------------------------------------------------------------------------

Auditory Attention Decoding from Neural Recordings

Keywords: brain-computer interface, EEG/MEG analysis

This project investigates neural mechanisms underlying auditory attention and decoding attention patterns from neural recordings (e.g., EEG/MEG). You will develop machine learning pipelines to analyze neural data, extract relevant auditory attention features, and optimize decoding algorithms for real-time applications. The goal is to enhance auditory prosthetics and brain-computer interfaces by decoding which sound source our brain is focusing on.

Qualifications: Experience with Python, machine learning, and preferably experience with neural recording data analysis.

--------------------------------------------------------------------------------------------

Hardware Acceleration of Auditory State Space Models

Keywords: embedded systems, edge computing, neural network accelerator

This project focuses on designing hardware accelerators for the forward (and backward) processes of state-space models in audio tasks, such as audio separation. Your work may involve optimizing neural architectures for energy-efficient execution, implementing hardware-friendly inference, and validating performance on embedded platforms.

Qualifications: Experience with Python and implementing neural networks on hardware platforms.

Contact Details

More information

Open this project... 

Published since: 2025-03-12 , Earliest start: 2025-03-12 , Latest end: 2026-05-13

Applications limited to University of Zurich , ETH Zurich

Organization ETH Competence Center - ETH AI Center

Hosts Liu Shih-Chii , Moure Pehuen

Topics Information, Computing and Communication Sciences , Engineering and Technology

Project in Neuroscience, Machine Learning, and Human-Computer Interaction

ETH Competence Center - ETH AI Center

Project Title: Designing a Human-in-the-Loop Model for Behavior Classification in Videos Description: We are looking for a motivated student to join an exciting interdisciplinary project that combines neuroscience, machine learning, and human-computer interaction. The project involves building a robust model for behavior classification in videos with a human-in-the-loop approach. The data for this project has already been recorded, and the next steps involve integrating new data, improving the model, and implementing machine learning solutions using Python and popular ML libraries.

Keywords

human-computer-interaction, machine learning, neuroscience

Labels

Master Thesis

Description

As part of this project, your tasks will include:

  • Integrating new data into the existing dataset.

  • Developing and refining machine learning models for behavior classification in video data.

  • Implementing efficient code using Python and ML frameworks such as TensorFlow or PyTorch.

  • Collaborating with an interdisciplinary team to improve model performance and explore new research directions.

We are looking for candidates who have

  • A background in Engineering, Computer Science, or a related field (e.g., Computer Vision, Human-Computer-Interaction, Biomedical, Computational Neuroscience).

  • Experience with deep learning frameworks such as TensorFlow or PyTorch.

  • Strong coding skills in Python and a solid understanding of ML concepts.

  • Motivation to contribute to an interdisciplinary project that bridges neuroscience and engineering.

  • Willingness to collaborate with a diverse team, including experts in ML and neuroscience.

You would be working together with Researchers from:

  • ETH AI Center

  • Institute for Neuroinformatics in Zurich

  • Fraunhofer-Institut für Angewandte Informationstechnik FIT (chair of human-centric AI)

Goal

The goal will be defined together with you :)

Contact Details

If you are interested, we would love to get to know you! Please write an Email to all of us including the following:

  • Details about your coding experience, including languages, years of experience, and relevant projects.

  • Your CV.

  • Your transcript of records.

Our emails are:

More information

Open this project... 

Published since: 2025-03-11 , Earliest start: 2023-02-01

Organization ETH Competence Center - ETH AI Center

Hosts Vo Anh

Topics Information, Computing and Communication Sciences , Engineering and Technology , Biology

Leveraging Human Motion Data from Videos for Humanoid Robot Motion Learning

ETH Competence Center - ETH AI Center

The advancement in humanoid robotics has reached a stage where mimicking complex human motions with high accuracy is crucial for tasks ranging from entertainment to human-robot interaction in dynamic environments. Traditional approaches in motion learning, particularly for humanoid robots, rely heavily on motion capture (MoCap) data. However, acquiring large amounts of high-quality MoCap data is both expensive and logistically challenging. In contrast, video footage of human activities, such as sports events or dance performances, is widely available and offers an abundant source of motion data. Building on recent advancements in extracting and utilizing human motion from videos, such as the method proposed in WHAM (refer to the paper "Learning Physically Simulated Tennis Skills from Broadcast Videos"), this project aims to develop a system that extracts human motion from videos and applies it to teach a humanoid robot how to perform similar actions. The primary focus will be on extracting dynamic and expressive motions from videos, such as soccer player celebrations, and using these extracted motions as reference data for reinforcement learning (RL) and imitation learning on a humanoid robot.

Labels

Master Thesis

Description

Work packages

Literature research

Global motion reconstruction from videos.

Learning from reconstructed motion demonstrations with reinforcement learning on a humanoid robot.

Requirements

Strong programming skills in Python

Experience in computer vision and reinforcement learning

Publication

This project will mostly focus on algorithm design and system integration. Promising results will be submitted to machine learning / computer vision / robotics conferences.

Related literature

Yuan, Y., Iqbal, U., Molchanov, P., Kitani, K. and Kautz, J., 2022. Glamr: Global occlusion-aware human mesh recovery with dynamic cameras. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11038-11049).

YUAN, Y. and Makoviychuk, V., 2023. Learning physically simulated tennis skills from broadcast videos.

Shin, S., Kim, J., Halilaj, E. and Black, M.J., 2024. Wham: Reconstructing world-grounded humans with accurate 3d motion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2070-2080).

Peng, X.B., Abbeel, P., Levine, S. and Van de Panne, M., 2018. Deepmimic: Example-guided deep reinforcement learning of physics-based character skills. ACM Transactions On Graphics (TOG), 37(4), pp.1-14.

Goal

The objective of this project is to develop a robust system for extracting human motions from video footage and transferring these motions to a humanoid robot using learning from demonstration techniques. The system will be designed to handle the noisy data typically associated with video-based motion extraction and ensure that the humanoid robot can replicate the extracted motions with high fidelity while respecting physical rules.

Proposed Methodology

Video Data Collection and Motion Extraction:

  • Collect video footage of soccer player celebrations and other dynamic human activities.

  • Starting from existing monocular human pose/motion estimation algorithms to extract 3D motion data from the videos.

  • Incorporate physics-based corrections similar to those employed in WHAM to address issues like jitter, foot sliding, and ground penetration in the extracted motion data.

Motion Learning:

  • Applying existing learning from demonstration algorithms in a simulated environment to replicate kinematic motions reconstructed from the videos while respecting physical rules using reinforcement learning.

Implementation on Humanoid Robot:

  • This is encouraged since we have our robot lying there waiting for you.

Contact Details

Please include your CV and transcript in the submission.

Manuel Kaufmann

https://ait.ethz.ch/people/kamanuel

kamanuel@inf.ethz.ch

Chenhao Li

https://breadli428.github.io/

chenhli@ethz.ch

More information

Open this project... 

Published since: 2025-02-25

Applications limited to ETH Zurich , EPFL - Ecole Polytechnique Fédérale de Lausanne

Organization ETH Competence Center - ETH AI Center

Hosts Li Chenhao , Kaufmann Manuel , Li Chenhao , Li Chenhao , Kaufmann Manuel , Li Chenhao

Topics Engineering and Technology

Learning Agile Dodgeball Behaviors for Humanoid Robots

ETH Competence Center - ETH AI Center

Agility and rapid decision-making are vital for humanoid robots to safely and effectively operate in dynamic, unstructured environments. In human contexts—whether in crowded spaces, industrial settings, or collaborative environments—robots must be capable of reacting to fast, unpredictable changes in their surroundings. This includes not only planned navigation around static obstacles but also rapid responses to dynamic threats such as falling objects, sudden human movements, or unexpected collisions. Developing such reactive capabilities in legged robots remains a significant challenge due to the complexity of real-time perception, decision-making under uncertainty, and balance control. Humanoid robots, with their human-like morphology, are uniquely positioned to navigate and interact with human-centered environments. However, achieving fast, dynamic responses—especially while maintaining postural stability—requires advanced control strategies that integrate perception, motion planning, and balance control within tight time constraints. The task of dodging fast-moving objects, such as balls, provides an ideal testbed for studying these capabilities. It encapsulates several core challenges: rapid object detection and trajectory prediction, real-time motion planning, dynamic stability maintenance, and reactive behavior under uncertainty. Moreover, it presents a simplified yet rich framework to investigate more general collision avoidance strategies that could later be extended to complex real-world interactions. In robotics, reactive motion planning for dynamic environments has been widely studied, but primarily in the context of wheeled robots or static obstacle fields. Classical approaches focus on precomputed motion plans or simple reactive strategies, often unsuitable for highly dynamic scenarios where split-second decisions are critical. In the domain of legged robotics, maintaining balance while executing rapid, evasive maneuvers remains a challenging problem. Previous work on dynamic locomotion has addressed agile behaviors like running, jumping, or turning (e.g., Hutter et al., 2016; Kim et al., 2019), but these movements are often planned in advance rather than triggered reactively. More recent efforts have leveraged reinforcement learning (RL) to enable robots to adapt to dynamic environments, demonstrating success in tasks such as obstacle avoidance, perturbation recovery, and agile locomotion (Peng et al., 2017; Hwangbo et al., 2019). However, many of these approaches still struggle with real-time constraints and robustness in high-speed, unpredictable scenarios. Perception-driven control in humanoids, particularly for tasks requiring fast reactions, has seen advances through sensor fusion, visual servoing, and predictive modeling. For example, integrating vision-based object tracking with dynamic motion planning has enabled robots to perform tasks like ball catching or blocking (Ishiguro et al., 2002; Behnke, 2004). Yet, dodging requires a fundamentally different approach: instead of converging toward an object (as in catching), the robot must predict and strategically avoid the object’s trajectory while maintaining balance—often in the presence of limited maneuvering time. Dodgeball-inspired robotics research has been explored in limited contexts, primarily using wheeled robots or simplified agents in simulations. Few studies have addressed the challenges of high-speed evasion combined with the complexities of humanoid balance and multi-joint coordination. This project aims to bridge that gap by developing learning-based methods that enable humanoid robots to reactively avoid fast-approaching objects in real time, while preserving stability and agility.

Labels

Master Thesis

Description

Work packages

Literature research

Utilize simulation platforms (e.g., Isaac Lab) for initial policy development and training.

Explore model-free RL approaches, potentially incorporating curriculum learning to gradually increase task complexity.

Investigate perception models for object detection and trajectory forecasting, possibly leveraging lightweight deep learning architectures for real-time processing.

Implement and test learned behaviors on a physical humanoid robot, addressing the challenges of sim-to-real transfer through domain randomization or fine-tuning.

Requirements

Solid foundation in robotics, control theory, and machine learning.

Experience with reinforcement learning frameworks (e.g., PyTorch, TensorFlow, or RLlib).

Familiarity with robot simulation environments (e.g., MuJoCo, Gazebo) and real-world robot control.

Strong programming skills (Python, C++) and experience with sensor data processing.

Publication

This project will mostly focus on algorithm design and system integration. Promising results will be submitted to machine learning / robotics conferences.

Goal

Perception & Prediction

  • Develop a real-time perception pipeline capable of detecting and tracking incoming projectiles. Utilize camera data or external motion capture systems to predict ball trajectories accurately under varying speeds and angles.

Reactive Motion Planning

  • Design algorithms that plan evasive maneuvers (e.g., side-steps, ducks, or rotational movements) within milliseconds of detecting an incoming threat, ensuring the robot’s center of mass remains stable throughout.

Learning-Based Control

  • Apply reinforcement learning or imitation learning to optimize dodge behaviors, balancing between minimal energy expenditure and maximum evasive success. Investigate policy architectures that enable rapid reactions while handling noisy observations and sensor delays.

Robustness & Evaluation

  • Test the system under diverse scenarios, including multi-ball environments and varying throw speeds. Evaluate the robot’s success rate, energy efficiency, and post-dodge recovery capabilities.

Implementation on Humanoid Robot:

  • This is encouraged since we have our robot lying there waiting for you.

Contact Details

Please include your CV and transcript in the submission.

Chenhao Li

https://breadli428.github.io/

chenhli@ethz.ch

More information

Open this project... 

Published since: 2025-02-25

Applications limited to ETH Zurich , EPFL - Ecole Polytechnique Fédérale de Lausanne

Organization ETH Competence Center - ETH AI Center

Hosts Li Chenhao , Li Chenhao , Li Chenhao , Li Chenhao

Topics Engineering and Technology

Learning Real-time Human Motion Tracking on a Humanoid Robot

ETH Competence Center - ETH AI Center

Humanoid robots, designed to mimic the structure and behavior of humans, have seen significant advancements in kinematics, dynamics, and control systems. Teleoperation of humanoid robots involves complex control strategies to manage bipedal locomotion, balance, and interaction with environments. Research in this area has focused on developing robots that can perform tasks in environments designed for humans, from simple object manipulation to navigating complex terrains. Reinforcement learning has emerged as a powerful method for enabling robots to learn from interactions with their environment, improving their performance over time without explicit programming for every possible scenario. In the context of humanoid robotics and teleoperation, RL can be used to optimize control policies, adapt to new tasks, and improve the efficiency and safety of human-robot interactions. Key challenges include the high dimensionality of the action space, the need for safe exploration, and the transfer of learned skills across different tasks and environments. Integrating human motion tracking with reinforcement learning on humanoid robots represents a cutting-edge area of research. This approach involves using human motion data as input to train RL models, enabling the robot to learn more natural and human-like movements. The goal is to develop systems that can not only replicate human actions in real-time but also adapt and improve their responses over time through learning. Challenges in this area include ensuring real-time performance, dealing with the variability of human motion, and maintaining stability and safety of the humanoid robot.

Keywords

real-time, humanoid, reinforcement learning, representation learning

Labels

Master Thesis

Description

Work packages

Literature research

Human motion capture and retargeting

Skill space development

Hardware validation encouraged upon availability

Requirements

Strong programming skills in Python

Experience in reinforcement learning and imitation learning frameworks

Publication

This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.

Related literature

Peng, Xue Bin, et al. "Deepmimic: Example-guided deep reinforcement learning of physics-based character skills." ACM Transactions On Graphics (TOG) 37.4 (2018): 1-14.

Starke, Sebastian, et al. "Deepphase: Periodic autoencoders for learning motion phase manifolds." ACM Transactions on Graphics (TOG) 41.4 (2022): 1-13.

Li, Chenhao, et al. "FLD: Fourier latent dynamics for Structured Motion Representation and Learning."

Serifi, A., Grandia, R., Knoop, E., Gross, M. and Bächer, M., 2024, December. Vmp: Versatile motion priors for robustly tracking motion on physical characters. In Computer Graphics Forum (Vol. 43, No. 8, p. e15175).

Fu, Z., Zhao, Q., Wu, Q., Wetzstein, G. and Finn, C., 2024. Humanplus: Humanoid shadowing and imitation from humans. arXiv preprint arXiv:2406.10454.

He, T., Luo, Z., Xiao, W., Zhang, C., Kitani, K., Liu, C. and Shi, G., 2024, October. Learning human-to-humanoid real-time whole-body teleoperation. In 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 8944-8951). IEEE.

He, T., Luo, Z., He, X., Xiao, W., Zhang, C., Zhang, W., Kitani, K., Liu, C. and Shi, G., 2024. Omnih2o: Universal and dexterous human-to-humanoid whole-body teleoperation and learning. arXiv preprint arXiv:2406.08858.

Contact Details

Please include your CV and transcript in the submission.

Chenhao Li

https://breadli428.github.io/

chenhli@ethz.ch

More information

Open this project... 

Published since: 2025-02-25

Organization ETH Competence Center - ETH AI Center

Hosts Li Chenhao , Li Chenhao , Li Chenhao , Li Chenhao

Topics Information, Computing and Communication Sciences

Loosely Guided Reinforcement Learning for Humanoid Parkour

ETH Competence Center - ETH AI Center

Humanoid robots hold the promise of navigating complex, human-centric environments with agility and adaptability. However, training these robots to perform dynamic behaviors such as parkour—jumping, climbing, and traversing obstacles—remains a significant challenge due to the high-dimensional state and action spaces involved. Traditional Reinforcement Learning (RL) struggles in such settings, primarily due to sparse rewards and the extensive exploration needed for complex tasks. This project proposes a novel approach to address these challenges by incorporating loosely guided references into the RL process. Instead of relying solely on task-specific rewards or complex reward shaping, we introduce a simplified reference trajectory that serves as a guide during training. This trajectory, often limited to the robot's base movement, reduces the exploration burden without constraining the policy to strict tracking, allowing the emergence of diverse and adaptable behaviors. Reinforcement Learning has demonstrated remarkable success in training agents for tasks ranging from game playing to robotic manipulation. However, its application to high-dimensional, dynamic tasks like humanoid parkour is hindered by two primary challenges: Exploration Complexity: The vast state-action space of humanoids leads to slow convergence, often requiring millions of training steps. Reward Design: Sparse rewards make it difficult for the agent to discover meaningful behaviors, while dense rewards demand intricate and often brittle design efforts. By introducing a loosely guided reference—a simple trajectory representing the desired flow of the task—we aim to reduce the exploration space while maintaining the flexibility of RL. This approach bridges the gap between pure RL and demonstration-based methods, enabling the learning of complex maneuvers like climbing, jumping, and dynamic obstacle traversal without heavy reliance on reward engineering or exact demonstrations.

Keywords

humanoid, reinforcement learning, loosely guided

Labels

Master Thesis

Description

Work packages

Design a Loosely Guided RL Framework that integrates simple reference trajectories into the training loop.

Evaluate Exploration Efficiency by comparing baseline RL methods with the guided approach.

Demonstrate Complex Parkour Behaviors such as climbing, jumping, and dynamic traversal using the guided RL policy.

Hardware validation encouraged

Requirements

Strong programming skills in Python

Experience in reinforcement learning and imitation learning frameworks

Publication

This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.

Related literature

Peng, Xue Bin, et al. "Deepmimic: Example-guided deep reinforcement learning of physics-based character skills." ACM Transactions On Graphics (TOG) 37.4 (2018): 1-14.

Li, C., Vlastelica, M., Blaes, S., Frey, J., Grimminger, F. and Martius, G., 2023, March. Learning agile skills via adversarial imitation of rough partial demonstrations. In Conference on Robot Learning (pp. 342-352). PMLR.

Li, Chenhao, et al. "FLD: Fourier latent dynamics for Structured Motion Representation and Learning."

Serifi, A., Grandia, R., Knoop, E., Gross, M. and Bächer, M., 2024, December. Vmp: Versatile motion priors for robustly tracking motion on physical characters. In Computer Graphics Forum (Vol. 43, No. 8, p. e15175).

Fu, Z., Zhao, Q., Wu, Q., Wetzstein, G. and Finn, C., 2024. Humanplus: Humanoid shadowing and imitation from humans. arXiv preprint arXiv:2406.10454.

He, T., Luo, Z., Xiao, W., Zhang, C., Kitani, K., Liu, C. and Shi, G., 2024, October. Learning human-to-humanoid real-time whole-body teleoperation. In 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 8944-8951). IEEE.

He, T., Luo, Z., He, X., Xiao, W., Zhang, C., Zhang, W., Kitani, K., Liu, C. and Shi, G., 2024. Omnih2o: Universal and dexterous human-to-humanoid whole-body teleoperation and learning. arXiv preprint arXiv:2406.08858.

Contact Details

Please include your CV and transcript in the submission.

Chenhao Li

https://breadli428.github.io/

chenhli@ethz.ch

More information

Open this project... 

Published since: 2025-02-25

Organization ETH Competence Center - ETH AI Center

Hosts Li Chenhao , Li Chenhao , Li Chenhao , Li Chenhao

Topics Information, Computing and Communication Sciences

Learning World Models for Legged Locomotion

Robotic Systems Lab

Model-based reinforcement learning learns a world model from which an optimal control policy can be extracted. Understanding and predicting the forward dynamics of legged systems is crucial for effective control and planning. Forward dynamics involve predicting the next state of the robot given its current state and the applied actions. While traditional physics-based models can provide a baseline understanding, they often struggle with the complexities and non-linearities inherent in real-world scenarios, particularly due to the varying contact patterns of the robot's feet with the ground. The project aims to develop and evaluate neural network-based models for predicting the dynamics of legged environments, focusing on accounting for varying contact patterns and non-linearities. This involves collecting and preprocessing data from various simulation environment experiments, designing neural network architectures that incorporate necessary structures, and exploring hybrid models that combine physics-based predictions with neural network corrections. The models will be trained and evaluated on prediction autoregressive accuracy, with an emphasis on robustness and generalization capabilities across different noise perturbations. By the end of the project, the goal is to achieve an accurate, robust, and generalizable predictive model for the forward dynamics of legged systems.

Keywords

forward dynamics, non-smooth dynamics, neural networks, model-based reinforcement learning

Labels

Master Thesis

Description

Work packages

Literature research

Understand the training pipeline of the paper Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics.

Explore the possibility of using a first-order gradient in optimizing the policy.

Requirements

Strong programming skills in Python

Experience in machine learning frameworks, especially model-based reinforcement learning.

Publication

This project will mostly focus on simulated environments. Promising results will be submitted to machine learning conferences, where the method will be thoroughly evaluated and tested on different systems (e.g., simple Mujoco environments to complex systems such as quadrupeds and bipeds).

Related literature

Hafner, D., Lillicrap, T., Ba, J. and Norouzi, M., 2019. Dream to control: Learning behaviors by latent imagination. arXiv preprint arXiv:1912.01603.

Hafner, D., Lillicrap, T., Norouzi, M. and Ba, J., 2020. Mastering atari with discrete world models. arXiv preprint arXiv:2010.02193.

Hafner, D., Pasukonis, J., Ba, J. and Lillicrap, T., 2023. Mastering diverse domains through world models. arXiv preprint arXiv:2301.04104.

Li, C., Stanger-Jones, E., Heim, S. and Kim, S., 2024. FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning. arXiv preprint arXiv:2402.13820.

Song, Y., Kim, S. and Scaramuzza, D., 2024. Learning Quadruped Locomotion Using Differentiable Simulation. arXiv preprint arXiv:2403.14864.

Li, C., Krause, A. and Hutter, M., 2025. Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics. arXiv preprint arXiv:2501.10100.

Contact Details

Please include your CV and transcript in the submission.

Chenhao Li

https://breadli428.github.io/

chenhli@ethz.ch

More information

Open this project... 

Published since: 2025-02-25

Organization Robotic Systems Lab

Hosts Li Chenhao , Li Chenhao , Li Chenhao , Li Chenhao

Topics Engineering and Technology

Development and Optimization of a Spirometry Data Collection Module for the Alex Digital Health Assistant

ETH Competence Center - ETH AI Center

This master’s thesis project, part of the Alex Project (https://brc.ch/research/alex/), focuses on designing, developing, and optimizing a user-centered spirometry data collection module integrated into a smartphone-based Digital Health Assistant (DHA). Targeting adolescents aged 10–19 years, the project emphasizes creating an age-appropriate graphical user interface (GUI) that guides users through spirometry testing, provides real-time feedback, and visually represents lung function data to enhance usability, compliance, and measurement accuracy.

Keywords

Digital Health Assistant (DHA) Spirometry Data Collection GUI Design UI/UX Mobile App Development Adolescents Real-Time Feedback Usability Testing Flutter & Dart

Labels

Master Thesis

Description

The thesis is a sub-project within the Alex Project—a digital health initiative aimed at improving asthma control and disease management among adolescents. The focus of this work is the development, testing, and refinement of a spirometry module that integrates with a portable spirometer. A key component of the module is an intuitive, age-appropriate GUI designed for two distinct adolescent age groups:

  • 10 to 13 years (younger adolescents)
  • 14 to 19 years (older adolescents)

The interface will provide step-by-step instructions on performing spirometry tests, deliver real-time graphical feedback to ensure correct testing technique (inspiration and forced expiration), and visualize the lung function results in a clear and engaging manner. The project will involve UI/UX design using Dart and Flutter, integration with the spirometer’s SDK and the CLAID middleware, and comprehensive usability studies to validate the design with real users and healthcare professionals.

Ideal Candidate Requirements:

  • Academic Background: Currently pursuing a Master’s degree in Human-Computer Interaction, Biomedical Engineering, Computer Science, or a related field.

  • Technical Skills: Experience in mobile app development ideally with a focus on Flutter and Dart, as well as Java; familiarity with Kotlin is an advantage.

  • Design & Usability Expertise: Demonstrated experience or a strong interest in UI/UX design, graphic design principles, front-end development, and usability testing.

  • Problem-Solving: Excellent analytical and problem-solving skills to tackle challenges in digital health technology development.

  • Domain Interest: Passion for digital health technologies and an understanding of how to design effective, age-appropriate interfaces for adolescent users.

Goal

  • User-Centered GUI Design: Create an engaging and intuitive interface tailored to the cognitive and visual needs of both younger and older adolescents, including a step-by-step instructional system for conducting spirometry tests.
  • Real-Time Feedback & Optimization: Develop and implement age-appropriate, dynamic graphical feedback that guides users to achieve correct spirometry technique.
  • Graphical Representation of Results: Design visualizations that make lung function data easily understandable and engaging for the target age groups.
  • Testing and Validation: Conduct usability studies with adolescents, gather feedback from healthcare professionals, and iteratively optimize the GUI to enhance spirometry compliance and data accuracy.

Contact Details

If interested, please send your CV, transcript, and cover letter to: Edgar Delgado-Eckert, University Children's Hospital of Basel(edgar.delgado-eckert@unibas.ch) AND Filipe Barata, Centre for Digital Health Interventions, ETH Zurich (fbarata@ethz.ch)

More information

Open this project... 

Published since: 2025-02-03 , Earliest start: 2025-02-10

Organization ETH Competence Center - ETH AI Center

Hosts Da Conceição Barata Filipe

Topics Medical and Health Sciences , Mathematical Sciences , Information, Computing and Communication Sciences , Engineering and Technology

Brain Machine Interface - Visual Neuroprosthetics

ETH Competence Center - ETH AI Center

Join the Sensors Group at the Institute of Neuroinformatics (INI) to develop next-generation visual neuroprosthetics and advance the future of brain-machine interfaces! Topics Include: - develop neural networks that learn optimal stimulation patterns - utilize recurrent/spiking neural networks for creating stimulation patterns - implementing real-time computation on embedded platforms (FPGA, uC, jetson) - investigating closed-loop control strategies for electrical brain-stimulation Application process: Write us about your interests, include CV and transcript, and we can arrange a meeting. We can supervise students from UZH and ETH. We offer semester projects as well as bachelor's and master's theses projects.

Keywords

brain machine interface, visual neuroprosthetics, bmi, neural networks, Real-time computation, Embedded platforms, FPGA, Closed-loop control, neural recording analysis, Control systems, Deep learning, Verilog, Vivado, hls4ml, Hardware acceleration, Jetson, VR, Android, Unity/Blender

Labels

Semester Project , Master Thesis , ETH Zurich (ETHZ)

PLEASE LOG IN TO SEE DESCRIPTION

This project is set to limited visibility by its publisher. To see the project description you need to log in at SiROP. Please follow these instructions:

  • Click link "Open this project..." below.
  • Log in to SiROP using your university login or create an account to see the details.

If your affiliation is not created automatically, please follow these instructions: http://bit.ly/sirop-affiliate

More information

Open this project... 

Published since: 2025-01-16 , Earliest start: 2025-02-01 , Latest end: 2026-12-31

Applications limited to ETH Zurich , University of Zurich

Organization ETH Competence Center - ETH AI Center

Hosts Moure Pehuen , Liu Shih-Chii

Topics Information, Computing and Communication Sciences , Engineering and Technology

Safe RL for Robot Social Navigation

Spinal Cord Injury & Artificial Intelligence Lab

Developing a constrained RL framework for social navigation, emphasizing explicit safety constraints to reduce reliance on reward tuning.

Keywords

Navigation, Robot Planning, Reinforcement Learning, RL, Social Navigation

Labels

Master Thesis

Description

This project proposes the development of a constrained reinforcement learning (RL) framework for social navigation, emphasizing explicit safety and collision avoidance constraints rather than relying solely on the reward function. By utilizing constrained RL, the approach seeks to eliminate the need for intricate reward tuning that often leads to overly conservative or risky behaviors. This work involves exploring various constraints based on collision detection, velocity limits, and geometric considerations like velocity obstacles.

Tasks

  • Perform a literature review on Constrained Reinforcement Learning.

  • Design a Constrained Reinforcement Learning Framework for Social Navigation.

  • Train and Evaluate RL Agents for Social Navigation.

  • Conduct Validation Experiments.

Requirements

  • Proficient programming skills in Python.

  • Solid understanding of RL algorithms, preferably constrained RL algorithms.

  • Experience with neural networks, including frameworks like PyTorch.

  • Familiarity with robotics, navigation algorithms, and path planning.

Goal

A safe Navigation policy for social environments.

Contact Details

More information

Open this project... 

Published since: 2024-12-13 , Earliest start: 2025-01-01 , Latest end: 2025-12-31

Applications limited to ETH Zurich

Organization Spinal Cord Injury & Artificial Intelligence Lab

Hosts Alyassi Rashid , Alyassi Rashid , Alyassi Rashid

Topics Engineering and Technology

Lifelike Agility on ANYmal by Learning from Animals

ETH Competence Center - ETH AI Center

The remarkable agility of animals, characterized by their rapid, fluid movements and precise interaction with their environment, serves as an inspiration for advancements in legged robotics. Recent progress in the field has underscored the potential of learning-based methods for robot control. These methods streamline the development process by optimizing control mechanisms directly from sensory inputs to actuator outputs, often employing deep reinforcement learning (RL) algorithms. By training in simulated environments, these algorithms can develop locomotion skills that are subsequently transferred to physical robots. Although this approach has led to significant achievements in achieving robust locomotion, mimicking the wide range of agile capabilities observed in animals remains a significant challenge. Traditionally, manually crafted controllers have succeeded in replicating complex behaviors, but their development is labor-intensive and demands a high level of expertise in each specific skill. Reinforcement learning offers a promising alternative by potentially reducing the manual labor involved in controller development. However, crafting learning objectives that lead to the desired behaviors in robots also requires considerable expertise, specific to each skill.

Keywords

learning from demonstrations, imitation learning, reinforcement learning

Labels

Master Thesis

Description

Work packages

Literature research

Skill development from an animal dataset (available)

Hardware deployment

Requirements

Strong programming skills in Python

Experience in reinforcement learning and imitation learning frameworks

Publication

This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.

Related literature This project and the following literature will make you a master in imitation/demonstration/expert learning.

Peng, Xue Bin, et al. "Deepmimic: Example-guided deep reinforcement learning of physics-based character skills." ACM Transactions On Graphics (TOG) 37.4 (2018): 1-14.

Peng, X.B., Coumans, E., Zhang, T., Lee, T.W., Tan, J. and Levine, S., 2020. Learning agile robotic locomotion skills by imitating animals. arXiv preprint arXiv:2004.00784.

Peng, X.B., Ma, Z., Abbeel, P., Levine, S. and Kanazawa, A., 2021. Amp: Adversarial motion priors for stylized physics-based character control. ACM Transactions on Graphics (ToG), 40(4), pp.1-20.

Escontrela, A., Peng, X.B., Yu, W., Zhang, T., Iscen, A., Goldberg, K. and Abbeel, P., 2022, October. Adversarial motion priors make good substitutes for complex reward functions. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 25-32). IEEE.

Li, C., Vlastelica, M., Blaes, S., Frey, J., Grimminger, F. and Martius, G., 2023, March. Learning agile skills via adversarial imitation of rough partial demonstrations. In Conference on Robot Learning (pp. 342-352). PMLR.

Tessler, C., Kasten, Y., Guo, Y., Mannor, S., Chechik, G. and Peng, X.B., 2023, July. Calm: Conditional adversarial latent models for directable virtual characters. In ACM SIGGRAPH 2023 Conference Proceedings (pp. 1-9).

Starke, Sebastian, et al. "Deepphase: Periodic autoencoders for learning motion phase manifolds." ACM Transactions on Graphics (TOG) 41.4 (2022): 1-13.

Li, Chenhao, et al. "FLD: Fourier latent dynamics for Structured Motion Representation and Learning."

Han, L., Zhu, Q., Sheng, J., Zhang, C., Li, T., Zhang, Y., Zhang, H., Liu, Y., Zhou, C., Zhao, R. and Li, J., 2023. Lifelike agility and play on quadrupedal robots using reinforcement learning and generative pre-trained models. arXiv preprint arXiv:2308.15143.

Contact Details

Please include your CV and transcript in the submission.

Chenhao Li

https://breadli428.github.io/

chenhli@ethz.ch

Victor Klemm

https://www.linkedin.com/in/vklemm/?originalSubdomain=ch

vklemm@ethz.ch

More information

Open this project... 

Published since: 2024-11-26

Organization ETH Competence Center - ETH AI Center

Hosts Li Chenhao , Li Chenhao , Klemm Victor

Topics Information, Computing and Communication Sciences

Pushing the Limit of Quadruped Running Speed with Autonomous Curriculum Learning

Robotic Systems Lab

The project aims to explore curriculum learning techniques to push the limits of quadruped running speed using reinforcement learning. By systematically designing and implementing curricula that guide the learning process, the project seeks to develop a quadruped controller capable of achieving the fastest possible forward locomotion. This involves not only optimizing the learning process but also ensuring the robustness and adaptability of the learned policies across various running conditions.

Keywords

curriculum learning, fast locomotion

Labels

Master Thesis

Description

Quadruped robots have shown remarkable versatility in navigating diverse terrains, demonstrating capabilities ranging from basic locomotion to complex maneuvers. However, achieving high-speed forward locomotion remains a challenging task due to the intricate dynamics and control requirements involved. Traditional reinforcement learning (RL) approaches have made significant strides in this area, but they often face issues related to sample efficiency, convergence speed, and stability when applied to tasks with high degrees of freedom like quadruped locomotion.

Curriculum learning (CL), a concept inspired by the way humans and animals learn progressively from simpler to more complex tasks, offers a promising solution to these challenges. In the context of reinforcement learning, curriculum learning involves structuring the learning process by starting with simpler tasks and gradually increasing the complexity as the agent's proficiency improves. This approach can lead to faster convergence and better generalization by enabling the agent to build foundational skills before tackling more difficult scenarios.

Work packages

Literature research

Development of autonomous curriculum

Comparison with baselines (no curriculum, hand-crafted curriculum)

Requirements

Strong programming skills in Python

Experience in reinforcement learning

Publication

This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.

Related literature This project and the following literature will make you a master in curriculum/active/open-ended learning.

Oudeyer, P.Y., Kaplan, F. and Hafner, V.V., 2007. Intrinsic motivation systems for autonomous mental development. IEEE transactions on evolutionary computation, 11(2), pp.265-286.

Baranes, A. and Oudeyer, P.Y., 2009. R-iac: Robust intrinsically motivated exploration and active learning. IEEE Transactions on Autonomous Mental Development, 1(3), pp.155-169.

Wang, R., Lehman, J., Clune, J. and Stanley, K.O., 2019. Paired open-ended trailblazer (poet): Endlessly generating increasingly complex and diverse learning environments and their solutions. arXiv preprint arXiv:1901.01753.

Pitis, S., Chan, H., Zhao, S., Stadie, B. and Ba, J., 2020, November. Maximum entropy gain exploration for long horizon multi-goal reinforcement learning. In International Conference on Machine Learning (pp. 7750-7761). PMLR.

Portelas, R., Colas, C., Hofmann, K. and Oudeyer, P.Y., 2020, May. Teacher algorithms for curriculum learning of deep rl in continuously parameterized environments. In Conference on Robot Learning (pp. 835-853). PMLR.

Margolis, G.B., Yang, G., Paigwar, K., Chen, T. and Agrawal, P., 2024. Rapid locomotion via reinforcement learning. The International Journal of Robotics Research, 43(4), pp.572-587.

Li, C., Stanger-Jones, E., Heim, S. and Kim, S., 2024. FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning. arXiv preprint arXiv:2402.13820.

Contact Details

Please include your CV and transcript in the submission.

Chenhao Li

https://breadli428.github.io/

chenhli@ethz.ch

Marco Bagatella

https://marbaga.github.io/

mbagatella@ethz.ch

More information

Open this project... 

Published since: 2024-11-26

Organization Robotic Systems Lab

Hosts Li Chenhao , Bagatella Marco , Li Chenhao , Li Chenhao , Li Chenhao

Topics Engineering and Technology

Humanoid Locomotion Learning and Finetuning from Human Feedback

ETH Competence Center - ETH AI Center

In the burgeoning field of deep reinforcement learning (RL), agents autonomously develop complex behaviors through a process of trial and error. Yet, the application of RL across various domains faces notable hurdles, particularly in devising appropriate reward functions. Traditional approaches often resort to sparse rewards for simplicity, though these prove inadequate for training efficient agents. Consequently, real-world applications may necessitate elaborate setups, such as employing accelerometers for door interaction detection, thermal imaging for action recognition, or motion capture systems for precise object tracking. Despite these advanced solutions, crafting an ideal reward function remains challenging due to the propensity of RL algorithms to exploit the reward system in unforeseen ways. Agents might fulfill objectives in unexpected manners, highlighting the complexity of encoding desired behaviors, like adherence to social norms, into a reward function. An alternative strategy, imitation learning, circumvents the intricacies of reward engineering by having the agent learn through the emulation of expert behavior. However, acquiring a sufficient number of high-quality demonstrations for this purpose is often impractically costly. Humans, in contrast, learn with remarkable autonomy, benefiting from intermittent guidance from educators who provide tailored feedback based on the learner's progress. This interactive learning model holds promise for artificial agents, offering a customized learning trajectory that mitigates reward exploitation without extensive reward function engineering. The challenge lies in ensuring the feedback process is both manageable for humans and rich enough to be effective. Despite its potential, the implementation of human-in-the-loop (HiL) RL remains limited in practice. Our research endeavors to significantly lessen the human labor involved in HiL learning, leveraging both unsupervised pre-training and preference-based learning to enhance agent development with minimal human intervention.

Keywords

reinforcement learning from human feedback, preference learning

Labels

Master Thesis

Description

Work packages

Literature research

Reinforcement learning from human feedback

Preference learning

Requirements

Strong programming skills in Python

Experience in reinforcement learning frameworks

Publication

This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.

Related literature

Christiano, Paul F., et al. "Deep reinforcement learning from human preferences." Advances in neural information processing systems 30 (2017).

Lee, Kimin, Laura Smith, and Pieter Abbeel. "Pebble: Feedback-efficient interactive reinforcement learning via relabeling experience and unsupervised pre-training." arXiv preprint arXiv:2106.05091 (2021).

Wang, Xiaofei, et al. "Skill preferences: Learning to extract and execute robotic skills from human feedback." Conference on Robot Learning. PMLR, 2022.

Li, Chenhao, et al. "FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning." arXiv preprint arXiv:2402.13820 (2024).

Goal

The goal of the project is to learn and finetune humanoid locomotion policies using reinforcement learning from human feedback. The challenge lies in learning effective reward models from an efficient representation of motion clips, as opposed to single-state frames. The tentative pipeline works as follows:

  1. A self-supervised motion representation pretraining phase that learns efficient trajectory representations, potentially using Fourier Latent Dynamics, with data generated by some initial policies.

  2. Reward learning from human feedback, conditioned on the trajectory representation learned in the first step. Human preference from visualizing the motions is thus embedded in this latent trajectory representation.

  3. Policy training with the learning reward. The induced trajectories from the learned policy are used to augment the training set for the first two steps. The process continues.

Contact Details

Please include your CV and transcript in the submission.

Chenhao Li

https://breadli428.github.io/

chenhli@ethz.ch

Xin Chen

https://www.xccyn.com/

xin.chen@inf.ethz.ch

More information

Open this project... 

Published since: 2024-11-26

Organization ETH Competence Center - ETH AI Center

Hosts Li Chenhao , Li Chenhao , Chen Xin , Li Chenhao

Topics Information, Computing and Communication Sciences , Engineering and Technology

Online Safe Locomotion Learning in the Wild

ETH Competence Center - ETH AI Center

Reinforcement learning (RL) can potentially solve complex problems in a purely data-driven manner. Still, the state-of-the-art in applying RL in robotics, relies heavily on high-fidelity simulators. While learning in simulation allows to circumvent sample complexity challenges that are common in model-free RL, even slight distribution shift ("sim-to-real gap") between simulation and the real system can cause these algorithms to easily fail. Recent advances in model-based reinforcement learning have led to superior sample efficiency, enabling online learning without a simulator. Nonetheless, learning online cannot cause any damage and should adhere to safety requirements (for obvious reasons). The proposed project aims to demonstrate how existing safe model-based RL methods can be used to solve the foregoing challenges.

Keywords

safe mode-base RL, online learning, legged robotics

Labels

Master Thesis

Description

The project aims to answer the following research questions:

How to model safe locomotion tasks for a real robotic system as a constrained RL problem? Can we use existing methods such as the one proposed by @as2022constrained to safely learn effective locomotion policies?

Answering the above questions will encompass hands-on experience with a real robotic system (such as ANYmal) together with learning to implement and test cutting-edge RL methods. As RL on real hardware is not yet fully explored, we expect to unearth various challenges concerning the effectiveness of our methods in the online learning setting. Accordingly, an equally important goal of the project is to accurately identify these challenges and propose methodological improvements that can help address them.

A starting point would be to create a model of a typical locomotion task in Isaac Orbit as a proof-of-concept. Following that, the second part of the project will be dedicated to extending the proof-of-concept to a real system.

Contact Details

If you are a Master's student with - basic knowledge in reinforcement learning, for instance, by taking Probabilistic Artificial Intelligence or Foundations of Reinforcement Learning courses; - strong background in robotics and programming C++, ROS,

please reach out to Yarden As (yarden.as@inf.ethz.ch) or Chenhao Li (chenhao.li@inf.ethz.ch). Feel free to share any previous materials, such as public code that you wrote, that could be relevant in demonstrating the above requirements.

More information

Open this project... 

Published since: 2024-11-26

Organization ETH Competence Center - ETH AI Center

Hosts Li Chenhao , Li Chenhao , Li Chenhao , Li Chenhao

Topics Engineering and Technology

Autonomous Curriculum Learning for Increasingly Challenging Tasks

Robotic Systems Lab

While the history of machine learning so far largely encompasses a series of problems posed by researchers and algorithms that learn their solutions, an important question is whether the problems themselves can be generated by the algorithm at the same time as they are being solved. Such a process would in effect build its own diverse and expanding curricula, and the solutions to problems at various stages would become stepping stones towards solving even more challenging problems later in the process. Consider the realm of legged locomotion: Training a robot via reinforcement learning to track a velocity command illustrates this concept. Initially, tracking a low velocity is simpler due to algorithm initialization and environmental setup. By manually crafting a curriculum, we can start with low-velocity targets and incrementally increase them as the robot demonstrates competence. This method works well when the difficulty correlates clearly with the target, as with higher velocities or more challenging terrains. However, challenges arise when the relationship between task difficulty and control parameters is unclear. For instance, if a parameter dictates various human dance styles for the robot to mimic, it's not obvious whether jazz is easier than hip-hop. In such scenarios, the difficulty distribution does not align with the control parameter. How, then, can we devise an effective curriculum? In the conventional RSL training setting for locomotion over challenging terrains, there is also a handcrafted learning schedule dictating increasingly hard terrain levels but unified with multiple different types. With a smart autonomous curriculum learning algorithm, are we able to overcome separate terrain types asynchronously and thus achieve overall better performance or higher data efficiency?

Keywords

curriculum learning, open-ended learning, self-evolution, progressive task solving

Labels

Master Thesis

Description

Work packages

Literature research

Development of autonomous curriculum

Comparison with baselines (no curriculum, hand-crafted curriculum)

Requirements

Strong programming skills in Python

Experience in reinforcement learning

Publication

This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.

Related literature This project and the following literature will make you a master in curriculum/active/open-ended learning.

Oudeyer, P.Y., Kaplan, F. and Hafner, V.V., 2007. Intrinsic motivation systems for autonomous mental development. IEEE transactions on evolutionary computation, 11(2), pp.265-286.

Baranes, A. and Oudeyer, P.Y., 2009. R-iac: Robust intrinsically motivated exploration and active learning. IEEE Transactions on Autonomous Mental Development, 1(3), pp.155-169.

Wang, R., Lehman, J., Clune, J. and Stanley, K.O., 2019. Paired open-ended trailblazer (poet): Endlessly generating increasingly complex and diverse learning environments and their solutions. arXiv preprint arXiv:1901.01753.

Pitis, S., Chan, H., Zhao, S., Stadie, B. and Ba, J., 2020, November. Maximum entropy gain exploration for long horizon multi-goal reinforcement learning. In International Conference on Machine Learning (pp. 7750-7761). PMLR.

Portelas, R., Colas, C., Hofmann, K. and Oudeyer, P.Y., 2020, May. Teacher algorithms for curriculum learning of deep rl in continuously parameterized environments. In Conference on Robot Learning (pp. 835-853). PMLR.

Margolis, G.B., Yang, G., Paigwar, K., Chen, T. and Agrawal, P., 2024. Rapid locomotion via reinforcement learning. The International Journal of Robotics Research, 43(4), pp.572-587.

Li, C., Stanger-Jones, E., Heim, S. and Kim, S., 2024. FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning. arXiv preprint arXiv:2402.13820.

Contact Details

Please include your CV and transcript in the submission.

Chenhao Li

https://breadli428.github.io/

chenhli@ethz.ch

Marco Bagatella

https://marbaga.github.io/

mbagatella@ethz.ch

More information

Open this project... 

Published since: 2024-11-26

Organization Robotic Systems Lab

Hosts Li Chenhao , Li Chenhao , Li Chenhao , Bagatella Marco , Li Chenhao

Topics Engineering and Technology

Humanoid Locomotion Learning with Human Motion Priors

ETH Competence Center - ETH AI Center

Humanoid robots, designed to replicate human structure and behavior, have made significant strides in kinematics, dynamics, and control systems. Research aims to develop robots capable of performing tasks in human-centric settings, from simple object manipulation to navigating complex terrains. Reinforcement learning (RL) has proven to be a powerful method for enabling robots to learn from their environment, enhancing their performance over time without explicit programming for every possible scenario. In the realm of humanoid robotics, RL is used to optimize control policies, adapt to new tasks, and improve the efficiency and safety of human-robot interactions. However, one of the primary challenges is the high dimensionality of the action space, where handcrafted reward functions fall short of generating natural, lifelike motions. Incorporating motion priors into the learning process of humanoid robots addresses these challenges effectively. Motion priors can significantly reduce the exploration space in RL, leading to faster convergence and reduced training time. They ensure that learned policies prioritize stability and safety, reducing the risk of unpredictable or hazardous actions. Additionally, motion priors guide the learning process towards more natural, human-like movements, improving the robot's ability to perform tasks intuitively and seamlessly in human environments. Therefore, motion priors are crucial for efficient, stable, and realistic humanoid locomotion learning, enabling robots to better navigate and interact with the world around them.

Keywords

motion priors, humanoid, reinforcement learning, representation learning

Labels

Master Thesis

Description

Work packages

Literature research

Human motion capture and retargeting

Skill space development

Hardware validation encouraged upon availability

Requirements

Strong programming skills in Python

Experience in reinforcement learning and imitation learning frameworks

Publication

This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.

Related literature

Peng, Xue Bin, et al. "Deepmimic: Example-guided deep reinforcement learning of physics-based character skills." ACM Transactions On Graphics (TOG) 37.4 (2018): 1-14.

Peng, X.B., Coumans, E., Zhang, T., Lee, T.W., Tan, J. and Levine, S., 2020. Learning agile robotic locomotion skills by imitating animals. arXiv preprint arXiv:2004.00784.

Peng, X.B., Ma, Z., Abbeel, P., Levine, S. and Kanazawa, A., 2021. Amp: Adversarial motion priors for stylized physics-based character control. ACM Transactions on Graphics (ToG), 40(4), pp.1-20.

Escontrela, A., Peng, X.B., Yu, W., Zhang, T., Iscen, A., Goldberg, K. and Abbeel, P., 2022, October. Adversarial motion priors make good substitutes for complex reward functions. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 25-32). IEEE.

Li, C., Vlastelica, M., Blaes, S., Frey, J., Grimminger, F. and Martius, G., 2023, March. Learning agile skills via adversarial imitation of rough partial demonstrations. In Conference on Robot Learning (pp. 342-352). PMLR.

Tessler, C., Kasten, Y., Guo, Y., Mannor, S., Chechik, G. and Peng, X.B., 2023, July. Calm: Conditional adversarial latent models for directable virtual characters. In ACM SIGGRAPH 2023 Conference Proceedings (pp. 1-9).

Starke, Sebastian, et al. "Deepphase: Periodic autoencoders for learning motion phase manifolds." ACM Transactions on Graphics (TOG) 41.4 (2022): 1-13.

Li, Chenhao, et al. "FLD: Fourier latent dynamics for Structured Motion Representation and Learning."

Han, L., Zhu, Q., Sheng, J., Zhang, C., Li, T., Zhang, Y., Zhang, H., Liu, Y., Zhou, C., Zhao, R. and Li, J., 2023. Lifelike agility and play on quadrupedal robots using reinforcement learning and generative pre-trained models. arXiv preprint arXiv:2308.15143.

Contact Details

Please include your CV and transcript in the submission.

Chenhao Li

https://breadli428.github.io/

chenhli@ethz.ch

More information

Open this project... 

Published since: 2024-11-26

Organization ETH Competence Center - ETH AI Center

Hosts Li Chenhao , Li Chenhao , Li Chenhao , Li Chenhao

Topics Information, Computing and Communication Sciences