Joining forces to reveal and address the risks of Generative AI

The Swiss Call for Trust & Transparency has today launched a Pilot Gen AI Redteaming Network. The network unites all stakeholders - tech companies and public research institutions alike - to work collectively on disclosing, replicating, and mitigating the most urgent safety issues of generative AI systems. As of mid January 2024, 12 major tech companies have committed to joining forces with the network, thereby significantly advancing AI safety.

16.01.2024 by Helga Rietz-Pankoke

Read
Number of comments

Large language models (LLMs) are natural language processing programs that use artificial neural networks to generate written responses. They pose some risks that are not yet fully explored, which include (1) Potential biased, inaccurate content, depending on the data they were trained on; (2) Vulnerability to abuse, such as being used to create custom malware; (3) Potential legal issues, such as copyright violation; and (4) Potential behavioral issues, such as providing harmful advice. These issues raise ethical challenges that need addressing to make LLMs beneficial and equitable and to enable their wider adoption

Tech companies working on LLMs are making marked efforts to assess and manage risks. After all, they, too, are invested in gaining the public's trust and driving forward wide, safe adoption of their products. However, these attempts are individual to each company and are therefore fragmented. Another issue is that users cannot always ascertain whether an AI system has been tested or verified.

“Securing AI systems is a team sport,” says Catrin Hinkel, CEO of Microsoft Switzerland. “At Microsoft, we firmly believe that when you create powerful technologies, you also must ensure the technology is developed and used responsibly. We are committed to a practice of responsible AI by design, guided by a core set of principles: fairness, reliability and safety, privacy and security, inclusiveness, transparency and accountability.”

Exploring and disclosing threats for the benefit of all users

This is where the Swiss Call for Trust and Transparency in AI comes in, a joint initiative of the Swiss Foreign Ministry and ETH AI Center. As a cornerstone of their work, the initiators have launched today, at AI House Davos, a Risk Exploration and Mitigation Network that will investigate from both the attacker's and the defender's perspective and share its findings with all participants.

The academic lead is shared between external page Florian Tramèr, Professor at ETH Zurich, where he leads the Secure and Private AI Lab, and an associated faculty member of ETH AI Center; and external page Carmela Troncoso, Associate Professor at the Security and Privacy Engineering Laboratory of EPFL. The coordination of the efforts is overseen by Alexander Ilic, Executive Direct of ETH AI Center.

As of mid-January 2024, 12 major tech companies have committed to taking part in the network. Those are: Aleph Alpha, appliedAI Institute for Europe, AWS, Cohere, Hugging Face, IBM, Microsoft, Roche, SAP, Swisscom, The Global Fund, and Zurich Insurance Group.

"Safe and responsible AI is a must for a data driven and scaled organization like Zurich Insurance Group", says Ericson Chan, Group Chief Information & Digital Officer at Zurich Insurance Group. "It is critical for us to work together, from Gen AI model training to inferencing, so we continue to inspire Digital Trust at the dawn of this hyper-innovation era."

Towards effective testing and regulation

As a fully transparent system, this red-teaming network will allow all stakeholders – including tech companies and public research institutions – to work collectively on disclosing and mitigating the most urgent issues. Those efforts will also aid regulators in developing effective, standardized AI testing.

“Looking at AI systems with the mindset of a bad actor tells us not just how to secure AI but also how to make the digital space as a whole more resilient.” said Sebastian Hallensleben, Co-Chair for AI Risk & Accountability at OECD ONE.AI, during the launch event today in Davos.

Companies in the network will share scenarios and threat models for AI models with researchers, so that they can be tested and attacked to reveal potential vulnerabilities. Results will be shared first within the group, so that the participants can work on mitigations before they are disclosed in public. All results will be fed into a database of attack vectors and mitigation strategies, thereby fostering collaboration and knowledge-sharing among all stakeholders.

“AI is crucial for the world, that’s clear – the technology’s benefits have been paramount. But we as a society cannot overlook potential risks. They need to be mitigated from the get-go, and for that we need openness and transparency. We at IBM welcome the creation of the Risk Exploration and Mitigation Network for Generative AI, which will be a great complement to other initiatives, such as the recently launched AI Alliance, to ensure AI is developed and deployed safely.” said Alessandro Curioni, IBM Research VP of Europe and Africa and the Director of IBM Research Europe – Zurich.

Joining forces to reveal and address the risks of Generative AI

Exploring and disclosing threats for the benefit of all users

Towards effective testing and regulation

For further information: