Microsoft explains how it’s Red Teaming GPT-4 and other AI models


How harmful and dangerous is AI? One way to find out is through “red teaming”.

Red teaming is a strategy used in many fields, including AI development. Basically, the “red team” is an independent group that tries to investigate or deliberately infiltrate the system, project, process, or whatever for vulnerabilities. The goal is to make the system more secure.

AI systems can also have such vulnerabilities or exhibit unexpected or undesirable behavior. This is where red teaming comes in: a red team in AI development acts as a kind of “independent auditor”. It tests the AI, trying to manipulate it or find flaws in its processes, preferably before the system is deployed in a real environment.

OpenAI says it invested more than six months in red teaming GPT-4 and using the results to improve the model. According to the test results, the unfiltered GPT-4 was able to detail cyberattacks on military systems, for example.


Model and system level: Microsoft uses two-tier Red Teaming

Microsoft uses Red Teaming to investigate large foundational models, such as GPT-4, as well as at the application level, such as Bing Chat, which accesses GPT-4 with additional functionality. These investigations influence the development of the models and the systems through which users interact with the models, Microsoft says.

The tech giant says it has expanded its Red Team for AI and is committed to responsible AI in addition to security. With generative AI, Microsoft says there are two types of risks: intentional manipulation, which is the exploitation of security vulnerabilities by users with malicious intent, but also security risks that arise from the normal use of large language models, such as the generation of false information.

Microsoft cites Bing Chat, of all things, as an example of extensive red-teaming. This seems odd, since Bing Chat intentionally went live in an unsafe version and generated abusive responses. So much so that Microsoft initially had to reduce the number of chats not long after launch.

If anything, Bing Chat didn’t seem to have undergone extensive security testing, and OpenAI reportedly warned Microsoft not to launch Bing Chat prematurely. But Microsoft didn’t care because ChatGPT was already on its way to the moon.

AI needs more than your standard red team

Another challenge for AI red-teaming, according to Microsoft: Traditional red-teaming is deterministic-the same input produces the same output. AI red-teaming, on the other hand, has to work with probabilities.


red teaming large language models on its Azure Learning Platform.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top