OpenAI has unveiled a new feature called Lockdown Mode aimed at protecting users' chat sessions from certain types of attacks. The company says this mode is designed to safeguard against what it calls "prompt injection attacks," which are when an attacker uses external information to trick AI models into responding in specific ways.
While OpenAI claims that even with Lockdown Mode activated, the risk of prompt injection remains high, researchers argue that reducing the likelihood of such occurrences is a worthwhile effort. Prompt injections can be particularly problematic for sensitive data, as they allow attackers to manipulate responses and potentially compromise user privacy. By introducing a more secure way to interact with AI models, OpenAI aims to minimize the risk of sensitive information being leaked.
The company says that Lockdown Mode works by limiting how easily external information is passed to an AI model during a conversation. This helps to prevent attackers from using pre-computed "cheat codes" or other methods that can be used to inject specific prompts into chat sessions. While some users may still be able to exploit vulnerabilities in the system, researchers say that Lockdown Mode provides a more robust defense against prompt injection attacks than OpenAI's current model.