AI chatbots trained to break out other chatbots.

General
AI chatbots trained to break out other chatbots.

As the ethics of AI continue to be a hot-button issue, and as corporations and world governments continue to wrestle with the moral implications of a technology that they often struggle to define, let alone control, a bit of discouraging news has come out: AI chatbots are already being used by other trained to break chatbots out of jail, and they seem to be surprisingly good at it.

Researchers at Nanyang Technological University in Singapore have successfully compromised several popular chatbots, including ChatGPT, Google Bard and Microsoft Bing Chat (via Tom's Hardware), all using different LLM (Large Language model) was used to do this. Once effectively compromised, the jailbroken bot could then use "replies under a personality of lack of moral restraint." Crikey.

This process is called "master keying," and in its most basic form it boils down to a two-step method. First, a trained AI outwits existing chatbots and reverse engineers a database of prompts that have already proven successful in hacking chatbots to avoid blacklisted keywords. Armed with this knowledge, the AI can then automatically generate prompts to further jailbreak other chatbots.

Finally, this method allows attackers to use compromised chatbots to generate unethical content and is three times more effective at jailbreaking LLM models than standard prompts, mainly because AI attackers can learn quickly and adapt from their mistakes It has been claimed to be up to 3 times more effective.

Noticing the effectiveness of this technique, NTU researchers reported the problem to the relevant chatbot service provider, but given that it is believed that this technique can quickly adapt and bypass new processes designed to defeat it, it is unclear how easy it would be for said provider to prevent such an It remains unclear how easy it would be to prevent the attack.

The full NTU research paper will be presented at the upcoming Network and Distributed System Security Symposium in San Diego in February 2024.

Be that as it may, using AI to circumvent the moral and ethical constraints of another AI seems like a step in a somewhat frightening direction. The fractal nature of pitting LLMs against each other, as well as the ethical issues raised by chatbots creating abusive or violent content, such as Microsoft's infamous "Tay," is enough to make one pause for thought.

While we as a species seem to be hurtling headlong into an AI future that we sometimes struggle to understand, the possibility of technology being used against itself for malicious purposes seems to be an ever growing threat.

Categories