A worm that uses clever prompt engineering and injection is able to trick generative AI (GenAI) apps like ChatGPT into propagating malware and more.
In a laboratory setting, three Israeli researchers demonstrated how an attacker could design "adversarial self-replicating prompts" that convince a generative model into replicating input as output – if a malicious prompt comes in, the model will turn around and push it back out, allowing it to spread to further AI agents. The prompts can be used for stealing information, spreading spam, poisoning models, and more.