A New Self-Spreading, Zero-Click Gen AI Worm Has Arrived!

Over the past year, developers of collaborative software like email and chat have been in a rush to integrate emerging generative artificial intelligence (AI) technology into their platforms. The frontrunners in this competition aim to seamlessly merge commonplace office tools with AI assistants, aiming to revolutionize the user experience.

However, security researchers have raised another red flag for the industry. They’ve devised a zero-click, self-spreading worm capable of pilfering personal data through applications utilizing AI-driven chatbots.

Named Morris II as a homage to the notorious computer worm that wreaked havoc on a significant portion of the internet in 1988, this new malware spreads autonomously, exploiting a prompt injection attack vector to deceive AI-powered email assistant apps incorporating chatbots like ChatGPT and Gemini. According to researchers from Cornell University, Technion-Israel Institute of Technology, and Intuit, this enables hackers to infiltrate victim emails, siphon personal information, launch spam campaigns, and corrupt AI models.

Crucially, victims don’t need to interact with anything to trigger the malicious activity. The worm autonomously proliferates the malware without human intervention. Hackers leverage the worm to craft “adversarial self-spreading prompts” capable of persuading a generative model to reproduce input as output. For instance, if the AI model encounters a malicious prompt, it regenerates it in its output and disseminates it to applications interfacing with it.

The researchers noted, “The study demonstrates that attackers can insert such prompts into inputs that, when processed by gen AI models, prompt the model to replicate the input as output (replication) and engage in malicious activities (payload).”

Morris II targets the current ecosystem comprising interconnected networks of generative AI-powered agents interacting with each other to autonomously execute tasks. Companies integrate AI capabilities into existing applications, linking them to remote or cloud-based generative AI models, aiming to create a “smart agent” capable of interpreting and executing complex inputs. These services generate outputs that act semi-automatically with human approval or fully automatically.

Generative AI-powered email assistants aid in responding to emails, forwarding relevant ones, marking spam, and auto-generating replies by analyzing email contents based on predefined rules and past interactions with senders. The AI email assistant market is projected to reach nearly $2 billion by 2032, underscoring the urgency for the industry to address these risks, the researchers emphasized.

In their demonstration, the researchers set up an email system capable of sending and receiving emails using generative AI. They composed an email containing a prompt triggering the system to utilize retrieval-augmented generation (RAG), a method AI models use to retrieve trusted external data, to contaminate the recipient email assistant’s database. The system retrieved and sent it to the AI model, where the worm exploited a jailbreak technique to extract sensitive data and reproduce the input as output, perpetuating the cycle across linked AI hosts.

The adversarial prompt isn’t confined to text; researchers achieved similar results by embedding the malware into an image attached to an email, compelling the AI-powered email assistant to forward the malicious image to other AI hosts.

Although the current usage of generative AI systems is limited, it’s expected to expand as tech companies intensify efforts to integrate these capabilities into existing products, forming generative AI ecosystems. “Due to this fact, we expect the exposure to the attack will be increased significantly in the next few years,” the researchers cautioned.

Currently, humans can detect and prevent the propagation of the adversarial self-spreading prompt and payload. However, relying on humans to patch system vulnerabilities is deemed unsound, especially as fully autonomous generative AI ecosystems evolve.

For now, companies can safeguard their outputs to ensure they don’t mirror the input or yield identical inferences. This mechanism can be implemented in the agent or the generative AI server. Countermeasures against jailbreaking can also thwart attackers from replicating input into output using known techniques.

The researchers conducted their experiment solely in a laboratory setting and shared their findings with prominent generative AI chatbot developers Google and OpenAI before making the study public.

Leave a Comment

Your email address will not be published. Required fields are marked *