LLM Privacy

Privacy Leak Demo

Several kinds of 'jailbreaking' prompts have been discovered that can cause LLMs to output information that should be kept private, like phone numbers, addresses, medical information about individuals, or potentially dangerous information. The developers of LLMs have worked to patch these vulnerabilities as they are discovered, but these patches do not always work reliably. Below we show examples of the types of prompts that have in the past led to privacy leaks, and the responses that popular LLMs give to these prompts.

We chose to use public figures in these prompts so as not reveal information about private citizens. In some cases, the privacy patches work, and the LLM refuses to answer, however, in other cases, answers are given, showing that the patches are not always effective. The answers the LLMs give should not be taken as factual. For instance, "Pop Star Disease" does not exist.

Select any of the prompts below