LLM对其用户的了解

LLM对其用户的了解
What LLMs Know About Their Users

数据完整性至关重要，不仅限于防止篡改，还包括确保数据在整个生命周期中的准确性。漏洞可能是恶意的或意外的，影响从基本计算机功能到复杂人工智能系统的一切。作者强调了针对人工智能的完整性攻击，例如操纵路标来欺骗驾驶系统或使用快速注射。随着我们越来越依赖人工智能和Web 3.0等数据驱动技术，数据完整性变得至关重要。强调需要“完整”的系统，确保数据的可信度和准确性。这需要对测试和测量完整性、构建可验证的传感器、设计完整的数据处理单元以及从漏洞中恢复进行研究。核心问题变成了：我们能否建立一个能够抵御完整性故障的完整网络，就像我们如何解决可用性和机密性一样？作者主张普及“诚信”一词，以促进这一重要领域的讨论和研究。

This Hacker News thread discusses an article on what LLMs know about their users. Users are concerned about LLMs collecting personal data and potentially misusing it. One user points out the difference between ChatGPT's old memory feature (viewing and deleting individual memories) and the new one (automatic summarization of past conversations). Some users have tested LLMs, like Grok and Mistral.ai, to see what personal information they retain. While some deny knowing user data without explicit sharing, others, like ChatGPT, might reveal location implicitly. There's a debate on whether LLMs knowing user information is genuinely useful or a privacy concern. Some suggest solutions like using privacy-focused tools like Duck.ai or Kagi's Assistants, which are designed not to retain user data. Others advocate for greater transparency and user control over data collected by companies like OpenAI. Some users are mindful of the profile LLMs build based on their interactions, drawing parallels to managing a public social media presence.

原文

We need to talk about data integrity.

Narrowly, the term refers to ensuring that data isn’t tampered with, either in transit or in storage. Manipulating account balances in bank databases, removing entries from criminal records, and murder by removing notations about allergies from medical records are all integrity attacks.

More broadly, integrity refers to ensuring that data is correct and accurate from the point it is collected, through all the ways it is used, modified, transformed, and eventually deleted. Integrity-related incidents include malicious actions, but also inadvertent mistakes.

We tend not to think of them this way, but we have many primitive integrity measures built into our computer systems. The reboot process, which returns a computer to a known good state, is an integrity measure. The undo button is another integrity measure. Any of our systems that detect hard drive errors, file corruption, or dropped internet packets are integrity measures.

Just as a website leaving personal data exposed even if no one accessed it counts as a privacy breach, a system that fails to guarantee the accuracy of its data counts as an integrity breach – even if no one deliberately manipulated that data.

Integrity has always been important, but as we start using massive amounts of data to both train and operate AI systems, data integrity will become more critical than ever.

Most of the attacks against AI systems are integrity attacks. Affixing small stickers on road signs to fool AI driving systems is an integrity violation. Prompt injection attacks are another integrity violation. In both cases, the AI model can’t distinguish between legitimate data and malicious input: visual in the first case, text instructions in the second. Even worse, the AI model can’t distinguish between legitimate data and malicious commands.

Any attacks that manipulate the training data, the model, the input, the output, or the feedback from the interaction back into the model is an integrity violation. If you’re building an AI system, integrity is your biggest security problem. And it’s one we’re going to need to think about, talk about, and figure out how to solve.

Web 3.0 – the distributed, decentralized, intelligent web of tomorrow – is all about data integrity. It’s not just AI. Verifiable, trustworthy, accurate data and computation are necessary parts of cloud computing, peer-to-peer social networking, and distributed data storage. Imagine a world of driverless cars, where the cars communicate with each other about their intentions and road conditions. That doesn’t work without integrity. And neither does a smart power grid, or reliable mesh networking. There are no trustworthy AI agents without integrity.

We’re going to have to solve a small language problem first, though. Confidentiality is to confidential, and availability is to available, as integrity is to what? The analogous word is “integrous,” but that’s such an obscure word that it’s not in the Merriam-Webster dictionary, even in its unabridged version. I propose that we re-popularize the word, starting here.

We need research into integrous system design.

We need research into a series of hard problems that encompass both data and computational integrity. How do we test and measure integrity? How do we build verifiable sensors with auditable system outputs? How to we build integrous data processing units? How do we recover from an integrity breach? These are just a few of the questions we will need to answer once we start poking around at integrity.

There are deep questions here, deep as the internet. Back in the 1960s, the internet was designed to answer a basic security question: Can we build an available network in a world of availability failures? More recently, we turned to the question of privacy: Can we build a confidential network in a world of confidentiality failures? I propose that the current version of this question needs to be this: Can we build an integrous network in a world of integrity failures? Like the two version of this question that came before: the answer isn’t obviously “yes,” but it’s not obviously “no,” either.

Let’s start thinking about integrous system design. And let’s start using the word in conversation. The more we use it, the less weird it will sound. And, who knows, maybe someday the American Dialect Society will choose it as the word of the year.

This essay was originally published in IEEE Security & Privacy.

Tags: AI, LLM

Posted on June 27, 2025 at 7:02 AM • 17 Comments

LLM对其用户的了解 What LLMs Know About Their Users

LLM对其用户的了解
What LLMs Know About Their Users