逆向工程10亿美元法律人工智能工具,泄露10万多份保密文件。
Reverse engineering a $1B Legal AI tool exposed 100k+ confidential files

原始链接: https://alexschapiro.com/security/vulnerability/2025/12/02/filevine-api-100k

## Filevine 漏洞披露 - 摘要 2025年10月,一位安全研究人员发现Filevine存在一个严重漏洞,Filevine是一家快速发展的、估值超过十亿美元的法律科技平台。通过子域名枚举,发现了一个未受保护的演示环境(“margolis.filevine.com”)。对该网站的JavaScript的调查显示,存在一个无需身份验证的API端点,查询该端点会返回律师事务所Box文件系统的完整权限管理令牌。 该令牌授予访问律师事务所Box帐户中*所有*机密数据的权限——包括客户文件、内部文件以及受HIPAA和法院命令保护的敏感信息。研究人员立即停止测试,并于2025年10月27日负责任地向Filevine披露了该问题。 Filevine的安全团队及时且专业地响应,并于2025年11月21日确认已修复该漏洞。在协调披露之后,于2025年12月3日发布了一篇详细描述该漏洞的技术博客文章。此案例强调了处理高度敏感数据的AI驱动法律工具采用健全安全实践的重要性。

相关文章

原文

Timeline & Responsible Disclosure

Initial Contact: Upon discovering this vulnerability on October 27, 2025, I immediately reached out to Filevine’s security team via email.

November 4, 2025: Filevine’s security team thanked me for the writeup and confirmed they would review the vulnerability and fix it quickly.

November 20, 2025: I followed up to confirm the patch was in place from my end, and informed them of my intention to write a technical blog post.

November 21, 2025: Filevine confirmed the issue was resolved and thanked me for responsibly reporting it.

Publication: December 3, 2025.

The Filevine team was responsive, professional, and took the findings seriously throughout the disclosure process. They acknowledged the severity, worked to remediate the issues, allowed responsible disclosure, and maintained clear communication. This is another great example of how organizations should handle security disclosures.


AI legal-tech companies are exploding in value, and Filevine, now valued at over a billion dollars, is one of the fastest-growing platforms in the space. Law firms feed tools like this enormous amounts of highly confidential information.

Because I’d recently been working with Yale Law School on a related project, I decided to take a closer look at how Filevine handles data security. What I discovered should concern every legal professional using AI systems today.

When I first navigated to the site to see how it worked, it seemed that I needed to be part of a law firm to actually play around with the tooling, or request an official demo. However, I know that companies often have a demo environment that is open, so I used a technique called subdomain enumeration (which I had first heard about in Gal Nagli’s article last year) to see if there was a demo environment. I found something much more interesting instead.

I saw a subdomain called margolis.filevine.com. When I navigated to that site, I was greeted with a loading page that never resolved:

Loading page screenshot placeholder

I wanted to see what was actually loading, so I opened Chrome’s developer tools, but saw no Fetch/XHR requests (the request you often expect to see if a page is loading data). Then, I decided to dig through some of the Javascript files to see if I could figure out what was supposed to be happening. I saw a snippet in a JS file like POST await fetch(${BOX_SERVICE}/recommend). This piqued my interest – recommend what? And what is the BOX_SERVICE? That variable was not defined in the JS file the fetch would be called from, but (after looking through minified code, which SUCKS to do) I found it in another one: “dxxxxxx9.execute-api.us-west-2.amazonaws.com/prod”. Now I had a new endpoint to test, I just had to figure out the correct payload structure to it. After looking at more minified js to determine the correct structure for this endpoint, I was able to construct a working payload to /prod/recommend:

{"projectName":"Very sensitive Project"}

(the name could be anything of course). No authorization tokens needed, and I was greeted with the response:

Response screenshot placeholder

At first I didn’t entirely understand the impact of what I saw. No matter the name of the project I passed in, I was recommended the same boxFolders and couldn’t seem to access any files. Then, not realizing I stumbled upon something massive, I turned my attention to the boxToken in the response.

After reading some documentation on the Box Api, I realized this was a maximum access fully scoped admin token to the entire Box filesystem (like an internal shared Google Drive) of this law firm. This includes all confidential files, logs, user information, etc. Once I was able to prove this had an impact (by searching for “confidential” and getting nearly 100k results back)

Search results placeholder

I immediately stopped testing and responsibly disclosed this to Filevine. They responded quickly and professionally and remediated this issue.

If someone had malicious intent, they would have been able to extract every single file used by Margolis lawyers – countless data protected by HIPAA and other legal standards, internal memos/payrolls, literally millions of the most sensitive documents this law firm has in their possession. Documents protected by court orders! This could have been a real nightmare for both the law firm and the clients whose data would have been exposed.

To companies who feel pressure to rush into the AI craze in their industry – be careful! Always ensure the companies you are giving your most sensitive information to secure that data.


联系我们 contact @ memedata.com