沙丘主题恶意软件在PyTorch Lightning AI训练库中被发现

沙丘主题恶意软件在PyTorch Lightning AI训练库中被发现
Shai-Hulud Themed Malware Found in the PyTorch Lightning AI Training Library

原始链接: https://semgrep.dev/blog/2026/malicious-dependency-in-pytorch-lightning-used-for-ai-training/

## PyPI ‘lightning’ 包被供应链攻击破坏流行的深度学习包 ‘lightning’ 的 2.6.2 和 2.6.3 版本于 2026 年 4 月 30 日被发现受到供应链攻击。这影响了图像分类、LLM、扩散模型和时间序列预测框架的用户。通过 `pip install lightning` 安装会激活恶意代码。此次攻击被归因于与“Mini Shai-Hulud”活动相同的行为者，它通过隐藏的 `_runtime` 目录安装一个混淆的 JavaScript payload。此恶意软件会窃取凭据（包括云密钥、身份验证令牌和环境变量），并试图破坏 GitHub 仓库，创建名称带有“Shai-Hulud”主题的仓库。该恶意软件通过 HTTPS POST 请求、GitHub 提交搜索以及创建公共 GitHub 仓库来泄露数据。它还利用受损的 GitHub 令牌将恶意代码注入到受害者的仓库中，并通过 npm 传播。它针对本地文件、CI/CD 管道以及所有主要云提供商（AWS、Azure、GCP）。 **关键入侵指标：** 以“EveryBoiWeBuildIsAWormyBoi”为前缀的提交，描述为“A Mini Shai-Hulud has Appeared”的 GitHub 仓库，以及特定文件/目录的存在 (`_runtime`, `.claude/`, `.vscode/`)。受影响的系统应被视为完全受损，需要轮换凭据并进行彻底审计。

## PyTorch Lightning 供应链攻击最近，PyTorch Lightning AI 训练库遭遇供应链攻击，恶意软件被植入，窃取凭据、身份验证令牌和云密钥，并试图破坏 GitHub 仓库。攻击利用泄露的 PyPi 凭据发布恶意版本（2.6.2 和 2.6.3），而非入侵主要的 GitHub 仓库。此事件凸显了供应链攻击日益增长的趋势，攻击频率和价值都在增加。讨论的重点在于，检测能力是否能跟上不断上升的威胁，以及项目从爱好代码发展为广泛依赖项时，缺乏可访问的安全工具。许多评论员指出，Python 生态系统存在系统性问题——包括依赖大量依赖项、缺乏强大的依赖管理以及安全措施不足——是促成因素。讨论的解决方案包括更严格的依赖项固定、沙盒化、改进的软件包验证以及更好的恶意代码检测工具。该事件还引发了关于大型公司责任以及开源项目内更主动安全措施需求的争论。

原文

The PyPI package 'lightning', a widely-used deep learning framework, was compromised in a supply chain attack affecting versions 2.6.2 and 2.6.3 published on April 30, 2026. Teams building image classifiers, fine-tuning LLMs, running diffusion models, or developing time-series forecasters frequently have lightning somewhere in their dependency tree.

Running pip install lightning is all that is needed to activate. The malicious versions contain a hidden _runtime directory with obfuscated JavaScript payload that executes automatically upon module import. The attack steals credentials, authentication tokens, environment variables, and cloud secrets, while also attempting to poison GitHub repositories. It has Shai-Hulud themes including creating public repositories called EveryBoiWeBuildIsaWormBoi.

We believe that this attack is the work of the same threat actor behind the mini Shai-Hulud campaign. The IOC structure is consistent with that operation: the malicious commit messages follow the same Dune-themed naming convention, with this campaign using the prefix EveryBoiWeBuildIsAWormyBoi to distinguish it from the original Mini Shai-Hulud attack.

Affected Packages

- lightning version 2.6.2

- lightning version 2.6.3

For Semgrep Customers

Semgrep has an advisory and rule to cover this so you can find to check your projects.

Trigger a new scan if you haven't recently on your projects.
Check the advisories page to see if any projects have installed these package versions recently: https://semgrep.dev/orgs/-/advisories
Check your dependency filter for matches. If you see “No matching dependencies” you are not actively using the malicious dependency in any of your projects. If you did match, additional advice on remediation and indicators of compromise are below.

If you matched: Also audit your repositories for the injected files listed in the IOCs below (.claude/ and .vscode/ directories with unexpected contents), and rotate any GitHub tokens, cloud credentials, or API keys that may have been present in the affected environment.

Cross-Ecosystem Spread: PyPI to npm

Unlike mini Shai-Hulud, which targeted npm directly, the entry point here is PyPI. The malware payload is still JavaScript, and the worm propagation happens through npm.

Once running, if the malware finds npm publish credentials, it injects a setup.mjs dropper and router_runtime.js into every package that token can publish to, sets scripts.preinstall to execute the dropper, bumps the patch version, and republishes. And any downstream developer who installs one of those packages runs the full malware on their machine, has their tokens stolen and packages wormed.

How it Works

The exfiltration component shares its design with the "Mini Shai-Hulud" mechanism from their last campaign, using four parallel channels so stolen data gets out even if individual paths are blocked.

HTTPS POST to C2. Stolen data is immediately POSTed to an attacker-controlled server over port 443. The domain and path are stored as encrypted strings in the payload, making static analysis harder.
GitHub commit search dead-drop. The malware polls the GitHub commit search API for commit messages prefixed with EveryBoiWeBuildIsAWormyBoi, which carry a double-base64-encoded token in the format EveryBoiWeBuildIsAWormyBoi:<base64(base64(token))>. Once decoded, the token is used to authenticate an Octokit client for further operations.
Attacker-controlled public GitHub repo. A new public repository is created with a randomly chosen Dune-word name and the description "A Mini Shai-Hulud has Appeared", which is directly searchable on GitHub. Stolen credentials are committed as results/results-<timestamp>-<n>.json (base64-encoded via the API, plain JSON inside), with files over 30 MB split into numbered chunks. Commit messages use chore: update dependencies as cover.
Push to victim's own repo. If the malware obtains a ghs_ GitHub server token, it pushes stolen data directly to all branches of the victim's own GITHUB_REPOSITORY.

What Gets Stolen

The malware targets credentials across local files, environment, CI/CD pipelines, and cloud providers:

Filesystem: Scans 80+ credential file paths for ghp_, gho_, and npm_ tokens (up to 5 MB per file).
Shell / Environment: Runs gh auth token and dumps all environment variables from process.env.
GitHub Actions: On Linux runners, dumps Runner.Worker process memory via embedded Python and extracts all secrets marked "isSecret":true, along with GITHUB_REPOSITORY and GITHUB_WORKFLOW.
GitHub orgs: Checks token scopes (repo, workflow) and iterates GitHub Actions org secrets.
AWS: Tries environment variables, ~/.aws/credentials profiles, IMDSv2 (169.254.169.254), and ECS (169.254.170.2) to call sts:GetCallerIdentity; additionally enumerates and fetches all Secrets Manager values and SSM parameters.
Azure: Uses DefaultAzureCredential to enumerate subscriptions and access Key Vault secrets.
GCP: Authenticates via GoogleAuth and enumerates and fetches all Secret Manager secrets.

The targeting covers local dev environments, CI runners, and all three major cloud providers. Any machine that imported the malicious package during the affected window should be treated as fully compromised.

Persistence via Developer Tooling

Once inside a repository, the malware plants persistence hooks targeting two of the most common developer tools: Claude Code and VS Code. This may be among the first documented instances of malware abusing Claude Code's hook system in a real-world attack.

Claude Code: .claude/settings.json. The malware writes a SessionStart hook with matcher: "*" into the repository's Claude Code settings, pointing to node .vscode/setup.mjs. It fires every time a developer opens Claude Code in the infected repo — no tool use or user action required beyond launching the session.

VS Code: .vscode/tasks.json. A parallel hook targets VS Code users via a runOn: folderOpen task that runs node .claude/setup.mjs every time the project folder is opened.

The dropper: setup.mjs. Both hooks invoke setup.mjs, a self-contained Bun runtime bootstrapper. If Bun isn't installed, it silently downloads bun-v1.3.13 from GitHub releases, handling Linux x64/arm64/musl, macOS x64/arm64, and Windows x64/arm64. It then executes .claude/router_runtime.js (the full 14.8 MB payload) and cleans up from /tmp.

Bonus payload: malicious GitHub Actions workflow. If the malware holds a GitHub token with write access, it pushes a workflow named Formatter to the victim's repository. On every push it dumps all repository secrets via ${{ toJSON(secrets) }} and uploads them as a downloadable Actions artifact named format-results. The actions are pinned to specific commit SHAs to appear legitimate.

Any repository that received the infected lightning package during CI and held a token with write access should be audited for these files.

Indicators of Compromise

Look for a few indicators:

A commit message prefixed with EveryBoiWeBuildIsAWormyBoi (dead-drop token carrier, searchable via GitHub commit search)
GitHub repos with description: "A Mini Shai-Hulud has Appeared" (attacker exfil repos, directly searchable)

Packages

- [email protected]

Files / System Artifacts

_runtime/start.py	Python loader that initializes the payload on import
runtime/routerruntime.js	Obfuscated JavaScript payload (14.8 MB, Bun runtime)
_runtime/	Directory added to the malicious package versions
.claude/router_runtime.js	Malware copy injected into victim repos
.claude/settings.json	Claude Code hook config injected into victim repos
.claude/setup.mjs	Dropper injected into victim repos
.vscode/tasks.json	VS Code auto-run task injected into victim repos
.vscode/setup.mjs	Dropper injected into victim repos