// field note 73

AI Security

From npm to Your Terminal: When the AI Supply Chain Becomes the Kill Chain

A 22-minute npm attack pushed 637 malicious versions across 317 packages—designed to hijack AI coding agents through session hooks, not steal passwords.…

AI supply chain attack - npm package pipeline compromised by malicious actor injecting code into AI coding agent — 637 malicious versions. 317 packages. 22 minutes. The kill chain just moved upstream.

On May 19, 2026, the npm account "atool" was compromised. In the 22 minutes that followed, the attacker published 637 malicious versions across 317 packages, an automated burst that triggered a massive AI supply chain attack moving faster than any human incident response team could conceivably react.

THE 22-MINUTE BREACH

The numbers tell part of the story. size-sensor, a package with 4.2 million monthly downloads, got three malicious versions. So did echarts-for-react (3.8M downloads/month), jest-canvas-mock, and jest-date-mock. Another 309 packages each received exactly two poisoned versions, one per wave, published between 01:39 and 02:06 UTC while most of the Western hemisphere slept.

But the real story is not the volume. It is the target.

The payload was not designed to steal your passwords and move on. It was designed to hijack your AI coding agent. It injects persistence into Claude Code's SessionStart hooks, meaning every time you open a new Claude Code session, the malware re-executes. It does the same thing to OpenAI Codex. It patches your VS Code tasks.json so the malware fires on folderOpen.

Then it installs a systemd service or macOS LaunchAgent called "kitty-monitor" that polls a GitHub dead-drop C2 every hour, accepting RSA-signed commands hidden in commit messages containing the keyword "firedalazer."

This is the Mini Shai-Hulud attack, named by SafeDep researchers who published their analysis on May 19. It is the first documented supply chain attack designed specifically to persist inside AI agent sessions.

And nobody was watching for it.

HOW THE ATTACK WORKS

The mechanism is elegant in a way that makes every security engineer's stomach turn.

Each compromised package version makes exactly two changes to its package.json. First, it adds a preinstall hook: bun run index.js. Bun is the JavaScript runtime, not Node, which means the payload sidesteps any Node-specific security tooling. Second, 630 of 637 versions inject an optionalDependencies entry pointing to orphan commits in the antvis/G2 GitHub repository—commits that do not exist on any branch but are reachable through GitHub's fork object-sharing mechanism. It is a second delivery channel hiding in plain sight.

Mini Shai-Hulud attack kill chain diagram showing how compromised npm packages target AI coding agents — From package.json to persistent agent hijack — the Mini Shai-Hulud attack flow

The payload itself is a 498KB obfuscated Bun script, structurally identical to the Mini Shai-Hulud toolkit used in the SAP compromise three weeks earlier. Same scanner architecture. Same credential regex set. Same obfuscation pattern. SafeDep confirmed it.

What does it harvest? Everything. The payload aggressively targets:

Cloud Infrastructure: AWS credentials across the full chain (environment variables, config files, EC2 IMDS metadata, ECS container credentials, Secrets Manager), plus Kubernetes service account tokens and HashiCorp Vault tokens.

Development Assets: GitHub Personal Access Tokens, npm publish tokens, and SSH keys.

Local Secrets: Password manager vaults for 1Password, Bitwarden, pass, and gopass.

Service Configurations: Database connection strings, Stripe keys, Slack tokens, and Docker authentication configs.

Stolen data goes out through two parallel channels. Channel one: the data gets committed as Git objects to public GitHub repositories created under the stolen token, with the User-Agent forged as python-requests/2.31.0. The repository names follow a Dune-themed pattern with 256,000 possible combinations: words like "fremen," "harkonnen," "sardaukar" combined with "sandworm," "melange," "ornithopter," plus a random number.

Channel two: RSA-plus-AES encrypted HTTPS POSTs to t.m-kosche[.]com disguised as OpenTelemetry trace data.

In CI/CD environments, the payload goes further. It exchanges GitHub Actions OIDC tokens for npm publish tokens, signs artifacts via Sigstore using the stolen identity, and injects persistence into .github/workflows/codeql.yml with a workflow named "Run Copilot" that dumps toJSON(secrets) as a GitHub Actions artifact before self-cleaning the workflow run and resetting the branch.

The Docker socket gets checked. If it is present, the payload attempts container escape through three sequential methods: direct Docker API container creation, Docker API create-plus-start, and an execSync fallback. The container config runs with Privileged: true and AutoRemove: true.

Then comes the AI agent hijacking layer. The SessionStart hook injection is what makes this attack categorically new.

Claude Code reads project-level .claude/settings.json when a session starts and executes every registered hook. The malware writes or patches this file so that every new Claude Code session in the affected project automatically re-executes the payload. This is not a one-time install script. This is a persistence mechanism that survives package removal, survives branch switches, and re-fires every time the developer opens a coding session.

Codex gets the same treatment. VS Code gets a tasks.json with "runOn": "folderOpen". The payload copies itself into .claude/ in the project root so the hooks reference a persistent local copy rather than the now-removed package directory. Even if you uninstall the compromised npm package, the hook and the payload copy remain.

Earlier variants of Mini Shai-Hulud, documented by SafeDep separately, took a different approach to the same problem. Five typosquatting packages shipped a hidden 4.5MB ELF binary inside .claude/settings, triggered by a preinstall script and re-executed via the same SessionStart hook pattern. The C2 endpoint there was 207.90.194.2:443, flagged as malicious infrastructure.

The May 19 variant is the evolution: bigger blast radius, more sophisticated C2, more credential targets, and the same AI agent persistence architecture.

THE CONVERGENCE: AI SECURITY MEETS SUPPLY CHAIN

This did not happen in a vacuum. Three events in the same week paint a picture that the industry is not yet looking at.

Google Cloud's Threat Horizons Report (H1 2026), released May 18, documented that the window between vulnerability disclosure and mass exploitation collapsed from weeks to days. React2Shell (CVE-2025-55182) was weaponized within 48 hours of disclosure. The report found that most attacks now target unpatched third-party code rather than cloud infrastructure directly. Attackers increasingly favor campaigns which gain initial access by exploiting software vulnerabilities over credential-based attacks.

Translation: the package manager is now the primary attack surface, and the exploitation window is shorter than most organizations' patch cycles.

The same week, Hugging Face confirmed a fake "OpenAI" package was distributing malware through its model hub. The package impersonated legitimate OpenAI libraries and was downloaded by users who trusted the name. Hugging Face removed it, but the incident demonstrated that the open model distribution ecosystem has the same trust model as npm circa 2018: the name looks right, so it must be safe.

The pattern is consistent across every open distribution channel. npm, PyPI, Hugging Face, cargo, and every other registry was designed with a security model that assumed a human being would review what they were downloading.

That assumption no longer holds.

AI coding agents pull dependencies automatically. Claude Code, Codex, Cursor, and GitHub Copilot all resolve imports and install packages without human intervention. The agent sees a missing dependency, it runs npm install, the preinstall hook fires, and the entire machine is compromised before the developer even reads the agent's output. There is no "does this look safe?" checkpoint. There is no human in the loop.

This is not prompt injection. Prompt injection tricks the LLM into doing something malicious through its text interface. This is runtime injection. The attacker does not need to trick your model. They just need to compromise the packages your model pulls.

WHY NOBODY IS FIXING THIS

Four structural reasons this threat category is wide open:

First, the registry security models are obsolete. npm, PyPI, and cargo were all designed for human review workflows. Package signing and provenance technologies like Sigstore and SLSA exist, but adoption remains vanishingly low. Industry estimates place provenance coverage at roughly 2 to 3 percent across major registries. When the @cap-js SAP packages were compromised in the earlier Mini Shai-Hulud wave in April, the attack exploited OIDC misconfiguration: the trusted publisher was scoped to the entire repository rather than a specific workflow on a specific branch. The packages carried valid provenance attestations. They were signed. They were malicious. Provenance tells you a package came from a specific CI pipeline. It does not tell you if the pipeline itself was hijacked.

Second, agent frameworks have zero package trust verification. None of them. Claude Code, Codex, Cursor, and Copilot all install dependencies with the same blind trust as a junior developer copy-pasting from Stack Overflow. There is no sandbox for dependency resolution. There is no provenance check before install. There is no integrity verification of what comes back from the registry. The agent runs npm install, the preinstall hook fires, game over.

Third, the economics are inverted. The attacker invests in compromising one maintainer account or one CI pipeline. The blast radius is every developer and every CI runner that pulls from that package, multiplied by every organization those developers and runners have access to. The Mini Shai-Hulud campaign compromised the TanStack namespace (React Router alone has 12 million weekly downloads), UiPath's entire npm toolchain, MistralAI's official TypeScript client, and hundreds of other packages. One attack, thousands of downstream victims.

Fourth, and this is the uncomfortable one: the AI industry is building autonomous coding agents at a pace that completely outstrips the security infrastructure they run on. The industry spent 2025 teaching AI agents to write code autonomously. It did not teach them to verify the packages they import. Every new agent capability that reduces the human in the loop also removes the last safety check that was catching supply chain attacks.

WHAT THE DEFENSE LOOKS LIKE

This is not hopeless. There is a practical defense stack. It just requires treating package consumption as a security boundary, which nobody is doing yet.

Layer One: Package provenance verification. Agent frameworks like Claude Code, Codex, Cursor, and Copilot should refuse to install packages that do not carry verifiable provenance from a trusted publisher. This is a single configuration change from npm's side and a feature flag from the framework side. It is not technically difficult. It has not been prioritized.

Layer Two: Sandboxed dependency resolution. When an AI agent identifies a missing dependency, it should resolve and audit that package in an isolated environment before it ever touches the developer's machine or the CI runner. Think of it as a package airlock. The dependency gets pulled into a sandbox, its preinstall and postinstall scripts get analyzed, its provenance gets verified, and only then does it get promoted to the real environment.

Layer Three: Session integrity checks. Agent frameworks should detect unauthorized modifications to their hook files, settings files, and task configurations. If .claude/settings.json suddenly contains a SessionStart hook that was not there before, the framework should flag it and refuse to execute it until the user explicitly approves.

Layer Four: AI-powered package auditing. This is ironic but necessary. Use AI to audit packages before your AI agent touches them. Scan for obfuscated code, suspicious preinstall scripts, network connections to unknown IPs, and credential-harvesting regex patterns. The same technology that enables the attack can also detect it.

The Ultimate Failsafe: Sovereign AI Infrastructure. Relying on cloud-connected autonomous agents that pull dependencies with blind trust is a fundamental vulnerability. The ultimate defense is moving toward a Sovereign AI architecture—running local models on your own hardware, fully isolated from the public package registry chaos. When you own your compute and strictly air-gap your reasoning engines from live execution environments, a compromised global supply chain cannot penetrate your perimeter.

As a practical immediate measure: set a minimum release age on your package manager. npm, pnpm, Yarn, and Bun all now support a minimumReleaseAge configuration option. Setting it to 24 hours or more means freshly published malicious versions will not auto-resolve before security researchers have time to flag them. The May 19 burst published 637 malicious versions that semver ranges would have resolved immediately on any clean install. A 24-hour cooldown window would have prevented that.

THE UNCOMFORTABLE TRUTH

The Mini Shai-Hulud campaign is not an isolated incident. It is a proof of concept that worked.

TeamPCP, the threat actor attributed to this campaign, has been running variants of this attack since at least April 2026. They started with SAP developer packages. They moved to Bitwarden. They hit TanStack, UiPath, MistralAI, and hundreds of smaller packages. Each wave got more sophisticated. Each wave added new persistence mechanisms. The May 19 attack is the most advanced yet, but it will not be the last.

The security model we inherited from the pre-AI era assumed humans would review dependencies. AI agents do not review dependencies. They install them. At scale. Automatically. Every autonomous coding agent is currently one compromised npm package away from total compromise.

This is not theoretical. It happened. In 22 minutes.

The industry is building billions of dollars of AI agent infrastructure on a supply chain that was never designed to support it. The kill chain does not need to trick your LLM. It just needs to own the packages your LLM pulls.

And right now, nobody is watching that door.