Alex Smith

Copy Fail: The 9-Year-Old Linux Bug That Lets Anyone Become Root — and the AI That Found It in an Hour

A Bug That Hid for Nine Years. An AI Found It in an Hour.

Imagine a skeleton key that fits every lock on the planet. Now imagine it’s been sitting in a drawer since 2017, and nobody noticed until a machine went looking for it.

That’s essentially what happened with CVE-2026-31431, nicknamed “Copy Fail” — a high-severity Linux kernel vulnerability that lets any unprivileged user escalate to full root access on virtually every major Linux distribution. It affects Ubuntu, Red Hat, SUSE, Amazon Linux, Debian, Fedora, Arch — the whole lineup. And it’s been there for nine years.

Here’s the part that should keep everyone in IT up at night: it was found by an AI system in about an hour.

What Happened

On April 29, 2026, security research firm Theori publicly disclosed CVE-2026-31431 — a local privilege escalation (LPE) vulnerability in the Linux kernel’s cryptographic subsystem. Specifically, it lives in the algif_aead module, which is part of the AF_ALG userspace crypto API.

The bug allows an unprivileged local user to perform a controlled 4-byte write into the kernel’s page cache — the shared memory that holds copies of files the system is actively using. By carefully corrupting the in-memory version of a privileged binary like /usr/bin/su, an attacker can execute that corrupted binary and gain root access.

And here’s the kicker: the exploit is a 732-byte Python script. No compilation needed, no network access required, no race conditions to win. It just works. Across distributions. Reliably.

Oh, and a fully working proof-of-concept is already public.

Why It Matters

It’s a Nine-Year-Old Bug in Everything

The flaw was introduced in 2017 via a kernel commit (72548b093ee3) that added an in-place optimization to the crypto subsystem — essentially reusing source memory as destination during cryptographic operations. It’s a performance tweak that introduced a subtle but devastating logic error. And it’s been baked into every major Linux distribution since then.

That means every Linux server, container host, cloud instance, and embedded device running a kernel from 2017 onwards has been vulnerable. That’s a lot of machines.

Containers Are Not the Safety Net You Think They Are

If your security model relies on containers as isolation boundaries, Copy Fail just punched a hole through it. Because the Linux kernel’s page cache is shared across all containers on a host, a compromised container can corrupt memory that affects every other tenant on that machine — and the host itself.

This is especially dangerous for:

  • Multi-tenant Kubernetes clusters — Namespace isolation doesn’t protect against page cache corruption
  • CI/CD runners — Self-hosted GitHub Actions runners, GitLab runners, Jenkins agents executing code from pull requests
  • AI sandboxes — Any system that lets an AI agent execute arbitrary code inside a container

As Bugcrowd’s analysis put it: if your isolation story has the word “container” in it without “microVM,” “gVisor,” or “dedicated host” right next to it, Copy Fail is in your threat model.

Container escape vulnerability diagram showing cloud infrastructure security breach
Figure 1: How Copy Fail enables container escape and multi-tenant compromise

CISA Has Taken Notice

The Cybersecurity and Infrastructure Security Agency (CISA) has already added CVE-2026-31431 to its Known Exploited Vulnerabilities (KEV) catalog, meaning federal agencies are required to patch — and everyone else should consider it a priority too.

Microsoft Defender reports seeing preliminary exploit testing activity, and expects increased threat actor exploitation in the near term.

The Technical Bit (For the Curious)

Alright, let’s get slightly nerdy. Here’s how Copy Fail actually works:

The flaw is in the interaction between the AF_ALG socket interface and the splice() system call within the algif_aead module. The 2017 in-place optimization allows a page-cache page to end up in the kernel’s writable destination scatterlist for an AEAD (Authenticated Encryption with Associated Data) operation.

The attack works like this:

  1. An unprivileged process opens an AF_ALG socket and sets up an AEAD operation
  2. It uses splice() to feed data into that socket from a target file it can read but not write
  3. Due to the in-place optimization bug, the kernel performs a failed copy operation that incorrectly writes back into the page cache
  4. The attacker controls a precise 4-byte write into any readable file’s in-memory cache
  5. By targeting the right bytes in a setuid binary like /usr/bin/su, execution of that binary grants root

Why it’s nasty:

  • Deterministic — no race conditions, no timing tricks. It works every time
  • In-memory only — the on-disk file is never modified, making forensic detection harder
  • Bypasses MAC — SELinux and AppArmor are effectively neutralized once you’re root
  • Universal — the same exploit script works across Ubuntu, RHEL, SUSE, Amazon Linux without modification

The closest historical comparison is Dirty Pipe (CVE-2022-0847) — the 2022 vulnerability that allowed similar page cache manipulation through the pipe subsystem. Copy Fail is the same class of primitive, just living in a different neighborhood of the kernel.

As for why it went unnoticed for nearly a decade? The crypto subsystem is heavily reviewed — but from a cryptographic perspective. Researchers look for things like side channels, parameter validation, and algorithm correctness. Copy Fail is fundamentally a memory ownership bug — a question about where memory came from and whether the kernel should be writing through it. Different lens, blind spot.

What You Should Do

Right now (0–24 hours):

  • Inventory your Linux systems. Know what kernel versions you’re running across servers, containers, VMs, and edge devices
  • Apply kernel patches immediately where available. Patches have been released in kernel versions 6.18.22, 6.19.12, and 7.0. Check your distribution’s security advisories for backported fixes
  • Identify high-risk surfaces: multi-tenant Kubernetes clusters, shared CI/CD runners, AI code-execution sandboxes

Short-term (1–7 days):

  • If you can’t patch immediately, consider blacklisting the algif_aead kernel module: echo blacklist algif_aead > /etc/modprobe.d/algif_aead.conf followed by a reboot. This blocks the specific attack vector but may break applications that depend on kernel-accelerated AEAD crypto
  • Audit for compromise indicators — look for unexpected root access, privilege escalations, or suspicious Python execution from unprivileged accounts
  • Review container isolation — for critical workloads, consider migrating from standard containers to microVMs (like Firecracker), gVisor, or dedicated hosts

Longer term:

  • Rethink your trust boundaries. If you’re running untrusted code in containers on shared kernels, this bug is your wake-up call. The container-as-security-boundary model has now been repeatedly challenged
  • Implement defense-in-depth: seccomp profiles, auditd rules, and kernel module restrictions all add layers that make exploitation harder even when individual bugs exist
  • Watch for updated advisories. The understanding of this vulnerability’s impact is still evolving

The Bigger Story: AI Found This in an Hour

Let’s talk about the elephant in the room. Theori’s AI-powered vulnerability discovery system, Xint Code, reportedly surfaced this bug with one operator prompt and about an hour of scan time against the Linux crypto subsystem. No custom harnessing, no manual instrumentation.

This is a bug that sat in the Linux kernel — one of the most scrutinized codebases on the planet — for nine years. Exploit brokers like Zerodium historically paid up to $500,000 for Linux zero-days of this quality. An AI system found it in the time it takes to watch a movie.

Theori isn’t some unknown startup either — they’ve won DEF CON CTF nine times and placed third in DARPA’s AI Cyber Challenge. When they say “one prompt, one hour,” the security community takes notice.

This isn’t just about one bug. It’s about what happens when vulnerability discovery becomes dramatically faster and more accessible. The skill required to find critical bugs is starting to look a lot more like the skill required to describe what you’re looking for.

The Bottom Line

Copy Fail (CVE-2026-31431) is a textbook example of a simple logic error with catastrophic consequences. Nine years old, universally present, trivially exploitable, and already in CISA’s crosshairs. Patch your Linux systems — especially shared infrastructure running containers. And if your security model treats containers as a hard boundary, it might be time to have a conversation about microVMs. The era of bugs hiding for a decade is ending. The question is whether your patching speed can keep up.

Sources & Further Reading

Leave a Comment