Copy Fail (CVE-2026-31431): A 4‑Byte Kernel Bug That Lets Attackers Gain Root on Major Linux Distros

Cartoon illustration of Tux and a 4-byte needle piercing the page cache causing privilege escalation

Microsoft Defender Security Research recently disclosed CVE-2026-31431—nicknamed “Copy Fail”—a high‑severity local privilege escalation in the Linux kernel’s crypto subsystem that enables an unprivileged user to escalate to root. The vulnerability affects kernels released since 2017 and has broad implications for cloud and container environments because the exploit can corrupt in-memory representations of readable files (including setuid binaries) without changing the on-disk file. Although public proof-of-concept code exists and detected exploitation has been limited so far, the combination of a reliable exploit and widespread exposure means defenders need to act quickly.

Vulnerability details

Technical element Details
Vulnerability type Local privilege escalation
Attack vector Code execution from unprivileged user
Prerequisites for exploitation Local access to the machine as non-privileged user
Brief technical explanation A bug in the Linux kernel’s crypto-subsystem can be abused by an attacker to corrupt the cache of any readable file, including setuid binaries. This corruption could be carried out by unprivileged users and could result in code execution with root privilege, effectively escalating the unprivileged user to root in an unauthorized way.

How the flaw works

The root cause traces to an in-place optimization in the AF_ALG userspace crypto API introduced in 2017. The kernel reuses source memory as the destination during certain cryptographic operations. By combining AF_ALG socket usage with the splice() system call and exploiting improper error handling during a failed copy operation, an attacker can force a controlled 4‑byte write into the kernel’s page cache for any readable file. Because page cache entries are shared between containers and the host, this in-memory corruption can change how privileged binaries behave at runtime—without touching the filesystem—allowing deterministic escalation to UID 0.

Technical analysis and attack chain

  • Reconnaissance: From any local foothold (a compromised container, CI runner, or user account), an attacker can determine the host kernel version and whether AF_ALG is available. Containers share the host kernel, so a vulnerable kernel on one node exposes all containers on that node.
  • Exploit delivery: The publicly demonstrated exploit is compact (on the order of hundreds of bytes) and relies only on standard kernel interfaces available to unprivileged processes—no compilation, network access, or kernel modules required—making it effective in restricted environments.
  • Memory corruption: The exploit leverages splice() and AF_ALG interactions to perform an in-place write to the page cache. The corruption is performed in kernel memory and bypasses conventional user‑space protections.
  • Privilege escalation: By corrupting in-memory structures or the runtime image of setuid binaries, the attacker can execute code as root. The exploit is deterministic and does not depend on timing or race conditions.
  • Post‑exploitation impact: Full root on the host enables container escape, lateral movement, and compromise of multi-tenant environments, and can neutralize LSMs (SELinux/AppArmor) and other local defenses.

Attack scenarios and impact

Because the vulnerability requires only local code execution by an unprivileged account, its greatest operational risk is in environments where untrusted code runs as non-root users: CI systems, build runners, public cloud containers, and multi-tenant hosts. A single vulnerable kernel on a Kubernetes node can be weaponized from any container on that node, turning container RCE into host compromise. The CVSS score of 7.8 (High) reflects the severity and potential for systemic impact; successful exploitation undermines confidentiality, integrity, and availability.

Mitigation and protection guidance

Immediate actions (0–24 hours):

  • Inventory: Identify all systems running affected kernels (distributions and versions). Prioritize cloud nodes, build runners, and multi-tenant hosts.
  • Patch: Apply vendor patches immediately where available. Refer to NVD and vendor advisories (for example, Red Hat, Ubuntu, SUSE) for links and guidance.
  • Interim mitigations if patches are not yet available:
    • Disable the affected feature (block AF_ALG socket creation where possible).
    • Enforce network and management isolation for vulnerable workloads.
    • Harden access controls and reduce the number of accounts that can run untrusted code.
  • Response posture:
    • Treat any container RCE as potential host compromise and consider rapid node recycling after indicators of compromise.
    • Review logs for signs of exploitation and hunt for the known PoC indicators.
    • Apply least privilege in CI/CD and ensure ephemeral runners are properly isolated.

Microsoft Defender coverage

Microsoft Defender XDR and related products provide detections and coverage to help identify exploitation attempts and vulnerable devices. Relevant detections and surfaced telemetry include signatures and behavioral alerts that target known exploit variants and exploitation behaviors.

Tactic Observed activity Microsoft Defender coverage
Execution Exploitation of CVE-2026-31431 Microsoft Defender Antivirus– Exploit:Linux/CopyFailExpDl.A – Exploit:Python/CopyFail.A – Exploit:Linux/CVE-2026-31431.A – Behavior:Linux/CVE-2026-31431Microsoft Defender for Endpoint – Possible CVE-2026-31431 (“Copy Fail”) vulnerability exploitationMicrosoft Defender for Cloud – Potential exploitation of copy-fail vulnerability detected

Recommendations for defenders

  • Patch immediately where vendor fixes are available and validate kernel versions across fleets.
  • Block AF_ALG socket creation if patching is not yet feasible for some environments.
  • Harden CI/CD and container platforms: run untrusted workloads in highly constrained runtimes, minimize capability exposure, and rotate nodes.
  • Hunt proactively for the known exploit signatures and behavior and treat any detection of the PoC or post-exploitation activity as an incident with potential host compromise.
  • Apply zero-trust principles for runtime workloads and consider automation to replace or rebuild nodes suspected of compromise.

Closing thoughts

Copy Fail is notable for its simplicity, determinism, and wide reach across distributions and cloud workloads. Its ability to corrupt in-memory images without touching disk makes detection and containment harder. Fast patching, robust isolation of untrusted workloads, and vigilant hunting will be the most effective immediate defenses while vendors and defenders continue to refine protections.

Leave a Reply

Your email address will not be published. Required fields are marked *