Linux Kernel NFSv4 LOCK Replay Heap Overflow
TL;DR
CVE-2026-31402is a heap overflow in the Linux kernelnfsdNFSv4.0LOCKreplay cache that causes slab out-of-bounds writes in kernel heap memory.- An unauthenticated attacker can trigger it remotely with two cooperating NFSv4.0 clients by provoking a
LOCKdenial response that overflows a fixed112byte replay buffer.- Severity is high, with
CVSS 3.18.2andCVSS 4.08.8fromSUSE, and upstream fixes have landed in kernels6.1.167,6.6.130,6.12.78,6.18.20,6.19.10, and7.0-rc5with distribution backports in progress.- I recommend treating any NFSv4.0 server on a vulnerable kernel as high priority: lock down NFS exposure and move to vendor kernels that include the replay cache bounds check as soon as possible.
Vulnerability Summary
| Field | Value |
|---|---|
CVE ID |
CVE-2026-31402 |
CVSS Score |
8.2
(CVSS 3.1, SUSE) /
8.8
(CVSS 4.0, SUSE)
|
CVSS Vector |
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:L/A:H,
CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:L/VA:H/SC:N/SI:N/SA:N
|
CVSS Version |
3.1, 4.0 |
CVSS Source |
SUSE
(NVD
currently
AWAITING ANALYSIS)
|
| Vulnerability Type |
Heap overflow, slab out-of-bounds write in Linux kernel
nfsd
NFSv4.0 LOCK replay cache
|
| Affected Software | Linux kernel NFS server nfsd implementing NFSv4.0 |
| Affected Versions |
Upstream 2.6.12 up to, but not including, fixed releases in the
6.1, 6.6, 6.12,
6.18, 6.19, and 7.0-rc
series
|
| Patch Status |
Fixed in upstream and stable trees, with vendor kernels (Debian, Ubuntu,
SUSE, Amazon Linux, and others) rolling out updates
|
PoC Publicly Available |
Not disclosed in upstream records or vendor advisories at time of writing |
Context
When I look at CVE-2026-31402, I see a classic protocol edge case that sat
in plain sight for years. The NFSv4 replay cache made a reasonable sizing assumption
based on OPEN responses, then quietly inherited a completely different
response shape from
LOCK
denials that included a much larger opaque field. Nobody updated the buffer size or
added a guard, so the gap between those two assumptions became a heap overflow.
In practice, any Linux system exporting NFSv4.0 through
nfsd
on a vulnerable kernel gives an unauthenticated attacker a clean path to corrupt kernel
heap memory in a predictable slab cache. Even if we never see a public
PoC
that turns this into reliable remote code execution, a remotely triggerable kernel panic
on file servers and storage nodes is already enough to treat this as a serious
availability problem.
The organizations that need to care most about this bug are the ones that quietly depend on NFS every day: mixed Linux estates in data centers, cloud environments that mount shared volumes over NFSv4.0, and legacy workloads that still speak NFSv4.0 because nobody wants to touch them. If you run fleets of Linux servers where NFS is part of the plumbing between application tiers, this vulnerability is your problem even if you do not think of yourself as a kernel engineer.
Technical Detail
Replay cache design and the bad assumption
The vulnerable code lives in the NFSv4 replay cache inside
nfsd. The replay cache exists to support idempotent semantics by caching
encoded responses for certain operations so that retransmitted requests can be answered
with exactly the same reply. To do that, each NFSv4 state owner structure embeds an
inline buffer
rp_ibuf
sized by the constant
NFSD4_REPLAY_ISIZE.
Historically, NFSD4_REPLAY_ISIZE was set to
112
bytes. The comment in fs/nfsd/state.h
spelled out the calculation:
4(status) + 8(stateid) + 20(changeinfo) + 4(rflags) + 8(verifier) + 4(deleg. type) +
8(deleg. stateid) + 4(deleg. recall flag) + 20(deleg. space limit) + ~32(deleg. ace) =
112 bytes. That is a decent approximation for an OPEN response with delegation, and
for years it seemed good enough.
The problem is that LOCK denial responses can be much larger. When a
LOCK
request is denied due to a conflict, the NFSv4.0 specification allows the server to
return information about the conflicting lock, including the lock owner as an opaque
field up to
NFS4_OPAQUE_LIMIT, which is 1024 bytes. That means a denial
response can easily exceed the 112 byte inline buffer by hundreds of bytes,
and the replay cache never accounted for that.
Vulnerable code path in nfsd4_encode_operation
The overflow happens in nfsd4_encode_operation in
fs/nfsd/nfs4xdr.c. This function encodes an NFSv4 operation into XDR, then
populates the replay cache with the response for potential reuse. The function computes
a len for the encoded payload, stores the operation status in
so->so_replay.rp_status, and then copies
len
bytes from the XDR buffer into
so->so_replay.rp_buf
using
read_bytes_from_xdr_buf.
In the vulnerable version, the logic looked like this once you compare it to the fixed diff:
@@ -5934,9 +5934,14 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
int len = xdr->buf->len - (op_status_offset + XDR_UNIT);
so->so_replay.rp_status = op->status;
- so->so_replay.rp_buflen = len;
- read_bytes_from_xdr_buf(xdr->buf, op_status_offset + XDR_UNIT,
- so->so_replay.rp_buf, len);
+ if (len <= NFSD4_REPLAY_ISIZE) {
+ so->so_replay.rp_buflen = len;
+ read_bytes_from_xdr_buf(xdr->buf,
+ op_status_offset + XDR_UNIT,
so->so_replay.rp_buf, len);
+ } else {
+ so->so_replay.rp_buflen = 0;
+ }
}
Before the patch, there was no
len <= NFSD4_REPLAY_ISIZE
check. The code unconditionally set rp_buflen = len and copied
len
bytes into rp_ibuf, even though the buffer was only 112 bytes
long. For a large LOCK
denial response with a 1024 byte owner string, this resulted in a slab
out-of-bounds write of up to 944 bytes into adjacent slab objects.
The fix is intentionally conservative. If the encoded reply length is less than or equal
to NFSD4_REPLAY_ISIZE, the function caches it as before. If it is larger,
the code sets
rp_buflen = 0
and skips the payload copy. The replay cache still records rp_status, which
is enough to satisfy the important semantic requirement that repeat requests see the
same status.
Updated commentary on
NFSD4_REPLAY_ISIZE
The patch also updates the comment in fs/nfsd/state.h
for NFSD4_REPLAY_ISIZE to document the behavior more clearly:
-/* A reasonable value for REPLAY_ISIZE was estimated as follows:
- * The OPEN response, typically the largest, requires
- * 4(status) + 8(stateid) + 20(changeinfo) + 4(rflags) + 8(verifier) +
- * 4(deleg. type) + 8(deleg. stateid) + 4(deleg. recall flag) +
- * 20(deleg. space limit) + ~32(deleg. ace) = 112 bytes
+/*
+ * REPLAY_ISIZE is sized for an OPEN response with delegation:
+ * 4(status) + 8(stateid) + 20(changeinfo) + 4(rflags) +
+ * 8(verifier) + 4(deleg. type) + 8(deleg. stateid) +
+ * 4(deleg. recall flag) + 20(deleg. space limit) +
+ * ~32(deleg. ace) = 112 bytes
+ *
+ * Some responses can exceed this. A LOCK denial includes the conflicting
+ * lock owner, which can be up to 1024 bytes (NFS4_OPAQUE_LIMIT). Responses
+ * larger than REPLAY_ISIZE are not cached in rp_ibuf; only rp_status is
+ * saved. Enlarging this constant increases the size of every
+ * nfs4_stateowner.
*/
I like this kind of comment because it documents the design tradeoff in code. The replay
buffer is intentionally sized for a common case,
OPEN
with delegation, and the comment explicitly calls out that some responses, such as
LOCK
denials with large owners, skip payload caching to avoid bloating every
nfs4_stateowner.
Trigger conditions and attacker workflow
To turn this into a working exploit primitive, an attacker needs two cooperating NFSv4.0
clients that can reach the same nfsd
server. The high level workflow looks like this.
First, client A takes a LOCK on a file region with an
oversized lock owner string, up to the 1024
byte NFS4_OPAQUE_LIMIT. Second, client B
issues a conflicting LOCK on the same region, which causes the server to
deny the request and include the conflicting lock information from client
A
in the denial reply. Third, the server encodes that denial into XDR. Because the owner
string is large, the encoded response length now significantly exceeds
NFSD4_REPLAY_ISIZE. Finally, the vulnerable
nfsd4_encode_operation
path copies len bytes into a 112 byte buffer and corrupts
whatever objects happen to be adjacent in the slab.
From there, the impact depends on allocator layout, hardening options such as
KASAN
or hardened slab allocators, and the attacker’s ability to shape which objects share the
slab. A determined attacker might be able to turn this into more than a crash, but even
a reliable remotely triggerable panic is enough to justify patching aggressively.
Mitigation
Reduce NFSv4.0 exposure
If I could not patch immediately, the first step I would take is to treat NFSv4.0 as a
high risk service and put it behind strict network boundaries. That starts with making
sure port 2049 for
nfsd
is never directly exposed to the public internet. Only networks where you have
legitimate NFS clients should be able to reach it, and those paths should be controlled
through firewalls or cloud security groups.
On top of that, I would review which hosts actually need to export NFSv4.0. In many
environments, NFS server capabilities live in base images or old automation, and there
are nodes that run nfsd
simply because nobody ever disabled it. Turning NFS server components off on those
systems removes them from the attack surface completely.
Review protocol versions and deployment patterns
If your environment and clients allow it, you can also look at how you use NFSv4.0 versus newer protocol flavors. For some clusters, it might be possible to prefer NFSv4.1 or later, or to restrict the set of protocol versions that clients negotiate. I would not rely on this alone as a mitigation, because protocol changes have a habit of surfacing unpleasant edge cases in legacy workloads. It is still a lever you can use if you have room to maneuver.
At the same time, I would take this as an opportunity to rationalize where NFS is used. Over time, NFS often becomes an invisible dependency that lives under many tiers of an application stack. Mapping which applications depend on which exports helps you prioritize which servers to patch and which network segments deserve the most attention.
Monitor for suspicious locking behavior
Even before you roll out kernel updates, you can start watching for patterns that look like someone trying to poke this bug.
I would focus on NFS clients that repeatedly send LOCK
requests with very large owner strings, clients that seem to get a lot of
LOCK
denials against the same file or region, and kernel logs that show slab corruption,
KASAN
reports, or
nfsd
crashes. None of these are perfect signatures, but they are reasonable starting points
for detection while you are still in the mitigation phase.
Remediation
Upstream fixed versions and commits
Upstream, the Linux CNA record for
CVE-2026-31402
lists several commits that carry the fix into different stable trees. The key ones
include
c9452c0797c95cf2378170df96cf4f4b3bca7eff,
8afb437ea1f70cacb4bbdf11771fb5c4d720b965,
dad0c3c0a8e5d1d6eb0fc455694ce3e25e6c57d0,
0f0e2a54a31a7f9ad2915db99156114872317388,
ae8498337dfdfda71bdd0b807c9a23a126011d76, and
5133b61aaf437e5f25b1b396b14242a6bb0508e2. All of them implement the
len <= NFSD4_REPLAY_ISIZE
guard and the documentation improvements in the NFSv4 replay cache.
The same record also translates those commits into semantic version ranges. Kernels at
or above 6.1.167, 6.6.130,
6.12.78, 6.18.20, 6.19.10, and
7.0-rc5
are marked unaffected in their respective branches. If you track upstream or stable
directly, those are the minimum versions you want to be on.
Vendor kernels: Debian, Ubuntu, SUSE, Amazon Linux
Most people do not run mainline kernels on production NFS servers, so the real story is in vendor kernels. Each major distribution is handling this in its own way, but the pattern is consistent.
On Debian, the security tracker and OSV
entry for DEBIAN-CVE-2026-31402 map the issue to the
linux
source package and list which versions are affected or fixed per release. On
Ubuntu, the Charmhub security entry for CVE-2026-31402 shows
each supported series and kernel flavor with a status field that tells you whether the
fix has shipped. On SUSE, the CVE page enumerates
SUSE Linux Enterprise
and openSUSE products across flavors like kernel-default,
kernel-rt, and kernel-source and ties each to a
CVSS
score and fix state. On Amazon Linux, the Security Center entry for
CVE-2026-31402
highlights
Amazon Linux 2
and Amazon Linux 2023 kernels and marks many as
Pending Fix
until the relevant
ALAS
bulletins go out.
My remediation strategy in each environment is simple. Identify which kernel package backs your NFS servers, check its version against the vendor advisory for your release, and schedule an update window to roll out the fixed kernel as soon as it is available and tested.
Unpatched and end-of-life systems
There is one more uncomfortable detail in the Tenable plugin for
CVE-2026-31402, which calls out “Linux distros unpatched” for this
vulnerability. Some systems will never receive a fix because the distribution or product
is beyond its support window. On those hosts, you can stack mitigations like network
isolation, but you cannot wait for a vendor patch that is never coming.
If you are stuck on an end-of-life platform, your choices are limited. You can backport the upstream patch into a custom kernel and take responsibility for maintaining it. Or you can migrate the workload to a supported platform where the vendor ships kernels that already contain the fix. For anything that faces untrusted NFS clients, I would strongly lean toward migration.
Verification
Confirm that your kernel is in range
Before you can decide what to patch, you need to know exactly which kernels your NFS servers are running. In practice, I start with the usual check:
uname -r
On rpm based systems like RHEL,
CentOS, SUSE, and Amazon Linux, I also query the
installed kernel package:
rpm -q kernel
On Debian and Ubuntu, I look at the installed
linux-image
packages:
dpkg -l | grep linux-image
With those versions in hand, you can line them up against your vendor’s advisories and decide which hosts are below the fixed builds and need to be moved.
Check that the fix is present
If you have matching kernel sources or debug packages on disk, you can also verify that
the replay cache patch is present directly. On systems that keep sources under
/usr/src, I search for the
len <= NFSD4_REPLAY_ISIZE
guard:
grep -R "len <= NFSD4_REPLAY_ISIZE" -n /usr/src/linux-*
If that string appears in the nfs4xdr.c file for the kernel you are
actually running, you know that the bounds check has been applied.
Watch for residual signs of trouble
After patching, I like to keep an eye on kernel logs and monitoring data while NFSv4.0
servers are under real workload. I watch for new kernel panics or
oops
reports that mention
nfsd, slab corruption or KASAN reports involving NFS slabs,
and NFS client complaints that might indicate unexpected behavior changes around
locking.
If the environment stays quiet under normal load, that is a good sign that you have both removed the vulnerability and preserved NFS stability.
References
-
CVE-2026-31402oncve.org -
NIST–CVE-2026-31402Detail -
kernel.org–nfsd: fix heap overflow in NFSv4.0 LOCK replay cache(dad0c3c0a8e5d1d6eb0fc455694ce3e25e6c57d0and related stable backports) -
Amazon Linux Security Center –
CVE-2026-31402 -
SUSE–CVE-2026-31402Common Vulnerabilities and Exposures -
Debian Security Bug Tracker –
CVE-2026-31402 -
Ubuntu / Charmhub Security –
CVE-2026-31402 -
Tenable Nessus –
Linux Distros Unpatched Vulnerability : CVE-2026-31402 -
SentinelOne –
CVE-2026-31402: Linux Kernel Buffer Overflow Vulnerability -
Feedly –
CVE-2026-31402 – Exploits & Severity -
Snyk –
Incorrect Calculation of Buffer Size in kernel-devel | CVE-2026-31402