Cloudflare pushes limits of Linux networking stack

Cloudflare engineers have encountered significant challenges in expanding their use of soft-unicast functionality within the Linux networking stack, driven by complex routing and anycast configurations for redundancy. Attempts to bypass limitations using advanced socket options ultimately led back to a simpler proxy solution. The experience highlights the difficulties in customizing Linux for high-scale networking demands.

Cloudflare's network infrastructure relies on intricate routing and configurations that test the boundaries of the Linux networking stack. As detailed in a recent blog post by engineer Chris Branch, the company sought to enhance soft-unicast capabilities, which align with their heavy use of anycast to distribute redundancy across external networks.

The core problem arose with the Netfilter connection tracking module, known as conntrack, and the Linux socket subsystem during packet rewriting processes. Soft-unicast requires multiple processes to recognize the same connection, but Linux's design prevented effective packet rewriting. Initially, the team implemented a local proxy to handle this, though it introduced performance overhead.

To address this, engineers explored abusing the TCP_REPAIR socket option, typically used for migrating virtual machine network connections. This allowed them to fully describe and 'repair' the socket connection state. They paired it with TCP Fast Open, using a TFO cookie to bypass the standard handshake. Despite these innovations, lingering issues persisted, with an early demux mechanism proposed as a partial fix.

In the end, the complexity proved too high. The team opted for the more straightforward local proxy approach, which terminates TCP connections and redirects traffic to a local socket. This decision underscores that fully escaping the Linux networking stack remains a formidable challenge, even for a company like Cloudflare at the forefront of internet infrastructure.

Articoli correlati

Illustration of a Cloudflare outage affecting X and ChatGPT, showing a user facing an error screen amid global disruption.
Immagine generata dall'IA

Cloudflare outage disrupts X and ChatGPT access

Riportato dall'IA Immagine generata dall'IA

Cloudflare suffered a major outage on November 18, 2025, rendering millions of websites worldwide, including X and ChatGPT, inaccessible for about three hours. The company confirmed the issue stemmed from an old bug triggered by a routine configuration change, not a cyber attack. Cloudflare apologized for the global impact on customers.

Phoronix has reported on updated Linux patches aimed at managing out-of-memory behavior through BPF technology. These developments focus on improving how the Linux kernel handles memory shortages. The updates are part of ongoing efforts in open-source Linux advancements.

Riportato dall'IA

A critical vulnerability in React Server Components, known as React2Shell and tracked as CVE-2025-55182, is being actively exploited to deploy a new Linux backdoor called PeerBlight. This malware turns compromised servers into covert proxy and command-and-control nodes. Attackers use a single crafted HTTP request to execute arbitrary code on vulnerable Next.js and React applications.

Cisco Talos has reported a China-linked threat actor known as UAT-7290 that has been spying on telecommunications companies since 2022. The group uses Linux malware, exploits on edge devices, and ORB infrastructure to maintain access to targeted networks.

Riportato dall'IA

Cyber threat actors in Operation Zero Disco have exploited a vulnerability in Cisco's SNMP service to install persistent Linux rootkits on network devices. The campaign targets older Cisco switches and uses crafted packets to achieve remote code execution. Trend Micro researchers disclosed the attacks on October 16, 2025, highlighting risks to unpatched systems.

Linux systems face significant risks from unpatched vulnerabilities, challenging the notion of their inherent security. Experts emphasize the need for automated patch management to protect open-source enterprises effectively.

Riportato dall'IA

Developers have resolved a performance regression in the Linux kernel 6.19's Slab allocator, which slowed module loading due to NUMA policy alterations. The issue, identified through benchmarking, affected memory management efficiency on high-core systems. The fix restores proper allocation behavior and has been merged into the mainline kernel.

 

 

 

Questo sito web utilizza i cookie

Utilizziamo i cookie per l'analisi per migliorare il nostro sito. Leggi la nostra politica sulla privacy per ulteriori informazioni.
Rifiuta