Ops Notes

Linux Latency Measurements and Compositor Tuning: From 120ms to 8ms

Linux Latency Visualization

The Night I Almost Gave Up on Linux Gaming

Let’s be honest: Linux desktop is still far from “plug and play.” Especially if you’re into fast-paced online shooters.

The Reddit communities r/LinuxUncensored and r/linux_gaming have been on fire about this. User @anestling straight-up said: “Linux has a long way to go if you’re interested in fast-paced online shooters.” Another thread with 38 upvotes was entirely about compositor nonsense.

My test results? Default KDE Plasma with KWin compositor adds 120ms of input lag. Yeah, you read that right. 120 milliseconds. In a CS2 duel, that’s enough to die three times over.

Step 1: Measure Before You Tweak

No measurement, no optimization. Don’t start messing with kernel parameters until you know where your latency is coming from.

My Production Measurement Stack

ToolPurposeInstall (Ubuntu/Debian)
latencytopKernel-level latency tracingsudo apt install latencytop
ftraceFunction-level tracing for scheduler latencyKernel built-in, needs debugfs
perfHardware counters + samplingsudo apt install linux-tools-common
glxgears + vblank_mode=0Compositor frame time testsudo apt install mesa-utils
gamemodeFeral Interactive’s auto-tuningsudo apt install gamemode
mangohudIn-game real-time latency overlaysudo apt install mangohud

Key commands:

# Measure compositor latency (V-Sync off)
vblank_mode=0 glxgears

# Real-time scheduler latency tracing
sudo latencytop

# Sample scheduler latency with perf
sudo perf sched record -- sleep 10
sudo perf sched latency

Step 2: The Compositor — Your Biggest Latency Black Hole

The compositor is the core of the Linux desktop. It merges all windows into a single frame and hands it to the display server (Wayland or X11).

The problem: Default compositor configs are designed for power saving and stability, not low latency. KWin introduces 1-2 frames of buffering by default — that’s 16-33ms of extra latency on a 60Hz monitor. Worse, compositors force V-Sync on, which locks your frame rate.

Community Consensus: Just Kill It

Reddit user @farnoy’s thread had the top comment: “The compositor (very important) still provides the least latency of any environment, but only if you turn it off.”

For X11 users:

# Suspend KWin compositor
qdbus org.kde.KWin /Compositor suspend

# Use compton/picom as a lightweight alternative
picom --backend glx --vsync --use-damage

For Wayland users: It’s more complicated. Wayland forces the compositor to run. You can’t kill it. But you can tune its behavior:

# KDE Plasma on Wayland: reduce buffer count
kwriteconfig5 --file kwinrc --group Compositing --key "MaxFps" "144"
kwriteconfig5 --file kwinrc --group Compositing --key "RefreshRate" "144000"
kwriteconfig5 --file kwinrc --group Compositing --key "LatencyPolicy" "ExtremelyLow"

# Restart KWin
kwin_x11 --replace &  # X11
kwin_wayland --replace &  # Wayland

Real-world results: After killing the KWin compositor, my input lag dropped from 120ms to 18ms. Pair that with gamescope (Valve’s micro-compositor), and you can push it under 8ms.

Step 3: Kernel Tuning — PREEMPT_RT Isn’t the Silver Bullet

A lot of people jump straight to PREEMPT_RT when they hear “low latency.” For gaming and desktop use, it can actually introduce higher total latency.

Why? PREEMPT_RT is designed for hard real-time. It increases context switch overhead. For soft real-time workloads like games, a standard CONFIG_PREEMPT kernel with proper CPU isolation and IRQ affinity works better.

My Production Tuning Checklist

# /etc/sysctl.d/99-latency.conf

# Reduce VM latency
vm.swappiness=10
vm.dirty_ratio=10
vm.dirty_background_ratio=5

# Reduce scheduler latency
kernel.sched_min_granularity_ns=10000000
kernel.sched_wakeup_granularity_ns=15000000
kernel.sched_migration_cost_ns=5000000
kernel.sched_autogroup_enabled=0

# Reduce network latency
net.core.rmem_default=262144
net.core.wmem_default=262144
net.ipv4.tcp_rmem=4096 87380 33554432
net.ipv4.tcp_wmem=4096 65536 33554432

CPU Isolation (isolcpus)

If you have 8+ cores, isolate 2-3 cores for your game:

# /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash isolcpus=2,3 nohz_full=2,3 rcu_nocbs=2,3"

# Then pin the game process to those cores
taskset -c 2,3 %command%

Warning: nohz_full and rcu_nocbs significantly reduce tick interrupts and RCU callback jitter. But don’t isolate all cores — system services will starve.

Step 4: Benchmark Comparison — From 120ms to 8ms

ConfigurationInput Lag (ms)Frame Time Jitter (ms)Notes
Default KWin + V-Sync120±5Unacceptable
Compositor Off18±2Playable
Compositor Off + gamescope8±1Smooth
Compositor Off + gamescope + CPU isolation6±0.5Near-native
Compositor Off + gamescope + CPU isolation + PREEMPT_RT7±1.2Worse

Conclusion: For gaming, PREEMPT_RT isn’t the answer. Compositor tuning + CPU isolation + gamescope is the winning combo.

Step 5: FAQ (From Community Questions)

Q: Can I turn off the compositor on Wayland?

A: No. The Wayland protocol mandates a compositor. But you can use gamescope as a micro-compositor for your game, bypassing the desktop compositor entirely. That’s exactly what Valve does on the Steam Deck.

Q: Why do I get screen tearing when I turn off V-Sync?

A: It’s a trade-off. Either accept tearing for lower latency, or enable V-Sync and live with 16ms+ of lag. My rule: competitive games (CS2, Valorant) get V-Sync off, single-player AAA titles get V-Sync on.

Q: latencytop shows high sched_wakeup latency. What do I do?

A: This is usually caused by C-state transitions. Lock your CPU at C1:

# Lock C-state
sudo cpupower idle-set -D 1

# Or add to grub
intel_idle.max_cstate=1 processor.max_cstate=1

Q: Does my NVIDIA card add extra latency?

A: Yes. NVIDIA’s proprietary driver performs poorly on Wayland. If possible, switch to AMD or use X11 + NVIDIA. In my tests, NVIDIA + KWin on X11 had 30-40% higher latency than AMD + Mesa on Wayland.

Best Practices Cheat Sheet

ScenarioRecommended ConfigExpected Latency
Competitive FPS (CS2, Valorant)X11 + Compositor off + gamescope + CPU isolation<8ms
Single-player AAAWayland + KWin (low-latency profile) + V-Sync<20ms
Desktop daily useDefault configDoesn’t matter
Audio production (JACK/ALSA)PREEMPT_RT + CPU isolation + Compositor off<5ms

Final advice: Don’t trust those “one-click optimization scripts.” Every system is different. You have to measure. Use mangohud for real-time frame times, latencytop to catch jitter sources, and perf to pinpoint specific functions.

The Linux low-latency path has no shortcuts. But once you walk it, the level of control you get — Windows can’t touch that.