The Linux kernel ships with conservative defaults. They’re chosen to work reliably on a Raspberry Pi, a 2013 laptop with 4 GB of RAM, and a 32-core server in the same kernel tree. That means your Proxmox host, Docker server, or storage box is leaving performance on the floor.
This post covers the sysctl changes that actually matter for a homelab — network throughput, connection handling, memory limits, filesystem notifiers, and Docker-specific settings. Every parameter includes what it does and the safe range, not cargo-culted Stack Overflow snippets.
Before You Tune: Save Your Current Baseline
You need numbers before and after. Don’t skip this.
|
|
Now you can compare before/after. “Feels faster” isn’t a metric.
The Config File Layout
All changes go into /etc/sysctl.d/ — never edit /etc/sysctl.conf
directly. Debian/Ubuntu loads *.conf files from this directory in
alphabetical order.
|
|
Use sysctl -w parameter=value for temporary runtime testing before
making permanent changes.
1. Network Throughput — TCP Buffer Tuning
This is the single biggest performance win. Default TCP buffers are usually 4 MB max. On a 10 Gbps link with even 10 ms latency, that limits throughput under 3 Gbps.
The bandwidth-delay product formula: throughput × RTT = required buffer. 10 Gbps × 10 ms = 100 Mbit, or about 12.5 MB of buffer.
|
|
For a 1 Gbps homelab (most common), 64 MB max is enough:
|
|
2. BBR Congestion Control
BBR (Bottleneck Bandwidth and RTT) is Google’s TCP congestion algorithm. On links with any packet loss (which includes basically all real-world links), BBR dramatically outperforms CUBIC.
Available since kernel 4.9. Check yours:
|
|
Enable it:
|
|
BBR requires fq (Fair Queue) as the packet scheduler. Without it, BBR
falls back to pfifo_fast and performance degrades.
Benchmark on a real 1 Gbps link with 0.5% packet loss:
| Algorithm | Throughput | Latency Under Load |
|---|---|---|
| CUBIC | ~480 Mbps | 45-120 ms jitter |
| BBR | ~860 Mbps | 5-15 ms stable |
Test it yourself:
|
|
3. Connection Handling — Backlog and TIME_WAIT
High connection rates (reverse proxies, Docker Swarm, monitoring scrapers) hit the connection backlog long before CPU or bandwidth.
|
|
tcp_tw_reuse only applies to outbound connections. It’s safe for
servers that make many outbound connections (proxies, web scrapers,
monitoring agents). It won’t help with inbound connections to the same
server — those need SO_REUSEADDR at the application level.
Do not use tcp_tw_recycle — it was removed in Linux 4.12 and was
never safe behind NAT or load balancers.
4. Ephemeral Port Range
Outbound connections eat ephemeral ports. Default range is 32768-60999 (~28K ports). A busy Traefik or Nginx proxy can exhaust this in minutes.
|
|
This gives you ~64K ports per destination (IP, port) pair. Monitor exhaustion:
|
|
5. Keepalive for Long-Lived Connections
Database connections, WebSocket clients, and persistent TCP tunnels go through firewalls that silently drop idle connections. Default keepalive waits 2 hours before probing — an eternity.
|
|
This detects a dead connection in 60 + (10 × 6) = 120 seconds, versus the default 7200 + (75 × 9) = 7875 seconds (~2.2 hours).
6. Memory Management — OOM and Swappiness
The kernel’s out-of-memory killer doesn’t always pick the right victim. Set it to always kill the process that’s using the most memory (the one that caused the OOM), not the one scoring lowest on the kernel’s heuristic.
|
|
Swappiness: the kernel swaps when it thinks it’s a good idea. On a homelab server with plenty of RAM, you want it to think twice.
|
|
For a NAS or ZFS box, consider even lower (1-5). ZFS uses its own ARC cache in RAM, and you don’t want the kernel to swap out ZFS metadata pages.
|
|
7. Dirty Page Writeback
The kernel batches dirty pages before writing to disk. Default settings favor write aggregation for spinning rust. On SSD/NVMe storage, you can tune for lower latency.
|
|
- SSD/NVMe host: keep these ratios lower. Background at 3%, max at 15%.
- ZFS host: ZFS manages its own writeback. Keep kernel ratio low (dirty_background_ratio=2, dirty_ratio=10) so ZFS gets first look.
- RAID with battery backup: can push higher ratios but the homelab benefit is marginal.
8. Filesystem — inotify Watchers
Every Docker container, file sync tool (Syncthing, rclone), and log tailer consumes inotify watches. Default limit is 8192 — you’ll hit it with more than a few Docker containers.
|
|
When you hit the limit, apps throw “Too many open files” or “inotify watch limit reached” errors. Check current usage:
|
|
9. File Descriptor Limits
systemd sets per-service limits, but the system-wide kernel limit often stays low.
|
|
Also set user limits via /etc/security/limits.conf or systemd’s
DefaultLimitNOFILE=2097152 in /etc/systemd/system.conf.
Check usage:
|
|
10. Docker-Specific sysctls
Docker containers inherit sysctls from the host for most parameters. Some are namespace-aware and can be set per-container. These host-level settings improve container performance:
|
|
For per-container sysctl settings (Linux kernel 5.8+ with
CONFIG_NET_NS), set them in docker-compose or docker run:
|
|
Check if your kernel supports per-container sysctls:
|
|
11. Kernel Same-page Merging (KSM) for Proxmox
If this is a Proxmox host, KSM deduplicates identical memory pages across VMs and containers. It reduces memory usage but costs CPU.
|
|
These are not sysctl values (they’re sysfs files), so add them to a startup script or systemd service that runs after boot.
Check KSM savings:
|
|
12. NUMA Tuning (Multi-Socket Hosts)
On dual-socket servers (common in homelabs from the used server market), NUMA awareness matters. The default zone reclaim mode can cause performance regressions when one NUMA node runs out of memory.
|
|
zone_reclaim_mode = 1 means “prefer reclaiming memory on the local
node before allocating from a remote node.” Value 0 allows remote node
allocations freely (lower latency for local, but more bandwidth overall).
For NUMA-unaware applications (many Docker containers), leave at 0:
|
|
Check your NUMA topology:
|
|
The Complete Homelab Tuning File
Here’s the full config for a Proxmox or Docker host with 32-64 GB RAM and a 1-10 Gbps network:
|
|
Apply and verify:
|
|
Testing Your Changes
Run specific benchmarks before and after tuning:
Network throughput:
|
|
Connection rate (use a dedicated test host):
|
|
Latency:
|
|
Inotify limit test (run Docker compose up on a stack with 20+ services):
|
|
What NOT to Tune
Some sysctls get recommended in older blog posts but don’t touch them:
net.ipv4.tcp_sack— disabling SACK reduces throughput on any link with packet loss. Leave it enabled (default: 1).net.ipv4.tcp_congestion_control = htcp— H-TCP was useful in the 2000s. Use BBR or CUBIC instead.net.ipv4.tcp_tw_recycle— removed in kernel 4.12, doesn’t exist on modern kernels.kernel.sched_*— process scheduler tuning is workload-specific and easy to get wrong. Leave defaults unless you’ve measured a specific scheduler bottleneck.vm.min_free_kbytes— lowering this to reclaim memory causes kernel allocation failures. Leave it at the kernel-computed default.
Summary
| Area | Key Setting | Default | Tuned | Why |
|---|---|---|---|---|
| TCP buffers | rmem_max / wmem_max |
~4 MB | 64-128 MB | Bandwidth-delay product |
| Congestion | tcp_congestion_control |
cubic | bbr | Better throughput with loss |
| Connections | somaxconn |
128 | 65535 | Accept new connections faster |
| Ephemeral ports | ip_local_port_range |
32768-60999 | 1024-65535 | More outbound connections |
| Keepalive | tcp_keepalive_time |
7200 | 60 | Detect dead connections fast |
| inotify | max_user_watches |
8192 | 524288 | Docker containers need this |
| Swappiness | swappiness |
60 | 10 | Keep cache in RAM |
These settings have been running on production Proxmox, Docker, and bare-metal Linux hosts without issues. As always: test on your own hardware, one parameter at a time, and keep your baseline data.