Every homelab has a service that depends on a database. Your web app starts before PostgreSQL finishes initializing. Your API gateway tries to reach Redis before it’s accepting connections. Your monitoring stack fails to start because InfluxDB isn’t ready yet.
The typical fix is a hack: sleep 30, a wait-for-it.sh script, or
manual restarts. These work inconsistently and slow down deployments.
Docker Compose health checks solve this properly. Combined with
depends_on conditions, they let you define exactly when a service
is considered ready — not just “running,” but “accepting connections.”
This post covers how to implement health checks for the most common homelab services, with real Compose configurations you can drop into your stack today.
How Docker Health Checks Work
A health check tells Docker how to verify a container is functioning correctly. Docker runs the check command periodically, and the container’s state changes based on the result:
starting— container started, initial grace periodhealthy— the check command exited with code 0unhealthy— the check command failed (non-zero exit) past the retry threshold
In docker ps, you’ll see the status column show healthy or
unhealthy when health checks are configured:
|
|
The key Compose health check parameters:
|
|
The start_period is critical. During this window, failures don’t
count toward the retry threshold. A database that takes 30 seconds to
initialize won’t be flagged as unhealthy — Docker just waits.
depends_on with Conditions
The real power comes from combining health checks with depends_on
conditions:
|
|
This tells Compose: “Don’t start app until postgres reports
healthy.” No more sleep hacks. No more wait scripts. The dependency
is driven by actual service readiness.
Without condition: service_healthy, depends_on only waits for the
container to start — which means “the process is running” not “the
service is ready.” That’s why your app crashes at startup even with
depends_on in place.
Example 1: Web App with PostgreSQL
This is the most common pattern. A web application that connects to PostgreSQL. The database can take 15-60 seconds to initialize, especially on the first run when it creates data directories.
|
|
The pg_isready command is built into the PostgreSQL image and exits
with code 0 when the server is accepting connections. It’s the correct
health check for PostgreSQL — not a TCP port check, not a custom HTTP
endpoint.
Example 2: Redis Cache with Application
Redis starts almost instantly, but on first boot with AOF persistence
enabled, it may take a moment to load data. A PING check confirms
the server is responding:
|
|
redis-cli ping returns PONG when the server is ready. Simple,
reliable, built into the image.
Example 3: MySQL / MariaDB with Application
MySQL’s health check uses mysqladmin ping, which requires credentials:
|
|
Note the password is passed via environment variable interpolation.
The health check command runs inside the container where the
environment is available, so ${MARIADB_ROOT_PASSWORD} resolves
correctly.
Example 4: Nginx with Backend Service Dependency
In a reverse proxy setup, you want Nginx to wait for the backend application to be healthy before it’s marked as ready:
|
|
The backend exposes a /health endpoint that checks database
connectivity, cache connectivity, and internal state. Nginx waits
until the backend returns HTTP 200 before starting.
Custom Health Checks with curl
For services that don’t have a built-in health check command, curl
inside the container is the universal fallback. Most Alpine-based
images don’t include curl by default, so you need to install it:
|
|
Or handle it in Compose if you prefer not to modify the Dockerfile:
|
|
Some lightweight images have wget instead of curl. Alpine includes
wget by default but not curl. Check what’s available before writing
your health check command.
Common Pitfalls and Debugging
Health Check Never Runs
If you see starting status indefinitely, the start_period hasn’t
elapsed yet. Wait for the full period to pass. If it stays in
starting after the period, your health check command is probably
wrong:
|
|
depends_on condition Not Honored
Compose v2.20+ supports condition: service_healthy. Older versions
ignore it silently — the service starts without waiting:
|
|
Circular Dependencies
If service A depends on service B and service B depends on service A, Compose refuses to start. Design your dependency graph as a DAG:
- App → Database (one way)
- Proxy → App → Database (chain)
- Never: App → DB → App
Health Check Logging
Docker stores the last health check result. View it with:
|
|
Advanced: Chained Dependencies
Complex stacks often have multiple layers of dependencies. Here’s a full monitoring stack with proper health chaining:
|
|
Grafana doesn’t depend on InfluxDB or Telegraf here because Grafana handles missing data sources gracefully — it starts, serves the UI, and connects to InfluxDB lazily. This avoids unnecessary sequential startup delays.
The general rule: only enforce dependencies that cause crashes on missing connections. If a service retries its backend internally, let it start independently.
When Not to Use Health Check Dependencies
Health check conditions are not always the right tool:
1. Services with internal retry logic
Most modern web frameworks (FastAPI, Express, Django, Spring Boot)
have configurable database connection retries. If your app retries
the database for 30 seconds internally, it doesn’t need a
depends_on condition — it handles the delay itself.
2. Stateless sidecars and log shippers
A log aggregator like Loki or Fluentd doesn’t depend on any application. It should start first, not wait for anything.
3. Reverse proxies with health check endpoints
Nginx, Caddy, and Traefik can mark backends as down and retry them automatically. Let the proxy manage availability — don’t delay the proxy startup.
4. Monitoring targets that should always be running
Grafana, Prometheus, and alert managers monitor the health of other services. They shouldn’t wait for them. If Grafana starts before InfluxDB, Grafana logs a connection error and Grafana itself stays healthy.
Only block startup when a missing dependency causes a crash loop.
Testing Your Health Checks
Before deploying to production, validate that your health checks work correctly:
|
|
If a service doesn’t reach healthy within the expected time:
- Increase
start_period— databases on homelab hardware (especially spinning disks or ZFS with slow sync) take longer to initialize - Run the health check command inside the container manually to verify it works
- Check the container logs for startup errors
- Verify the health check command exists in the image (no curl on Alpine by default)
Summary
Docker health checks with depends_on conditions eliminate the most
common startup failure in homelab deployments: services that start
before their dependencies.
The setup is minimal:
- Add a
healthcheck:block to each database service - Add
depends_on: db: condition: service_healthyto each consumer - Use the database’s native check command (
pg_isready,mysqladmin ping,redis-cli ping) — not TCP port checks
No more sleep 30 hacks. No more wait scripts. Your services start
when they’re actually ready, not when Docker decides the container is
running.