Fix Docker Health Check Failures: Frontend Container Port Mismatch Solution | DevOps Guide

Solving Docker Frontend Health Check Failures: A DevOps Debugging Journey

The Problem I Discovered

My Docker deployment appeared to be working perfectly - users could access the application, all features functioned correctly, and five out of six containers showed healthy status. However, the frontend container had accumulated 1172 consecutive health check failures and remained stubbornly "unhealthy" despite serving traffic without issues.

Root Cause Analysis

Through systematic investigation, I identified a simple but critical misconfiguration. The health check in my docker-compose.prod-windows.yml was attempting to reach http://localhost:8080/health, but nginx inside the frontend container was actually listening on port 80.

I discovered this by examining the port mappings:

docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}" | grep -E "NAME|pcvn"

The output revealed that the frontend container mapped external port 3003 to internal port 80 (0.0.0.0:3003->80/tcp), confirming the port mismatch.

The Investigation Process

First, I examined the Docker Compose configuration to understand the health check setup:

grep -A 5 -B 5 "healthcheck:" docker-compose.prod-windows.yml | grep -A 5 -B 5 "frontend"

Then I inspected the complete frontend service definition:

sed -n '/^  frontend:/,/^  [a-z]/p' docker-compose.prod-windows.yml | head -n -1

This revealed the misconfigured health check attempting to reach port 8080 instead of port 80.

Verifying the Nginx Configuration

Before making changes, I checked whether nginx had the /health endpoint configured:

cat ./erp-frontend/Dockerfile
cat ./erp-frontend/nginx.conf

I discovered that the nginx configuration already included a properly configured health endpoint:

location /health {
    access_log off;
    return 200 "healthy\n";
    add_header Content-Type text/plain;
}

This meant I only needed to fix the port number in the Docker Compose file - the health endpoint itself was already properly implemented.

The Solution

The fix required changing just one line in the Docker Compose configuration. I followed these steps:

  1. Created a backup before making any changes:
cp docker-compose.prod-windows.yml docker-compose.prod-windows.yml.backup-$(date +%Y%m%d-%H%M%S)
  1. Fixed the port mismatch using sed:
sed -i 's|http://localhost:8080/health|http://localhost:80/health|g' docker-compose.prod-windows.yml
  1. Verified the change was applied correctly:
sed -n '/^  frontend:/,/^  [a-z]/p' docker-compose.prod-windows.yml | grep -A 4 "healthcheck:"
  1. Applied the fix with zero downtime using Docker Compose's rolling update:
docker-compose -f docker-compose.prod-windows.yml up -d --no-deps frontend

The Result

Within three minutes, the frontend container transitioned from "unhealthy" to "healthy" status. I verified this with:

docker ps --format "table {{.Names}}\t{{.Status}}\t{{.State}}" | grep frontend

The output showed: pcvn-erp-frontend-prod Up 3 minutes (healthy) running

Finally, I confirmed all six services were now healthy:

docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}" | grep -E "NAME|pcvn"

Key Takeaways from This Experience

This debugging experience reinforced several important lessons I've learned about containerized deployments:

  1. Health checks can be misleading - My application was functioning perfectly despite the "unhealthy" status. The health check was simply configured to check the wrong port.

  2. Minimal changes minimize risk - I fixed the issue by changing just one number (8080 to 80) in one configuration file. This surgical precision avoided unnecessary risk to the production system.

  3. Always verify before and after - I checked the current state before making changes, verified my edits were correct, then confirmed the fix worked. This methodical approach prevented surprises.

  4. Docker's rolling updates are powerful - Using docker-compose up -d --no-deps frontend allowed me to update the container with virtually zero downtime. The seamless container swap took less than two seconds.

  5. Understanding port mappings is crucial - The confusion between internal container ports (80) and external host ports (3003, 8080) was at the heart of this issue. Inside the container, services must communicate using internal ports.

The Technical Details That Mattered

The issue boiled down to Docker's network architecture. When a health check runs, it executes inside the container's network namespace. The health check command curl -f http://localhost:8080/health was running inside the frontend container, trying to reach port 8080. But nginx inside that same container was listening on port 80. The external port mapping (3003:80) was irrelevant for internal health checks - those only matter when crossing the container boundary.

Conclusion

What started as 1172 consecutive health check failures turned into a simple one-line fix. My Docker deployment went from 5/6 healthy services to 6/6 healthy services by correcting a single port number. This experience reminded me that in complex systems, sometimes the most persistent problems have the simplest solutions. The key is methodical investigation, understanding the architecture, and making precise, minimal changes to fix issues without introducing new problems.


If you enjoyed this article, you can also find it published on LinkedIn and Medium.