How I Debugged a Dockerized ERP System: From Network Errors to Full Resolution

How I Debugged a Dockerized ERP System: From Network Errors to Full Resolution

The Challenge

I recently faced a critical deployment issue with a Dockerized ERP system that taught me valuable lessons about container networking, build-time vs. runtime configuration, and systematic debugging. The system consisted of a React frontend, Rails API backend, PostgreSQL database, and Redis cache—all orchestrated through Docker Compose. What started as a simple “Network Error” during user authentication evolved into a deep dive into containerized application troubleshooting.

Understanding the Architecture

Before diving into the issues, here's what I was working with:

  • Frontend: React 18.2 with TypeScript, built using Vite and served by nginx on port 8080
  • Backend: Rails 8.0.2 API running with Thruster server on port 80 (mapped to 3000 externally)
  • Database: PostgreSQL 15 on port 5433
  • Cache: Redis 7
  • Orchestration: Docker Compose with comprehensive health checks

The Investigation Process

Starting with the Symptoms

The primary symptom was straightforward: users couldn't log in, receiving a "Network Error" message. However, the root cause proved to be a cascade of misconfigurations that required methodical investigation.

Issue 1: Unmasking the Frontend API URL Problem

My first discovery came from inspecting the compiled JavaScript bundles. I used a simple but effective approach to examine what was actually deployed:

# Check what API URL the frontend was compiled with
docker-compose exec frontend sh -c "grep -o 'VITE_API_URL:[^,}]*' /usr/share/nginx/html/assets/index-*.js"

The output revealed the problem immediately: the frontend was compiled with http://localhost:3000/api/v1 hardcoded into the bundle. This meant that client-side JavaScript was trying to reach the backend directly from the browser, completely bypassing the nginx proxy I had carefully configured.

The root cause was a misunderstanding of how Vite handles environment variables. Unlike runtime environment variables, Vite bakes these values into the code at build time. My solution involved modifying the Dockerfile to properly accept and use build arguments:

ARG VITE_API_URL=/api/v1
ENV VITE_API_URL=$VITE_API_URL

Then I rebuilt the frontend with the correct relative path, ensuring all API calls would route through nginx.

Issue 2: Discovering Hidden Rate Limiting

Even after fixing the API URLs, I encountered intermittent 503 errors. The nginx logs revealed an unexpected culprit: overly aggressive rate limiting. I discovered the configuration was limiting requests to just 10 requests per second with a burst capacity of 10—far too restrictive for a modern single-page application that makes multiple concurrent API calls during initialization.

I traced through the nginx configuration:

# Inspect current nginx settings
docker-compose exec frontend cat /etc/nginx/nginx.conf | grep -A 5 "location /api"

My solution increased the limits to more reasonable values for a development environment:

limit_req_zone $binary_remote_addr zone=api:10m rate=100r/s;

location /api {
    limit_req zone=api burst=50 nodelay;
    proxy_pass http://backend:80;
}

Issue 3: The Port Mapping Mystery

The most subtle issue involved a misunderstanding of Docker's internal networking. My nginx configuration was attempting to proxy requests to http://backend:3000, but this kept failing with connection refused errors.

Through systematic investigation, I discovered that Rails with Thruster actually runs on port 80 inside the container, not port 3000. The docker-compose.yml mapping of 3000:80 meant that port 3000 was only accessible from the host machine, not from other containers.

I used these commands to verify the internal port configuration:

# Test internal connectivity directly
docker-compose exec frontend wget -qO- http://backend:80/up

# Verify the service is actually listening on port 80 internally
docker-compose exec backend netstat -tlpn

The fix was straightforward once I understood the issue: update the proxy_pass directive to use the correct internal port.

My Debugging Methodology

Throughout this process, I developed a systematic approach that I now apply to all container debugging:

1. Verify Basic Connectivity

I always start by testing whether services can reach each other at all:

# Test backend API directly from host
curl -X POST http://localhost:3000/api/v1/auth/login \
  -H "Content-Type: application/json" \
  -d '{"email":"admin@pcvn.com","password":"password123"}'

# Test through the nginx proxy
curl -X POST http://localhost:8080/api/v1/auth/login \
  -H "Content-Type: application/json" \
  -d '{"email":"admin@pcvn.com","password":"password123"}'

2. Examine the Actual Deployed Configuration

Rather than assuming what configuration is deployed, I always verify:

# Check environment variables
docker-compose exec frontend printenv | grep -i api
docker-compose exec backend printenv | grep -i rails

# Search for hardcoded values in compiled code
docker-compose exec frontend grep -r "localhost:3000" /usr/share/nginx/html/

3. Apply Fixes Incrementally

I learned to apply and test fixes one at a time, using techniques that don't require full container rebuilds when possible:

# Apply nginx config changes without rebuilding
docker cp nginx-fixed.conf pcvn-erp-frontend:/etc/nginx/nginx.conf
docker-compose exec frontend nginx -s reload

Key Lessons Learned

This debugging journey reinforced several critical principles for containerized applications:

Build-time vs. Runtime Configuration: Understanding when configuration values are set is crucial. Vite's environment variables are resolved at build time, meaning they must be provided during the Docker image build process, not when the container starts.

Internal vs. External Networking: Container-to-container communication uses internal ports and service names. The port mappings in docker-compose.yml only affect host-to-container communication, not container-to-container.

Systematic Debugging Wins: Starting with basic connectivity tests and gradually working up to specific configuration issues is more effective than making assumptions about complex problems.

Development Environment Considerations: Development environments need different configurations than production, especially for rate limiting and debugging capabilities.

Prevention Strategies I Now Implement

Based on this experience, I now follow these practices in all my Docker deployments:

  1. Use Relative URLs: Configure frontends to use relative paths like /api instead of absolute URLs with hostnames and ports.

  2. Document Port Mappings Clearly: I always add comments in docker-compose.yml explaining the internal vs. external port mappings.

  3. Implement Comprehensive Health Checks: Every service gets a proper health check endpoint that validates actual functionality, not just process availability.

  4. Provide Configuration Templates: I include .env.example files with correct default values and extensive comments explaining each variable's purpose and when it's applied.

  5. Build-time Arguments in Dockerfiles: I explicitly declare all ARG directives needed for build-time configuration at the top of Dockerfiles with clear documentation.

The Outcome

After applying these fixes systematically, the entire ERP system came online successfully. All health checks passed, authentication worked smoothly, and the system has been stable since. More importantly, I documented the entire process, creating a playbook for future debugging sessions.

Final Thoughts

This experience reminded me that even seemingly simple errors like "Network Error" can have complex, interconnected root causes in containerized environments. The key to resolution isn't making random changes but rather applying systematic investigation techniques, understanding the fundamental architecture, and testing hypotheses methodically.

Every debugging session is an opportunity to deepen understanding of the underlying systems. In this case, what started as a frustrating authentication error became a valuable lesson in Docker networking, nginx configuration, and the importance of understanding the full stack of technologies in modern web applications.


Technical Stack: Docker, Docker Compose, nginx, React 18.2, Rails 8.0.2, PostgreSQL 15, Redis 7, Vite

Debugging Tools Used: curl, wget, grep, docker-compose logs, netstat


If you enjoyed this article, you can also find it published on LinkedIn and Medium.