Debugging a Rails Authorization System: My Journey from 403 Errors to Resolution
The Crisis: When Everything Returns 403
I recently found myself staring at a production management system that was completely inaccessible. Every API endpoint I tried to access returned either 403 Forbidden or 500 Internal Server errors. The browser console was flooding with errors, and the application that had been running in Docker for 8 hours suddenly seemed completely broken.
The initial symptoms were overwhelming:
- WebSocket connections failing with 404 errors
- Multiple API endpoints returning 403 Forbidden
- Critical dashboard endpoints throwing 500 errors
- Authentication appearing to work (200 OK on login) but authorization failing everywhere
Starting the Investigation
My first instinct was to verify what was actually running. I needed to understand if this was an infrastructure problem or an application-level issue.
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}\t{{.Image}}"
The output showed all containers were healthy and had been running for 8 hours:
NAMES STATUS PORTS
pcvn-erp-backend-prod Up 8 hours (healthy) 0.0.0.0:3002->3002/tcp
pcvn-erp-frontend-prod Up 8 hours (healthy) 0.0.0.0:3003->80/tcp
pcvn-erp-db-prod Up 8 hours (healthy) 0.0.0.0:5432->5432/tcp
This told me the Docker infrastructure was fine. The problem had to be within the application logic itself.
Diving Into the Backend Logs
Next, I examined the backend container logs to understand what was happening when requests were made:
docker logs pcvn-erp-backend-prod --tail 50
The logs revealed three distinct patterns:
- Authentication was working:
POST /api/v1/auth/login
returnedCompleted 200 OK
- Authorization was failing:
Filter chain halted as :authorize_production_access! rendered or redirected
- Code errors existed:
NameError (uninitialized constant Api::V1::MetricsController::ProductionStagesOrder)
Tracing the Authorization Logic
I needed to understand how the authorization system was supposed to work. I examined the controller structure:
docker exec pcvn-erp-backend-prod cat app/controllers/api/v1/production_orders_controller.rb | head -30
This revealed that the controller expected specific roles: production managers, line supervisors, and regular employees. The authorization methods were being called, but something was preventing them from working correctly.
The First Dead End
I searched for the authorization method in what I thought was the obvious place:
docker exec pcvn-erp-backend-prod grep -n "authorize_production_access" app/controllers/application_controller.rb
Nothing. The method wasn't where I expected it to be. This led me to search more broadly:
docker exec pcvn-erp-backend-prod bash -c "grep -r 'def authorize_production_access' app/ 2>/dev/null"
I found the methods were defined as private methods within each individual controller, not in a shared location as I'd initially assumed.
Understanding the Authorization Implementation
I extracted the actual authorization logic:
docker exec pcvn-erp-backend-prod sed -n '/def authorize_production_access!/,/^ def \|^ end/p' app/controllers/api/v1/production_orders_controller.rb
The output was revealing:
def authorize_production_access!
unless can_access_production?
render json: { error: "Forbidden" }, status: :forbidden
end
end
This led me to find the actual permission check:
docker exec pcvn-erp-backend-prod sed -n '198,210p' app/controllers/api/v1/production_orders_controller.rb
def can_access_production?
current_user.has_role?("admin") || current_user.has_role?("production_manager") || current_user.has_role?("line_supervisor")
end
The Root Cause Discovery
Now I understood the authorization flow, but I needed to know why it was failing. I checked the user's actual state:
docker exec pcvn-erp-backend-prod bundle exec rails runner "u = User.find_by(email: 'admin@pcvn.com'); puts \"Role ID: #{u.role_id || 'nil'}\"; puts \"Role: #{u.role&.name || 'No role assigned'}\""
The output was the smoking gun:
Role ID: nil
Role: No role assigned
My admin user had no role assigned at all! The authorization system was working correctly - it was just that nobody had any permissions.
Checking the Role System
I investigated whether roles even existed in the database:
docker exec pcvn-erp-backend-prod bundle exec rails runner "puts 'Total roles: ' + Role.count.to_s"
Total roles: 0
The database had zero roles. The entire role-based access control system was built and functional, but the actual role data was never created.
The Failed First Attempt
I tried to create the roles:
docker exec pcvn-erp-backend-prod bundle exec rails runner "['admin', 'production_manager'].each {|name| Role.create!(name: name)}"
But I got confusing errors about roles already existing, yet the count was still zero. This led me to dig deeper into the Role model requirements:
docker exec pcvn-erp-backend-prod bundle exec rails runner "r = Role.new(name: 'admin'); puts 'Valid?: ' + r.valid?.to_s; unless r.valid?; puts 'Errors:'; r.errors.full_messages.each {|e| puts ' - ' + e}; end"
The real problem emerged:
Valid?: false
Errors:
- Description can't be blank
The Successful Fix
Armed with the knowledge that roles required both name and description, I created them properly:
docker exec pcvn-erp-backend-prod bundle exec rails runner "[['admin', 'System administrator with full access'], ['production_manager', 'Manages production operations'], ['line_supervisor', 'Supervises production lines'], ['employee', 'Regular employee']].each {|name, desc| Role.create!(name: name, description: desc)}"
Success! The roles were created with UUIDs as primary keys:
c7af64d2-5520-4a02-b12d-14fd507fdf26: admin - System administrator with full access
125aed27-0621-4751-803c-bb7318baf382: production_manager - Manages production operations
Assigning the Admin Role
Finally, I assigned the admin role to my user:
docker exec pcvn-erp-backend-prod bundle exec rails runner "u = User.find_by(email: 'admin@pcvn.com'); r = Role.find_by(name: 'admin'); u.update!(role: r); puts \"Has admin access: #{u.has_role?('admin')}\""
Has admin access: true
Verification
I verified the fix was working by creating a test JWT token and making a direct API call:
docker exec pcvn-erp-backend-prod bundle exec rails runner "require 'net/http'; u = User.find_by(email: 'admin@pcvn.com'); token = JWT.encode({user_id: u.id}, Rails.application.credentials.secret_key_base); uri = URI('http://localhost:3002/api/v1/production_orders'); req = Net::HTTP::Get.new(uri); req['Authorization'] = \"Bearer #{token}\"; res = Net::HTTP.new(uri.host, uri.port).request(req); puts \"Status: #{res.code}\""
Status: 200 OK
Reflections on the Journey
This debugging experience taught me several valuable lessons about systematic troubleshooting:
Start with infrastructure verification - I first confirmed all Docker containers were healthy before diving into application logic.
Follow the error messages systematically - The "Filter chain halted" message led me directly to the authorization system.
Never assume, always verify - I initially assumed authorization methods would be in ApplicationController, but searching thoroughly revealed they were implemented differently.
Understand the complete flow - Tracing from the controller through the authorization methods to the actual role checking logic revealed the complete picture.
Check data prerequisites - The code was working perfectly; it was the missing data (roles) that caused the failures.
Read validation errors carefully - The first attempt to create roles failed because I didn't know about the description field requirement.
The entire authorization system was like a perfectly functional lock with no keys manufactured yet. Once I created the keys (roles) and gave one to the admin user, everything worked as designed.
Technical Takeaways
Working through this issue reinforced several technical best practices:
- Always check both code AND data when debugging authorization issues
- Docker container health doesn't guarantee application functionality
- Rails validation errors need careful attention - they tell you exactly what's wrong
- JWT tokens are snapshots in time - after fixing permissions, users need new tokens
- Systematic debugging beats random attempts every time
The most satisfying part was that fixing this required no code changes, no configuration updates, and no Docker container modifications. It was purely a matter of understanding what data the existing system expected and providing it. Sometimes the most complex-seeming problems have surprisingly simple solutions when approached methodically.
If you enjoyed this article, you can also find it published on LinkedIn and Medium.