From Local to Cloud: Implementing HashiCorp Vault for CI/CD Secret Management

From Local to Cloud: Implementing HashiCorp Vault for CI/CD Secret Management

The Journey: Solving Real-World Secret Management Challenges

I recently tackled a common but critical challenge in modern DevOps: transitioning from hardcoded secrets and environment variables to a proper secret management system using HashiCorp Vault. Starting with a local development setup using ngrok tunnels, I successfully deployed Vault to Google Cloud Platform, integrated it with GitHub Actions, and established secure boundaries between staging and production environments.

Key Problems I Solved

1. Docker Environment Setup Issues

Problem: WSL2 environment with Docker Desktop integration challenges
Solution: Verified Docker daemon connectivity and resolved container networking issues

docker --version
# Output: Docker version 28.3.2, build 578ccf6

docker ps -a --format "table {{.Names}}\t{{.Image}}\t{{.Status}}"
# Showed running containers: pcvn-erp-backend, pcvn-erp-frontend, pcvn-redis, pcvn-postgres

2. Vault Development Environment Configuration

Problem: Need for isolated secret management in development
Solution: Created Docker Compose configuration for Vault with proper networking

# docker-compose.vault.yml
version: '3.8'
services:
  vault:
    image: hashicorp/vault:1.17
    container_name: pcvn-vault
    cap_add:
      - IPC_LOCK
    environment:
      VAULT_DEV_ROOT_TOKEN_ID: "dev-root-token-change-me"
      VAULT_DEV_LISTEN_ADDRESS: "0.0.0.0:8200"
    ports:
      - "8200:8200"
    networks:
      - pcvn-fullstack_pcvn-network
    command: server -dev

3. Secret Organization and Policy Implementation

Problem: No access control between environments
Solution: Implemented path-based policies with explicit deny rules

# Storing environment-specific secrets
docker exec pcvn-vault vault kv put secret/staging/database \
  postgres_db=pcvn_erp_staging \
  postgres_user=pcvn_staging \
  postgres_password=staging-temp-password-2024

# Creating staging policy with restricted access
cat > staging-policy.hcl << 'EOF'
path "secret/data/staging/*" {
  capabilities = ["read", "list"]
}
path "secret/data/production/*" {
  capabilities = ["deny"]
}
EOF

4. CI/CD Integration Challenges

Problem: GitHub Actions couldn't access locally running Vault
Solution: Migrated from local ngrok tunnels to cloud-hosted Vault on GCP

# Initial billing setup challenge
gcloud billing projects link project-01-24 --billing-account=015809-2B69EF-EC599E
# Output: billingEnabled: false (initially closed account)

# After resolving billing
gcloud services enable compute.googleapis.com --project=project-01-24
# Output: Operation finished successfully

5. Cloud Deployment and Infrastructure as Code

Problem: Manual secret management without version control
Solution: Implemented Terraform configuration for reproducible Vault deployment

resource "google_compute_instance" "vault" {
  name         = "vault-server-us"
  machine_type = "e2-micro"
  zone         = "us-west2-a"
  
  boot_disk {
    initialize_params {
      image = "ubuntu-os-cloud/ubuntu-2404-lts-amd64"
      size  = 30
    }
  }
  
  metadata_startup_script = templatefile("startup-script.sh", {
    vault_version = "1.17.6"
  })
}

Deployment Results:

terraform apply
# Output: Apply complete! Resources: 6 added, 0 changed, 0 destroyed
# Vault Public IP: 34.102.18.66
# Vault UI URL: http://34.102.18.66:8200

6. Token Management and Rotation

Problem: Static tokens without expiration or rotation strategy
Solution: Created time-limited tokens with specific policies

# Generating environment-specific tokens
docker exec pcvn-vault vault token create \
  -policy=staging-policy \
  -ttl=720h \
  -format=json

# Testing token permissions
VAULT_TOKEN="hvs.CAESI..." vault kv get secret/staging/database
# Success: Retrieved staging secrets

VAULT_TOKEN="hvs.CAESI..." vault kv get secret/production/database  
# Error: permission denied (as expected)

7. Kubernetes Secret Bridge

Problem: No automated way to inject Vault secrets into Kubernetes
Solution: Created bridge script to fetch from Vault and create K8s secrets

./create-k8s-secrets-from-vault.sh staging
# Output:
# ✓ Successfully retrieved database credentials from Vault
# ✓ Generated Redis password (44 characters)
# ✓ Generated Rails secret key base (128 characters)
# ✓ Generated JWT signing secret (44 characters)

Key Achievements

  • Security Boundaries: Successfully implemented policy-based access control where staging tokens cannot access production secrets and vice versa
  • Audit Trail: Every secret access is now logged and traceable
  • Infrastructure as Code: Entire Vault deployment is reproducible via Terraform
  • CI/CD Integration: GitHub Actions can securely retrieve secrets without hardcoding
  • Cost Optimization: Used GCP free tier (e2-micro instance) keeping infrastructure costs minimal

Verification and Testing

I created comprehensive verification scripts to ensure security boundaries were enforced:

./verify-vault-setup.sh
# Output:
# Test 1: Vault Service Health Check ✓
# Test 2: Secret Existence Verification ✓
# Test 3: Staging Token Permission Tests ✓
# Test 4: Production Token Permission Tests ✓
# Tests passed: 10/11

Migration from Singapore to US Region

When I discovered I was in California, not Vietnam as initially assumed, I successfully migrated the infrastructure to reduce latency:

# Destroying Singapore infrastructure
terraform destroy
# Output: Destroy complete! Resources: 6 destroyed

# Redeploying to US West (Los Angeles)
terraform apply
# Output: Apply complete! Resources: 6 added
# New IP: 34.102.18.66 (us-west2)

Lessons Learned

  1. Start Simple, Scale Gradually: Beginning with Docker Compose for local development helped understand Vault concepts before cloud deployment
  2. Policy-First Design: Defining access policies before creating tokens ensures security from the start
  3. Geographic Optimization Matters: The initial Singapore deployment had ~150ms latency from California; moving to us-west2 reduced this to ~30ms
  4. Verification is Critical: Creating test scripts to verify security boundaries caught issues before production deployment

Next Steps

While I successfully implemented Vault for development and staging, production improvements would include:

  • Implementing auto-unseal with Google Cloud KMS
  • Setting up high availability with multiple Vault nodes
  • Adding TLS certificates for HTTPS connections
  • Implementing dynamic database credentials
  • Creating automated token rotation workflows

This implementation transformed a vulnerable system with hardcoded secrets into a secure, auditable, and scalable secret management infrastructure ready for production workloads.


If you enjoyed this article, you can also find it published on LinkedIn and Medium.