--- name: "env-secrets-manager" description: "Env & Secrets Manager" --- # Env & Secrets Manager **Tier:** POWERFUL **Category:** Engineering **Domain:** Security / DevOps / Configuration Management --- ## Overview Complete environment and secrets management workflow: .env file lifecycle across dev/staging/prod, .env.example auto-generation, required-var validation, secret leak detection in git history, and credential rotation playbook. Integrates with HashiCorp Vault, AWS SSM, 1Password CLI, and Doppler. --- ## Core Capabilities - **.env lifecycle** — create, validate, sync across environments - **.env.example generation** — strip values, preserve keys and comments - **Validation script** — fail-fast on missing required vars at startup - **Secret leak detection** — regex scan of git history and working tree - **Rotation workflow** — detect → scope → rotate → deploy → verify - **Secret manager integrations** — Vault KV v2, AWS SSM, 1Password, Doppler --- ## When to Use - Setting up a new project — scaffold .env.example and validation - Before every commit — scan for accidentally staged secrets - Post-incident response — leaked credential rotation procedure - Onboarding new developers — they need all vars, not just some - Environment drift investigation — prod behaving differently from staging --- ## .env File Structure ### Canonical Layout ```bash # .env.example — committed to git (no values) # .env.local — developer machine (gitignored) # .env.staging — CI/CD or secret manager reference # .env.prod — never on disk; pulled from secret manager at runtime # Application APP_NAME= APP_ENV= # dev | staging | prod APP_PORT=3000 # default port if not set APP_SECRET= # REQUIRED: JWT signing secret (min 32 chars) APP_URL= # REQUIRED: public base URL # Database DATABASE_URL= # REQUIRED: full connection string DATABASE_POOL_MIN=2 DATABASE_POOL_MAX=10 # Auth AUTH_JWT_SECRET= # REQUIRED AUTH_JWT_EXPIRY=3600 # seconds AUTH_REFRESH_SECRET= # REQUIRED # Third-party APIs STRIPE_SECRET_KEY= # REQUIRED in prod STRIPE_WEBHOOK_SECRET= # REQUIRED in prod SENDGRID_API_KEY= # Storage AWS_ACCESS_KEY_ID= AWS_SECRET_ACCESS_KEY= AWS_REGION=eu-central-1 AWS_S3_BUCKET= # Monitoring SENTRY_DSN= DD_API_KEY= ``` --- ## .gitignore Patterns Add to your project's `.gitignore`: ```gitignore # Environment files — NEVER commit these .env .env.local .env.development .env.development.local .env.test.local .env.staging .env.staging.local .env.production .env.production.local .env.prod .env.*.local # Secret files *.pem *.key *.p12 *.pfx secrets.json secrets.yaml secrets.yml credentials.json service-account.json # AWS .aws/credentials # Terraform state (may contain secrets) *.tfstate *.tfstate.backup .terraform/ # Kubernetes secrets *-secret.yaml *-secrets.yaml ``` --- ## .env.example Auto-Generation ```bash #!/bin/bash # scripts/gen-env-example.sh # Strips values from .env, preserves keys, defaults, and comments INPUT="${1:-.env}" OUTPUT="${2:-.env.example}" if [ ! -f "$INPUT" ]; then echo "ERROR: $INPUT not found" exit 1 fi python3 - "$INPUT" "$OUTPUT" << 'PYEOF' import sys, re input_file = sys.argv[1] output_file = sys.argv[2] lines = [] with open(input_file) as f: for line in f: stripped = line.rstrip('\n') # Keep blank lines and comments as-is if stripped == '' or stripped.startswith('#'): lines.append(stripped) continue # Match KEY=VALUE or KEY="VALUE" m = re.match(r'^([A-Z_][A-Z0-9_]*)=(.*)$', stripped) if m: key = m.group(1) value = m.group(2).strip('"\'') # Keep non-sensitive defaults (ports, regions, feature flags) safe_defaults = re.compile( r'^(APP_PORT|APP_ENV|APP_NAME|AWS_REGION|DATABASE_POOL_|LOG_LEVEL|' r'FEATURE_|CACHE_TTL|RATE_LIMIT_|PAGINATION_|TIMEOUT_)', re.I ) sensitive = re.compile( r'(SECRET|KEY|TOKEN|PASSWORD|PASS|CREDENTIAL|DSN|AUTH|PRIVATE|CERT)', re.I ) if safe_defaults.match(key) and value: lines.append(f"{key}={value} # default") else: lines.append(f"{key}=") else: lines.append(stripped) with open(output_file, 'w') as f: f.write('\n'.join(lines) + '\n') print(f"Generated {output_file} from {input_file}") PYEOF ``` Usage: ```bash bash scripts/gen-env-example.sh .env .env.example # Commit .env.example, never .env git add .env.example ``` --- ## Required Variable Validation Script → See references/validation-detection-rotation.md for details ## Secret Manager Integrations ### HashiCorp Vault KV v2 ```bash # Setup export VAULT_ADDR="https://vault.internal.company.com" export VAULT_TOKEN="$(vault login -method=oidc -format=json | jq -r '.auth.client_token')" # Write secrets vault kv put secret/myapp/prod \ DATABASE_URL="postgres://user:pass@host/db" \ APP_SECRET="$(openssl rand -base64 32)" # Read secrets into env eval $(vault kv get -format=json secret/myapp/prod | \ jq -r '.data.data | to_entries[] | "export \(.key)=\(.value)"') # In CI/CD (GitHub Actions) # Use vault-action: hashicorp/vault-action@v2 ``` ### AWS SSM Parameter Store ```bash # Write (SecureString = encrypted with KMS) aws ssm put-parameter \ --name "/myapp/prod/DATABASE_URL" \ --value "postgres://..." \ --type "SecureString" \ --key-id "alias/myapp-secrets" # Read all params for an app/env into shell eval $(aws ssm get-parameters-by-path \ --path "/myapp/prod/" \ --with-decryption \ --query "Parameters[*].[Name,Value]" \ --output text | \ awk '{split($1,a,"/"); print "export " a[length(a)] "=\"" $2 "\""}') # In Node.js at startup # Use @aws-sdk/client-ssm to pull params before server starts ``` ### 1Password CLI ```bash # Authenticate eval $(op signin) # Get a specific field op read "op://MyVault/MyApp Prod/STRIPE_SECRET_KEY" # Export all fields from an item as env vars op item get "MyApp Prod" --format json | \ jq -r '.fields[] | select(.value != null) | "export \(.label)=\"\(.value)\""' | \ grep -E "^export [A-Z_]+" | source /dev/stdin # .env injection op inject -i .env.tpl -o .env # .env.tpl uses {{ op://Vault/Item/field }} syntax ``` ### Doppler ```bash # Setup doppler setup # interactive: select project + config # Run any command with secrets injected doppler run -- node server.js doppler run -- npm run dev # Export to .env (local dev only — never commit output) doppler secrets download --no-file --format env > .env.local # Pull specific secret doppler secrets get DATABASE_URL --plain # Sync to another environment doppler secrets upload --project myapp --config staging < .env.staging.example ``` --- ## Environment Drift Detection Check if staging and prod have the same set of keys (values may differ): ```bash #!/bin/bash # scripts/check-env-drift.sh # Pull key names from both environments (not values) STAGING_KEYS=$(doppler secrets --project myapp --config staging --format json 2>/dev/null | \ jq -r 'keys[]' | sort) PROD_KEYS=$(doppler secrets --project myapp --config prod --format json 2>/dev/null | \ jq -r 'keys[]' | sort) ONLY_IN_STAGING=$(comm -23 <(echo "$STAGING_KEYS") <(echo "$PROD_KEYS")) ONLY_IN_PROD=$(comm -13 <(echo "$STAGING_KEYS") <(echo "$PROD_KEYS")) if [ -n "$ONLY_IN_STAGING" ]; then echo "Keys in STAGING but NOT in PROD:" echo "$ONLY_IN_STAGING" | sed 's/^/ /' fi if [ -n "$ONLY_IN_PROD" ]; then echo "Keys in PROD but NOT in STAGING:" echo "$ONLY_IN_PROD" | sed 's/^/ /' fi if [ -z "$ONLY_IN_STAGING" ] && [ -z "$ONLY_IN_PROD" ]; then echo "✅ No env drift detected — staging and prod have identical key sets" fi ``` --- ## Common Pitfalls - **Committing .env instead of .env.example** — add `.env` to .gitignore on day 1; use pre-commit hooks - **Storing secrets in CI/CD logs** — never `echo $SECRET`; mask vars in CI settings - **Rotating only one place** — secrets often appear in Heroku, Vercel, Docker, K8s, CI — update ALL - **Forgetting to invalidate sessions after JWT secret rotation** — all users will be logged out; communicate this - **Using .env.example with real values** — example files are public; strip everything sensitive - **Not monitoring after rotation** — watch audit logs for 24h after rotation to catch unauthorized old-credential use - **Weak secrets** — `APP_SECRET=mysecret` is not a secret. Use `openssl rand -base64 32` --- ## Best Practices 1. **Secret manager is source of truth** — .env files are for local dev only; never in prod 2. **Rotate on a schedule**, not just after incidents — quarterly minimum for long-lived keys 3. **Principle of least privilege** — each service gets its own API key with minimal permissions 4. **Audit access** — log every secret read in Vault/SSM; alert on anomalous access 5. **Never log secrets** — add log scrubbing middleware that redacts known secret patterns 6. **Use short-lived credentials** — prefer OIDC/instance roles over long-lived access keys 7. **Separate secrets per environment** — never share a key between dev and prod 8. **Document rotation runbooks** — before an incident, not during one