Troubleshooting Preview Environments #
This guide helps you diagnose and resolve common issues with Preview Environments.
Quick Diagnostics #
Before diving into specific issues, run these checks:
1. Check GitHub Actions Status #
# Navigate to your PR on GitHub
→ Click "Actions" tab
→ Find "Preview Environment" workflow
→ Check for errors or failures
2. Check Coder Workspace Status #
# Visit Coder Dashboard
→ Go to https://coder.internal.gotofu.com/workspaces
→ Find workspace: coder-preview-{PR-number}
→ Check status: should be "Running" and "Healthy"
3. Verify Preview URL #
# From PR comments, find the preview URL
→ It should match: https://app--dev--{workspace}--{owner}.coder.internal.gotofu.com/
→ Click the link to test access
Common Issues #
Environment Creation Problems #
Issue: “Workflow fails at workspace creation step” #
Symptoms:
- GitHub Actions shows error in “Create workspace” step
- No workspace appears in Coder dashboard
Common Causes:
-
Coder service issue
Check: - Is Coder dashboard accessible? - Are other workspaces working? Solution: - Wait a few minutes and retry - Contact DevOps if Coder is down -
AWS capacity issues
Error: "InsufficientInstanceCapacity" Solution: - GitHub Actions will automatically retry - If persists, try different time of day - Contact DevOps to check AWS status -
Invalid parameters
Error: "Parameter validation failed" Check: - PR URL is correct - Tokens are valid (not expired) Solution: - Verify PR is against 'main' branch - Check if workflow parameters are correct
Issue: “Workspace created but build fails” #
Symptoms:
- Workspace appears in Coder
- Build status shows “Failed”
- Health check never passes
Common Causes:
-
Compilation errors
Check build logs for: - Rust compilation errors - TypeScript type errors - Python dependency issues Solution: - Fix errors in your code - Push new commit to trigger rebuild - Verify changes work locally first -
Database migration issues
Error: "Migration failed" or "Could not apply migrations" Solution: - Check migration files for syntax errors - Ensure migrations are sequential - Test migrations locally: mise run db-migrate -
Docker image pull failures
Error: "Failed to pull image" or "Image not found" Solution: - Usually temporary network issue - Wait and retry - Check if image exists in registry
Issue: “Build takes too long (>60 minutes)” #
Symptoms:
- Workspace building for over an hour
- No error messages but not completing
Diagnosis:
# Access workspace terminal via Coder
→ Open workspace in Coder dashboard
→ Click "Terminal"
→ Check processes: ps aux | grep docker
→ Check logs: docker compose logs -f
Solutions:
-
Check build progress
# In workspace terminal cd ~/bonsai docker compose -f docker-compose.preview.yml ps docker compose -f docker-compose.preview.yml logs --tail=100 -
Restart stuck services
# Restart all services docker compose -f docker-compose.preview.yml restart # Or restart specific service docker compose -f docker-compose.preview.yml restart bonsapi -
Nuclear option: Rebuild workspace
# Remove preview label from PR → Wait 1 minute → Re-add preview label
Access Issues #
Issue: “Cannot access preview URL” #
Symptoms:
- Preview URL returns 404 or connection timeout
- “This site can’t be reached”
Diagnosis:
-
Verify workspace is running
Coder Dashboard → Find workspace Status should be: "Running" + "Healthy" If not: - Status "Stopped": Start workspace - Status "Unhealthy": Check build logs -
Check URL format
Correct format: https://app--dev--coder-preview-123--john.coder.internal.gotofu.com/ Common mistakes: - Missing 'https://' - Extra spaces - Wrong subdomain -
Verify network connectivity
# Test if you can reach Coder ping coder.internal.gotofu.com # If fails: Check VPN connection # internal.gotofu.com requires VPN
Solutions:
-
Start workspace
Coder Dashboard → Workspace → Click "Start" Wait 2-3 minutes for services to start -
Restart Coder agent
# In workspace terminal sudo systemctl restart coder -
Verify nginx is running
# In workspace terminal cd ~/bonsai docker compose -f docker-compose.preview.yml ps nginx-proxy # Should show "Up" # If not: docker compose -f docker-compose.preview.yml restart nginx-proxy
Issue: “502 Bad Gateway” error #
Symptoms:
- Can reach URL but see “502 Bad Gateway”
- Intermittent access
Common Causes:
-
Service not ready
Services still starting up Solution: - Wait 5-10 minutes - Check service status -
Backend service crashed
# Check service status docker compose -f docker-compose.preview.yml ps # Check logs for crashes docker compose -f docker-compose.preview.yml logs bonsapi --tail=100 -
Nginx configuration issue
# Test nginx config docker exec bonsai-nginx-proxy nginx -t # Restart nginx if needed docker compose -f docker-compose.preview.yml restart nginx-proxy
Solutions:
-
Restart failing service
# Identify failed service docker compose -f docker-compose.preview.yml ps # Restart it docker compose -f docker-compose.preview.yml restart <service-name> -
Check resource usage
# In workspace terminal docker stats # If memory/CPU maxed out: # Restart heavy services or rebuild
Issue: “Authentication loop” or “Cannot sign in” #
Symptoms:
- Redirected to login repeatedly
- “Authentication failed” error
- Stuck on auth page
Common Causes:
-
Cookie issues
Solution: - Clear browser cookies for the domain - Try incognito/private window - Try different browser -
Clerk configuration issue
Check: - Environment variables set correctly - CLERK_SECRET_KEY present - Domain configuration correct Solution: - Verify .env file in workspace - Restart webapp service -
Session storage issues
# Check Redis is running docker compose -f docker-compose.preview.yml ps redis # If not healthy: docker compose -f docker-compose.preview.yml restart redis
Update/Sync Issues #
Issue: “Environment not updating after push” #
Symptoms:
- Pushed new commits but changes not reflected
- Old code still running
Diagnosis:
-
Check if workflow triggered
GitHub → Actions → Look for new "Preview Environment" run Should trigger on every push to PR branch If not triggering: - Verify PR has 'preview' label - Check workflow file for conditions -
Check rebuild status
Coder Dashboard → Workspace → Check "Latest Build" Should show recent build time
Solutions:
-
Manual restart
Coder Dashboard → Workspace → Click "Restart" This forces rebuild with latest code -
Check build logs
Coder Dashboard → Workspace → "Builds" → Latest Look for errors in build process -
Force rebuild via GitHub
# Remove and re-add label PR → Remove 'preview' label Wait 1 minute PR → Add 'preview' label back
Issue: “Rebuild fails after working initially” #
Symptoms:
- Environment worked before
- After update, build fails
- Previously green builds now red
Common Causes:
-
New code has errors
Solution: - Check build logs for compile errors - Test locally: mise run ci - Fix errors and push again -
New dependencies
If you added new dependencies: - Ensure they're in package.json/Cargo.toml - Check for platform compatibility - Verify versions are correct -
Database migration issues
Error: "Migration conflict" or "Duplicate migration" Solution: - Check migration file names - Ensure sequential numbering - Verify no conflicts with main branch
Performance Issues #
Issue: “Environment is very slow” #
Symptoms:
- Pages load slowly
- API responses delayed
- Timeouts
Common Causes:
-
Resource exhaustion
# Check resource usage docker stats # High CPU/Memory usage indicates: - Heavy background job running - Memory leak - Inefficient code Solution: - Check background job queues - Review code for performance issues - Restart services to clear memory -
Database performance
# Check database size docker exec -it database psql -U postgres -c "SELECT pg_size_pretty(pg_database_size('bonsai'));" # Check slow queries docker exec -it database psql -U postgres -c "SELECT * FROM pg_stat_activity WHERE state = 'active';" -
Too many concurrent processes
# Check running containers docker ps # Stop unnecessary services docker compose -f docker-compose.preview.yml stop <service-name>
Solutions:
-
Restart services
docker compose -f docker-compose.preview.yml restart -
Clear cache
# Clear Redis cache docker exec -it bonsai-redis redis-cli FLUSHALL -
Optimize queries
- Review database query performance - Add indexes if needed - Optimize N+1 queries
Issue: “Out of disk space” #
Symptoms:
- “No space left on device” errors
- Cannot write files
- Database errors
Diagnosis:
# Check disk usage
df -h
# Check Docker disk usage
docker system df
# Find large directories
du -sh ~/bonsai/* | sort -h
Solutions:
-
Clean Docker cache
# Remove unused images docker image prune -a # Remove unused volumes docker volume prune # Remove build cache docker builder prune -
Clean build artifacts
cd ~/bonsai # Clean Rust builds cargo clean # Clean Node modules (will need reinstall) rm -rf node_modules apps/webapp/node_modules -
Request larger disk
Contact DevOps to: - Increase EBS volume size - Or provision new workspace with larger disk
Integration Issues #
For integration-specific issues, see Integration Setup guide.
Issue: “OAuth callback fails” #
Quick Fixes:
- Verify redirect URI configured correctly
- Check URL exactly matches (including protocol)
- Contact developer to verify OAuth app settings
Issue: “Integration disconnects frequently” #
Diagnosis:
- Check token expiration times
- Verify webhook URLs if applicable
- Review integration logs
Solutions:
- Reconnect integration
- Check for API rate limits
- Verify credentials haven’t expired
Service-Specific Issues #
BonsAPI (Backend) #
Check if running:
docker compose -f docker-compose.preview.yml ps bonsapi
View logs:
docker compose -f docker-compose.preview.yml logs -f bonsapi
Common errors:
- Database connection failed → Check DATABASE_URL
- Redis connection failed → Restart Redis
- RabbitMQ connection failed → Check RabbitMQ health
Webapp (Frontend) #
Check if running:
docker compose -f docker-compose.preview.yml ps webapp
View logs:
docker compose -f docker-compose.preview.yml logs -f webapp
Common errors:
- API connection failed → Check NEXT_PUBLIC_BONSAPI_HOST
- Build errors → Check TypeScript compilation
- Module not found → Reinstall dependencies
Database (PostgreSQL) #
Check if running:
docker compose -f docker-compose.preview.yml ps database
Access database:
docker exec -it database psql -U postgres -d bonsai
Common issues:
- Connection refused → Database not started
- Authentication failed → Check credentials
- Too many connections → Restart services
Background Workers #
Check status:
# View all workers
docker compose -f docker-compose.preview.yml ps | grep bonsai-
# Check specific worker
docker compose -f docker-compose.preview.yml logs bonsai-invoice
Common issues:
- Not processing jobs → Check RabbitMQ queues
- Crashing repeatedly → Review error logs
- High CPU usage → Check for infinite loops
Advanced Debugging #
Accessing Workspace Terminal #
# Via Coder web terminal
Coder Dashboard → Workspace → Terminal
# Or via SSH
ssh coder.coder-preview-{PR-number}
Useful Commands #
# See all services
docker compose -f docker-compose.preview.yml ps
# Follow logs for all services
docker compose -f docker-compose.preview.yml logs -f
# Check specific service logs
docker compose -f docker-compose.preview.yml logs -f <service-name>
# Restart specific service
docker compose -f docker-compose.preview.yml restart <service-name>
# Check resource usage
docker stats
# Check disk space
df -h
# Check memory
free -h
# Check running processes
ps aux | grep docker
Database Debugging #
# Access database shell
docker exec -it database psql -U postgres -d bonsai
# Check tables
\dt
# Check recent migrations
SELECT * FROM atlas_schema_revisions ORDER BY executed_at DESC LIMIT 5;
# Check database size
SELECT pg_size_pretty(pg_database_size('bonsai'));
# Find slow queries
SELECT pid, query_start, state, query
FROM pg_stat_activity
WHERE state = 'active';
RabbitMQ Debugging #
# Access RabbitMQ management UI
# Via Coder workspace apps or:
http://localhost:15672
# Check queues via CLI
docker exec bonsai-rabbitmq rabbitmqctl list_queues
# Check connections
docker exec bonsai-rabbitmq rabbitmqctl list_connections
Getting Help #
Before Asking for Help #
Collect this information:
-
Preview URL
Example: https://app--dev--coder-preview-123--john.coder.internal.gotofu.com/ -
PR Number and Link
Example: #123 - https://github.com/tofu2-limited/bonsai/pull/123 -
Error Message
Copy exact error message from: - Browser console (F12) - GitHub Actions logs - Coder build logs - Service logs -
Steps to Reproduce
1. Go to... 2. Click... 3. See error... -
What You’ve Tried
- Restarted services - Cleared cache - etc.
Where to Get Help #
For Infrastructure Issues:
- Slack: #devops
- Contact: DevOps team
For Application Issues:
- Slack: #engineering
- Tag: @backend-team or @frontend-team
For Integration Issues:
- Slack: #engineering
- Tag: @integration-team
For Urgent Production Impact:
- Slack: #incidents
- Follow incident response process
Prevention Tips #
Before Creating Preview #
- ✅ Run
mise run cilocally - ✅ Test changes on local environment
- ✅ Ensure migrations work
- ✅ Verify no compilation errors
- ✅ Check PR is against ‘main’ branch
During Testing #
- ✅ Monitor resource usage
- ✅ Check logs regularly
- ✅ Document issues as you find them
- ✅ Clean up test data periodically
- ✅ Report problems early
After Testing #
- ✅ Document any environment-specific issues
- ✅ Close PR when done (auto-deletes environment)
- ✅ Share learnings with team
- ✅ Update this documentation if needed
Next Steps #
- Integration Setup - Configure external services
- Creating Preview Environments - Step-by-step guide
- Accessing Preview Environments - How to use environments