Database Access #
This runbook covers how to safely access production and development databases for troubleshooting, data inspection, and emergency operations.
When to Use This Runbook #
- Investigating data-related issues in production
- Running read-only queries to diagnose problems
- Verifying data integrity after incidents
- Performing emergency data fixes (with proper approvals)
- Analyzing slow queries or database performance
Overview #
BonsAI uses PostgreSQL databases across environments:
| Environment | Database | Access Method |
|---|---|---|
| Local | Coder workspace | CloudBeaver or direct connection |
| Development | RDS PostgreSQL | ? |
| Production | RDS PostgreSQL | Via Tailscale VPN |
Local/Development Database Access #
Using CloudBeaver (Recommended) #
CloudBeaver provides a web-based interface for database management.
Access from Coder Workspace:
- Open your Coder workspace
- Click “Database” app in the workspace dashboard
- Access at
http://localhost:5432
First-Time Setup:
# If CloudBeaver is not running
mise run dev
Then follow the setup wizard:
- Username:
admin - Password:
admin123 - Add PostgreSQL connection:
- Host:
database - Port:
5432 - Database:
bonsai - Username:
postgres - Password:
postgres
- Host:
See: CloudBeaver Documentation for detailed usage.
Direct Database Connection #
For command-line access:
# Connect to local database
psql postgresql://postgres:postgres@localhost:5432/bonsai
# Or using Docker
docker exec -it bonsai-database psql -U postgres -d bonsai
Production Database Access #
Important Security Notice #
CRITICAL: Production database access should be:
- Read-only unless absolutely necessary
- Approved by a team lead or senior engineer
- Documented in incident logs
- Used only for troubleshooting and emergency fixes
Prerequisites #
-
Tailscale VPN Access
- Contact DevOps team for Tailscale setup
- Requires company authentication
-
AWS SSO Configuration
aws configure sso- SSO start URL:
https://tofu-bonsai.awsapps.com/start - Region:
eu-central-1 - Role:
full_access(production) orbonsai_developers(dev only)
- SSO start URL:
-
Database Credentials
- Stored in 1Password
- Access with appropriate permissions only
Accessing Production Database #
Step 1: Get Database Connection Details
# Set your AWS profile
export AWS_PROFILE=bonsai-prod
# Get database endpoint from Doppler
doppler secrets get DATABASE_URL --project bonsai --config prod --plain
The DATABASE_URL format:
postgresql://username:password@host:5432/database_name
Step 2: Connect via Tailscale (NOT YET CONFIGURED)
Once Tailscale is set up:
# Ensure Tailscale is running
sudo tailscale up
# Connect to database
psql "postgresql://username:password@internal-host:5432/bonsai"
Step 3: Using CloudBeaver for Production
CloudBeaver can be configured to connect to production:
- Add new connection in CloudBeaver
- Use Tailscale-accessible hostname
- Enter credentials from Doppler
- Enable read-only mode for safety
Safe Query Practices #
Always Start with Read-Only Queries #
-- Check current transaction isolation
SHOW transaction_read_only;
-- Set to read-only (session level)
SET SESSION CHARACTERISTICS AS TRANSACTION READ ONLY;
-- Verify data before changes
SELECT * FROM users WHERE id = 'target-id';
Use Transactions for Safety #
-- Start transaction
BEGIN;
-- Make your changes
UPDATE documents SET status = 'processed' WHERE id = 'doc-123';
-- Verify the change
SELECT * FROM documents WHERE id = 'doc-123';
-- If correct, commit. If not, rollback.
COMMIT; -- or ROLLBACK;
Query Timeouts #
Set statement timeout to prevent long-running queries:
-- Set 30 second timeout
SET statement_timeout = '30s';
-- Your query here
SELECT * FROM large_table WHERE ...;
Common Database Operations #
Investigating User Issues #
-- Find user by email
SELECT id, email, created_at, updated_at
FROM users
WHERE email = 'user@example.com';
-- Check user's documents
SELECT id, name, status, created_at
FROM documents
WHERE user_id = 'user-uuid'
ORDER BY created_at DESC
LIMIT 10;
-- Check user's invoices
SELECT id, invoice_number, status, total_amount
FROM invoices
WHERE organization_id IN (
SELECT organization_id FROM users WHERE id = 'user-uuid'
)
ORDER BY created_at DESC;
Checking Document Processing Status #
-- Find stuck documents
SELECT id, name, status, created_at, updated_at
FROM documents
WHERE status = 'processing'
AND updated_at < NOW() - INTERVAL '1 hour'
ORDER BY created_at DESC;
-- Check processing queue
SELECT
status,
COUNT(*) as count,
AVG(EXTRACT(EPOCH FROM (NOW() - created_at))) as avg_age_seconds
FROM documents
GROUP BY status;
Analyzing Performance Issues #
-- Find slow queries (requires pg_stat_statements extension)
SELECT
query,
calls,
total_exec_time / 1000 as total_time_sec,
mean_exec_time / 1000 as mean_time_sec,
max_exec_time / 1000 as max_time_sec
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;
-- Check table sizes
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
-- Check index usage
SELECT
schemaname,
tablename,
indexname,
idx_scan,
idx_tup_read,
idx_tup_fetch
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;
Checking Database Health #
-- Active connections
SELECT
datname,
count(*) as connections,
state
FROM pg_stat_activity
GROUP BY datname, state
ORDER BY connections DESC;
-- Long-running queries
SELECT
pid,
now() - pg_stat_activity.query_start AS duration,
state,
query
FROM pg_stat_activity
WHERE state != 'idle'
AND query_start < NOW() - INTERVAL '5 minutes'
ORDER BY duration DESC;
-- Lock information
SELECT
l.locktype,
l.mode,
l.granted,
a.query,
a.pid
FROM pg_locks l
JOIN pg_stat_activity a ON l.pid = a.pid
WHERE NOT l.granted;
Database Backup and Restore #
Checking Backup Status #
Backups are automated through AWS RDS:
# List recent RDS snapshots
aws rds describe-db-snapshots \
--db-instance-identifier bonsai-prod-db \
--query 'DBSnapshots[*].[DBSnapshotIdentifier,SnapshotCreateTime,Status]' \
--output table
Creating Manual Backup #
# Create manual snapshot
aws rds create-db-snapshot \
--db-instance-identifier bonsai-prod-db \
--db-snapshot-identifier bonsai-manual-$(date +%Y%m%d-%H%M%S)
Exporting Specific Tables #
For data exports or point-in-time backups:
# Export single table
pg_dump -h <host> -U <user> -d bonsai -t users --data-only > users_backup.sql
# Export schema only
pg_dump -h <host> -U <user> -d bonsai --schema-only > schema_backup.sql
# Export with specific format (custom format for pg_restore)
pg_dump -h <host> -U <user> -d bonsai -Fc > bonsai_backup.dump
Point-in-Time Recovery (PITR) #
RDS supports Point-in-Time Recovery:
# Restore to specific timestamp
aws rds restore-db-instance-to-point-in-time \
--source-db-instance-identifier bonsai-prod-db \
--target-db-instance-identifier bonsai-restored-$(date +%Y%m%d) \
--restore-time "2025-10-21T10:30:00Z"
IMPORTANT: PITR creates a new database instance. Never restore over production!
Emergency Data Fixes #
Before Making Changes #
- Get approval - From team lead or senior engineer
- Take backup - Manual snapshot or table export
- Test in dev first - If possible, reproduce the fix
- Document everything - Commands, timestamps, results
- Have rollback plan - Know how to undo the change
Example: Fixing Stuck Document #
-- 1. Document current state
SELECT id, status, updated_at FROM documents WHERE id = 'doc-123';
-- 2. Take a backup (in a separate query/transaction)
CREATE TABLE documents_backup_20251021 AS
SELECT * FROM documents WHERE id = 'doc-123';
-- 3. Make the change in a transaction
BEGIN;
UPDATE documents
SET status = 'failed',
error_message = 'Manually failed after investigation',
updated_at = NOW()
WHERE id = 'doc-123';
-- 4. Verify
SELECT id, status, updated_at FROM documents WHERE id = 'doc-123';
-- 5. Commit or rollback
COMMIT; -- or ROLLBACK if something looks wrong
Example: Resetting User Password #
-- This is just an example - actual password reset should go through API
BEGIN;
-- Document the change
INSERT INTO audit_log (action, user_id, timestamp, details)
VALUES ('password_reset', 'user-uuid', NOW(), 'Emergency reset via DBA');
-- Update password (hashed appropriately)
UPDATE users
SET password_hash = <new_hashed_password>,
password_updated_at = NOW()
WHERE id = 'user-uuid';
COMMIT;
Troubleshooting Database Issues #
Cannot Connect to Database #
Symptoms: Connection timeout or refused
Checks:
- Verify Tailscale is connected (for production)
- Check security groups allow your IP
- Verify credentials in Doppler
- Check RDS instance status:
aws rds describe-db-instances \ --db-instance-identifier bonsai-prod-db \ --query 'DBInstances[0].DBInstanceStatus'
High Connection Count #
Symptoms: “Too many connections” error
Investigation:
-- Check connection count by state
SELECT state, count(*)
FROM pg_stat_activity
GROUP BY state;
-- Find idle connections
SELECT pid, usename, application_name, state, query
FROM pg_stat_activity
WHERE state = 'idle'
AND query_start < NOW() - INTERVAL '10 minutes';
Solutions:
-
Identify and terminate idle connections:
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle' AND query_start < NOW() - INTERVAL '30 minutes'; -
Scale up connection pool limits in BonsAPI
-
Check for connection leaks in application code
Slow Query Performance #
Symptoms: High query execution time, timeouts
Investigation:
-- Enable query logging (requires superuser)
ALTER DATABASE bonsai SET log_min_duration_statement = 1000; -- 1 second
-- Check for missing indexes
SELECT
schemaname || '.' || tablename as table,
seq_scan,
seq_tup_read,
idx_scan,
seq_tup_read / seq_scan as avg_seq_tup
FROM pg_stat_user_tables
WHERE seq_scan > 0
ORDER BY seq_tup_read DESC
LIMIT 10;
Solutions:
- Add indexes for frequently queried columns
- Analyze query execution plans:
EXPLAIN ANALYZE <query> - Update table statistics:
ANALYZE table_name; - Consider query optimization or caching
Disk Space Issues #
Symptoms: Database writes failing, “disk full” errors
Investigation:
# Check RDS storage
aws rds describe-db-instances \
--db-instance-identifier bonsai-prod-db \
--query 'DBInstances[0].[AllocatedStorage,StorageType]'
-- Check database size
SELECT
pg_database.datname,
pg_size_pretty(pg_database_size(pg_database.datname)) as size
FROM pg_database
ORDER BY pg_database_size(pg_database.datname) DESC;
Solutions:
- Increase RDS storage (auto-scaling should handle this)
- Archive old data
- VACUUM old tables to reclaim space:
VACUUM FULL table_name;
Database Migrations #
For schema changes and migrations, see:
Checking Migration Status #
# Via Atlas CLI
mise run db:status
# Or check migrations table
psql -c "SELECT version, applied_at FROM atlas_schema_revisions ORDER BY applied_at DESC LIMIT 10;"
Best Practices #
Query Safety #
- Always use
LIMITfor SELECT queries on large tables - Test queries in development first
- Use transactions for any data modifications
- Set statement timeouts to prevent runaway queries
- Never run
DELETEorUPDATEwithoutWHEREclause
Connection Management #
- Close connections when done
- Use connection pooling (built into BonsAPI)
- Don’t leave long-running transactions open
- Monitor connection count regularly
Data Privacy #
- Never export PII without authorization
- Mask sensitive data in logs and outputs
- Follow GDPR and data protection policies
- Audit all production database access
Performance #
- Use appropriate indexes
- Avoid SELECT * in production code
- Regular VACUUM and ANALYZE
- Monitor and optimize slow queries
See Also #
- Database Schema - Complete schema documentation
- Database Migration - Schema change procedures
- CloudBeaver Guide - Web-based database tool
- Kubernetes Debugging - Access pod databases
- Secrets Management - Getting database credentials
- Incident Response - Database-related incidents