📊 Monitoring Guide - Gunei ERP (Enterprise Architecture)
Sistema completo de monitoreo y observabilidad multi-ambiente del sistema Gunei ERP.
Versión 2.1 - Multi-Ambiente Enterprise (Staging + Production)
📋 Tabla de Contenidos
- Descripción General
- Arquitectura de Monitoreo
- Health Checks por Ambiente
- Scripts de Monitoreo
- Logs Centralizados
- Cron Jobs
- Notificaciones Discord
- Monitoreo PostgreSQL Shared
- Comparación de Métricas Entre Ambientes
- Métricas y Alertas
- Dashboard Consolidado
- Troubleshooting
🎯 Descripción General
Objetivos
- Detectar problemas antes de que afecten a usuarios en cada ambiente
- Centralizar logs de múltiples servicios y ambientes
- Automatizar health checks cada 5 minutos (staging + production)
- Notificar fallos vía Discord identificando ambiente
- Mantener histórico de eventos por ambiente
- Monitorear infraestructura compartida (PostgreSQL, Caddy)
Componentes por Ambiente
┌─ STAGING ─────────────────────────┐
│ Frontend Staging → Health Check │
│ Backend Staging → Health Check │
│ ↓ │
│ Logs + Metrics │
└────────────────────────────────────┘
↓
┌─ PRODUCTION ───────────────────────┐
│ Frontend Production → Health Check │
│ Backend Production → Health Check │
│ ↓ │
│ Logs + Metrics │
└────────────────────────────────────┘
↓
┌─ INFRASTRUCTURE ───────────────────┐
│ PostgreSQL Shared → Health Check │
│ Caddy Shared → Health Check │
│ ↓ │
│ Logs + Metrics │
└────────────────────────────────────┘
↓
Logs Centralizados
↓
Discord Webhooks
Filosofía de Monitoreo
- Por Ambiente: Cada ambiente (staging/production) se monitorea independientemente
- Infraestructura Compartida: PostgreSQL y Caddy se monitorean como servicios críticos que afectan a ambos ambientes
- Alertas Contextuales: Las notificaciones identifican claramente el ambiente afectado
- Logs Segregados: Logs separados por ambiente pero centralizados para análisis
- Redundancia: Si staging falla, production puede seguir operando (y viceversa)
🏗️ Arquitectura de Monitoreo
Servicios Monitoreados
| Componente | Tipo | Afecta a | Health Endpoint |
|---|---|---|---|
| Caddy Shared | Infraestructura | Ambos ambientes | N/A (proceso) |
| PostgreSQL Shared | Infraestructura | Ambos ambientes | pg_isready (puerto 5433) |
| Frontend Staging | Aplicación | Staging | https://staging-erpfront.gunei.xyz/health |
| Backend Staging | Aplicación | Staging | https://staging-erpback.gunei.xyz/status |
| Frontend Production | Aplicación | Production | (pendiente URL) |
| Backend Production | Aplicación | Production | (pendiente URL) |
Ubicación de Scripts
/root/scripts/
├── monitor-logs.sh # Ver logs multi-ambiente (actualizado)
├── health-check.sh # Verificar salud completa (actualizado)
├── alert-check.sh # Health check + alertas (actualizado)
├── check-staging.sh # Check solo staging (nuevo)
├── check-production.sh # Check solo production (nuevo)
├── check-infrastructure.sh # Check servicios compartidos (nuevo)
└── metrics-dashboard.sh # Dashboard consolidado (nuevo)
🥦 Health Checks por Ambiente
Endpoints por Ambiente
Staging Environment
Frontend Staging:
GET https://staging-erpfront.gunei.xyz/health
Response:
{
"status": "healthy",
"timestamp": "2026-01-12T12:34:56.789Z",
"service": "gunei-erp-frontend",
"version": "0.0.1",
"runtime": "bun",
"environment": "staging"
}
Backend Staging:
GET https://staging-erpback.gunei.xyz/status
Response:
{
"status": "ok",
"timestamp": "2026-01-12T12:34:56.789Z",
"environment": "staging",
"database": "connected",
"uptime": 123456
}
Production Environment (Cuando esté activo)
Frontend Production:
GET https://erpfront.gunei.xyz/health # URL pendiente configurar
Response:
{
"status": "healthy",
"timestamp": "2026-01-12T12:34:56.789Z",
"service": "gunei-erp-frontend",
"environment": "production"
}
Backend Production:
GET https://erpback.gunei.xyz/status # URL pendiente configurar
Response:
{
"status": "ok",
"timestamp": "2026-01-12T12:34:56.789Z",
"environment": "production",
"database": "connected"
}
Infrastructure (Compartida)
PostgreSQL Shared:
# Check staging database
docker exec postgres-shared pg_isready -U gunei_staging_user -d gunei_erp_staging
# Check production database
docker exec postgres-shared pg_isready -U gunei_prod_user -d gunei_erp_production
# Check server
docker exec postgres-shared pg_isready -U postgres
Caddy Shared:
# Check proceso
docker ps | grep caddy-shared
# Check logs para errores
docker logs caddy-shared --tail 50 | grep -i error
Mapeo de Puertos por Ambiente
| Ambiente | Servicio | Puerto Interno | Puerto Host |
|---|---|---|---|
| Shared | PostgreSQL | 5432 | 5433 |
| Staging | Backend | 3000 | 3000 |
| Staging | Frontend | 3001 | 3001 |
| Production | Backend | 3000 | 3100 |
| Production | Frontend | 3001 | 3101 |
Verificación de puertos:
# Ver todos los puertos en uso
ss -tlnp | grep -E "3000|3001|3100|3101|5433"
# Verificar que no hay conflictos
netstat -tlnp | grep -E ":300[01]|:310[01]|:5433"
Conexiones a Base de Datos por Ambiente
| Ambiente | Database | Usuario | Host |
|---|---|---|---|
| Staging | gunei_erp_staging | gunei_staging_user | postgres-shared:5432 |
| Production | gunei_erp_production | gunei_prod_user | postgres-shared:5432 |
Verificar conexión correcta:
# Staging - debe conectar a gunei_erp_staging
docker exec postgres-shared psql -U gunei_staging_user -d gunei_erp_staging -c "\conninfo"
# Production - debe conectar a gunei_erp_production
docker exec postgres-shared psql -U gunei_prod_user -d gunei_erp_production -c "\conninfo"
# Ver conexiones activas por database
docker exec postgres-shared psql -U postgres -c "SELECT datname, count(*) FROM pg_stat_activity GROUP BY datname;"
Criterios de Salud
Por Servicio:
- Status 200: Sistema operativo
- Status 5xx: Fallo crítico
- Timeout (5s): Servicio no responde
- Database: Conexión verificada
Por Ambiente:
- Staging Healthy: Ambos servicios (frontend + backend) responden OK
- Production Healthy: Ambos servicios responden OK
- Infrastructure Healthy: PostgreSQL + Caddy operativos
🔧 Scripts de Monitoreo
1. monitor-logs.sh (Actualizado)
Propósito: Visualizar logs de todos los servicios y ambientes simultáneamente
#!/bin/bash
# Ver logs multi-ambiente en una sola pantalla
# Uso
/root/scripts/monitor-logs.sh
# Ver logs de un ambiente específico
/root/scripts/monitor-logs.sh staging
/root/scripts/monitor-logs.sh production
# Ver logs de infraestructura
/root/scripts/monitor-logs.sh infrastructure
Ejemplo de implementación actualizada:
#!/bin/bash
# /root/scripts/monitor-logs.sh
ENVIRONMENT=${1:-all}
echo "==================================="
echo "📋 Gunei ERP - Monitor de Logs"
echo "==================================="
echo ""
show_logs() {
service=$1
title=$2
echo "📦 === $title ==="
if docker ps | grep -q $service; then
docker logs --tail 20 $service 2>&1 | tail -10
else
echo "⚠️ Container $service no está corriendo"
fi
echo ""
}
if [ "$ENVIRONMENT" = "all" ] || [ "$ENVIRONMENT" = "infrastructure" ]; then
echo "🏗️ === INFRASTRUCTURE ==="
show_logs "postgres-shared" "PostgreSQL Shared"
show_logs "caddy-shared" "Caddy Shared"
echo ""
fi
if [ "$ENVIRONMENT" = "all" ] || [ "$ENVIRONMENT" = "staging" ]; then
echo "🟢 === STAGING ==="
show_logs "gunei-backend-staging" "Backend Staging"
show_logs "gunei-frontend-staging" "Frontend Staging"
echo ""
fi
if [ "$ENVIRONMENT" = "all" ] || [ "$ENVIRONMENT" = "production" ]; then
if docker ps | grep -q "gunei-backend-production"; then
echo "🔵 === PRODUCTION ==="
show_logs "gunei-backend-production" "Backend Production"
show_logs "gunei-frontend-production" "Frontend Production"
echo ""
fi
fi
echo "📊 === Estado de Containers ==="
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
echo ""
echo "💾 === Uso de Recursos ==="
docker stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"
Servicios monitoreados:
- Infraestructura: PostgreSQL Shared, Caddy Shared
- Staging: Backend Staging, Frontend Staging
- Production: Backend Production, Frontend Production (cuando esté deployado)
Características:
- Output coloreado por ambiente
- Filtrado por ambiente específico
- Timestamps sincronizados
- Detección automática de ambientes activos
2. health-check.sh (Actualizado)
Propósito: Verificar estado del sistema completo por ambiente
#!/bin/bash
# Check completo de salud multi-ambiente
# Uso
/root/scripts/health-check.sh
# Check solo un ambiente
/root/scripts/health-check.sh staging
/root/scripts/health-check.sh production
Ejemplo de implementación actualizada:
#!/bin/bash
# /root/scripts/health-check.sh
ENVIRONMENT=${1:-all}
echo "🥦 Health Check - Gunei ERP (Multi-Environment)"
echo "================================================"
echo ""
# Colores
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
check_service() {
service=$1
url=$2
if curl -f -s -o /dev/null "$url"; then
echo -e "${GREEN}✅ $service: OK${NC}"
return 0
else
echo -e "${RED}❌ $service: FAIL${NC}"
return 1
fi
}
# Infrastructure
echo "🏗️ Infraestructura:"
if docker exec postgres-shared pg_isready -U postgres > /dev/null 2>&1; then
echo -e "${GREEN}✅ PostgreSQL Shared: OK${NC}"
else
echo -e "${RED}❌ PostgreSQL Shared: FAIL${NC}"
fi
if docker ps | grep -q caddy-shared; then
echo -e "${GREEN}✅ Caddy Shared: OK${NC}"
else
echo -e "${RED}❌ Caddy Shared: FAIL${NC}"
fi
# Staging
if [ "$ENVIRONMENT" = "all" ] || [ "$ENVIRONMENT" = "staging" ]; then
echo ""
echo "🟢 Staging Environment:"
check_service "Backend Staging /status" "http://localhost:3000/status"
check_service "Frontend Staging /health" "http://localhost:3001/health"
check_service "HTTPS staging-erpback.gunei.xyz" "https://staging-erpback.gunei.xyz/status"
check_service "HTTPS gunei.xyz" "https://staging-erpfront.gunei.xyz/health"
fi
# Production (si existe)
if [ "$ENVIRONMENT" = "all" ] || [ "$ENVIRONMENT" = "production" ]; then
if docker ps | grep -q "gunei-backend-production"; then
echo ""
echo "🔵 Production Environment:"
check_service "Backend Production /status" "http://localhost:3100/status"
check_service "Frontend Production /health" "http://localhost:3101/health"
# URLs públicas cuando estén configuradas:
# check_service "HTTPS erpback.gunei.xyz" "https://erpback.gunei.xyz/status"
# check_service "HTTPS erpfront.gunei.xyz" "https://erpfront.gunei.xyz/health"
fi
fi
# Databases
echo ""
echo "🗄️ Databases:"
if docker exec postgres-shared psql -U gunei_staging_user -d gunei_erp_staging -c "SELECT 1;" > /dev/null 2>&1; then
echo -e "${GREEN}✅ DB Staging: OK${NC}"
else
echo -e "${RED}❌ DB Staging: FAIL${NC}"
fi
if docker exec postgres-shared psql -U gunei_prod_user -d gunei_erp_production -c "SELECT 1;" > /dev/null 2>&1; then
echo -e "${GREEN}✅ DB Production: OK${NC}"
else
echo -e "${YELLOW}⚠️ DB Production: No disponible o no configurada${NC}"
fi
# Verificar containers corriendo
echo ""
echo "📦 Docker Containers:"
docker ps --format "table {{.Names}}\t{{.Status}}"
# Uso de recursos
echo ""
echo "💾 System Resources:"
df -h / | tail -1 | awk '{print "Disk: "$3" / "$2" ("$5" used)"}'
free -h | grep Mem | awk '{print "RAM: "$3" / "$2" (used/total)"}'
Output ejemplo:
🥦 Health Check - Gunei ERP (Multi-Environment)
================================================
🏗️ Infraestructura:
✅ PostgreSQL Shared: OK
✅ Caddy Shared: OK
🟢 Staging Environment:
✅ Backend Staging /status: OK
✅ Frontend Staging /health: OK
✅ HTTPS staging-erpback.gunei.xyz: OK
✅ HTTPS staging-erpfront.gunei.xyz: OK
🔵 Production Environment:
✅ Backend Production /status: OK
✅ Frontend Production /health: OK
🗄️ Databases:
✅ DB Staging: OK
✅ DB Production: OK
📦 Docker Containers:
NAMES STATUS
caddy-shared Up 5 days
postgres-shared Up 5 days
gunei-backend-staging Up 2 days
gunei-frontend-staging Up 2 days
gunei-backend-production Up 1 day
gunei-frontend-production Up 1 day
💾 System Resources:
Disk: 45G / 200G (23% used)
RAM: 4.2G / 8.0G (used/total)
3. alert-check.sh (Actualizado)
Propósito: Health check automatizado con notificaciones por ambiente
#!/bin/bash
# Ejecutado por cron cada 5 minutos
# Funciones:
# - Ejecuta health checks por ambiente
# - Detecta fallos específicos del ambiente
# - Envía alertas a Discord identificando ambiente
# - Registra en log centralizado con tags de ambiente
Lógica de alertas actualizada:
- Primera falla (staging): Alerta inmediata "🟢 Staging Down"
- Primera falla (production): Alerta inmediata "🔵 Production Down"
- Falla de infraestructura: Alerta crítica "🚨 Infrastructure Down (affects all environments)"
- Fallas consecutivas: Alerta cada 15 minutos
- Recuperación: Notificación de sistema restaurado con downtime
Ejemplo de implementación:
#!/bin/bash
# /root/scripts/alert-check.sh
LOG_FILE="/var/log/gunei-health.log"
STATE_DIR="/var/tmp/gunei-health-state"
DISCORD_WEBHOOK="${DISCORD_WEBHOOK_URL}"
mkdir -p "$STATE_DIR"
log() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
}
check_and_alert() {
ENV=$1
SERVICE=$2
URL=$3
STATE_FILE="$STATE_DIR/${ENV}_${SERVICE}_state"
if curl -f -s -o /dev/null "$URL"; then
# Service is UP
if [ -f "$STATE_FILE" ]; then
# Was down, now recovered
DOWNTIME=$(cat "$STATE_FILE")
log "[INFO] [$ENV] $SERVICE recovered (downtime: $DOWNTIME)"
send_discord_recovery "$ENV" "$SERVICE" "$DOWNTIME"
rm "$STATE_FILE"
fi
else
# Service is DOWN
if [ ! -f "$STATE_FILE" ]; then
# First failure
echo "$(date +%s)" > "$STATE_FILE"
log "[ERROR] [$ENV] $SERVICE is DOWN"
send_discord_alert "$ENV" "$SERVICE" "$URL"
else
# Still down
START_TIME=$(cat "$STATE_FILE")
CURRENT_TIME=$(date +%s)
DOWNTIME=$((CURRENT_TIME - START_TIME))
log "[ERROR] [$ENV] $SERVICE still DOWN (${DOWNTIME}s)"
fi
fi
}
send_discord_alert() {
ENV=$1
SERVICE=$2
URL=$3
if [ "$ENV" = "staging" ]; then
COLOR="3066993" # Verde oscuro
EMOJI="🟢"
elif [ "$ENV" = "production" ]; then
COLOR="15548997" # Rojo
EMOJI="🔵"
else
COLOR="15105570" # Naranja
EMOJI="🚨"
fi
curl -X POST "$DISCORD_WEBHOOK" \
-H "Content-Type: application/json" \
-d "{
\"embeds\": [{
\"title\": \"${EMOJI} ${ENV^^} - ${SERVICE} Down\",
\"color\": ${COLOR},
\"description\": \"Service is not responding\",
\"fields\": [
{\"name\": \"Environment\", \"value\": \"$ENV\", \"inline\": true},
{\"name\": \"Service\", \"value\": \"$SERVICE\", \"inline\": true},
{\"name\": \"URL\", \"value\": \"$URL\"},
{\"name\": \"Time\", \"value\": \"$(date '+%Y-%m-%d %H:%M:%S')\"}
]
}]
}"
}
send_discord_recovery() {
ENV=$1
SERVICE=$2
DOWNTIME=$3
curl -X POST "$DISCORD_WEBHOOK" \
-H "Content-Type: application/json" \
-d "{
\"embeds\": [{
\"title\": \"✅ ${ENV^^} - ${SERVICE} Recovered\",
\"color\": 5763719,
\"description\": \"Service is responding normally\",
\"fields\": [
{\"name\": \"Environment\", \"value\": \"$ENV\", \"inline\": true},
{\"name\": \"Downtime\", \"value\": \"${DOWNTIME} seconds\", \"inline\": true}
]
}]
}"
}
# Main - Check all services
log "[INFO] === Health Check Started ==="
# Infrastructure
check_and_alert "infrastructure" "PostgreSQL" "direct-pg-check"
check_and_alert "infrastructure" "Caddy" "docker-ps-check"
# Staging
check_and_alert "staging" "Backend" "https://staging-erpback.gunei.xyz/status"
check_and_alert "staging" "Frontend" "https://staging-erpfront.gunei.xyz/health"
# Production (si está activo)
if docker ps | grep -q "gunei-backend-production"; then
check_and_alert "production" "Backend" "http://localhost:3100/status"
check_and_alert "production" "Frontend" "http://localhost:3101/health"
fi
log "[INFO] === Health Check Completed ==="
4. check-staging.sh (Nuevo)
Propósito: Health check rápido solo del ambiente staging
#!/bin/bash
# Check rápido de staging
/root/scripts/health-check.sh staging
5. check-production.sh (Nuevo)
Propósito: Health check rápido solo del ambiente production
#!/bin/bash
# Check rápido de production
/root/scripts/health-check.sh production
6. check-infrastructure.sh (Nuevo)
Propósito: Health check de servicios compartidos críticos
#!/bin/bash
# Check de infraestructura compartida
echo "🏗️ Infrastructure Health Check"
echo "=============================="
echo ""
# PostgreSQL
echo "PostgreSQL Shared:"
if docker exec postgres-shared pg_isready -U postgres > /dev/null 2>&1; then
echo " ✅ Server: OK"
else
echo " ❌ Server: FAIL"
fi
if docker exec postgres-shared psql -U gunei_staging_user -d gunei_erp_staging -c "SELECT 1;" > /dev/null 2>&1; then
echo " ✅ Staging DB: OK"
else
echo " ❌ Staging DB: FAIL"
fi
if docker exec postgres-shared psql -U gunei_prod_user -d gunei_erp_production -c "SELECT 1;" > /dev/null 2>&1; then
echo " ✅ Production DB: OK"
else
echo " ⚠️ Production DB: Not available"
fi
# Caddy
echo ""
echo "Caddy Shared:"
if docker ps | grep -q caddy-shared; then
echo " ✅ Container: Running"
# Check SSL certificates
CERT_COUNT=$(docker exec caddy-shared ls /data/caddy/certificates/ 2>/dev/null | wc -l)
echo " ✅ SSL Certificates: $CERT_COUNT"
# Check logs for errors
ERROR_COUNT=$(docker logs caddy-shared --tail 100 2>&1 | grep -i error | wc -l)
if [ "$ERROR_COUNT" -eq 0 ]; then
echo " ✅ No recent errors"
else
echo " ⚠️ Recent errors: $ERROR_COUNT"
fi
else
echo " ❌ Container: Not running"
fi
# Disk space
echo ""
echo "System Resources:"
DISK_USAGE=$(df -h / | awk 'NR==2 {print $5}' | tr -d '%')
echo " Disk: $DISK_USAGE% used"
if [ "$DISK_USAGE" -gt 80 ]; then
echo " ⚠️ WARNING: High disk usage"
fi
MEM_USAGE=$(free | awk 'NR==2 {printf "%.0f", $3/$2*100}')
echo " Memory: $MEM_USAGE% used"
if [ "$MEM_USAGE" -gt 85 ]; then
echo " ⚠️ WARNING: High memory usage"
fi
7. metrics-dashboard.sh (Nuevo)
Propósito: Dashboard consolidado de métricas multi-ambiente
Ver sección Dashboard Consolidado para implementación completa.
📝 Logs Centralizados
Archivo Principal
/var/log/gunei-health.log
Formato de Logs Actualizado
[TIMESTAMP] [LEVEL] [ENVIRONMENT] [COMPONENT] Message
Ejemplo:
[2026-01-12 12:34:56] [INFO] [STAGING] [HEALTH_CHECK] Backend health: OK
[2026-01-12 12:39:56] [ERROR] [STAGING] [HEALTH_CHECK] Backend not responding (timeout)
[2026-01-12 12:40:01] [ERROR] [STAGING] [ALERT] Discord notification sent: Backend down
[2026-01-12 12:44:56] [INFO] [STAGING] [HEALTH_CHECK] Backend health: OK
[2026-01-12 12:45:01] [INFO] [STAGING] [ALERT] Discord notification sent: Backend recovered
[2026-01-12 13:15:22] [INFO] [PRODUCTION] [HEALTH_CHECK] Backend health: OK
[2026-01-12 13:15:22] [INFO] [PRODUCTION] [HEALTH_CHECK] Frontend health: OK
[2026-01-12 13:20:30] [ERROR] [INFRASTRUCTURE] [HEALTH_CHECK] PostgreSQL connection slow
[2026-01-12 13:20:30] [WARNING] [INFRASTRUCTURE] [ALERT] PostgreSQL performance degraded
Rotación de Logs
# Configuración: /etc/logrotate.d/gunei-health
/var/log/gunei-health.log {
daily
rotate 30
compress
delaycompress
missingok
notifempty
size 100M
}
Ver Logs por Ambiente
# Logs de staging
grep "\[STAGING\]" /var/log/gunei-health.log | tail -n 50
# Logs de production
grep "\[PRODUCTION\]" /var/log/gunei-health.log | tail -n 50
# Logs de infraestructura
grep "\[INFRASTRUCTURE\]" /var/log/gunei-health.log | tail -n 50
# Errores de staging
grep "\[STAGING\]" /var/log/gunei-health.log | grep ERROR
# Alertas de production
grep "\[PRODUCTION\]" /var/log/gunei-health.log | grep ALERT
# Tiempo real por ambiente
tail -f /var/log/gunei-health.log | grep "\[STAGING\]"
tail -f /var/log/gunei-health.log | grep "\[PRODUCTION\]"
Docker Logging Configuration
Todos los contenedores usan logging JSON con rotación automática:
logging:
driver: "json-file"
options:
max-size: "10m" # Máximo 10MB por archivo
max-file: "3" # Máximo 3 archivos
tag: "{{.Name}}/{{.ID}}"
Verificación:
# Ver configuración de logging de un container
docker inspect gunei-backend-staging | grep -A 10 LogConfig
# Los logs rotan automáticamente (no llenan disco)
docker logs gunei-backend-staging --tail 100
Beneficios:
- Logs no llenan disco (máximo ~30MB por container)
- Rotación automática sin intervención
- Tags identifican container en logs centralizados
Timezone
Todos los contenedores usan timezone Argentina:
environment:
- TZ=America/Argentina/Buenos_Aires
Verificación:
# Ver timezone de cada container
docker exec gunei-backend-staging date
docker exec gunei-frontend-staging date
docker exec postgres-shared date
# Output esperado: hora Argentina (UTC-3)
Implicaciones:
- Logs con timestamps en hora Argentina
- Cron jobs ejecutan en hora local del servidor
- Backups con nomenclatura en hora Argentina
Logs de Contenedores por Ambiente
Staging:
# Backend staging
docker logs gunei-backend-staging -f
docker logs gunei-backend-staging --tail 100
docker logs gunei-backend-staging --since 1h
# Frontend staging
docker logs gunei-frontend-staging -f
docker logs gunei-frontend-staging --tail 100
Production:
# Backend production
docker logs gunei-backend-production -f
docker logs gunei-backend-production --tail 100
# Frontend production
docker logs gunei-frontend-production -f
docker logs gunei-frontend-production --tail 100
Infrastructure:
# PostgreSQL (afecta ambos ambientes)
docker logs postgres-shared -f
docker logs postgres-shared --tail 100
# Caddy (routing de ambos ambientes)
docker logs caddy-shared -f
docker logs caddy-shared --tail 100
# Logs de acceso de Caddy por ambiente
docker exec caddy-shared cat /var/log/caddy/staging-erpfront.log | tail -n 50
docker exec caddy-shared cat /var/log/caddy/staging-erpback.log | tail -n 50
docker exec caddy-shared cat /var/log/caddy/production-erpfront.log | tail -n 50
docker exec caddy-shared cat /var/log/caddy/production-erpback.log | tail -n 50
⏰ Cron Jobs
Configuración Actual
# Ver crontab
crontab -l
# Editar crontab
crontab -e
Schedule Completo (Actualizado)
# Health checks cada 5 minutos (multi-ambiente)
*/5 * * * * /root/scripts/alert-check.sh >> /var/log/gunei-health.log 2>&1
# Backups diarios 2 AM (ambos ambientes)
0 2 * * * /root/scripts/backup-postgres.sh >> /var/log/gunei-backups.log 2>&1
# Verificación de backups 3 AM
0 3 * * * /root/scripts/verify-backup.sh >> /var/log/gunei-backups.log 2>&1
# Cleanup semanal (domingos 4 AM)
0 4 * * 0 /root/scripts/cleanup-backups.sh >> /var/log/gunei-backups.log 2>&1
# Check de disco diario 5 AM
0 5 * * * /root/scripts/check-disk-space.sh >> /var/log/gunei-health.log 2>&1
# Metrics dashboard cada hora (opcional)
0 * * * * /root/scripts/metrics-dashboard.sh >> /var/log/gunei-metrics.log 2>&1
Verificar Ejecución
# Ver último run de health checks (cualquier ambiente)
grep "HEALTH_CHECK" /var/log/gunei-health.log | tail -n 1
# Ver último run por ambiente
grep "\[STAGING\]" /var/log/gunei-health.log | grep "HEALTH_CHECK" | tail -n 1
grep "\[PRODUCTION\]" /var/log/gunei-health.log | grep "HEALTH_CHECK" | tail -n 1
# Verificar que cron está activo
systemctl status cron
# Ver logs de cron
grep CRON /var/log/syslog | tail -n 20
# Ver ejecuciones recientes de alert-check
grep "alert-check.sh" /var/log/syslog | tail -n 10
📢 Notificaciones Discord
Webhook Configurado
# Variable de entorno (en GitHub Secrets y VPS)
DISCORD_WEBHOOK_URL=https://discord.com/api/webhooks/...
Tipos de Notificaciones Actualizadas
1. Deployment Success (Staging)
{
"embeds": [{
"title": "✅ Deployment Successful - Staging",
"color": 5763719,
"fields": [
{"name": "Environment", "value": "staging", "inline": true},
{"name": "Branch", "value": "develop", "inline": true},
{"name": "Commit", "value": "abc1234"},
{"name": "Duration", "value": "2m 34s"}
]
}]
}
2. Deployment Success (Production)
{
"embeds": [{
"title": "✅ Deployment Successful - Production",
"color": 3447003,
"fields": [
{"name": "Environment", "value": "production", "inline": true},
{"name": "Branch", "value": "main", "inline": true},
{"name": "Commit", "value": "abc1234"},
{"name": "Duration", "value": "2m 45s"}
]
}]
}
3. Staging Backend Down
{
"embeds": [{
"title": "🟢 Staging - Backend Down",
"color": 3066993,
"description": "Backend staging is not responding",
"fields": [
{"name": "Environment", "value": "staging", "inline": true},
{"name": "Service", "value": "Backend", "inline": true},
{"name": "Endpoint", "value": "https://staging-erpback.gunei.xyz/status"},
{"name": "Response", "value": "Timeout (5s)"},
{"name": "Time", "value": "2026-01-12 14:30:00"}
]
}]
}
4. Production Backend Down (Crítico)
{
"embeds": [{
"title": "🔵 Production - Backend Down",
"color": 15548997,
"description": "🚨 CRITICAL: Production backend is not responding",
"fields": [
{"name": "Environment", "value": "production", "inline": true},
{"name": "Service", "value": "Backend", "inline": true},
{"name": "Endpoint", "value": "https://erpback.gunei.xyz/status"},
{"name": "Response", "value": "Timeout (5s)"},
{"name": "Time", "value": "2026-01-12 14:30:00"},
{"name": "Action", "value": "@here Immediate attention required"}
]
}]
}
5. Infrastructure Down (Muy Crítico)
{
"embeds": [{
"title": "🚨 Infrastructure Down - Affects All Environments",
"color": 15105570,
"description": "CRITICAL: Shared infrastructure is failing",
"fields": [
{"name": "Component", "value": "PostgreSQL Shared", "inline": true},
{"name": "Impact", "value": "All environments", "inline": true},
{"name": "Time", "value": "2026-01-12 14:30:00"},
{"name": "Action", "value": "@everyone CRITICAL - ALL ENVIRONMENTS AFFECTED"}
]
}]
}
6. Service Recovered
{
"embeds": [{
"title": "✅ Staging - Backend Recovered",
"color": 5763719,
"description": "Backend staging is responding normally",
"fields": [
{"name": "Environment", "value": "staging", "inline": true},
{"name": "Downtime", "value": "15 minutes", "inline": true}
]
}]
}
Rate Limits
- Discord: 30 mensajes / minuto
- Nuestro sistema: ~2-4 mensajes / 5 minutos (staging + production checks)
- Infraestructura: ~1 mensaje / 5 minutos
Total máximo: ~5-6 mensajes / 5 minutos en operación normal
🐘 Monitoreo PostgreSQL Shared
Configuración del Servidor
# Puerto expuesto al host
Puerto: 5433
# Conexión desde el host
psql -h localhost -p 5433 -U postgres
# Conexión desde otros containers (via Docker network)
Host: postgres-shared
Puerto: 5432 (puerto interno)
Health Checks PostgreSQL
# Check básico - servidor responde
docker exec postgres-shared pg_isready -U postgres -p 5432
# Check staging database
docker exec postgres-shared pg_isready -U gunei_staging_user -d gunei_erp_staging
# Check production database
docker exec postgres-shared pg_isready -U gunei_prod_user -d gunei_erp_production
Métricas de PostgreSQL
# Conexiones activas por database
docker exec postgres-shared psql -U postgres -c "
SELECT datname as database,
count(*) as connections,
count(*) FILTER (WHERE state = 'active') as active,
count(*) FILTER (WHERE state = 'idle') as idle
FROM pg_stat_activity
WHERE datname IS NOT NULL
GROUP BY datname;"
# Tamaño de databases
docker exec postgres-shared psql -U postgres -c "
SELECT datname as database,
pg_size_pretty(pg_database_size(datname)) as size
FROM pg_database
WHERE datname LIKE 'gunei_%';"
# Queries lentas (últimas 10)
docker exec postgres-shared psql -U postgres -c "
SELECT pid, datname, usename,
now() - query_start as duration,
left(query, 50) as query_preview
FROM pg_stat_activity
WHERE state = 'active' AND query NOT LIKE '%pg_stat_activity%'
ORDER BY duration DESC
LIMIT 10;"
# Cache hit ratio (debería ser > 99%)
docker exec postgres-shared psql -U postgres -c "
SELECT datname,
round(100.0 * blks_hit / nullif(blks_hit + blks_read, 0), 2) as cache_hit_ratio
FROM pg_stat_database
WHERE datname LIKE 'gunei_%';"
Alertas PostgreSQL
| Métrica | Warning | Critical |
|---|---|---|
| Conexiones totales | > 80 | > 95 |
| Conexiones staging | > 40 | > 50 |
| Conexiones production | > 40 | > 50 |
| Cache hit ratio | < 95% | < 90% |
| Query duration | > 5s | > 30s |
📊 Comparación de Métricas Entre Ambientes
Script: compare-environments.sh
#!/bin/bash
# /root/scripts/compare-environments.sh
echo "📊 Comparación Staging vs Production"
echo "====================================="
echo ""
# Response times
echo "⏱️ Response Times:"
STAGING_TIME=$(curl -o /dev/null -s -w '%{time_total}' https://staging-erpback.gunei.xyz/status)
echo " Staging Backend: ${STAGING_TIME}s"
if docker ps | grep -q "gunei-backend-production"; then
PROD_TIME=$(curl -o /dev/null -s -w '%{time_total}' http://localhost:3100/status)
echo " Production Backend: ${PROD_TIME}s"
fi
echo ""
# Container resources
echo "💾 Container Resources:"
echo " CONTAINER CPU% MEM USAGE"
docker stats --no-stream --format " {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}" | grep gunei
echo ""
# Database connections
echo "🗄️ Database Connections:"
docker exec postgres-shared psql -U postgres -t -c "
SELECT ' ' || datname || ': ' || count(*) || ' connections'
FROM pg_stat_activity
WHERE datname LIKE 'gunei_%'
GROUP BY datname;"
echo ""
# Database sizes
echo "📦 Database Sizes:"
docker exec postgres-shared psql -U postgres -t -c "
SELECT ' ' || datname || ': ' || pg_size_pretty(pg_database_size(datname))
FROM pg_database
WHERE datname LIKE 'gunei_%';"
Métricas Comparativas
| Métrica | Staging | Production | Notas |
|---|---|---|---|
| Response time | < 500ms | < 200ms | Prod debe ser más rápido |
| Memory usage | < 512MB | < 1GB | Prod puede usar más recursos |
| DB connections | < 20 | < 50 | Prod tiene más carga |
| Error rate | < 5% | < 0.1% | Prod debe ser más estable |
📈 Métricas y Alertas
Métricas Monitoreadas por Ambiente
Sistema (Compartido)
- CPU usage: Via
top/htop - Memory usage: Via
free -h - Disk space: Via
df -h - Network: Via
netstat/ss
Staging
- Response time: Health check timing
- HTTP status codes: 200, 5xx
- Database connections: Conexiones activas a
gunei_erp_staging - Container health: Docker ps status staging
Production
- Response time: Health check timing
- HTTP status codes: 200, 5xx
- Database connections: Conexiones activas a
gunei_erp_production - Container health: Docker ps status production
Infrastructure
- PostgreSQL: Conexiones totales, queries/segundo, cache hit ratio
- Caddy: Requests/segundo, errores 5xx, SSL certificate expiry
Umbrales de Alerta
# Disk space (sistema)
WARNING: 80%
CRITICAL: 90%
# Memory (sistema)
WARNING: 85%
CRITICAL: 95%
# Response time (por ambiente)
WARNING: > 2 segundos
CRITICAL: > 5 segundos (timeout)
# Consecutive failures (por ambiente)
STAGING: 2 fallos consecutivos → alerta
PRODUCTION: 1 fallo → alerta inmediata
INFRASTRUCTURE: 1 fallo → alerta crítica inmediata
# PostgreSQL connections (compartido)
WARNING: > 80 conexiones
CRITICAL: > 95 conexiones (max 100)
📊 Dashboard Consolidado
Script: metrics-dashboard.sh (Nuevo)
Propósito: Dashboard visual consolidado de todos los ambientes
#!/bin/bash
# /root/scripts/metrics-dashboard.sh
echo "╔═══════════════════════════════════════════════════════════╗"
echo "║ Gunei ERP - Multi-Environment Dashboard ║"
echo "╚═══════════════════════════════════════════════════════════╝"
echo ""
# Function to check service
check_url() {
if curl -f -s -o /dev/null "$1"; then
echo "✅"
else
echo "❌"
fi
}
# Infrastructure
echo "┌─ Infrastructure (Shared) ──────────────────────────────────"
POSTGRES_STATUS=$(docker exec postgres-shared pg_isready -U postgres 2>/dev/null && echo "✅" || echo "❌")
CADDY_STATUS=$(docker ps | grep -q caddy-shared && echo "✅" || echo "❌")
echo "│ PostgreSQL Shared: $POSTGRES_STATUS"
echo "│ Caddy Shared: $CADDY_STATUS"
echo "└────────────────────────────────────────────────────────────"
echo ""
# Staging
echo "┌─ Staging Environment 🟢 ───────────────────────────────────"
STAGING_BACKEND=$(check_url "https://staging-erpback.gunei.xyz/status")
STAGING_FRONTEND=$(check_url "https://staging-erpfront.gunei.xyz/health")
STAGING_DB=$(docker exec postgres-shared psql -U gunei_staging_user -d gunei_erp_staging -c "SELECT 1;" > /dev/null 2>&1 && echo "✅" || echo "❌")
echo "│ Backend: $STAGING_BACKEND https://staging-erpback.gunei.xyz"
echo "│ Frontend: $STAGING_FRONTEND https://staging-erpfront.gunei.xyz"
echo "│ Database: $STAGING_DB gunei_erp_staging"
# Container status
STAGING_BACKEND_CONTAINER=$(docker ps | grep gunei-backend-staging | awk '{print $7, $8, $9}')
STAGING_FRONTEND_CONTAINER=$(docker ps | grep gunei-frontend-staging | awk '{print $7, $8, $9}')
echo "│ Backend container: Up $STAGING_BACKEND_CONTAINER"
echo "│ Frontend container: Up $STAGING_FRONTEND_CONTAINER"
echo "└────────────────────────────────────────────────────────────"
echo ""
# Production (si está activo)
if docker ps | grep -q "gunei-backend-production"; then
echo "┌─ Production Environment 🔵 ─────────────────────────────────"
PROD_BACKEND=$(check_url "http://localhost:3100/status")
PROD_FRONTEND=$(check_url "http://localhost:3101/health")
PROD_DB=$(docker exec postgres-shared psql -U gunei_prod_user -d gunei_erp_production -c "SELECT 1;" > /dev/null 2>&1 && echo "✅" || echo "❌")
echo "│ Backend: $PROD_BACKEND (port 3100)"
echo "│ Frontend: $PROD_FRONTEND (port 3101)"
echo "│ Database: $PROD_DB gunei_erp_production"
PROD_BACKEND_CONTAINER=$(docker ps | grep gunei-backend-production | awk '{print $7, $8, $9}')
PROD_FRONTEND_CONTAINER=$(docker ps | grep gunei-frontend-production | awk '{print $7, $8, $9}')
echo "│ Backend container: Up $PROD_BACKEND_CONTAINER"
echo "│ Frontend container: Up $PROD_FRONTEND_CONTAINER"
echo "└────────────────────────────────────────────────────────────"
echo ""
fi
# System Resources
echo "┌─ System Resources ──────────────────────────────────────────"
DISK_USAGE=$(df -h / | awk 'NR==2 {print $5}')
MEM_USAGE=$(free -h | awk '/^Mem:/ {print $3 "/" $2}')
CPU_LOAD=$(uptime | awk -F'load average:' '{print $2}' | awk '{print $1}')
echo "│ Disk: $DISK_USAGE used"
echo "│ Memory: $MEM_USAGE"
echo "│ Load: $CPU_LOAD"
echo "└────────────────────────────────────────────────────────────"
echo ""
# Recent Events
echo "┌─ Recent Events (Last 5) ────────────────────────────────────"
grep -E "\[ERROR\]|\[WARNING\]" /var/log/gunei-health.log | tail -n 5 | while read line; do
echo "│ $line"
done
echo "└────────────────────────────────────────────────────────────"
echo ""
echo "Last updated: $(date '+%Y-%m-%d %H:%M:%S')"
Uso:
# Ejecutar dashboard
/root/scripts/metrics-dashboard.sh
# Ejecutar cada hora automáticamente (ya configurado en cron)
# O ejecutar manualmente cuando se necesite
Output ejemplo:
╔═══════════════════════════════════════════════════════════╗
║ Gunei ERP - Multi-Environment Dashboard ║
╚═══════════════════════════════════════════════════════════╝
┌─ Infrastructure (Shared) ──────────────────────────────────
│ PostgreSQL Shared: ✅
│ Caddy Shared: ✅
└────────────────────────────────────────────────────────────
┌─ Staging Environment 🟢 ───────────────────────────────────
│ Backend: ✅ https://staging-erpback.gunei.xyz
│ Frontend: ✅ https://staging-erpfront.gunei.xyz
│ Database: ✅ gunei_erp_staging
│ Backend container: Up 2 days
│ Frontend container: Up 2 days
└────────────────────────────────────────────────────────────
┌─ Production Environment 🔵 ─────────────────────────────────
│ Backend: ✅ (port 3100)
│ Frontend: ✅ (port 3101)
│ Database: ✅ gunei_erp_production
│ Backend container: Up 1 day
│ Frontend container: Up 1 day
└────────────────────────────────────────────────────────────
┌─ System Resources ──────────────────────────────────────────
│ Disk: 23% used
│ Memory: 4.2G/8.0G
│ Load: 0.45
└────────────────────────────────────────────────────────────
┌─ Recent Events (Last 5) ────────────────────────────────────
│ [2026-01-12 13:15:00] [WARNING] [STAGING] High response time
└────────────────────────────────────────────────────────────
Last updated: 2026-01-12 14:30:00
🔧 Troubleshooting
Health Check Fallando en Staging pero Production OK
Síntoma: Alertas de staging pero production funciona normal
Diagnóstico:
# Ver logs específicos de staging
grep "\[STAGING\]" /var/log/gunei-health.log | tail -n 20
# Check manual de staging
curl -v https://staging-erpback.gunei.xyz/status
curl -v https://staging-erpfront.gunei.xyz/health
# Ver logs de containers staging
docker logs gunei-backend-staging --tail 50
docker logs gunei-frontend-staging --tail 50
# Verificar que no hay conflicto de recursos con production
docker stats --no-stream | grep staging
Solución:
# Si staging está lento o crasheando
docker restart gunei-backend-staging
docker restart gunei-frontend-staging
# Verificar database staging específicamente
docker exec postgres-shared psql -U gunei_staging_user -d gunei_erp_staging -c "SELECT 1;"
Production Down pero Staging OK
Síntoma: Production no responde, staging operativo
Diagnóstico:
# Ver logs de production
grep "\[PRODUCTION\]" /var/log/gunei-health.log | tail -n 20
# Check manual de production
curl -v http://localhost:3100/status
curl -v http://localhost:3101/health
# Ver containers production
docker logs gunei-backend-production --tail 50
docker logs gunei-frontend-production --tail 50
Solución:
# Reiniciar production services
cd /opt/apps/gunei-erp/backend/production
docker compose restart backend
cd /opt/apps/gunei-erp/frontend/production
docker compose restart frontend
Infrastructure Down (Afecta Ambos Ambientes)
Síntoma: Tanto staging como production fallan simultáneamente
Diagnóstico:
# Check infrastructure
/root/scripts/check-infrastructure.sh
# Ver logs de PostgreSQL
docker logs postgres-shared --tail 100
# Ver logs de Caddy
docker logs caddy-shared --tail 100
# Verificar que los contenedores están corriendo
docker ps | grep -E "postgres-shared|caddy-shared"
Solución:
# Reiniciar PostgreSQL (CUIDADO: afecta ambos ambientes)
docker restart postgres-shared
sleep 10
# Reiniciar Caddy
docker restart caddy-shared
# Verificar que todo volvió
/root/scripts/health-check.sh
Demasiadas Notificaciones Discord (Multi-Ambiente)
Síntoma: Spam de alertas de múltiples ambientes
Diagnóstico:
# Ver frecuencia de alertas por ambiente
grep "Discord notification sent" /var/log/gunei-health.log | grep "\[STAGING\]" | tail -n 20
grep "Discord notification sent" /var/log/gunei-health.log | grep "\[PRODUCTION\]" | tail -n 20
# Identificar ambiente problemático
grep "\[ERROR\]" /var/log/gunei-health.log | tail -n 50
Solución:
# Implementar cooldown por ambiente en alert-check.sh
# O ajustar frecuencia de cron
# O implementar diferentes umbrales por ambiente:
# - Staging: 3 fallos consecutivos antes de alertar
# - Production: 1 fallo inmediato
Logs No Rotan / Disco Lleno
Síntoma: Disco > 90%, logs gigantes de múltiples ambientes
Diagnóstico:
# Ver tamaño de logs
du -sh /var/log/gunei-*.log
# Ver tamaño por container
du -sh /var/lib/docker/containers/*/
Solución:
# Forzar rotación
logrotate -f /etc/logrotate.d/gunei-health
# Limpiar logs viejos
find /var/log -name "*.gz" -mtime +30 -delete
# Limpiar logs de Docker
docker system prune -a --volumes
Cron No Ejecuta Scripts
Ver troubleshooting en versión anterior - no cambia.
Conflictos Entre Ambientes
Síntoma: Un ambiente afecta al otro (staging lento cuando production está bajo carga, o viceversa)
Diagnóstico:
# Ver recursos por container
docker stats --no-stream | grep gunei
# Ver conexiones de DB por ambiente
docker exec postgres-shared psql -U postgres -c "
SELECT datname, count(*), state
FROM pg_stat_activity
WHERE datname LIKE 'gunei_%'
GROUP BY datname, state
ORDER BY datname;"
# Verificar que cada ambiente usa su DB correcta
docker logs gunei-backend-staging 2>&1 | grep -i "database\|connection"
docker logs gunei-backend-production 2>&1 | grep -i "database\|connection"
Posibles causas:
- Conexiones DB cruzadas: Backend staging conectando a DB production
- Puertos incorrectos: Production usando puerto de staging
- Recursos compartidos saturados: PostgreSQL/Caddy al límite
- Variables de entorno incorrectas:
DATABASE_URLapuntando al ambiente incorrecto
Solución:
# Verificar puertos no colisionan
ss -tlnp | grep -E "3000|3001|3100|3101"
# Output esperado:
# :3000 - gunei-backend-staging
# :3001 - gunei-frontend-staging
# :3100 - gunei-backend-production
# :3101 - gunei-frontend-production
# Si hay conflicto, recrear container con puerto correcto
# Staging backend debe mapear 3000:3000
# Production backend debe mapear 3100:3000
# Verificar DATABASE_URL en cada ambiente
docker exec gunei-backend-staging printenv | grep DATABASE
# Debe mostrar: gunei_erp_staging
docker exec gunei-backend-production printenv | grep DATABASE
# Debe mostrar: gunei_erp_production
# Si está mal configurado, actualizar docker-compose y recrear
Puerto en Uso por Otro Proceso
Síntoma: Container no inicia, error "port already in use"
Diagnóstico:
# Ver qué proceso usa el puerto
lsof -i :3000
lsof -i :3001
lsof -i :3100
lsof -i :3101
lsof -i :5433
# Ver todos los containers (incluso detenidos)
docker ps -a | grep gunei
Solución:
# Si es un container zombie
docker rm -f <container_id>
# Si es otro proceso
kill <PID>
# Reiniciar el container afectado
docker compose up -d
Database Connections Exhausted
Síntoma: "too many connections" error en logs de backend
Diagnóstico:
# Ver conexiones totales
docker exec postgres-shared psql -U postgres -c "SELECT count(*) FROM pg_stat_activity;"
# Ver por database
docker exec postgres-shared psql -U postgres -c "
SELECT datname, count(*)
FROM pg_stat_activity
WHERE datname IS NOT NULL
GROUP BY datname;"
# Ver conexiones idle antiguas
docker exec postgres-shared psql -U postgres -c "
SELECT datname, usename, state, query_start, now() - query_start as duration
FROM pg_stat_activity
WHERE state = 'idle'
ORDER BY duration DESC
LIMIT 20;"
Solución:
# Terminar conexiones idle antiguas de staging
docker exec postgres-shared psql -U postgres -c "
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = 'gunei_erp_staging'
AND state = 'idle'
AND query_start < now() - interval '30 minutes';"
# Igual para production si es necesario (con más cuidado)
docker exec postgres-shared psql -U postgres -c "
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = 'gunei_erp_production'
AND state = 'idle'
AND query_start < now() - interval '1 hour';"
# Reiniciar backends para limpiar connection pools
docker restart gunei-backend-staging
docker restart gunei-backend-production
📚 Referencias
Scripts de Monitoreo
/root/scripts/monitor-logs.sh: Ver logs multi-ambiente/root/scripts/health-check.sh: Check completo/root/scripts/alert-check.sh: Alertas automatizadas/root/scripts/check-staging.sh: Check rápido staging/root/scripts/check-production.sh: Check rápido production/root/scripts/check-infrastructure.sh: Check infraestructura/root/scripts/metrics-dashboard.sh: Dashboard consolidado/root/scripts/compare-environments.sh: Comparar staging vs production
URLs de Monitoreo
Staging:
- Frontend: https://staging-erpfront.gunei.xyz/health
- Backend: https://staging-erpback.gunei.xyz/status
Production
- Frontend: https://erpfront.gunei.xyz/health
- Backend: https://erpback.gunei.xyz/status
Última actualización: 14 Enero 2026 Versión: 2.1.0
Cambios en v2.1:
- ✅ Documentación detallada de PostgreSQL Shared (puerto 5433)
- ✅ Mapeo de puertos por ambiente (staging 3000/3001, production 3100/3101)
- ✅ Conexiones a DB por ambiente documentadas
- ✅ Métricas PostgreSQL: conexiones, tamaños, queries lentas, cache hit ratio
- ✅ Comparación de métricas entre ambientes (compare-environments.sh)
- ✅ Docker Logging Configuration (json-file, rotación automática, max 10m/3 files)
- ✅ Timezone Argentina (TZ=America/Argentina/Buenos_Aires)
- ✅ Troubleshooting: conflictos entre ambientes
- ✅ Troubleshooting: puertos en uso
- ✅ Troubleshooting: database connections exhausted
Cambios en v2.0:
- ✅ Soporte multi-ambiente (staging + production)
- ✅ Monitoreo de infraestructura compartida (PostgreSQL, Caddy)
- ✅ Scripts actualizados con detección automática de ambientes
- ✅ Logs centralizados con tags de ambiente
- ✅ Notificaciones Discord contextuales por ambiente
- ✅ Dashboard consolidado multi-ambiente
- ✅ Health checks independientes por ambiente
- ✅ Troubleshooting específico multi-ambiente
- ✅ Nuevos scripts: check-staging, check-production, check-infrastructure, metrics-dashboard