监控告警演练 · 2026-04-26 06:04:00

公网入口、API 健康、Relay 与恢复步骤已统一巡检

本页用于客户和运维确认上线前的监控告警准备状态。当前巡检覆盖管理后台、供应商端、发布中心、数据库演练、行动清单、API health 和 NEXUS relay。

PASS演练结论
8/8通过检查
0超过 3 秒入口
0Relay 异常

巡检明细

API 健康200 / 222ms / relay pc1PASS
项目进度看板200 / 464ms / relay pc1PASS
管理后台200 / 136ms / relay pc1PASS
供应商端200 / 208ms / relay pc1PASS
发布中心200 / 499ms / relay pc1PASS
数据库演练页200 / 1068ms / relay pc1PASS
数据库演练 JSON200 / 1796ms / relay pc1PASS
行动清单200 / 304ms / relay pc1PASS

告警策略

  • 任一公网入口非 200 或关键文案缺失时,标记为 P0 告警。
  • API health 非 success 或服务名不匹配时,标记为 P0 告警。
  • 公网响应未经过 pc1 relay 或超过 3 秒时,标记为 P1 运维检查。
  • 数据库演练 JSON 非 PASS 时,暂停真实试运行发布。

恢复步骤

  • 确认 WSL2 API 进程:ps -ef | grep "node src/main.js"。
  • 重启 API:cd /wwwroot/xia/apps/api && setsid env PORT=3300 node src/main.js >/tmp/xia-api.log 2>&1 < /dev/null &。
  • 校验 Nginx:nginx -t && nginx -s reload。
  • 重新执行 smoke:pnpm -C /wwwroot/xia smoke:v0.1。
  • 若 WSL2 IP 变化,按 AGENTS.md 规则重新发布 NEXUS relay。

机器可读报告

打开 report.json

{
  "title": "龙虾供应链配送系统监控告警演练报告",
  "updatedAt": "2026-04-26 06:04:00",
  "status": "PASS",
  "summary": {
    "total": 8,
    "passed": 8,
    "failed": 0,
    "slow": 0,
    "relayMiss": 0
  },
  "thresholds": {
    "httpStatus": 200,
    "publicRelay": "X-NEXUS-Relay: pc1",
    "slowResponseMs": 3000
  },
  "checks": [
    {
      "name": "API 健康",
      "url": "https://xia.shenliu.cc/xia-api/health",
      "statusCode": 200,
      "durationMs": 222,
      "relay": "pc1",
      "result": "PASS"
    },
    {
      "name": "项目进度看板",
      "url": "https://xia.shenliu.cc/xia-board/",
      "expect": "龙虾",
      "statusCode": 200,
      "durationMs": 464,
      "relay": "pc1",
      "result": "PASS"
    },
    {
      "name": "管理后台",
      "url": "https://xia.shenliu.cc/xia-admin/",
      "expect": "root",
      "statusCode": 200,
      "durationMs": 136,
      "relay": "pc1",
      "result": "PASS"
    },
    {
      "name": "供应商端",
      "url": "https://xia.shenliu.cc/xia-supplier/",
      "expect": "root",
      "statusCode": 200,
      "durationMs": 208,
      "relay": "pc1",
      "result": "PASS"
    },
    {
      "name": "发布中心",
      "url": "https://xia.shenliu.cc/xia-release/",
      "expect": "v0.1",
      "statusCode": 200,
      "durationMs": 499,
      "relay": "pc1",
      "result": "PASS"
    },
    {
      "name": "数据库演练页",
      "url": "https://xia.shenliu.cc/xia-db-drill/",
      "expect": "生产数据库演练报告",
      "statusCode": 200,
      "durationMs": 1068,
      "relay": "pc1",
      "result": "PASS"
    },
    {
      "name": "数据库演练 JSON",
      "url": "https://xia.shenliu.cc/xia-db-drill/report.json",
      "statusCode": 200,
      "durationMs": 1796,
      "relay": "pc1",
      "result": "PASS"
    },
    {
      "name": "行动清单",
      "url": "https://xia.shenliu.cc/xia-action-list/",
      "expect": "生产化上线行动清单",
      "statusCode": 200,
      "durationMs": 304,
      "relay": "pc1",
      "result": "PASS"
    }
  ],
  "alertPolicy": [
    "任一公网入口非 200 或关键文案缺失时,标记为 P0 告警。",
    "API health 非 success 或服务名不匹配时,标记为 P0 告警。",
    "公网响应未经过 pc1 relay 或超过 3 秒时,标记为 P1 运维检查。",
    "数据库演练 JSON 非 PASS 时,暂停真实试运行发布。"
  ],
  "recoveryRunbook": [
    "确认 WSL2 API 进程:ps -ef | grep \"node src/main.js\"。",
    "重启 API:cd /wwwroot/xia/apps/api && setsid env PORT=3300 node src/main.js >/tmp/xia-api.log 2>&1 < /dev/null &。",
    "校验 Nginx:nginx -t && nginx -s reload。",
    "重新执行 smoke:pnpm -C /wwwroot/xia smoke:v0.1。",
    "若 WSL2 IP 变化,按 AGENTS.md 规则重新发布 NEXUS relay。"
  ]
}