Monitoring
JamBridge exposes Prometheus metrics via Spring Boot Actuator at /actuator/prometheus.
Key metrics
| Metric | Type | Description |
|---|---|---|
jambridge_messages_received_total | Counter | Total messages received per port |
jambridge_pipeline_duration_seconds | Histogram | End-to-end pipeline duration |
jambridge_stage_duration_seconds | Histogram | Per-stage duration (labels: stage) |
jambridge_consent_decisions_total | Counter | Consent decisions (labels: result=permit/deny/error) |
jambridge_mpi_match_grade_total | Counter | MPI match grades (labels: grade) |
jambridge_circuit_breaker_state | Gauge | HAPI circuit breaker (0=closed, 1=open, 2=half-open) |
jambridge_queue_depth | Gauge | Secondary retry queue depth |
jambridge_dedup_hits_total | Counter | Duplicate messages detected |
Grafana dashboard
Import the JamBridge dashboard from Grafana ID 21847 (or from /monitoring/grafana-dashboard.json in the repo).
Panels include:
- Message throughput by port (messages/min)
- Pipeline stage latency heatmap
- Consent decision rate (permit vs deny)
- MPI match grade distribution
- Circuit breaker state timeline
- Queue depth over time
- ATNA audit event rate
Alerting rules
prometheus-rules.yaml
groups:
- name: jambridge
rules:
- alert: JamBridgeCircuitOpen
expr: jambridge_circuit_breaker_state == 1
for: 2m
annotations:
summary: "JamBridge HAPI circuit breaker is OPEN"
- alert: JamBridgeQueueDepthHigh
expr: jambridge_queue_depth > 100
for: 5m
annotations:
summary: "JamBridge retry queue depth > 100"
- alert: JamBridgeConsentErrorRate
expr: rate(jambridge_consent_decisions_total{result="error"}[5m]) > 0.1
for: 2m
annotations:
summary: "JamBridge consent error rate > 10%"
- alert: JamBridgeNoMessages
expr: rate(jambridge_messages_received_total[10m]) == 0
for: 10m
annotations:
summary: "JamBridge: no messages received in 10 minutes"