Observability
Prometheus metrics, OpenTelemetry tracing, and structured logging for voicetyped.
voicetyped provides comprehensive observability through three pillars: metrics (Prometheus), traces (OpenTelemetry), and structured logs. Every service emits detailed telemetry that enables you to monitor call quality, debug issues, and track system health.
Metrics (Prometheus)
All services expose Prometheus-compatible metrics on the configured metrics port (default :9100).
Configuration
observability:
metrics:
port: 9100
path: /metrics
enabled: true
Key Metrics
Call Metrics
| Metric | Type | Labels | Description |
|---|---|---|---|
vg_calls_active | Gauge | dialog | Currently active calls |
vg_calls_total | Counter | dialog, status | Total calls |
vg_call_duration_seconds | Histogram | dialog | Call duration distribution |
vg_call_state_transitions_total | Counter | dialog, from, to | State transitions |
ASR Metrics
| Metric | Type | Labels | Description |
|---|---|---|---|
vg_asr_latency_seconds | Histogram | model | Time from audio to transcript |
vg_asr_transcriptions_total | Counter | model, type | Transcriptions (partial/final) |
vg_asr_workers_active | Gauge | Active ASR workers | |
vg_asr_queue_depth | Gauge | Queued audio segments | |
vg_asr_confidence | Histogram | model | Confidence score distribution |
vg_asr_gpu_utilization | Gauge | device | GPU utilization % |
Media Metrics
| Metric | Type | Labels | Description |
|---|---|---|---|
vg_rtp_packets_received_total | Counter | RTP packets received | |
vg_rtp_packets_lost_total | Counter | RTP packets lost | |
vg_rtp_jitter_ms | Histogram | RTP jitter | |
vg_sip_requests_total | Counter | method, status | SIP requests |
vg_audio_buffer_underruns_total | Counter | Audio buffer underruns |
Integration Metrics
| Metric | Type | Labels | Description |
|---|---|---|---|
vg_integration_requests_total | Counter | service, method, status | Backend requests |
vg_integration_latency_seconds | Histogram | service, method | Backend latency |
vg_integration_retries_total | Counter | service | Retry attempts |
vg_integration_circuit_breaker | Gauge | service | Circuit breaker state |
System Metrics
| Metric | Type | Description |
|---|---|---|
vg_uptime_seconds | Gauge | Time since startup |
vg_goroutines | Gauge | Active goroutines |
vg_memory_alloc_bytes | Gauge | Memory allocated |
Prometheus Configuration
# prometheus.yml
scrape_configs:
- job_name: 'voice-gateway'
scrape_interval: 15s
static_configs:
- targets: ['voice-gateway:9100']
# For Kubernetes with ServiceMonitor
# (handled automatically by Helm chart)
Useful PromQL Queries
# Active calls
vg_calls_active
# Call rate (calls per minute)
rate(vg_calls_total[5m]) * 60
# Average call duration
histogram_quantile(0.5, rate(vg_call_duration_seconds_bucket[5m]))
# P99 ASR latency
histogram_quantile(0.99, rate(vg_asr_latency_seconds_bucket[5m]))
# RTP packet loss rate
rate(vg_rtp_packets_lost_total[5m]) /
rate(vg_rtp_packets_received_total[5m]) * 100
# Integration error rate
rate(vg_integration_requests_total{status!="OK"}[5m]) /
rate(vg_integration_requests_total[5m]) * 100
# ASR queue depth (backpressure indicator)
vg_asr_queue_depth > 5
Alerting Rules
# prometheus-rules.yml
groups:
- name: voice-gateway
rules:
- alert: HighASRLatency
expr: histogram_quantile(0.99, rate(vg_asr_latency_seconds_bucket[5m])) > 2
for: 5m
labels:
severity: warning
annotations:
summary: "ASR latency P99 exceeds 2 seconds"
- alert: HighPacketLoss
expr: |
rate(vg_rtp_packets_lost_total[5m]) /
rate(vg_rtp_packets_received_total[5m]) > 0.05
for: 2m
labels:
severity: critical
annotations:
summary: "RTP packet loss exceeds 5%"
- alert: ASRQueueBacklog
expr: vg_asr_queue_depth > 10
for: 1m
labels:
severity: warning
annotations:
summary: "ASR queue depth exceeds 10 segments"
- alert: IntegrationCircuitOpen
expr: vg_integration_circuit_breaker == 1
for: 0m
labels:
severity: critical
annotations:
summary: "Circuit breaker is OPEN for {{ $labels.service }}"
- alert: HighCallVolume
expr: vg_calls_active > 80
for: 5m
labels:
severity: warning
annotations:
summary: "Active calls exceeding 80% capacity"
Tracing (OpenTelemetry)
voicetyped supports distributed tracing via OpenTelemetry (OTLP).
Configuration
observability:
tracing:
enabled: true
otlp_endpoint: "otel-collector:4317"
otlp_protocol: grpc # grpc or http
sample_rate: 1.0 # 1.0 = 100%, 0.1 = 10%
service_name: voice-gateway
resource_attributes:
deployment.environment: production
service.version: "1.0.0"
Trace Spans
Each call generates a trace with spans for each processing stage:
Trace: call-abc-123
├── media.sip_invite (2ms)
├── media.rtp_setup (15ms)
├── speech.vad_detect (50ms)
├── speech.asr_transcribe (340ms)
│ ├── speech.whisper_inference (280ms)
│ └── speech.post_process (60ms)
├── runtime.state_transition (1ms)
│ └── runtime.evaluate_conditions (0.5ms)
├── integration.call_hook (234ms)
│ ├── integration.serialize (1ms)
│ ├── integration.grpc_call (230ms)
│ └── integration.deserialize (3ms)
├── speech.tts_synthesize (120ms)
└── media.rtp_playback (2100ms)
Span Attributes
Each span includes relevant attributes:
speech.asr_transcribe:
asr.model: whisper-medium
asr.language: en
asr.confidence: 0.94
asr.duration_ms: 2340
asr.is_final: true
integration.call_hook:
rpc.service: ticketing
rpc.method: CreateTicket
rpc.status_code: OK
retry.count: 0
Structured Logging
Configuration
observability:
logging:
level: info # debug, info, warn, error
format: json # json, text
output: stdout # stdout, file
file_path: /var/log/voice-gateway/vg.log
include_caller: true # Include source file:line
Log Format
{
"timestamp": "2024-01-15T10:30:45.123Z",
"level": "info",
"message": "call started",
"session_id": "abc-123-def",
"caller_id": "+15551234567",
"dialog": "helpdesk",
"component": "media-gateway",
"trace_id": "4bf92f3577b34da6a3ce929d0e0e4736"
}
Log Levels
| Level | Use |
|---|---|
debug | Detailed debugging (audio processing, VAD decisions) |
info | Normal operations (call start/end, state transitions) |
warn | Potential issues (high latency, retry attempts) |
error | Failures (hook errors, codec failures, connection drops) |
Grafana Dashboards
voicetyped provides pre-built Grafana dashboards:
Call Overview Dashboard
Displays:
- Active calls (real-time)
- Call volume over time
- Average call duration
- Call completion rate
- Top dialogs by volume
ASR Performance Dashboard
Displays:
- ASR latency (P50, P95, P99)
- Transcription throughput
- Model accuracy distribution
- GPU utilization
- Worker pool usage and queue depth
Media Quality Dashboard
Displays:
- RTP packet loss rate
- Jitter distribution
- Audio buffer underruns
- SIP error rates
- Codec distribution
Integration Health Dashboard
Displays:
- Backend request rate
- Error rate by service
- Latency by service
- Circuit breaker states
- Retry rates
Health Endpoints
# Liveness probe (is the process running?)
curl http://localhost:9100/healthz
# Returns 200 if alive
# Readiness probe (is the service ready to accept calls?)
curl http://localhost:9100/readyz
# Returns 200 if ready, 503 if not
# Detailed health check
curl http://localhost:9100/health
# Returns JSON with component status
{
"status": "healthy",
"components": {
"media_gateway": {"status": "healthy", "sip_port": 5060},
"speech_gateway": {"status": "healthy", "model": "whisper-medium", "gpu": true},
"runtime": {"status": "healthy", "dialogs_loaded": 3},
"integration": {"status": "healthy", "services": 2}
},
"uptime_seconds": 86400,
"version": "1.0.0"
}
Next Steps
- Security — audit logging and compliance
- Kubernetes Deployment — ServiceMonitor setup
- Getting Started — verify your installation