Dialog Hooks API

Dialog Hooks are the integration point between voicetyped’s conversation runtime and your business logic. When a dialog state machine executes a call_hook action, the Integration Gateway sends an HTTP POST request with JSON to your webhook URL and uses the JSON response to continue the dialog. You implement standard HTTP endpoints that voicetyped calls — this is the primary way to add custom logic to voice flows.

How It Works

voicetyped acts as the HTTP client. You run an HTTP server that exposes one or more webhook endpoints. During dialog execution, voicetyped POSTs a JSON payload to your endpoint and expects a JSON response.

voicetyped                        Your Server
     |                                    |
     |--- POST /on-intent (JSON) -------->|
     |                                    |  (your business logic)
     |<---------- 200 OK (JSON) ---------|
     |                                    |

Webhook Endpoints

Your server implements up to three endpoints. Only /on-intent is required.

Endpoint	Method	Description
`{your_url}/on-intent`	POST	Called when the dialog FSM triggers a hook action. Required.
`{your_url}/on-call-start`	POST	Called when a call starts. Optional.
`{your_url}/on-call-end`	POST	Called when a call ends. Optional.

OnIntent

The primary hook. Called when a dialog state machine triggers a call_hook action.

Request (POST from voicetyped)

{
  "session_id": "call-abc-123",
  "caller_id": "+15551234567",
  "called_number": "+18001234567",
  "dialog_name": "helpdesk",
  "current_state": "process_request",
  "transcript": "I need a password reset",
  "confidence": 0.94,
  "language": "en",
  "variables": {"caller_name": "John"},
  "payload": {},
  "timestamp": "2024-01-15T10:30:45Z",
  "sip_headers": {}
}

Field	Type	Description
`session_id`	string	Unique call session ID
`caller_id`	string	Caller phone number or SIP URI
`called_number`	string	Dialed number
`dialog_name`	string	Active dialog name
`current_state`	string	Current FSM state
`transcript`	string	Full transcript of caller’s speech
`confidence`	float	ASR confidence score (0.0 to 1.0)
`language`	string	Detected language code
`variables`	object	Variables set during the dialog
`payload`	object	Custom payload from `call_hook` action
`timestamp`	string	ISO 8601 timestamp
`sip_headers`	object	SIP headers from the call

Response (your server returns)

{
  "response_text": "I've created a password reset ticket for you.",
  "next_state": "",
  "variables": {"ticket_id": "TK-45678", "issue_type": "password_reset"},
  "metadata": {},
  "send_dtmf": "",
  "transfer_to": "",
  "hangup": false
}

Field	Type	Description
`response_text`	string	Text played as TTS to the caller
`next_state`	string	Force a state transition. If empty, the dialog FSM uses its normal transitions.
`variables`	object	Variables to set in the dialog context
`metadata`	object	Metadata to attach to the call session
`send_dtmf`	string	DTMF digits to send (optional)
`transfer_to`	string	SIP URI or number to transfer the call to (optional)
`hangup`	bool	End the call (optional)

Implementation Examples

Go

package main

import (
	"encoding/json"
	"fmt"
	"log"
	"net/http"
	"strings"
)

type IntentEvent struct {
	SessionID    string            `json:"session_id"`
	CallerID     string            `json:"caller_id"`
	CalledNumber string            `json:"called_number"`
	DialogName   string            `json:"dialog_name"`
	CurrentState string            `json:"current_state"`
	Transcript   string            `json:"transcript"`
	Confidence   float64           `json:"confidence"`
	Language     string            `json:"language"`
	Variables    map[string]string `json:"variables"`
	Payload      json.RawMessage   `json:"payload"`
	Timestamp    string            `json:"timestamp"`
	SipHeaders   map[string]string `json:"sip_headers"`
}

type DialogAction struct {
	ResponseText string            `json:"response_text"`
	NextState    string            `json:"next_state,omitempty"`
	Variables    map[string]string `json:"variables,omitempty"`
	Metadata     map[string]string `json:"metadata,omitempty"`
	SendDTMF     string            `json:"send_dtmf,omitempty"`
	TransferTo   string            `json:"transfer_to,omitempty"`
	Hangup       bool              `json:"hangup,omitempty"`
}

func onIntent(w http.ResponseWriter, r *http.Request) {
	var event IntentEvent
	if err := json.NewDecoder(r.Body).Decode(&event); err != nil {
		http.Error(w, "bad request", http.StatusBadRequest)
		return
	}

	log.Printf("OnIntent: session=%s state=%s transcript=%q",
		event.SessionID, event.CurrentState, event.Transcript)

	transcript := strings.ToLower(event.Transcript)
	var action DialogAction

	switch {
	case strings.Contains(transcript, "password reset"):
		ticketID, err := createTicket(event.CallerID, "password_reset")
		if err != nil {
			action = DialogAction{
				ResponseText: "I'm sorry, I couldn't create your ticket. " +
					"Let me transfer you to an agent.",
				TransferTo: "sip:[email protected]",
			}
		} else {
			action = DialogAction{
				ResponseText: fmt.Sprintf(
					"I've created a password reset ticket for you. "+
						"Your ticket number is %s. "+
						"You should receive an email shortly.", ticketID),
				Variables: map[string]string{
					"ticket_id":  ticketID,
					"issue_type": "password_reset",
				},
			}
		}

	case strings.Contains(transcript, "check status"):
		ticketID := event.Variables["ticket_id"]
		if ticketID == "" {
			action = DialogAction{
				ResponseText: "I don't have a ticket number. " +
					"Could you please provide your ticket number?",
				NextState: "collect_ticket_number",
			}
		} else {
			status, _ := getTicketStatus(ticketID)
			action = DialogAction{
				ResponseText: fmt.Sprintf(
					"Your ticket %s is currently %s.", ticketID, status),
			}
		}

	default:
		action = DialogAction{
			ResponseText: "I understand you need help. " +
				"Could you describe your issue in a few words?",
		}
	}

	w.Header().Set("Content-Type", "application/json")
	json.NewEncoder(w).Encode(action)
}

func onCallStart(w http.ResponseWriter, r *http.Request) {
	var event CallStartEvent
	if err := json.NewDecoder(r.Body).Decode(&event); err != nil {
		http.Error(w, "bad request", http.StatusBadRequest)
		return
	}
	log.Printf("Call started: %s from %s", event.SessionID, event.CallerID)

	name, _ := lookupCaller(event.CallerID)

	w.Header().Set("Content-Type", "application/json")
	json.NewEncoder(w).Encode(CallStartAction{
		Variables: map[string]string{"caller_name": name},
	})
}

func onCallEnd(w http.ResponseWriter, r *http.Request) {
	var event CallEndEvent
	if err := json.NewDecoder(r.Body).Decode(&event); err != nil {
		http.Error(w, "bad request", http.StatusBadRequest)
		return
	}
	log.Printf("Call ended: %s (duration: %ds, reason: %s)",
		event.SessionID, event.DurationSeconds, event.Reason)

	logCallRecord(event)
	w.WriteHeader(http.StatusOK)
}

func main() {
	http.HandleFunc("/on-intent", onIntent)
	http.HandleFunc("/on-call-start", onCallStart)
	http.HandleFunc("/on-call-end", onCallEnd)

	log.Printf("Dialog hooks server listening on :8080")
	log.Fatal(http.ListenAndServe(":8080", nil))
}

Python

from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route("/on-intent", methods=["POST"])
def on_intent():
    event = request.get_json()
    transcript = event["transcript"].lower()

    if "password" in transcript:
        ticket_id = create_ticket(event["caller_id"], "password_reset")
        return jsonify(
            response_text=f"Ticket {ticket_id} created for password reset.",
            variables={"ticket_id": ticket_id},
        )

    if "transfer" in transcript:
        return jsonify(
            response_text="Transferring you now.",
            transfer_to="sip:[email protected]",
        )

    return jsonify(response_text="How can I help you today?")

@app.route("/on-call-start", methods=["POST"])
def on_call_start():
    event = request.get_json()
    caller_name = lookup_caller(event["caller_id"])
    return jsonify(variables={"caller_name": caller_name})

@app.route("/on-call-end", methods=["POST"])
def on_call_end():
    event = request.get_json()
    log_call_record(event)
    return "", 200

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=8080)

OnCallStart

Optional webhook called when a new call is connected. Use it to look up caller information, set initial variables, or log the call start.

Request

{
  "session_id": "call-abc-123",
  "caller_id": "+15551234567",
  "called_number": "+18001234567",
  "dialog_name": "helpdesk",
  "sip_headers": {},
  "timestamp": "2024-01-15T10:30:00Z"
}

Response

{
  "variables": {"caller_name": "John Smith"},
  "override_dialog": "",
  "reject_call": false,
  "reject_reason": ""
}

Field	Type	Description
`variables`	object	Initial variables to set in the dialog context
`override_dialog`	string	Use a different dialog for this call (optional)
`reject_call`	bool	Reject the call (optional)
`reject_reason`	string	Reason for rejection, logged for diagnostics

OnCallEnd

Optional webhook called when a call terminates. Use it for analytics, logging, or cleanup. Return an empty 200 OK response.

Request

{
  "session_id": "call-abc-123",
  "caller_id": "+15551234567",
  "dialog_name": "helpdesk",
  "final_state": "goodbye",
  "duration_seconds": 142,
  "reason": "caller_hangup",
  "variables": {"ticket_id": "TK-45678", "issue_type": "password_reset"},
  "state_transitions": 5
}

Response

Return an empty 200 OK. No response body is required.

Registration

integration:
  services:
    dialog_hooks:
      type: http
      url: https://hook-server.internal:50051/on-intent
      method: POST
      headers:
        Authorization: "Bearer ${WEBHOOK_TOKEN}"
      timeout: 5s
      retry:
        max_attempts: 2
        initial_backoff: 100ms

Then reference it in your dialog YAML:

states:
  process:
    on_enter:
      - action: call_hook
        service: dialog_hooks
        method: OnIntent
        payload:
          transcript: "{{ .Event.Transcript }}"

Best Practices

Keep hooks fast — target < 200ms response time. The caller is waiting in silence while voicetyped waits for your webhook response.
Return proper status codes — return 200 for success. voicetyped treats 4xx and 5xx responses as errors and will retry or fall back based on your retry configuration.
Handle errors gracefully — always return a JSON DialogAction even on internal errors. Use response_text to inform the caller and transfer_to as a fallback.
Authenticate requests — use the Authorization header (configured in registration) to verify that requests originate from your voicetyped instance. Reject requests without a valid token.
Use variables — store context in dialog variables rather than external state. Variables survive state transitions and are available in templates.
Set Content-Type — always return Content-Type: application/json. voicetyped will reject responses with other content types.
Log everything — use the /on-call-end webhook to record call analytics in your data warehouse.
Test with the Speech API — use the Speech API to simulate calls without SIP infrastructure.

Next Steps

Integration Gateway — configure service connections
Conversation Runtime — build dialog state machines
Call Event Stream API — observe call events in real-time