This commit is contained in:
Dave Horton
2021-04-21 21:21:54 -04:00
parent 86c1d96b42
commit da9ba94eff
24 changed files with 1102 additions and 23 deletions

View File

@@ -2,6 +2,71 @@ root:
link: /docs/
label: For Developers
navi:
-
path: webhooks
title: Using webhooks
pages:
-
path: overview
title: Overview
-
path: conference
title: conference
-
path: dequeue
title: dequeue
-
path: dial
title: dial
-
path: dialogflow
title: dialogflow
-
path: enqueue
title: enqueue
-
path: gather
title: gather
-
path: hangup
title: hangup
-
path: leave
title: leave
-
path: lex
title: lex
-
path: listen
title: listen
-
path: pause
title: pause
-
path: play
title: Dequeue
-
path: redirect
title: redirect
-
path: say
title: say
-
path: tag
title: tag
-
path: transcribe
title: transcribe
-
path: rest
title: API Reference
pages:
-
path: overview
title: Overview
-
path: applications
title: Applications
-
path: getting-started
title: Getting Started
@@ -21,16 +86,6 @@ navi:
-
path: register-sip-client
title: Register SIP clients
-
path: api
title: API Reference
pages:
-
path: webhooks
title: Webhooks
-
path: rest
title: REST APIs
-
path: tutorials
title: Tutorials

View File

@@ -1,10 +1,10 @@
# Webhooks
jambonz uses JSON payloads in HTTP messages to control calls.
jambonz uses JSON payloads that are exchanged in HTTP messages to control calls.
When an incoming call for your account is received, jambonz makes an HTTP request to the URL that you have configured and you return a JSON body in your response indicating how you want the call handled.
When an incoming call for your account is received, jambonz makes an HTTP request to a URL that you have configured and your response contains a JSON body that indicates how you want the call handled.
When you want to launch an outbound call via the [REST API](/docs/api/rest) it works similarly: you make an HTTP request and in it you provide a web callback url that will be invoked once the call is answered. In your response to that request you will then provide your call handling instructions.
When you want to launch an outbound call via the [REST API](/docs/api/rest) it works similarly: you make an HTTP request and in it you provide a web callback url that will be invoked once the call is answered. In your response to that request you then provide your call handling instructions.
## Basic JSON message structure
The JSON payload that you provide in response to a callback must be an array with each item describing a task that the platform shall perform. These tasks are executed sequentially in the order they appear in the array. Each task is identified by a verb (e.g. "dial", "gather", "hangup" etc) with associated detail and these verbs are described in more detail below.

View File

@@ -1,13 +1,12 @@
# jambonz
## Welcome, developers!
jambonz is a CPaaS that is designed for communications service providers. As an API-driven platform, you will primarily interface with it using [Webhooks](/docs/api/webhooks) and [REST APIs](/docs/api/rest).
jambonz is an open source CPaaS platform that is primarily designed for use by communications service providers. As an API-driven platform, you will primarily interface with it using [Webhooks](/docs/webhooks/overview/) and [REST APIs](/docs/rest/overview/). Our client SDKs include a [Node.js SDK]() SDK as well as [Node-RED plugins]().
jambonz is available for use both as cloud APIs, or as an open source platform that you can run on your own infrastructure. Either way, your applications are written in the same fashion, so you can start off by using the cloud APIs and later migrate to running your own platform if you like.
There are two ways to get started with jambonz:
- create a free account on our hosted platform, or
- download and install a jambonz system on your own infrastructure.
A good idea is to start by creating a free account on the hosted platform. Then, based on your testing and traffic needs, you can bring up a jambonz cluster on your own infrastructure and migrate your applications seamlessly.
jambonz is also a "Bring Your Own Everything" (BYOE) CPaaS, meaning that you will [plug in your own SIP trunking providers](), and [use your own AWS or Google credentials]() for speech processing.
In these pages you will find information about our SDKs and APIs, along with some useful tips for [Getting Started]() on the hosted platform, [tutorials]() describing how to perform common tasks, as well as some quick [how-to videos]().
Follow the [Getting Started]() pages that follow to get yourself up and running on the cloud platform, or dive into the [API Reference]() or examine [client SDKs]() and [sample applications]() for inspiration.
```javascript
const foo = "bar";
```
As always, if you can't find the information you are looking for, please email us at support@jambonz.org and we'll be glad to help!

View File

@@ -0,0 +1 @@
# Applications

22
docs/rest/overview.md Normal file
View File

@@ -0,0 +1,22 @@
# Overview
The jambonz REST API allows applications to query, create, and manage calls and other resources.
**Base URL**
All calls should use the following base URL:
```
https://{serviceUrl}/v1
```
where serviceUrl is set according to your own installation.
**Authentication**
The REST api uses HTTP Bearer Authentication which requires that you include an HTTP Authorization header containing a valid api token.
**Dates and Times**
All dates and times are UTC, using RFC 2822 format.
**Phone Numbers**
All phone numbers are in E.164 format, starting with a plus sign ("+") and the country code.

View File

@@ -0,0 +1,50 @@
# conference
The `conference` verb places a call into a conference.
```json
{
"verb": "conference",
"name": "test",
"beep": true,
"startConferenceOnEnter": false,
"waitHook": "/confWait",
"enterHook": "/confEnter"
},
```
You can use the following attributes in the `conference` command:
| option | description | required |
| ------------- |-------------| -----|
| actionHook | A webhook to call when the conference ends | no |
| beep | if true, play a beep tone to the conference when caller enters (default: false) | no |
| endConferenceOnExit | if true, end the conference when this caller hangs up (default: false) | no |
| enterHook | A webhook to retrieve something to play or say to the caller just before they are put into a conference after waiting for it to start| no |
| maxParticipants | maximum number of participants that will be allowed in the conference | no |
| name | name of the conference | yes |
| startConferenceOnEnter | if true, start the conference only when this caller enters (default: true) | no |
| statusHook | A webhook to call with conference status events | no |
| statusEvents | An array of events for which the statusHook should be called to. See below for details. | no |
| waitHook | A webhook to retrieve commands to play or say while the caller is waiting for the conference to start | no |
Conference status events:
- 'start': the conference has started
- 'end': the conference has ended
- 'join': a participant has joined the conference
- 'leave': a participant has left the conference
- 'start-talking': a participant started speaking
- 'end-talking': a participant stopped talking
Conference status webhooks will contain the following additional parameters:
- conferenceSid: a unique identifier for the conference
- friendlyName: the name of the conference as specified in the application
- event: the conference event being reported (e.g. "join")
- time: the time of the event in ISO format (e.g. "2020-04-27T13:44:17.336Z")
- members: the current number of members in the conference
- duration: the current length of the conference in seconds
<p>
<a href="/docs/webhooks/overview" style="float: left;">Prev: Overview</a>
<a href="/docs/webhooks/dequeue" style="float: right;">Next: dequeue</a>
</p>

33
docs/webhooks/dequeue.md Normal file
View File

@@ -0,0 +1,33 @@
# dequeue
The `dequeue` verb removes the a call from the front of a specified queue and bridges that call to the current caller.
```json
{
"verb": "dequeue",
"name": "support",
"beep": true,
"timeout": 60
}
```
You can use the following options in the `dequeue` command:
| option | description | required |
| ------------- |-------------| -----|
| name | name of the queue | yes |
| actionHook | A webhook invoke when call ends. If no webhook is provided, execution will continue with the next verb in the current application. <br/>See below for specified request parameters.| no |
| beep | if true, play a beep tone to this caller only just prior to connecting the queued call; this provides an auditory cue that the call is now connected | no |
| confirmHook | A webhook for an application to run on the callee's end before the call is bridged. This will allow the application to play an informative message to a caller as they leave the queue (e.g. "your call may be recorded") | no |
| timeout | number of seconds to wait on an empty queue before returning (default: wait forever) | no |
The *actionHook* webhook will contain a `dequeueResult` property indicating the completion reason:
- 'hangup' - the bridged call was abandoned while listening to the confirmHook message
- 'complete' - the call was successfully bridged and ended with a caller hangup
- 'timeout' - no call appeared in the named queue during the timeout interval
- 'error' - a system error of some kind occurred
<p>
<a href="/docs/webhooks/conference" style="float: left;">Prev: conference</a>
<a href="/docs/webhooks/dial" style="float: right;">Next: dial</a>
</p>

113
docs/webhooks/dial.md Normal file
View File

@@ -0,0 +1,113 @@
# dial
The `dial` verb is used to create a new call by dialing out to a telephone number, a registered sip user, a sip uri, or a Microsoft Teams user.
```json
{
"verb": "dial",
"actionHook": "/outdial",
"callerId": "+16173331212",
"answerOnBridge": true,
"dtmfCapture": ["*2", "*3"],
"dtmfHook": {
"url": "/dtmf",
"method": "GET"
},
"target": [
{
"type": "phone",
"number": "+15083084809"
},
{
"type": "sip",
"sipUri": "sip:1617333456@sip.trunk1.com",
"auth": {
"user": "foo",
"password": "bar"
}
},
{
"type": "user",
"name": "spike@sip.example.com"
}
]
}
```
As the example above illustrates, when you execute the dial command you are making one or more outbound call attempts in an effort to create one new call, which is bridged to a parent call. The `target` property specifies an array of call destinations that will be attempted simultaneously.
If multiple endpoints are specified in the `target` array, all targets are outdialed at the same time (e.g., "simring", or "blast outdial" as some folks call it) and the call will be connected to the first endpoint that answers the call (and, optionally, completes a call screening application as specified in the `url` property).
There are several types of endpoints:
* a telephone phone number that can be reached via your Carrier,
* a webrtc or sip client that has registered directly with your application,
* a sip endpoint, identified by a sip uri (and possibly authentication parameters), or
* Microsoft Teams user
You can use the following attributes in the `dial` command:
| option | description | required |
| ------------- |-------------| -----|
| actionHook | webhook to invoke when the call ends. | no |
| answerOnBridge | If set to true, the inbound call will ring until the number that was dialed answers the call, and at that point a 200 OK will be sent on the inbound leg. If false, the inbound call will be answered immediately as the outbound call is placed. <br/>Defaults to false. | no |
| callerId | The inbound caller's phone number, which is displayed to the number that was dialed. The caller ID must be a valid E.164 number. <br/>Defaults to caller id on inbound call. | no |
| confirmHook | webhook for an application to run on the callee's end after the dialed number answers but before the call is connected. This allows the caller to provide information to the dialed number, giving them the opportunity to decline the call, before they answer the call. Note that if you want to run different applications on specific destinations, you can specify the 'url' property on the nested [target](#target-types) object. | no |
| dialMusic | url that specifies a .wav or .mp3 audio file of custom audio or ringback to play to the caller while the outbound call is ringing. | no |
| dtmfCapture | an array of strings that represent dtmf sequence which, when detected, will trigger a mid-call notification to the application via the configured `dtmfHook` | no |
| dtmfHook | a webhook to call when a dtmfCapture entry is matched. This is a notification only -- no response is expected, and any desired actions must be carried out via the REST updateCall API. | no|
| headers | an object containing arbitrary sip headers to apply to the outbound call attempt(s) | no |
| listen | a nested [listen](#listen) action, which will cause audio from the call to be streamed to a remote server over a websocket connection | no |
| target | array of to 10 [destinations](#target-types) to simultaneously dial. The first person (or entity) to answer the call will be connected to the caller and the rest of the called numbers will be hung up.| yes |
| timeLimit | max length of call in seconds | no |
| timeout | ring no answer timeout, in seconds. <br/>Defaults to 60. | no |
| transcribe | a nested [transcribe](#transcribe) action, which will cause the call to be transcribed | no |
##### target types
*PSTN number*
| option | description | required |
| ------------- |-------------| -----|
| type | must be "phone" | yes |
| confirmHook | A webhook for an application to run on the callee's end after the dialed number answers but before the call is connected. This will override the confirmHook property set on the parent dial verb, if any.| no |
| number | a telephone numnber in E.164 number | yes |
*sip endpoint*
| option | description | required |
| ------------- |-------------| -----|
| type | must be "sip" | yes |
| confirmHook | A webhook for an application to run on the callee's end after the dialed number answers but before the call is connected. This will override the confirmHook property set on the parent dial verb, if any.| no |
| sipUri | sip uri to send call to | yes |
| auth | authentication credentials | no |
| auth.user | sip username | no |
| auth.password | sip password | no |
Using this approach, it is possible to send calls out a sip trunk. If the sip trunking provider enforces username/password authentication, supply the credentials in the `auth` property.
*a registered webrtc or sip user*
| option | description | required |
| ------------- |-------------| -----|
| type | must be "user" | yes |
| confirmHook | A webhook for an application to run on the callee's end after the dialed number answers but before the call is connected. This will override the confirmHook property set on the parent dial verb, if any.| no |
| name | registered sip user, including domain (e.g. "joeb@sip.jambonz.org") | yes |
*Microsoft Teams user*
If Microsoft Teams integration has been configured, you can dial out to Teams users.
| option | description | required |
| ------------- |-------------| -----|
| type | must be "teams" | yes |
| tenant | Microsoft Teams customer tenant domain name. Will default to the Microsoft Teams tenant associated with the account of the calling party. | no |
| number | the phone number that has been mapped to the teams user by the Microsoft Teams administrator | yes |
| voicemail | if true, dial directly into user's voicemail to leave a message | no |
The `confirmHook` property that can be optionally specified as part of the target types is a web callback that will be invoked when the outdial call is answered. That callback should return an application that will run on the outbound call before bridging it to the inbound call. If the application completes with the outbound call still in a stable/connected state, then the two calls will be bridged together.
This allows you to easily implement call screening applications (e.g. "You have a call from so-and-so. Press 1 to decline").
<p>
<a href="/docs/webhooks/dequeue" style="float: left;">Prev: dequeue</a>
<a href="/docs/webhooks/dialogflow" style="float: right;">Next: dialogflow</a>
</p>

View File

@@ -0,0 +1,88 @@
# dialogflow
The `dialogflow` verb is used to connect a call to a [Google Dialogflow](https://cloud.Google.com/dialogflow) bot.
```json
{
"verb": "dialogflow",
"project": "ai-in-rtc-drachtio-tsjjpn",
"lang": "en-US",
"credentials": "{\"type\": \"service_account\",\"project_id\": \"prj..",
"welcomeEvent": "welcome",
"eventHook": "/dialogflow-event",
"actionHook": "/dialogflow-action"
}
```
You can use the following options in the `dialogflow` verb:
| option | description | required |
| ------------- |-------------| -----|
| project | the Google dialogflow project id | yes |
| lang | language to use for speech recognition | yes |
| credentials | the service account key in JSON string form that is used to authenticate to dialogflow | yes |
| welcomeEvent | An event to send to dialogflow when first connecting; e.g. to trigger a welcome prompt | no |
| welcomeEventParams | An object containing parameters to send with the welcome event | no |
| noInputTimeout | Number of seconds of no speech detected after which to reprompt | no |
| noInputEvent | Name of dialogflow event to send in query when no input timeout expires | no |
| passDtmfAsTextInput | If true, pass user dtmf entries as text inputs to the dialogflow bot | no |
| thinkingMusic | A url to a .wav or .mp3 file to play as filler music while the dialogflow back-end is executing | no |
| actionHook | A webhook invoke when operation completes.<br/>See below for specified request parameters.| no |
| eventHook | A webhook to invoke when a dialogflow event occurs, such as an intent being detected or a speech transcription being returned. <br/>The response to the event hook may contain a new jambonz application to execute| no|
| tts | if provided, audio prompts will be played using text-to-speech rather than the dialogflow-provided audio clips | no |
| tts.vendor | speech vendor to use: Google, aws (alias: polly), or default (for application default) | no |
| bargein | if true, kill playback immediately when user begins speaking | no|
| tts.language | language code to use. | yes |
| tts.gender | (Google only) MALE, FEMALE, or NEUTRAL. | no |
| tts.voice | voice to use. Note that the voice list differs whether you are using aws or Google. Defaults to application setting, if provided. | no |
The *actionHook* webhook will contain the following additional parameters:
- `dialogflowResult`: the completion reason:
- `redirect` - a new application was returned from an event webhook
- `completed` - an intent with `end iteraction` set to true was received from dialogflow
The *eventHook* webhook will contain two parameters: `event` and `data`. The `event` parameter identifies the specific event and the `data` parameter is an object containng event data associated with the event. The following events are supported:
- `intent`: dialogflow detected an intent
- `transcription`: a speech transcription was returned from dialogflow
- `dmtf`: a dtmf key was pressed by the caller
- `start-play`: an audio segment returned from dialogflow started to play
- `stop-play`: an audio segment returned from dialogflow completing playing
- `no-input`: the no input timer elapsed with no input detected from the caller
Please refer to [this tutorial](/tutorials/#building-voicebots-using-jambonz-and-dialogflow) for a detailed example.
### call transfer in Dialogflow
Call transfer from a dialogflow bot is achieved by responding to an eventHook with event `intent` by returning a new jambonz application containing a [dial](#dial) verb. Of course, this should only be done if the intent is signaling a request for a call transfer.
Indicating a desire to transfer the call to a live agent can be done in a couple of different ways in the dialogflow editor:
1. By adding a Dialogflow Phone Gateway Response to the intent, with a Transfer Call action.
1. By adding a custom payload in a response to the intent, with arbitrary JSON content that you define and which should include the telephone number (or registered user, or sip endpoint) to transfer to.
> Note: option 1 only works when transferring to a US number, because the dialogflow editor only accepts US destinations. To transfer to non-US destinations, use option 2.
In either case, your application is responsible for having an eventHook that parses the intent (found in the `data` property of the webhook content) in order to check if call transfer is being requested, and if so responding with a new jambonz application.
For instance, when the Dialogflow Phone Gateway Response is used (option 1 above), the code snippet below shows where to find the transfer number in the intent data provided in the eventHook.
```js
const evt = req.body;
if (evt.event === 'intent') {
const qo = evt.data.query_result;
const transfer = qo.fulfillment_messages.find((fm) => {
return fm.platform === 'TELEPHONY' && fm.telephony_transfer_call;
});
if (transfer) {
// a transfer has been requested
// transfer.telephony_transfer_call.phone_number has the phone number to transfer to
}
}
```
Please refer to [this tutorial](/tutorials/#dialogflow-part-2-adding-call-transfer-functionality) for a detailed example.
<p>
<a href="/docs/webhooks/dial" style="float: left;">Prev: dial</a>
<a href="/docs/webhooks/enqueue" style="float: right;">Next: enqueue</a>
</p>

41
docs/webhooks/enqueue.md Normal file
View File

@@ -0,0 +1,41 @@
# enqueue
The `enqueue` command is used to place a caller in a queue.
```json
{
"verb": "enqueue",
"name": "support",
"actionHook": "/queue-action",
"waitHook": "/queue-wait"
}
```
You can use the following options in the `enqueue` command:
| option | description | required |
| ------------- |-------------| -----|
| name | name of the queue | yes |
| actionHook | A webhook invoke when operation completes. <br/>If a call is dequeued through the `leave` verb, the webook is immediately invoked. <br/>If the call has been bridged to another party via the `dequeue` verb, then the webhook is invoked after both parties have disconnected. <br/>If no webhook is provided, execution will continue with the next verb in the current application. <br/>See below for specified request parameters.| no |
| waitHook | A webhook to invoke while the caller is in queue. The only allowed verbs in the application returned from this webhook are `say`, `play`, `pause`, and `leave`, </br>See below for additional request parameters| no|
The *actionHook* webhook will contain the following additional parameters:
- `queueSid`: the unique identifier for the queue
- `queueResult`: the completion reason:
- 'hangup' - the call was abandoned while in queue
- 'leave' - a `leave` verb caused the call to exit the queue
- 'bridged' - a `dequeue` verb caused the call to be bridged to another call
- 'error' - a system error of some kind occurred
- `queueTime` - the number of seconds the call spent in queue
The *waitHook* webhook will contain the following additional parameters:
- `queueSid`: the unique identifier for the queue
- `queuePosition`: the current zero-based position in the queue
- `queueTime`: the current number of seconds the call has spent in queue
- `queueSize`: the current number of calls in the queue
<p>
<a href="/docs/webhooks/dialogflow" style="float: left;">Prev: dialogflow</a>
<a href="/docs/webhooks/gather" style="float: right;">Next: gather</a>
</p>

68
docs/webhooks/gather.md Normal file
View File

@@ -0,0 +1,68 @@
# gather
The `gather` command is used to collect dtmf or speech input.
```json
{
"verb": "gather",
"actionHook": "http://example.com/collect",
"input": ["digits", "speech"],
"finishOnKey": "#",
"numDigits": 5,
"timeout": 8,
"recognizer": {
"vendor": "Google",
"language": "en-US"
},
"say": {
"text": "To speak to Sales press 1. To speak to customer support press 2.",
"synthesizer": {
"vendor": "Google",
"language": "en-US"
}
}
}
```
You can use the following options in the `gather` command:
| option | description | required |
| ------------- |-------------| -----|
| actionHook | webhook POST to invoke with the collected digits or speech. The payload will include a 'speech' or 'dtmf' property along with the standard attributes. See below for more detail.| yes |
| finishOnKey | dmtf key that signals the end of input | no |
| input | array, specifying allowed types of input: ['digits'], ['speech'], or ['digits', 'speech']. Default: ['digits'] | no |
| numDigits | number of dtmf digits expected to gather | no |
| partialResultHook | webhook to send interim transcription results to. Partial transcriptions are only generated if this property is set. | no |
| play | nested [play](#play) command that can be used to prompt the user | no |
| recognizer.hints | array of words or phrases to assist speech detection | no |
| recognizer.language | language code to use for speech detection. Defaults to the application level setting, or 'en-US' if not set | no |
| recognizer.profanityFilter | if true, filter profanity from speech transcription. Default: no| no |
| recognizer.vendor | speech vendor to use (currently only Google supported) | no |
| say | nested [say](#say) command that can be used to prompt the user | no |
| timeout | The number of seconds of silence or inaction that denote the end of caller input. The timeout timer will begin after any nested play or say command completes. Defaults to 5 | no |
In the case of speech input, the actionHook payload will include a `speech` object with the response from Google speech:
```json
"speech": {
"stability": 0,
"is_final": true,
"alternatives": [{
"confidence": 0.858155,
"transcript": "sales please"
}]
}
```
In the case of digits input, the payload will simple include a `digits` property indicating the dtmf keys pressed:
```json
"digits": "0276"
```
**Note**: an HTTP POST will be used for both the `action` and the `partialResultCallback` since the body may need to contain nested JSON objects for speech details.
Note: the `partialResultCallback` web callback should not return content; any returned content will be discarded.
<p>
<a href="/docs/webhooks/enqueue" style="float: left;">Prev: enqueue</a>
<a href="/docs/webhooks/hangup" style="float: right;">Next: hangup</a>
</p>

22
docs/webhooks/hangup.md Normal file
View File

@@ -0,0 +1,22 @@
# hangup
The hangup command terminates the call and ends the application.
```json
{
"verb": "hangup",
"headers": {
"X-Reason" : "maximum call duration exceeded"
}
}
```
You can use the following options in the `hangup` action:
| option | description | required |
| ------------- |-------------| -----|
| headers | an object containing SIP headers to include in the BYE request | no |
<p>
<a href="/docs/webhooks/gather" style="float: left;">Prev: gather</a>
<a href="/docs/webhooks/leave" style="float: right;">Next: leave</a>
</p>

16
docs/webhooks/leave.md Normal file
View File

@@ -0,0 +1,16 @@
# leave
The `leave` verb transfers a call out of a queue. The call then returns to the flow of execution following the [enqueue](#enqueue) verb that parked the call, or the document returned by that verbs *actionHook* property, if provided.
```json
{
"verb": "leave"
}
```
There are no options for the `leave` verb.
<p>
<a href="/docs/webhooks/hangup" style="float: left;">Prev: hangup</a>
<a href="/docs/webhooks/lex" style="float: right;">Next: lex</a>
</p>

81
docs/webhooks/lex.md Normal file
View File

@@ -0,0 +1,81 @@
# lex
The 'lex' verb connects a call to an [Amazon Lex](https://aws.amazon.com/lex/) V2 bot.
```json
{
"verb": "lex",
"botId": "MTLNerCD9L",
"botAliasId": "z5yY1iYykE",
"region": "us-east-1",
"locale": "en_US",
"credentials": {
"accessKey": "XXXX",
"secretAccessKey": "YYYY"
},
"passDtmf": true,
"intent": "BookHotel",
"metadata": {
"slots": {
"Location": "Los Angeles"
},
"context": {
"callerId": "+15083084909",
"customerName": "abc company"
}
},
"tts": {
"vendor": "Google",
"language": "en-US",
"voice": "en-US-Wavenet-C"
}
"eventHook": "/lex-events"
}
```
The following features are supported:
- optionally specify an initial, or "welcome" intent,
- pre-fill slot values for the initial intent,
- provide text for a spoken welcome message at the start of the conversation,
- play lex-generated audio, or use text-to-speech with either AWS/Polly or Google voices,
- receive real-time notifications of intents and transcriptions as the conversation progresses, and
- provide arbitrary context data to the lex backend to help guide the flow.
You can use the following options in the `lex` verb:
| option | description | required |
| ------------- |-------------| -----|
| botID | Lex bot ID | yes |
| botAliasId | Lex bot alias ID | yes |
| credentials | AWS credentials | yes |
| credentials.accessKey | AWS access key id | yes |
| credentials.secretAccessKey | AWS secret access key id | yes |
| region | AWS region bot is running in | yes |
| locale | language code of speaker (currently supported languages:<br/> en_AU, en_GB, en_US, fr_CA, fr_FR, es_ES, es_US, it_IT)| yes |
| eventHook | A webhook to invoke when a Lex event occurs (e.g intent detected, transcription, etc) | no |
| intent | initial intent to trigger (i.e. "welcome intent") | no |
| welcomeMessage | text for a welcome message to play to the caller | no |
| noInputTimeout | timeout in millseconds Lex will wait for a response before triggering fallback intent | no |
| tts | if provided, playback will use text-to-speech rather than Lex-generated audio | no |
| tts.vendor | 'aws', 'Google', or 'default' | no |
| tts.language | language identifier or 'default' | no |
| tts.voice | voice identifier or 'default' | no |
| metadata | initial slot values and context data | no |
| metadata.slots | key-value pairs for slot names and initial values to be pre-filled | no |
| metadata.context | key-value pairs for context data to pass to Lex bot | no |
| metadata['x-amz-lex:channels:platform'] | name of voice platform | no |
The *eventHook* webhook will contain two parameters: `event` and `data`. The `event` parameter identifies the specific event and the `data` parameter is an object containng event data associated with the event. The following events are supported:
- `intent`: Lex detected an intent
- `transcription`: a speech transcription was returned from Lex
- `response-text`: Lex returned a response in the form of text
- `dmtf`: a dtmf key was pressed by the caller
- `start-play`: an audio segment returned from Lex or TTS started to play
- `stop-play`: an audio segment returned from Lex or TTS completing playing
- `play-interrupted`: an audio segment was interrupted
<p>
<a href="/docs/webhooks/leave" style="float: left;">Prev: leave</a>
<a href="/docs/webhooks/listen" style="float: right;">Next: listen</a>
</p>

47
docs/webhooks/listen.md Normal file
View File

@@ -0,0 +1,47 @@
# listen
jambonz does not have a 'record' verb. This is by design, for data privacy reasons:
>Recordings can contain sensitive and confidential information about your customers, and such data is never stored at rest in the jambonz core.
Instead, jambonz provides the **listen** verb, where an audio stream(s) can be forked and sent in real-time to your application for processing.
The listen verb can also be nested in a [dial](#dial) verb, which allows the audio for a call between two parties to be sent to a remote websocket server.
To utilize the listen verb, the customer must implement a websocket server to receive and process the audio. The endpoint should be prepared to accept websocket connections with a subprotocol name of 'audio.jambonz.org'.
The listen verb includes a **url** property which is the url of the remote websocket server to send the audio to. The url may be an absolute or relative URL. HTTP Basic Authentication can optionally be used to protect the websocket endpoint by using the **wsAuth** property.
The format of the audio data sent over the websocket is 16-bit PCM encoding, with a user-specified sample rate. The audio is sent in binary frames over the websocket connection.
Additionally, one text frame is sent immediately after the websocket connection is established. This text frame contains a JSON string with all of the call attributes normally sent on an HTTP request (e.g. callSid, etc), plus **sampleRate** and **mixType** properties describing the audio sample rate and stream(s). Additional metadata can also be added to this payload using the **metadata** property as described in the table below. Once the intial text frame containing the metadata has been sent, the remote side should expect to receive only binary frames, containing audio. The remote side is not expected to send any data back over the websocket.
```json
{
"verb": "listen",
"url": "wss://myrecorder.example.com/calls",
"mixType" : "stereo"
}
```
You can use the following options in the `listen` action:
| option | description | required |
| ------------- |-------------| -----|
| actionHook | webhook to invoke when listen operation ends. The information will include the duration of the audio stream, and also a 'digits' property if the recording was terminated by a dtmf key. | yes |
| finishOnKey | The set of digits that can end the listen action | no |
| maxLength | the maximum length of the listened audio stream, in secs | no |
| metadata | arbitrary data to add to the JSON payload sent to the remote server when websocket connection is first connected | no |
| mixType | "mono" (send single channel), "stereo" (send dual channel of both calls in a bridge), or "mixed" (send audio from both calls in a bridge in a single mixed audio stream) Default: mono | no |
| playBeep | true, false whether to play a beep at the start of the listen operation. Default: false | no |
| sampleRate | sample rate of audio to send (allowable values: 8000, 16000, 24000, 48000, or 64000). Default: 8000 | no |
| timeout | the number of seconds of silence that terminates the listen operation.| no |
| transcribe | a nested [transcribe](#transcribe) verb | no |
| url | url of remote server to connect to | yes |
| wsAuth.username | HTTP basic auth username to use on websocket connection | no |
| wsAuth.password | HTTP basic auth password to use on websocket connection | no |
<p>
<a href="/docs/webhooks/lex" style="float: left;">Prev: lex</a>
<a href="/docs/webhooks/pause" style="float: right;">Next: pause</a>
</p>

214
docs/webhooks/overview.md Normal file
View File

@@ -0,0 +1,214 @@
# Webhooks
jambonz uses JSON payloads that are exchanged in HTTP messages to control calls.
When an incoming call for your account is received, jambonz makes an HTTP request to a URL that you have configured and your response will contain a JSON body that indicates how you want the call handled.
When you want to launch an outbound call it works similarly: you will make an HTTP request using the [REST API](/docs/api/rest) and in it you will specify a URL or application identifier that will be invoked once the call is answered. Once again, your response to that HTTP request will contain a JSON body that indicates how you want the call handled.
Simple enough, right?
## Basic JSON message structure
The JSON payload that you provide in response to an HTTP request must be an array with each item describing a task that the platform shall perform. These tasks are executed sequentially in the order they appear in the array. Each task is identified by a verb (e.g. "dial", "gather", "hangup" etc) with associated detail to configure how the action should be carried out.
If the caller hangs up during the execution of an application for that call, the current task is allowed to complete and any remaining tasks in the application are ignored.
```json
[
{
"verb": "say",
"text": "Hi there! Please leave a message at the tone.",
"synthesizer": {
"vendor": "Google",
"language": "en-US",
"gender": "FEMALE"
}
},
{
/* ..next verb */
}
]
```
Some verbs allow other verbs to be nested; e.g. "gather" can have a nested "say" command in order to play a prompt and collect a response in one command:
```json
{
"verb": "gather",
"actionHook": "/gatherCardNumber",
"input": ["speech", "dtmf"],
"timeout": 16,
"numDigits": 6,
"recognizer": {
"vendor": "Google",
"language": "en-US"
},
"say": {
"text": "Please say or enter your six digit card number now",
"synthesizer": {
"vendor": "Google",
"language": "en-US",
"gender": "FEMALE"
}
}
}
```
Altogether then, a simple voicemail application could look like this:
```json
[
{
"verb": "say",
"text": "Hi there! Please leave a message at the tone and we will get back to you shortly."
},
{
"verb": "listen",
"actionHook": "http://example.com/voicemail",
"url": "wss://example.com/my-recorder",
"finishOnKey": "#",
"metadata": {
"topic": "voicemail"
},
"playBeep": true,
"timeout": 20
},
{
"verb": "say",
"text": "Thanks for your message. We'll get back to you"
}
]
```
## HTTP connection details
Each HTTP request that jambonz makes to one of your callbacks will include (at least) the following information either as query arguments (in a GET request) or in the body of the response as a JSON payload (in a POST request):
- call_sid: a unique identifier for the call.
- application_sid: a unique identifier for the jambonz application controlling this call
- account_sid: a unique identifier for the jambonz account associated with the application
- direction: the direction of the call: inbound or outbound
- from: the calling party number
- to: the called party number
- caller_id: the caller name, if known
- call_status: current status of the call, see table below
- sip_status: the most recent sip status code received or generated for the call
Additionally, the request **MAY** include
- parent_call_sid: the call_sid of a parent call to this call, if this call is a child call
And the initial webhook for a new incoming call will have:
- originating_sip_trunk_name: name of the SIP trunk that originated the call to the platform
- originating_sip_ip: the ip address and port of the sip gateway that originated the call
Finally, if you specify to use a POST method for the initial webhook for an incoming call, the JSON payload in that POST will also contain the entire incoming SIP INVITE request details in a 'sip' property (this is not provided if a GET request is used). This can be useful if you need a detailed look at all of the SIP headers, or the Session Description Protocol being offered.
> Note also that you can add arbitrary information of your own into the payloads that jambonz sends you by using the [tag](/docs/webhooks/tag/) verb early in your application flow. Data elements that you provide in that verb will then come back to you in further webhook callbacks for that call. This can be useful for managing stateful information during a call that you may want to drive decision logic later in the call.
| call_status value | description |
| ------------- |-------------|
| trying | a new incoming call has arrived or an outbound call has just been sent|
| ringing | a 180 Ringing response has been sent or received |
| early-media | an early media connection has been established prior to answering the call (183 Session Progress) |
| in-progress | call has been answered |
| completed | an answered call has ended |
| failed | a call attempt failed |
| busy | a call attempt failed because the called party returned a busy status |
| no-answer | a call attempt failed because it was not answered in time |
## Securing your HTTP Endpoints
Before we go any further, let's talk about how to properly secure your endpoints.
This is important because your response to HTTP webhook requests will contain information that must be kept private between you and the jambonz platform. We recommend that you use HTTPS connections secured with TLS certificates for your endpoints, and that you additionally takes steps to verify that the incoming request was actually sent by jambonz, and not an imposter.
For the latter, you have two options:
- You can use HTTP basic authentication to secure your endpoint with a username and password.
- On the hosted platform, you can verify the signature of the HTTP request to know that it was sent by jambonz.
#### Verifying a signed request
The HTTP requests sent to you from the hosted platform will include a Jambonz-Signature header, which is a hash of the request payload signed with your webhook secret, which you can view (and when desired, change) in the self-service portal. Using that secret, you can verify that the request was actually sent by jambonz.
When using the Node.js SDK, this is done simply as http middleware.
```js
const express = require('express');
const app = express();
const {WebhookResponse} = require('@jambonz/node-client');
app.use(WebhookResponse.verifyJambonzSignature('<your-webhook-secret>'));
app.use('/', routes); /* only requests with valid signatures will get here */
```
## Initial state of incoming calls
When the jambonz platform receives a new incoming call, it responds 100 Trying to the INVITE but does not automatically answer the call. It is up to your application to decide how to finally respond to the INVITE. You have some choices here.
Your application can:
- answer the call, which connects the call to a media endpoint that can perform IVR functions on the call,
- outdial a new call, and bridge the two calls together (i.e use the dial verb),
- reject the call, with a specified SIP status code and reason,
- establish an early media connection and play audio to the caller without answering the call.
The last is interesting and worthy of further comment. The intent is to let you play audio to callers without necessarily answering the call. You signal this by including an "earlyMedia" property with a value of true in the application. When receiving this, jambonz will create an early media connection (using 183 Session Progress), as shown in the example below.
> Note: an early media connection will not be possible if the call has already been answered by an earlier verb in the application. In such a scenario, the earlyMedia property is ignored.
```json
[
{
"verb": "say",
"earlyMedia": true,
"text": "Please call back later, we are currently at lunch"
"synthesizer": {
"vendor": "aws",
"language": "en-US",
"voice": "Amy"
},
{
"verb": "sip:decline",
"status": 480,
"headers": {
"Retry-After": 1800
}
}
}
]
```
Please note:
- The say, play, gather, listen, and transcribe verbs all support the "earlyMedia" property.
- The dial verb supports a similar feature of not answering the inbound call unless/until the dialed call is answered via the "answerOnBridge" property.
## Speech integration
The platform makes use of text-to-speech as well as real-time speech recognition. Both Google and AWS are supported for text to speech (TTS) as well as speech to text (STT).
Synthesized audio is cached for up to 24 hours, so that if the same {text, language, voice} combination is requested more than once in that period it will be served from cache, thus reducing your TTS bill from from Google or AWS.
When you configure your applications in the portal, you can set defaults for the language and voice to use for speech synthesis as well as the language to use for speech recognition. These can then be overridden by verbs in the application, by using the 'synthesizer' and 'recognizer' properties./
## Webhook URL specifiers
Many of the verbs specify a webhook that will be called when the verb completes or has some information to deliver to your application. These verbs contain a property that allow you to configure that webhook. By convention, the property name will always end in "Hook"; e.g "actionHook", "dtmfHook", and so on.
You can either specify the webhook as a simple string specifying either an absolute or relative url:
```json
"actionHook": "https://my.appserver.com/results"
```
```json
"actionHook": "/results"
```
In the latter case, the base url of the application will be applied.
Alternatively, you can provide an object containing a url and optional method and basic authentication parameters, e.g.:
```json
"actionHook": {
"url": "https://my.appserver.com/results",
"method": "GET",
"username": "foo",
"password": "bar"
}
```
<p>
<a href="/docs/webhooks/conference" style="float: right;">Next: conference</a>
</p>

20
docs/webhooks/pause.md Normal file
View File

@@ -0,0 +1,20 @@
# pause
The pause command waits silently for a specified number of seconds.
```json
{
"verb": "pause",
"length": 3
}
```
You can use the following options in the `pause` action:
| option | description | required |
| ------------- |-------------| -----|
| length | number of seconds to wait before continuing the app | yes |
<p>
<a href="/docs/webhooks/listen" style="float: left;">Prev: listen</a>
<a href="/docs/webhooks/play" style="float: right;">Next: play</a>
</p>

22
docs/webhooks/play.md Normal file
View File

@@ -0,0 +1,22 @@
# play
The play command is used to stream recorded audio to a call.
```json
{
"verb": "play",
"url": "https://example.com/example.mp3"
}
```
You can use the following options in the `play` action:
| option | description | required |
| ------------- |-------------| -----|
| url | a single url or array of urls (will play in sequence) to a wav or mp3 file | yes |
| loop | number of times to play the url(s) | no (default: 1) |
| earlyMedia | if true and the call has not yet been answered, play the audio without answering call. Defaults to false | no |
<p>
<a href="/docs/webhooks/pause" style="float: left;">Prev: pause</a>
<a href="/docs/webhooks/redirect" style="float: right;">Next: redirect</a>
</p>

21
docs/webhooks/redirect.md Normal file
View File

@@ -0,0 +1,21 @@
# redirect
The redirect action is used to transfer control to another JSON document that is retrieved from the specified url. All actions after `redirect` are unreachable and ignored.
```json
{
"verb": "redirect",
"actionHook": "/connectToSales",
}
```
You can use the following options in the `redirect` action:
| option | description | required |
| ------------- |-------------| -----|
| actionHook | URL of webhook to retrieve new application from. | yes |
<p>
<a href="/docs/webhooks/play" style="float: left;">Prev: play</a>
<a href="/docs/webhooks/say" style="float: right;">Next: say</a>
</p>

31
docs/webhooks/say.md Normal file
View File

@@ -0,0 +1,31 @@
# say
The say command is used to send synthesized speech to the remote party. The text provided may be either plain text or may use SSML tags.
```json
{
"verb": "say",
"text": "hi there!",
"synthesizer" : {
"vendor": "Google",
"language": "en-US"
}
}
```
You can use the following options in the `say` action:
| option | description | required |
| ------------- |-------------| -----|
| text | text to speak; may contain SSML tags | yes |
| synthesizer.vendor | speech vendor to use: Google or aws (polly is also an alias for aws)| no |
| synthesizer.language | language code to use. | yes |
| synthesizer.gender | (Google only) MALE, FEMALE, or NEUTRAL. | no |
| synthesizer.voice | voice to use. Note that the voice list differs whether you are using aws or Google. Defaults to application setting, if provided. | no |
| loop | the number of times a text is to be repeated; 0 means repeat forever. Defaults to 1. | no |
| earlyMedia | if true and the call has not yet been answered, play the audio without answering call. Defaults to false | no |
<p>
<a href="/docs/webhooks/redirect" style="float: left;">Prev: redirect</a>
<a href="/docs/webhooks/sip-decline" style="float: right;">Next: sip:decline</a>
</p>

View File

@@ -0,0 +1,31 @@
# sip:decline
The sip:decline action is used to reject an incoming call with a specific status and, optionally, a reason and SIP headers to include on the response.
This action must be the first and only action returned in the JSON payload for an incoming call.
The sip:decline action is a non-blocking action and the session ends immediately after the action is executed.
```json
{
"verb": "sip:decline",
"status": 480,
"reason": "Gone Fishing",
"headers" : {
"Retry-After": 1800
}
}
```
You can use the following options in the `sip:decline` action:
| option | description | required |
| ------------- |-------------| -----|
| status | a valid SIP status code in the range 4XX - 6XX | yes |
| reason | a brief description | no (default: the well-known SIP reasons associated with the specified status code |
| headers | SIP headers to include in the response | no
<p>
<a href="/docs/webhooks/say" style="float: left;">Prev: say</a>
<a href="/docs/webhooks/tag" style="float: right;">Next: tag</a>
</p>

56
docs/webhooks/tag.md Normal file
View File

@@ -0,0 +1,56 @@
# tag
The tag verb is used to add properties to the standard call attributes that jambonz includes on every action or call status HTTP POST request.
> Note: because of the possible richness of the data, only subsequent POST requests will include this data. It will not be included in HTTP GET requests.
The purpose is to simplify applications by eliminating the need to store state information if it can simply be echoed back to the application on each HTTP request for the call.
For example, consider an application that wishes to apply some privacy settings on outdials based on attributes in the initial incoming call. The application could parse information from the SIP INVITE provided in the web callback when the call arrives, and rather than having to store that information for later use it could simply use the 'tag' verb to associate that information with the call. Later, when an action or call status triggers the need for the application to outdial it can simply access the information from the HTTP POST body, rather than having to retrieve it from the cache of some sort.
Note that every time the tag verb is used, the collection of customer data is completely replaced with the new data provided. This information will be provided back in all action or status notifications if POST method is used. It will appear in property named 'customerData' in the JSON payload.
```json
{
"verb": "tag",
"data": {
"foo": "bar",
"counter": 100,
"list": [1, 2, "three"]
}
}
```
After the above 'tag' verb has executed, web callbacks using POST would have a payload similar to this:
```json
{
"call_sid": "df09e8d4-7ffd-492b-94d9-51a60318552c",
"direction": "inbound",
"from": "+15083084809",
"to": "+15083728299",
"call_id": "f0414693-bdb6-1238-6185-06d91d68c9b0",
"sip_status": 200,
"call_status": "in-progress",
"caller_id": "f0414693-bdb6-1238-6185-06d91d68c9b0",
"account_sid": "fef61e75-cec3-496c-a7bc-8368e4d02a04",
"application_sid": "0e0681b0-d49f-4fb8-b973-b5a3c6758de1",
"originating_sip_ip": "54.172.60.1:5060",
"originating_sip_trunk_name": "twilio",
"customerData": {
"foo": "bar",
"counter": 100,
"list": [1, 2, "three"]
}
}
```
You can use the following options in the `tag` command:
| option | description | required |
| ------------- |-------------| -----|
| data | a JSON object containing values to be saved and included in future action or call status notifications (HTTP POST only) for this call | yes |
<p>
<a href="/docs/webhooks/sip-decline" style="float: left;">Prev: sip:decline</a>
<a href="/docs/webhooks/transcribe" style="float: right;">Next: transcribe</a>
</p>

View File

@@ -0,0 +1,34 @@
# transcribe
The transcribe verb is used to send real time transcriptions of speech to a web callback.
The transcribe command is only allowed as a nested verb within a dial or listen verb. Using transcribe in a dial command allows a long-running transcription of a phone call to be made, while nesting within a listen verb allows transcriptions of recorded messages (e.g. voicemail).
```json
{
"verb": "transcribe",
"transcriptionHook": "http://example.com/transcribe",
"recognizer": {
"vendor": "Google",
"language" : "en-US",
"interim": true
}
}
```
You can use the following options in the `transcribe` command:
| option | description | required |
| ------------- |-------------| -----|
| recognizer.dualChannel | if true, transcribe the parent call as well as the child call | no |
| recognizer.interim | if true interim transcriptions are sent | no (default: false) |
| recognizer.language | language to use for speech transcription | yes |
| recognizer.profanityFilter | if true, filter profanity from speech transcription. Default: no| no |
| recognizer.vendor | speech vendor to use (currently only Google supported) | no |
| transcriptionHook | webhook to call when a transcription is received. Due to the richness of information in the transcription an HTTP POST will always be sent. | yes |
> **Note**: the `dualChannel` property is not currently implemented.
<p>
<a href="/docs/webhooks/tag" style="float: left;">Prev: tag</a>
</p>

View File

@@ -20,7 +20,7 @@ $font-mono: Consolas, Monaco, 'Andale Mono', 'Ubuntu Mono', monospace;
&__navi {
max-width: 224px;
width: 100%;
margin-right: 32px;
margin-right: 12px;
@media (max-width: $width-tablet-1) {
max-width: 100%;
@@ -37,6 +37,19 @@ $font-mono: Consolas, Monaco, 'Andale Mono', 'Ubuntu Mono', monospace;
max-width: 100%;
}
h1 {
font-size: 42px;
}
h2 {
font-size: 32px;
}
blockquote > p {
border-left: inset;
padding-left: 10px;
font-size: 16px;
line-height: 2;
}
> div {
> p {
@include ms();
@@ -56,6 +69,7 @@ $font-mono: Consolas, Monaco, 'Andale Mono', 'Ubuntu Mono', monospace;
li {
list-style-type: disc;
@include p();
font-size: 14px;
}
}