mirror of
https://github.com/jambonz/next-static-site.git
synced 2026-01-25 02:08:03 +00:00
@@ -22,6 +22,7 @@ You can use the following options in the `play` action:
|
|||||||
| seekOffset | how many samples to seek into the url | no |
|
| seekOffset | how many samples to seek into the url | no |
|
||||||
| actionHook | webhook that is called when the play verb completes | no |
|
| actionHook | webhook that is called when the play verb completes | no |
|
||||||
|
|
||||||
|
|
||||||
<h5 id="message-action-properties">actionHook properties</h5>
|
<h5 id="message-action-properties">actionHook properties</h5>
|
||||||
|
|
||||||
The actionHook that is invoked when the `play` command completes will include the following properties:
|
The actionHook that is invoked when the `play` command completes will include the following properties:
|
||||||
|
|||||||
@@ -1,9 +1,12 @@
|
|||||||
|
# recognizer
|
||||||
|
|
||||||
The `recognizer` property is used in multiple verbs ([gather](/docs/webhooks/gather), [transcribe](/docs/webhooks/transcribe), [dial](/docs/webhooks/dial)). It selects and configures the speech recognizer. It is an object containing the following properties:
|
The `recognizer` property is used in multiple verbs ([gather](/docs/webhooks/gather), [transcribe](/docs/webhooks/transcribe), [dial](/docs/webhooks/dial)). It selects and configures the speech recognizer.
|
||||||
|
|
||||||
|
It is an object containing the following properties:
|
||||||
|
|
||||||
| option | description | required |
|
| option | description | required |
|
||||||
| ------------- |-------------| -----|
|
| ------------- |-------------| -----|
|
||||||
| vendor | Speech vendor to use (google, aws, microsoft, deepgram, nuance, nvidia, soniox, and ibm are supported, along with any others you add via the [custom speech API](/docs/speech-api/overview/)) | no |
|
| vendor | Speech vendor to use (see list below, along with any others you add via the [custom speech API](/docs/speech-api/overview/)) | no |
|
||||||
| language | Language code to use for speech detection. Defaults to the application level setting | no |
|
| language | Language code to use for speech detection. Defaults to the application level setting | no |
|
||||||
| interim | If true, interim transcriptions are sent | no (default: false) |
|
| interim | If true, interim transcriptions are sent | no (default: false) |
|
||||||
| hints | (google, microsoft, deepgram, nvidia, soniox) Array of words or phrases to assist speech detection. See [examples](#hints) below. | no |
|
| hints | (google, microsoft, deepgram, nvidia, soniox) Array of words or phrases to assist speech detection. See [examples](#hints) below. | no |
|
||||||
@@ -43,6 +46,18 @@ The `recognizer` property is used in multiple verbs ([gather](/docs/webhooks/gat
|
|||||||
| [nvidiaOptions](#nvidiaOptions) (added in 0.8.0)|Nvidia-specific speech recognition options (see below)| no |
|
| [nvidiaOptions](#nvidiaOptions) (added in 0.8.0)|Nvidia-specific speech recognition options (see below)| no |
|
||||||
| [sonioxOptions](#sonioxOptions) (added in 0.8.2)|Soniox-specific speech recognition options (see below)| no |
|
| [sonioxOptions](#sonioxOptions) (added in 0.8.2)|Soniox-specific speech recognition options (see below)| no |
|
||||||
|
|
||||||
|
## Speech-to-text vendors
|
||||||
|
jambonz natively supports the following speech-to-text services:
|
||||||
|
- assemblyai
|
||||||
|
- aws
|
||||||
|
- azure
|
||||||
|
- cobalt
|
||||||
|
- deepgram
|
||||||
|
- google
|
||||||
|
- ibm
|
||||||
|
- nuance
|
||||||
|
- nvidia
|
||||||
|
- sonoix
|
||||||
|
|
||||||
<h2 id="hints">Providing speech hints</h2>
|
<h2 id="hints">Providing speech hints</h2>
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
# say
|
# say
|
||||||
|
|
||||||
The say command is used to send synthesized speech to the remote party. The text provided may be either plain text or may use SSML tags. The following vendors are supported: google, microsoft, aws, nuance, nvidia, ibm, and wellsaid; along with any others you add via the [custom speech API](/docs/supporting-articles/custom-speech-tts).
|
The say command is used to send synthesized speech to the remote party. The text provided may be either plain text or may use SSML tags. jambonz supports a large number of speech vendors out of the box (see list below), and you may add others via the [custom speech API](/docs/supporting-articles/custom-speech-tts).
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
@@ -18,13 +18,26 @@ You can use the following options in the `say` action:
|
|||||||
| option | description | required |
|
| option | description | required |
|
||||||
| ------------- |-------------| -----|
|
| ------------- |-------------| -----|
|
||||||
| text | text to speak; may contain SSML tags | yes |
|
| text | text to speak; may contain SSML tags | yes |
|
||||||
| synthesizer.vendor | speech vendor to use (google, aws, microsoft, nuance, nvidia, and ibm are supported, along with any others you add via the [custom speech API](/docs/speech-api/overview/))| no |
|
| synthesizer.vendor | speech vendor to use (see list below, along with any others you add via the [custom speech API](/docs/speech-api/overview/))| no |
|
||||||
| synthesizer.language | language code to use. | no |
|
| synthesizer.language | language code to use. | no |
|
||||||
| synthesizer.gender | (Google only) MALE, FEMALE, or NEUTRAL. | no |
|
| synthesizer.gender | (Google only) MALE, FEMALE, or NEUTRAL. | no |
|
||||||
| synthesizer.voice | voice to use. Note that the voice list differs whether you are using aws or Google. Defaults to application setting, if provided. | no |
|
| synthesizer.voice | voice to use. Note that the voice list differs whether you are using aws or Google. Defaults to application setting, if provided. | no |
|
||||||
| loop | the number of times a text is to be repeated; 0 means repeat forever. Defaults to 1. | no |
|
| loop | the number of times a text is to be repeated; 0 means repeat forever. Defaults to 1. | no |
|
||||||
| earlyMedia | if true and the call has not yet been answered, play the audio without answering call. Defaults to false | no |
|
| earlyMedia | if true and the call has not yet been answered, play the audio without answering call. Defaults to false | no |
|
||||||
|
|
||||||
|
## Text-to-speech vendors
|
||||||
|
jambonz natively supports the following text-to-speech services:
|
||||||
|
- google
|
||||||
|
- aws
|
||||||
|
- azure
|
||||||
|
- deepgram
|
||||||
|
- elevenlabs
|
||||||
|
- ibm
|
||||||
|
- nuance
|
||||||
|
- nvidia
|
||||||
|
- wellsaid
|
||||||
|
- whisper
|
||||||
|
|
||||||
<p class="flex">
|
<p class="flex">
|
||||||
<a href="/docs/webhooks/redirect">Prev: redirect</a>
|
<a href="/docs/webhooks/redirect">Prev: redirect</a>
|
||||||
<a href="/docs/webhooks/sip-decline">Next: sip:decline</a>
|
<a href="/docs/webhooks/sip-decline">Next: sip:decline</a>
|
||||||
|
|||||||
Reference in New Issue
Block a user