mirror of
https://github.com/jambonz/next-static-site.git
synced 2026-01-25 02:08:03 +00:00
Release notes/0.9.0 (#88)
* wip * initial changes * release notes * wip * wip * wip * wip * wip
This commit is contained in:
@@ -165,27 +165,30 @@ navi:
|
||||
path: release-notes
|
||||
title: Release Notes
|
||||
pages:
|
||||
-
|
||||
path: 0.9.0
|
||||
title: 0.9.0
|
||||
-
|
||||
path: v0.8.5
|
||||
title: v0.8.5
|
||||
title: 0.8.5
|
||||
-
|
||||
path: v0.8.4
|
||||
title: v0.8.4
|
||||
title: 0.8.4
|
||||
-
|
||||
path: v0.8.3
|
||||
title: v0.8.3
|
||||
title: 0.8.3
|
||||
-
|
||||
path: v0.8.2
|
||||
title: v0.8.2
|
||||
title: 0.8.2
|
||||
-
|
||||
path: v0.8.1
|
||||
title: v0.8.1
|
||||
title: 0.8.1
|
||||
-
|
||||
path: v0.8.0
|
||||
title: v0.8.0
|
||||
title: 0.8.0
|
||||
-
|
||||
path: v0.7.9
|
||||
title: v0.7.9
|
||||
title: 0.7.9
|
||||
-
|
||||
path: jambonz-ui
|
||||
title: Jambonz UI
|
||||
|
||||
39
markdown/docs/release-notes/0.9.0.md
Normal file
39
markdown/docs/release-notes/0.9.0.md
Normal file
@@ -0,0 +1,39 @@
|
||||
# Release 0.9.0
|
||||
#### Info
|
||||
- Release Date: April 20, 2024
|
||||
|
||||
#### New Features
|
||||
- Add support for google v2 STT api
|
||||
- Add support for additional TTS vendors: [PlayHT](https://play.ht/), [RimeLabs](https://rime.ai/), and [Deepgram](https://deepgram.com/product/text-to-speech)
|
||||
- Add support for streaming TTS (reduces latency) for Deepgram, ElevenLabs, Microsoft, PlayHT, RimeLabs, and Whisper
|
||||
- Add support for [bidirectional audio](/docs/supporting-articles/bidirectional-audio) in [listen](/docs/webhooks/listen) verb
|
||||
- Add new verb: [dub](/docs/webhooks/dub/) to insert additional audio tracks into the conversation; see [here](/docs/supporting-articles/using-dub-tracks/) for example usage.
|
||||
- Add `boostAudioSignal` to [config](/docs/webhooks/config) verb allowing the volume of a conversation to be increased or lowered.
|
||||
- Add support for "filler" audio to the [gather](/docs/webhooks/config) verb allowing brief audio to be played to a caller while the user application is processing a user utterance or dtmf collection, this can be useful in scenarios where an AI bot is expected to take a lengthy time to process a request
|
||||
- Add support for sending outbound OPTIONS pings to configured SIP trunks
|
||||
- If Deepgram endpointing is enabled, default utterance_end_ms to 1000 if none specified by the application (per Deepgram recommendation)
|
||||
- various improvements and enhancements to [node-client-ws](https://github.com/jambonz/node-client-ws)
|
||||
|
||||
|
||||
#### Bug fixes
|
||||
- various fixes for Deepgram STT
|
||||
- [714](https://github.com/jambonz/jambonz-feature-server/issues/714) bargein "sticky" only works twice
|
||||
- [710](https://github.com/jambonz/jambonz-feature-server/issues/710) fix for actionHookDelay action
|
||||
- [671](https://github.com/jambonz/jambonz-feature-server/issues/671) handling of siprec invite failure
|
||||
- [666](https://github.com/jambonz/jambonz-feature-server/issues/666) transcribe on dial verb does not transcribe B leg by default
|
||||
- fix for precaching of TTS
|
||||
- check if sip gateway is in blacklist before sending outbound call
|
||||
|
||||
#### SQL changes
|
||||
|
||||
```
|
||||
ALTER TABLE sip_gateways ADD COLUMN send_options_ping BOOLEAN NOT NULL DEFAULT 0
|
||||
ALTER TABLE applications MODIFY COLUMN speech_synthesis_voice VARCHAR(256)
|
||||
ALTER TABLE applications MODIFY COLUMN fallback_speech_synthesis_voice VARCHAR(256)
|
||||
```
|
||||
|
||||
#### Availability
|
||||
- Available now on jambonz.cloud
|
||||
- devops scripts (packer, cloudformation, helm) available now for subscription customers
|
||||
|
||||
**Questions?** Contact us at <a href="mailto:support@jambonz.org">support@jambonz.org</a>
|
||||
20
markdown/docs/supporting-articles/bidirectional-audio.md
Normal file
20
markdown/docs/supporting-articles/bidirectional-audio.md
Normal file
@@ -0,0 +1,20 @@
|
||||
# Bidirectional (streaming) audio
|
||||
|
||||
As of release 0.9.0, the jambonz [listen](/docs/webhooks/listen) verb supports streaming bidirectional audio.
|
||||
|
||||
>> Prior to release 0.9.0 bidirectional audio was supported but streaming was one-way: from jambonz to your application. Any audio you provided back had to be provided in the form of a base64-encoded file that was received and then played in its entirety.
|
||||
|
||||
To enable bidirectional audio, you must explicitly enable it in the listen verb with the `streaming` property as shown below:
|
||||
```js
|
||||
{
|
||||
verb: 'listen',
|
||||
bdirectionalAudio: {
|
||||
enabled: true,
|
||||
streaming: true,
|
||||
sampleRate: 8000
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Your application should then send binary frames of linear-16 pcm raw data with the specified sample rate over the websocket connection.
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
Sometimes in conversational AI scenarios there may be significant latency while the remote application processes a user response and is determing the next action to take. In these scenarios it is common to play a typing sound or other audio to provide an audio cue to the caller that the system is processing the response, that the agent is thinking or retrieving, etc.
|
||||
|
||||
Support for "filler noise" can enabled either at the session level using the `config.fillerNoise` property or at the individual `gather` level using the same property. In the example below, we set a session-wide setting for filler noise (in the form of a typing sound) to kick in after waiting 2 seconds for the remote app to respond to user input.
|
||||
Support for "filler noise" can enabled either at the session level using the [config.fillerNoise](/docs/webhooks/config) property or at the individual [gather](/docs/webhooks/gather) level using the same property. In the example below, we set a session-wide setting for filler noise (in the form of a typing sound) to kick in after waiting 2 seconds for the remote app to respond to user input.
|
||||
|
||||
```js
|
||||
/* websocket application */
|
||||
|
||||
@@ -29,6 +29,10 @@ You can use the following options in the `listen` action:
|
||||
| option | description | required |
|
||||
| ------------- |-------------| -----|
|
||||
| actionHook | webhook to invoke when listen operation ends. The information will include the duration of the audio stream, and also a 'digits' property if the recording was terminated by a dtmf key. | yes |
|
||||
|bidirectionalAudio.enabled|if true, enable bidirectional audio | no (default: true)|
|
||||
|bidirectionalAudio.streaming|if true, enable streaming of audio from your application to jambonz (and the remote caller)|no (default: false)|
|
||||
|bidirectionalAudio.sampleRate|required if streaming| no|
|
||||
|disableBidirectionalAudio| (deprecated) if true, disable bidirectional audio (same as setting bidirectionalAudio.enabled = false)|no|
|
||||
| finishOnKey | The set of digits that can end the listen action | no |
|
||||
| maxLength | the maximum length of the listened audio stream, in secs | no |
|
||||
| metadata | arbitrary data to add to the JSON payload sent to the remote server when websocket connection is first connected | no |
|
||||
@@ -58,6 +62,12 @@ Any DTMF digits entered by the far end party on the call can optionally be passe
|
||||
|
||||
Audio can also be sent back over the websocket to jambonz. This audio, if supplied, will be played out to the caller. (Note: Bidirectional audio is not supported when the `listen` is nested in the context of a `dial` verb).
|
||||
|
||||
There are two separate modes for bidirectional audio:
|
||||
- non-streaming, where you provide a full base64-encoded audio file as JSON text frames
|
||||
- streaming, where stream audio as L16 pcm raw audio as binary frames
|
||||
|
||||
<h5 id="bidirectional_audio_non_streaming">non-streaming</h5>
|
||||
|
||||
The far-end websocket server supplies bidirectional audio by sending a JSON text frame over the websocket connection:
|
||||
```json
|
||||
{
|
||||
@@ -92,6 +102,22 @@ And finally, if the websocket connection wishes to end the `listen`, it can send
|
||||
}
|
||||
```
|
||||
|
||||
<h5 id="bidirectional_audio_streaming">streaming</h5>
|
||||
|
||||
To enable streaming bidirectional audio, you must explicitly enable it as shown below:
|
||||
```js
|
||||
{
|
||||
verb: 'listen',
|
||||
bdirectionalAudio: {
|
||||
enabled: true,
|
||||
streaming: true,
|
||||
sampleRate: 8000
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Your application should then send binary frames of linear-16 pcm raw data with the specified sample rate over the websocket connection.
|
||||
|
||||
<p class="flex">
|
||||
<a href="/docs/webhooks/lex">Prev: lex</a>
|
||||
<a href="/docs/webhooks/message">Next: message</a>
|
||||
|
||||
@@ -22,8 +22,10 @@ The `command` property must be one of the values shown below.
|
||||
|conf:mute-status|mute or unmute all non-moderator conference legs|data must include a `conf_mute_status` property with a value of either 'mute' or 'unmute'|
|
||||
|conf:hold-status|place a conference leg on hold or take off hold|data must include a `conf_hold_status` property with a value of either 'hold' or 'unhold'|
|
||||
|listen:status|Change the status of a listen stream|data must include a `listen_status` property with a value of 'pause' or 'resume'|
|
||||
|record|manage call recording that is done via SIPREC to a remote recording server|data must include an `action` with one of "startCallRecording", "stopCallRecording", "pauseCallRecording", or "resumeCallRecording". When starting a recording you must also supply "recordingID" and "siprecServerURL". You may optionally supply a `headers` object with custom headers to be sent to the remote SIPREC recording server.|
|
||||
|whisper|Play a whisper prompt to the caller (i.e only one party hears the prompt)|data must include a `whisper` property that can be an array of say or play verbs|
|
||||
|sip:request|Send a SIP INFO, NOTIFY, or MESSAGE request to the far end party|data must include a 'method' property (allowed values: 'INFO', 'NOTIFY', 'MESSAGE') and can include 'content_type', 'content', and 'headers' properties.|
|
||||
|dub|add, remove or operate on a dub track.|The data must include the properties defined for the [dub](./docs/webhooks/dub) verb|
|
||||
|
||||
|
||||
> Note: In the data payload when `redirect` is used, each jambonz verb in the `data` array may optionally include an `id` property. If present, jambonz will provide `verb:status` notifications when the verb starts and ends execution.
|
||||
|
||||
Reference in New Issue
Block a user