Compare commits

..

122 Commits
v0.9.4 ... main

Author SHA1 Message Date
Hoan Luu Huu
325af42946 speechmatics support end_of_utterance_silence_trigger (#1499)
* speechmatics support end_of_utterance_silence_trigger

* wip
2026-01-23 10:11:58 -05:00
Hoan Luu Huu
9848152d5b support google gemini tts (#1491)
* support google gemini tts

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* update speech utils version

* wip
2026-01-22 10:12:06 -05:00
Sam Machin
2468557aef add statusHook to redirect verb (#1500)
* add statusHook to redirect

* fix url import

* Update redirect.js

* logging

* constructor statusHook

* lint n logging

* debug

* update call_status_hook

* use notifier to test url

* remove require url as its global since node 10

* update verb specs dep

* update verb specs
2026-01-21 09:20:47 -05:00
Dave Horton
3c3dfa81d3 fix issue where call hangs up and actionhook delay triggered (#1497) 2026-01-19 16:42:43 -05:00
Vinod Dharashive
961c2589ac freeswitch capture sip error and propagate the same error (#1489)
* fix: propagate SIP 488 error to SBC on endpoint allocation failure

When FreeSWITCH returns a SIP 488 'Not Acceptable Here' error during
endpoint allocation (e.g., codec incompatibility), this error was not
being propagated back to the SBC/client. Instead, the call would wait
indefinitely for websocket commands or return a generic 603 response.

Implementation:
- In _evalEndpointPrecondition(), detect SipError by checking
  err.type === 'SipError' or err.name === 'SipError'
- Extract the SIP status code (e.g., 488), reason, and the Reason
  header from the error response (e.g., Q.850;cause=88;text=INCOMPATIBLE_DESTINATION)
- Send the SIP error response immediately to the SBC with:
  - X-Reason header: endpoint allocation failure details
  - Reason header: original Q.850 cause from FreeSWITCH
- Notify call status change as Failed with proper SIP status
- Release the call immediately instead of waiting for commands

Also added fallback handling in InboundCallSession._onTasksDone() to
propagate the stored error if immediate send was not possible.

* wip

* Simplify SipError check to only use err.name
2026-01-13 08:58:37 -05:00
Hoan Luu Huu
e4ec0025c3 Fix/gladia multi sessions (#1487)
* support gladia transcribe multi channels

* wip
2026-01-07 10:46:33 -05:00
Ed Robbins
ba275ef547 #1485 remove deprecated call to URL.parse (#1488) 2026-01-05 15:53:32 -05:00
Dave Horton
83a8cf6d25 SIGUSR1 should cause fs to commence drying up calls but do not exit when call count reaches zero (#1486) 2026-01-04 12:23:26 -05:00
Sam Machin
09220872ae Update recording (#1483)
* refactor recording

removed the test of `(this.cs.accountInfo.account.record_all_calls || this.cs.application.record_all_calls` from backround-task-manager.jsL138 as this check is already done in call-session.js at Line 3007, also allows us to start the record from update or config verbs

* handle start recording for a call that is not yet answered

* return false if not changing recording state

* different check for status

* set hasRecording flag on callInfo when starting

* update redis on recording start

* lint

* update dependency
2026-01-02 11:05:38 -05:00
Dave Horton
fdce05fa40 add handler for SIGUSR1 to start drying up calls, useful as a generic mechanism on non-AWS deployments (#1482) 2025-12-30 13:31:42 -05:00
Sam Machin
3bd1dd6323 put removeListner in a try/catch (#1479)
* put removeListner in a try/catch

* typo
2025-12-19 13:31:06 -05:00
Ed Robbins
54dc172ebd Allow defining an ENV for specific webhook error return SIP code (#1476) 2025-12-16 17:14:42 -05:00
Hoan Luu Huu
e007e0e2d3 fixed callsession cannot close tts streaming (#1472) 2025-12-16 07:58:54 -05:00
Hoan Luu Huu
c5cd488fdf fixed gather should ignore transcription if task is killed/resolved. (#1465)
* fixed gather should ignore transcription if task is killed/resolved.

* wip
2025-12-12 09:03:08 -05:00
Sam Machin
57982335e0 add label to STT/TTS alerts (#1468)
* add label to STT/TTS alerts

* update time-series
2025-12-11 11:07:24 -05:00
Hoan Luu Huu
5cea91e18a add support for sending DTMF to ultravox (#1471) 2025-12-11 07:53:59 -05:00
Dave Horton
e396b6aa98 fix #1466: (#1467)
* fix #1466:

* do not send tts streaming events when we are not doing tts streaming
2025-12-09 09:43:53 -05:00
Vinod Dharashive
9104ebb603 Add configurable say chunk size (#1461) 2025-12-08 10:54:27 -05:00
Vinod Dharashive
1ad0261336 Enhance TTS sentence boundary detection for Arabic and Japanese (#1464)
Update sentenceEndRegex to treat the following as sentence boundaries: ASCII .!? followed by whitespace or end-of-text; Arabic question mark (؟) and full stop (۔) with the same rule; Japanese 。, !, ? treated as boundaries regardless of following character; and double newlines (\n\n). This improves streaming chunking for mixed-language content.
2025-12-08 10:44:20 -05:00
Hoan Luu Huu
7802822773 fixed dial verb cannot bridge 2 leg endpoints due to transcoding (#1457)
* fixed dial verb cannot bridge 2 leg endpoints due to transcoding

* wip
2025-12-03 07:16:25 -05:00
Hoan Luu Huu
edb4d21ce1 fixed undefine issue when setting tts streaming channel vars (#1456) 2025-12-02 19:46:28 -05:00
Dave Horton
8048e9cf88 when dialing the B leg we check to see if we are using opus on the A leg, and if so we outdial B with opus first; however we were incorrectly checking the SDP on the A leg invite not the 200 OK we send back (#1455) 2025-12-02 19:22:20 -05:00
Sam Machin
451feafed4 use timeout on HTTP requests (#1453) 2025-12-02 07:41:47 -05:00
Ed Robbins
7f1543a0f3 Add ability to enable/disable Azure audio logging via azureOptions (#1432) 2025-11-30 11:56:56 -05:00
Hoan Luu Huu
83955ba972 SoundHound support audio endpoint from speech credential (#1446)
* SoundHound support audio endpoint from speech credential

* add requestInfo and sampleRate to houndify channel variable

* add requestInfo and sampleRate to houndify channel variable

* wip

* wip

* wip

* wip

* wip

* wip

* wip
2025-11-30 11:55:20 -05:00
Hoan Luu Huu
a5fa5fce5b Fixed transcribe 2 legs cannot fallback (#1451)
* fixed transcribe cannot fallback for specific endpoint

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip
2025-11-28 21:43:05 -05:00
Dave Horton
cc1751f500 fix race condition where gather resolves with speech transcript but t… (#1449)
* fix race condition where gather resolves with speech transcript but timeout timer gets set after the resolve and is left running after gather completes

* remove unneeded line of code
2025-11-27 11:44:49 -06:00
Ed Robbins
1a1f53aede Compare sdp to determine if transcoding is being used. (#1444)
* compare sdp for transcoding

* refactor sdp check for leading codec

* fix reference to epOther

* minor changes

* minor

* fix #1447

* fix security issue

* use convenience getter appIsUsingWebsockets in CallSession

---------

Co-authored-by: Dave Horton <daveh@beachdognet.com>
2025-11-24 10:50:41 -06:00
Hoan Luu Huu
1984b6d3ea allow say verb failed as NonFatalTaskError for File Not Found (#1443)
* allow say verb failed as NonFatalTaskError for File Not Found

* wip
2025-11-20 07:22:28 -05:00
Hoan Luu Huu
769b66f57e fixed playbackIds is not in correct order compare with say.text array (#1439)
* fixed playbackIds is not in correct order compare with say.text array

* wip

* wip
2025-11-19 19:00:44 -05:00
Hoan Luu Huu
98b845f489 fix say verb does not close streaming when finish say (#1412)
* fix say verb does not close streaming when finish say

* wip

* wip

* ttsStreamingBuffer reset eventHandlerCount after remove listeners

* only send tokens to module if connected

* wip

* sent stream_open when successfully connected to vendor
2025-11-17 08:56:09 -05:00
Ed Robbins
f92b1dbc97 Add ability to override certain tts streaming options via the config … (#1429)
* Add ability to override certain tts streaming options via the config verb.

* Update to null operator(??), support parameter override via config
2025-11-12 13:54:01 -05:00
Dave Horton
0442144793 fix bug escaping backspace character 2025-11-03 15:33:59 -05:00
Hoan Luu Huu
2de24af169 fixed gather does not start timeout on bargin (#1421)
* fixed gather does not start timeout on bargin

* with previous change, no need to emit playDone since no where in the code are we listening for it

---------

Co-authored-by: Dave Horton <daveh@beachdognet.com>
2025-11-03 13:11:59 -05:00
Dave Horton
a884880321 fix for #1422 (#1423)
* fix for #1422

* fix prev commit
2025-11-03 12:53:43 -05:00
Dave Horton
b307df79d0 update deps (#1417) 2025-10-31 07:31:32 -04:00
Hoan Luu Huu
77bd11dd47 update speech util 0.2.26 (#1416) 2025-10-31 07:14:38 -04:00
Hoan Luu Huu
46d56fe546 fd_1574: should not send only whitespace to streaming tts engine (#1415) 2025-10-30 20:59:25 -04:00
Hoan Luu Huu
30ab281ea2 support disableTtsCache from config verb (#1410) 2025-10-28 08:19:03 -04:00
Sam Machin
0869a73052 add distributeDtmf to conference (#1401)
* add distributeDtmf to conference

* lint

* bump verb specs
2025-10-21 11:20:12 -04:00
Sam Machin
a0a579ccee escape json special chars in metadata (#1399) 2025-10-20 10:30:03 -04:00
Sam Machin
4218653852 add customerData on transferred calls (#1391)
* add customerData on transferred calls

* change to if statement
2025-10-20 09:20:12 -04:00
Hoan Luu Huu
89cc39f726 support gladia stt (#1397)
* support gladia stt

* wip

* wip

* update verb specification
2025-10-20 04:56:39 -04:00
Sam Machin
b231593bff bump dbhelpers for cache change (#1396) 2025-10-15 11:38:43 -04:00
Sam Machin
4309d25376 don't encode querystring if its the filename (#1395)
* don't encode querystring if its the filename

* lint

* update link to issue

u
2025-10-14 10:48:50 -04:00
Hoan Luu Huu
a00703a067 support houndify stt (#1364)
* support houndify stt

* wip

* wip

* wip

* update houndify stt parameters

* wip

* wip
2025-10-14 00:55:21 -04:00
Hoan Luu Huu
89c985b564 fixed does not send final status call back if call canceled quickly (#1393)
* fixed callsession should cleanup resource if call was canceled while fetching app

* wip

* wip

* wip

* wip

* wip
2025-10-11 03:44:42 -04:00
Dave Horton
b4ed4c8c46 #1385: Gather - dont start the continuous asr timer when we first start listening if this is a background gather (#1386) 2025-10-09 08:47:51 -04:00
Hoan Luu Huu
581d309f36 support elevenlabs different endpoint (#1387)
* support elevenlabs different endpoint

* wip

* wip
2025-10-09 08:19:40 -04:00
Sam Machin
d1baf2fe37 if call is transferred from another FS then always answer (#1383)
Currently if the call being transferred was originally an outbound call then the direction thats retrieved from redis is outbound and the invite of the refer from the other FS is never answered,
However a transferredCall will always need to be answered regardless of CallDirection
2025-10-07 07:19:11 -04:00
Dave Horton
28bf0d3477 send eager_eot events (#1382) 2025-10-06 16:50:20 -04:00
Dave Horton
d2d3b4583e Fix/flux cleanup (#1379)
* for deepgram flux include the turn taking events in the transcription payload

* for deepgram flux, including turn_taking_event in the speech payload

* fix prev commit which used wrong field
2025-10-04 20:06:38 -04:00
Hoan Luu Huu
854c26db11 support deepgramflux (#1373)
* support deepgramflux

* wip

* wip

* wip

* wip

* update verb scpecification
2025-10-03 10:38:39 -04:00
Dave Horton
e77666a1a7 update speech-utils and drachtio-srf (#1377) 2025-10-03 10:06:37 -04:00
Ed Robbins
5acb19225b If an error occurs during initial TTS request, propagate the error (#1369)
* If an error occurs during initial TTS request, propagate the error

* fix missing semicolon

* fix jslint error

* add null check to r.playbackMilliseconds
2025-10-01 00:02:51 -04:00
Dave Horton
1d6f84c2d7 add event handler for when deepgram closes with an error (#1372) 2025-09-28 14:18:56 -04:00
Vinod Dharashive
de9b970a93 Update example-voicemail-greetings.json (#1367)
Add JP greetin
2025-09-23 09:30:40 -04:00
rammohan-y
ec786ef1dd Fix for sending synthesized-audio verb:status event when using TTS streaming (#1366)
https://github.com/jambonz/jambonz-feature-server/issues/1365
2025-09-23 09:30:05 -04:00
Sam Machin
a95a6d1683 clear main timout when interdigit timeout is started, (#1351)
* clear main timout when interdigit timeout is started,

also clear the asrTimeout when dtmf has taken priority

* lint

* lint
2025-09-12 09:01:51 -04:00
Dave Horton
65b3066866 catch exceptions from req.cancel() (#1359)
* catch exceptions from req.cancel()

* catch other instances of req.cancel

* fix prev commit
2025-09-11 12:25:36 -04:00
Hoan Luu Huu
057f52e56c speech_util v0.2.23 (#1358) 2025-09-11 01:30:31 -04:00
Hoan Luu Huu
b46be57eba singleDialer should create ConfirmCallSession with correct tmpFiles (#1357) 2025-09-10 22:37:41 -04:00
Hoan Luu Huu
f950d19d1c fix ConfirmCallSession in placeCall does not have access to tmpFiles for removing tmp file later (#1356) 2025-09-10 19:39:43 -04:00
pk32495
859132bb1c Fixed token missing log line. (#1354) 2025-09-10 15:03:56 -04:00
Dave Horton
acaadceaa2 fix exception when receiving REFER but dial task ended (#1353) 2025-09-10 12:23:32 -04:00
Hoan Luu Huu
add8d63e8e tts stream should not print speech credential (#1352) 2025-09-09 19:18:49 -04:00
Sam Machin
a05b72a420 Fix/1345 (#1349)
* don't try and guess carrier if LCR is set

* lint

* update dbHelpers dep
2025-09-06 14:15:16 -04:00
rammohan-y
28ff85225f Fixed issue for punctuation (#1344)
https://github.com/jambonz/jambonz-feature-server/issues/1343
2025-09-03 13:33:38 -04:00
Dave Horton
f2fe7c4d24 Fix/playback race by fs generates playback (#1331)
* update to speech-utils that generates playback id

* modify tts and say task to track current playback id and match against start and stop events

* bump speech utils

* wip

* wip

* fix race condition where say with playbackId gets stop event from previous play from cache file

* logging

* wip

* fix comparison when playing cached files

* logging
2025-08-26 09:39:25 -04:00
Dave Horton
97408c7d3b fix uncaught exception with llm streaming 2025-08-22 13:24:59 -04:00
Dave Horton
db5f0a0dce Feat/startup logging (#1333)
* turn down some logging

* add startup logging

* wip

* lint
2025-08-21 14:09:20 -04:00
Sam Machin
654ccd9d9d start timeout on bargein digits (#1312) 2025-08-19 08:34:33 -04:00
Dave Horton
ea27b20ac5 update speech-utils (#1324) 2025-08-17 10:09:55 -04:00
Hoan Luu Huu
96aa705378 update speech util verbsion (#1323) 2025-08-14 08:39:12 -04:00
rammohan-y
5e51849839 Sending synthesized-audio notification for servedFromCache as false (#1320)
* Sending synthesized-audio notification for servedFromCache as well
https://github.com/jambonz/jambonz-feature-server/issues/1319

* Sending back the id that was set, to track the synthesized-audio
e.g if we send a say verb having 100, it's synthesized-audio event will return 100 in the data to correleate the say verb and synthesized-audio event
2025-08-13 20:56:56 -04:00
Hoan Luu Huu
44f69fa76d Support resemble tts (#1322)
* support resemble tts

* update speech utils version
2025-08-13 08:15:29 -04:00
Dave Horton
73c77bea71 add support for deepgram entity_prompt 2025-08-11 20:58:14 -04:00
rammohan-y
babc0d0dbb Fix for issue https://github.com/jambonz/jambonz-feature-server/issues/1317. (#1318)
Unable to use mod_aws_transcribe module due to security error as sessionId is not populated
2025-08-11 09:17:32 -04:00
Hoan Luu Huu
6b8d0fe1a8 update db-helper 0.9.16 (#1316) 2025-08-08 10:43:59 -04:00
Dave Horton
66bb466297 fix bug where task.kill is not passed cs (#1315) 2025-08-07 16:01:45 -04:00
Dave Horton
1933f4ec0b Feat/freeswitch logging (#1309)
* include callSid on INVITEs to freeeswitch

* remove unnecessary warning
2025-08-04 09:19:47 -04:00
Sam Machin
b1089a1ae9 pass recogniser opts in amd to stt (#1308) 2025-08-01 22:26:51 -04:00
sathish kumar pasham
93e06d887e Fix security vulnerabilities by upgrading @jambonz/realtimedb-helpers (#1305) 2025-07-30 13:24:19 -04:00
Sam Machin
b478e0ecd2 fix assert, and force methods to upper case (#1304)
* fix assert, and force methods to upper case

* add alert for updateCall errors

* lint

* handle missing method
2025-07-30 08:32:15 -04:00
Sam Machin
94d43d4b70 use tmpFiles list of parent call-session (#1301)
fixes #1299
2025-07-29 22:08:00 -04:00
Hoan Luu Huu
eb449e9169 support deepgram river (#1273)
* support deepgram river

* wip

* rebase

* fix review comment

---------

Co-authored-by: Dave Horton <daveh@beachdognet.com>
2025-07-29 13:49:43 -04:00
Hoan Luu Huu
158d9d7d25 support stt latency metrics (#1252)
* support stt latency metrics

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* enable stt latency calculator by config verb

* wip

* wip

* wip

* fix jslint

* fixed gather timeout does not have latency calculation

* upadte verb specification to use notifySttLatency

* move stt latency metric from call session to stt-latency calculator

* wip
2025-07-29 09:56:37 -04:00
Hoan Luu Huu
5886d1d945 allow pause/resume background listen with silence/blank (#1300)
* allow pause/resume background listen with silence/blank

* wip

* wip

* wip

* wip

* update drachtio-fsmrf version
2025-07-28 07:58:30 -04:00
Hoan Luu Huu
352106ec0c support referTo with tel: prefix (#1297)
* support referTo with tel: prefix

* fix review comment
2025-07-23 10:52:10 -04:00
Sam Machin
05a6bf51a7 corrected typos (#1295)
🧙‍♀️
2025-07-21 07:22:51 -04:00
Sam Machin
bd1c763e72 urlencode the . in a url querystring on play (#1293)
* urlencode the . in a url querystring on play

* lint

* fix for non querystring urls too

* lint
2025-07-20 14:03:21 -04:00
Sam Machin
d831a4ca7f don't fetch if whisper is an object with a single verb in it (#1290)
* don't fetch if whisper is an object with a single verb in it

* disable URL fetching of verbs on whisper

* fixes for lint

* lint
2025-07-20 13:47:31 -04:00
Sam Machin
0cc6ea987f Fix/1277 (#1278)
* wip

* forwardPAI control in verb

* update deps

* package-loxck

* Update package-lock.json
2025-07-20 13:26:33 -04:00
rammohan-y
6e7521c91c Fix for issue where we should not delay from gather when JAMBONES_TRANSCRIBE_EP_DESTROY_DELAY_MS is set (#1283)
* Fix for issue https://github.com/jambonz/jambonz-feature-server/issues/1281

* Commented the reason for the change
2025-07-17 10:36:22 -04:00
Hoan Luu Huu
e0e2ade289 fixed gather cannot timeout if listenDuringPrompt is true (#1276)
* fixed gather cannot timeout if listenDuringPrompt is true

* wip

* wip

* wip
2025-07-15 22:53:22 -04:00
Dave Horton
ee895b4046 bump version 2025-07-15 11:40:14 -04:00
Dave Horton
2a42ccb0e1 fix for #1285 (#1286)
* fix for #1285

* typo
2025-07-15 11:38:48 -04:00
Dave Horton
62b6a814b7 fixes for LCC dial where a relative url is given as actionhook (#1282)
* fixes for LCC dial where a relative url is given as actionhook

* update to speech-utils 0.2.15 with configurable tmp folder location
2025-07-13 11:08:15 -04:00
Dave Horton
e415420150 Fix/cached audio race (#1279)
* fix for race condition when play-stop event from earlier command received

* wip

* say verb should not cache if disableTtsCache = true

* logging

* modify race condition logic to validate playback id in playback-stopped matches that from playback-start

* logging

---------

Co-authored-by: Quan HL <quan.luuhoang8@gmail.com>
2025-07-13 10:56:41 -04:00
Sam Machin
e6e039e0f2 add alert verb (#1270)
* add alert verb

* update dependencies

* Update package-lock.json

* remove await taskDone
2025-07-10 07:39:42 -04:00
Dave Horton
657e2d4a49 update speech-utils (#1275) 2025-07-09 16:10:08 -04:00
Hoan Luu Huu
337c1cded0 fixed transcription is not received when call is terminated (#1259)
* fixed transcription is not received when call is terminated

* wip

* fixed failing testcases

* wip

* wip

* wip

* wip

* should not do gracefulshutdown on stopAmd
2025-07-09 10:20:09 -04:00
Dave Horton
444abcb036 (fd 1056) if no payload returned from referHook in dial then dont do anything (#1271) 2025-07-07 12:50:46 -04:00
Hoan Luu Huu
c82a835e70 FD_1079: tts modules response_code = 0 should make say fail to exec. (#1269)
* FD_1079: tts modules response_code = 0 should make say fail to exec.

* fixed tts azure does not have response code

* fixed tts error does not raise alarm

* wip

* fixed
2025-07-07 07:59:29 -04:00
Hoan Luu Huu
3c185d4bd2 consistent assemblyAiOptions name (#1267)
* consistent assemblyAiOptions  name

* update verb specification version
2025-07-03 08:05:56 -04:00
Hoan Luu Huu
ba2049b705 support assemblyai v3 (#1265)
* support assemblyai v3

* wip

* wip

* wip

* wip

* wip

* wip
2025-07-01 15:46:19 -04:00
Dave Horton
7691af30de Fix/dial refer (#1264)
* Revert "Update dial.js (#1243)"

This reverts commit 259dedcded.

* add to .gitignore

* when we receive a REFER on the parent leg, after adulting the child the dial task in the parent session should end
2025-06-28 15:01:09 -04:00
Hoan Luu Huu
ab83b21979 support inworld tts (#1262)
* support inworld tts

* wip

* wip
2025-06-27 10:05:18 -04:00
Hoan Luu Huu
f18b62e165 support ultravox agent id (#1254)
* support ultravox agent id

* wip
2025-06-23 09:37:10 -04:00
Dave Horton
f98bf2a1f8 update to drachtio-fsmrf@4.0.4 (#1255) 2025-06-22 15:18:30 +02:00
Hoan Luu Huu
8c67c05d87 fixed bargin task loop forever (#1253) 2025-06-22 06:20:24 +02:00
Hoan Luu Huu
3f11ee58a7 fixed dub playOnTrack loop forever (#1247) 2025-06-18 21:37:32 +02:00
rammohan-y
c8d94026ff Removing video sdp when making an outbound call (#1242) 2025-06-18 21:31:00 +02:00
Hoan Luu Huu
5be6c54339 support mod_cartesia_transcribe (#1245) 2025-06-17 20:54:26 +02:00
Sam Machin
259dedcded Update dial.js (#1243) 2025-06-13 18:23:40 +02:00
Dave Horton
b70fea69cc Fix/dial unhandled rejection (#1239)
* fix bug with race condition in dial which could spike cpu due to unhandled exception

* update to 0.2.12 speech-utils with azure ssml fix

* minor
2025-06-11 11:40:51 +02:00
leedia-tech
2bea7e83e1 Update example-voicemail-greetings.json: included Italian (it-IT) voicemail response patterns based on common operator messages (#1226) 2025-06-11 11:03:51 +02:00
Rohan Sreerama
812076d4fe fix(create-call): fix url parsing (#1235) 2025-06-11 11:01:44 +02:00
Hoan Luu Huu
b0b74871e7 support say stream with text (#1227)
* support say stream with text

* wip

* wip

* wip

* wip

* update verb  specification
2025-06-10 16:56:44 +02:00
Hoan Luu Huu
29708a1f7c clear log from ws-requestor (#1238)
* clear log from ws-requestor

* wip

* wip
2025-06-10 10:34:33 +02:00
Rohan Sreerama
e686a11808 fix(play): fixes ep null issue (#1233)
* fix(play): fixes ep null issue

* chore(play): fixing formatting

* chore(play): update syntax

* chore(play): updates

* fix(play): updated fix

* fix(play): updated fix

---------

Co-authored-by: rsreerama3 <rsreerama3@gatech.edu>
2025-06-05 14:57:35 -04:00
Dave Horton
25f58d2e43 fixes #1230 (#1231) 2025-06-04 13:42:49 -04:00
55 changed files with 7012 additions and 2640 deletions

3
.gitignore vendored
View File

@@ -2,6 +2,9 @@
logs
*.log
.claude/
CLAUDE.md
# Runtime data
pids
*.pid

6
app.js
View File

@@ -29,6 +29,12 @@ const {LifeCycleEvents, FS_UUID_SET_NAME, SystemState, FEATURE_SERVER} = require
const installSrfLocals = require('./lib/utils/install-srf-locals');
const createHttpListener = require('./lib/utils/http-listener');
const healthCheck = require('@jambonz/http-health-check');
const ProcessMonitor = require('./lib/utils/process-monitor');
const monitor = new ProcessMonitor(logger);
// Log startup
monitor.logStartup();
monitor.setupSignalHandlers();
logger.on('level-change', (lvl, _val, prevLvl, _prevVal, instance) => {
if (logger !== instance) {

View File

@@ -163,5 +163,72 @@
"wird sich bei Ihnen melden",
"ich melde mich bei dir",
"wir können nicht"
],
"it-IT": [
"segreteria telefonica",
"risponde la segreteria telefonica",
"lascia un messaggio",
"puoi lasciare un messaggio dopo il segnale",
"dopo il segnale acustico",
"il numero chiamato non è raggiungibile",
"non è raggiungibile",
"lascia pure un messaggio",
"puoi lasciare un messaggio"
],
"ja-JP": [
"この通話は留守番電話に転送されました",
"発信先は現在電話に出ることができません",
"発信音の後でメッセージを録音してください",
"録音を完了したら電話を切ることができます",
"只今電話に出ることができません",
"ただ今電話に出ることができません",
"ただいま電話に出ることができません",
"ピーという発信音の後にお名前とご用件をお話しください",
"ファックスを送られる方はスタートボタンを押してください",
"FAXを送られる方はスタートボタンを押してください",
"おかけになった電話をお呼びしましたが",
"お出になりません",
"おでになりません",
"お掛けになった電話番号は",
"おかけになった電話番号は",
"お掛けになった電話は",
"おかけになった電話は",
"現在使われておりません",
"番号をお確かめになって",
"お掛け直し下さい",
"おかけ直し下さい",
"おかけ直しください",
"こちらはNTTドコモです",
"こちらはエーユーです",
"こちらはソフトバンクです",
"電波の届かない",
"電源が入っていない",
"掛かりません",
"かかりません",
"お繋ぎすることが出来ません",
"お繋ぎ出来ません",
"お繋ぎすることができません",
"お繋ぎできません",
"おつなぎすることができません",
"おつなぎできません",
"メッセージを録音",
"留守番電話",
"お留守番サービス",
"留守番",
"留守電",
"留守",
"接続します",
"合図の音",
"ピーと",
"発信音",
"ご用件",
"伝言",
"お話しください",
"ファックス",
"FAX",
"終了",
"終了しました",
"終了いたしました",
"営業時間"
]
}

View File

@@ -119,7 +119,7 @@ const ENCRYPTION_SECRET = process.env.ENCRYPTION_SECRET;
const HTTP_POOL = process.env.HTTP_POOL && parseInt(process.env.HTTP_POOL);
const HTTP_POOLSIZE = parseInt(process.env.HTTP_POOLSIZE, 10) || 10;
const HTTP_PIPELINING = parseInt(process.env.HTTP_PIPELINING, 10) || 1;
const HTTP_TIMEOUT = 10000;
const HTTP_TIMEOUT = parseInt(process.env.JAMBONES_HTTP_TIMEOUT, 10) || 10000;
const HTTP_PROXY_IP = process.env.JAMBONES_HTTP_PROXY_IP;
const HTTP_PROXY_PORT = process.env.JAMBONES_HTTP_PROXY_PORT;
const HTTP_PROXY_PROTOCOL = process.env.JAMBONES_HTTP_PROXY_PROTOCOL || 'http';
@@ -130,7 +130,7 @@ const OPTIONS_PING_INTERVAL = parseInt(process.env.OPTIONS_PING_INTERVAL, 10) ||
const JAMBONZ_RECORD_WS_BASE_URL = process.env.JAMBONZ_RECORD_WS_BASE_URL || process.env.JAMBONES_RECORD_WS_BASE_URL;
const JAMBONZ_RECORD_WS_USERNAME = process.env.JAMBONZ_RECORD_WS_USERNAME || process.env.JAMBONES_RECORD_WS_USERNAME;
const JAMBONZ_RECORD_WS_PASSWORD = process.env.JAMBONZ_RECORD_WS_PASSWORD || process.env.JAMBONES_RECORD_WS_PASSWORD;
const JAMBONZ_DISABLE_DIAL_PAI_HEADER = process.env.JAMBONZ_DISABLE_DIAL_PAI_HEADER || false;
const JAMBONZ_DIAL_PAI_HEADER = process.env.JAMBONZ_DIAL_PAI_HEADER || false;
const JAMBONES_DISABLE_DIRECT_P2P_CALL = process.env.JAMBONES_DISABLE_DIRECT_P2P_CALL || false;
const JAMBONES_EAGERLY_PRE_CACHE_AUDIO = parseInt(process.env.JAMBONES_EAGERLY_PRE_CACHE_AUDIO, 10) || 0;
@@ -139,6 +139,11 @@ const JAMBONES_USE_FREESWITCH_TIMER_FD = process.env.JAMBONES_USE_FREESWITCH_TIM
const JAMBONES_DIAL_SBC_FOR_REGISTERED_USER = process.env.JAMBONES_DIAL_SBC_FOR_REGISTERED_USER || false;
const JAMBONES_MEDIA_TIMEOUT_MS = process.env.JAMBONES_MEDIA_TIMEOUT_MS || 0;
const JAMBONES_MEDIA_HOLD_TIMEOUT_MS = process.env.JAMBONES_MEDIA_HOLD_TIMEOUT_MS || 0;
const JAMBONES_WEBHOOK_ERROR_RETURN = parseInt(process.env.JAMBONES_WEBHOOK_ERROR_RETURN, 10) || 480;
/* say / tts */
const JAMBONES_SAY_CHUNK_SIZE = parseInt(process.env.JAMBONES_SAY_CHUNK_SIZE, 10) || 900;
// jambonz
const JAMBONES_TRANSCRIBE_EP_DESTROY_DELAY_MS =
process.env.JAMBONES_TRANSCRIBE_EP_DESTROY_DELAY_MS;
@@ -225,11 +230,13 @@ module.exports = {
JAMBONZ_RECORD_WS_BASE_URL,
JAMBONZ_RECORD_WS_USERNAME,
JAMBONZ_RECORD_WS_PASSWORD,
JAMBONZ_DISABLE_DIAL_PAI_HEADER,
JAMBONZ_DIAL_PAI_HEADER,
JAMBONES_DISABLE_DIRECT_P2P_CALL,
JAMBONES_USE_FREESWITCH_TIMER_FD,
JAMBONES_DIAL_SBC_FOR_REGISTERED_USER,
JAMBONES_MEDIA_TIMEOUT_MS,
JAMBONES_MEDIA_HOLD_TIMEOUT_MS,
JAMBONES_TRANSCRIBE_EP_DESTROY_DELAY_MS
JAMBONES_SAY_CHUNK_SIZE,
JAMBONES_TRANSCRIBE_EP_DESTROY_DELAY_MS,
JAMBONES_WEBHOOK_ERROR_RETURN
};

View File

@@ -13,10 +13,11 @@ const WsRequestor = require('../../utils/ws-requestor');
const RootSpan = require('../../utils/call-tracer');
const dbUtils = require('../../utils/db-utils');
const { decrypt } = require('../../utils/encrypt-decrypt');
const { mergeSdpMedia, extractSdpMedia } = require('../../utils/sdp-utils');
const { mergeSdpMedia, extractSdpMedia, removeVideoSdp } = require('../../utils/sdp-utils');
const { createCallSchema, customSanitizeFunction } = require('../schemas/create-call');
const { selectHostPort } = require('../../utils/network');
const { JAMBONES_DIAL_SBC_FOR_REGISTERED_USER } = require('../../config');
const { createMediaEndpoint } = require('../../utils/media-endpoint');
const removeNullProperties = (obj) => (Object.keys(obj).forEach((key) => obj[key] === null && delete obj[key]), obj);
const removeNulls = (req, res, next) => {
@@ -67,7 +68,7 @@ router.post('/',
const {
lookupAppBySid
} = srf.locals.dbHelpers;
const {getSBC, getFreeswitch} = srf.locals;
const {getSBC} = srf.locals;
let sbcAddress = getSBC();
if (!sbcAddress) throw new Error('no available SBCs for outbound call creation');
const target = restDial.to;
@@ -146,7 +147,7 @@ router.post('/',
// find handling sbc sip for called user
if (JAMBONES_DIAL_SBC_FOR_REGISTERED_USER && target.type === 'user') {
const { registrar } = srf.locals.dbHelpers;
const { registrar} = srf.locals.dbHelpers;
const reg = await registrar.query(target.name);
if (reg) {
sbcAddress = selectHostPort(logger, reg.sbcAddress, 'tcp')[1];
@@ -158,7 +159,9 @@ router.post('/',
* trunk isn't specified,
* check if from-number matches any existing numbers on Jambonz
* */
if (target.type === 'phone' && !target.trunk) {
const { lookupLcrByAccount} = srf.locals.dbHelpers;
const lcrs = await lookupLcrByAccount(req.body.account_sid);
if (target.type === 'phone' && !target.trunk && lcrs.length == 0) {
const str = restDial.from || '';
const callingNumber = str.startsWith('+') ? str.substring(1) : str;
const voip_carrier_sid = await lookupCarrierByPhoneNumber(req.body.account_sid, callingNumber);
@@ -170,9 +173,7 @@ router.post('/',
}
/* create endpoint for outdial */
const ms = getFreeswitch();
if (!ms) throw new Error('no available Freeswitch for outbound call creation');
const ep = await ms.createEndpoint();
const ep = await createMediaEndpoint(srf, logger);
logger.debug(`createCall: successfully allocated endpoint, sending INVITE to ${sbcAddress}`);
/* launch outdial */
@@ -181,10 +182,14 @@ router.post('/',
let localSdp = ep.local.sdp;
if (req.body.dual_streams) {
dualEp = await ms.createEndpoint();
dualEp = await createMediaEndpoint(srf, logger);
localSdp = mergeSdpMedia(localSdp, dualEp.local.sdp);
}
if (process.env.JAMBONES_VIDEO_CALLS_ENABLED_IN_FS) {
logger.debug('createCall: removing video sdp');
localSdp = removeVideoSdp(localSdp);
ep.modify(localSdp);
}
const connectStream = async(remoteSdp) => {
if (remoteSdp !== sdp) {
sdp = remoteSdp;
@@ -235,6 +240,7 @@ router.post('/',
if (app.call_hook.url === app.call_status_hook?.url || !app.call_status_hook?.url) {
logger.debug('reusing websocket for call status hook');
app.notifier = app.requestor;
app.call_status_hook = app.call_hook;
}
}
else {
@@ -285,7 +291,7 @@ router.post('/',
}, {
...(account.enable_debug_log && {level: 'debug'})
});
app.requestor.logger = app.notifier.logger = sipLogger;
app.requestor.logger = app.notifier.logger = restDial.logger = sipLogger;
const callInfo = new CallInfo({
direction: CallDirection.Outbound,
req: inviteReq,

View File

@@ -116,8 +116,8 @@ const customSanitizeFunction = (value) => {
/* trims characters at the beginning and at the end of a string */
value = value.trim();
/* Verify strings including 'http' via new URL */
if (value.includes('http')) {
// Only attempt to parse if the whole string is a URL
if (/^https?:\/\/\S+$/.test(value)) {
value = new URL(value).toString();
}
}

View File

@@ -12,7 +12,8 @@ const RootSpan = require('./utils/call-tracer');
const listTaskNames = require('./utils/summarize-tasks');
const {
JAMBONES_MYSQL_REFRESH_TTL,
JAMBONES_DISABLE_DIRECT_P2P_CALL
JAMBONES_DISABLE_DIRECT_P2P_CALL,
JAMBONES_WEBHOOK_ERROR_RETURN
} = require('./config');
const { createJambonzApp } = require('./dynamic-apps');
const { decrypt } = require('./utils/encrypt-decrypt');
@@ -112,6 +113,14 @@ module.exports = function(srf, logger) {
req.locals.callingNumber = sipURIs[1];
}
}
// Feature server INVITE request pipelines taking time to finish,
// while connecting and fetch application from db and invoking webhook.
// call can be canceled without any handling, so we add a listener here
req.once('cancel', (sipMsg) => {
logger.info(`${callId} got CANCEL request`);
req.locals.canceled = true;
});
next();
}
@@ -332,7 +341,7 @@ module.exports = function(srf, logger) {
}
// Resolve application.speech_synthesis_voice if it's custom voice
if (app2.speech_synthesis_vendor === 'google' && app2.speech_synthesis_voice.startsWith('custom_')) {
if (app2.speech_synthesis_vendor === 'google' && app2.speech_synthesis_voice?.startsWith('custom_')) {
const arr = /custom_(.*)/.exec(app2.speech_synthesis_voice);
if (arr) {
const google_custom_voice_sid = arr[1];
@@ -362,13 +371,14 @@ module.exports = function(srf, logger) {
});
// if transferred call contains callInfo, let update original data to newly created callInfo in this instance.
if (app.transferredCall && app.callInfo) {
const {direction, callerName, from, to, originatingSipIp, originatingSipTrunkName} = app.callInfo;
const {direction, callerName, from, to, originatingSipIp, originatingSipTrunkName, customerData} = app.callInfo;
req.locals.callInfo.direction = direction;
req.locals.callInfo.callerName = callerName;
req.locals.callInfo.from = from;
req.locals.callInfo.to = to;
req.locals.callInfo.originatingSipIp = originatingSipIp;
req.locals.callInfo.originatingSipTrunkName = originatingSipTrunkName;
if (customerData) req.locals.callInfo.customerData = customerData;
delete app.callInfo;
}
next();
@@ -471,7 +481,7 @@ module.exports = function(srf, logger) {
message: `${err?.message}`.trim()
}).catch((err) => this.logger.info({err}, 'Error generating alert for parsing application'));
logger.info({err}, `Error retrieving or parsing application: ${err?.message}`);
res.send(480, {headers: {'X-Reason': err?.message || 'unknown'}});
res.send(JAMBONES_WEBHOOK_ERROR_RETURN, {headers: {'X-Reason': err?.message || 'unknown'}});
app.requestor.close(WS_CLOSE_CODES.GoingAway);
}
}

View File

@@ -12,6 +12,7 @@ class CallInfo {
let srf;
this.direction = opts.direction;
this.traceId = opts.traceId;
this.hasRecording = false;
this.callTerminationBy = undefined;
if (opts.req) {
const u = opts.req.getParsedHeader('from');

View File

@@ -10,8 +10,9 @@ const {
RecordState,
AllowedSipRecVerbs,
AllowedConfirmSessionVerbs,
TtsStreamingEvents
} = require('../utils/constants');
TtsStreamingEvents,
ListenStatus
} = require('../utils/constants.json');
const moment = require('moment');
const assert = require('assert');
const sessionTracker = require('./session-tracker');
@@ -29,10 +30,6 @@ const {
JAMBONES_INJECT_CONTENT,
JAMBONES_EAGERLY_PRE_CACHE_AUDIO,
AWS_REGION,
JAMBONES_USE_FREESWITCH_TIMER_FD,
JAMBONES_MEDIA_TIMEOUT_MS,
JAMBONES_MEDIA_HOLD_TIMEOUT_MS,
JAMBONES_TRANSCRIBE_EP_DESTROY_DELAY_MS
} = require('../config');
const bent = require('bent');
const BackgroundTaskManager = require('../utils/background-task-manager');
@@ -40,7 +37,8 @@ const dbUtils = require('../utils/db-utils');
const BADPRECONDITIONS = 'preconditions not met';
const CALLER_CANCELLED_ERR_MSG = 'Response not sent due to unknown transaction';
const { NonFatalTaskError} = require('../utils/error');
const { sleepFor } = require('../utils/helpers');
const { createMediaEndpoint } = require('../utils/media-endpoint');
const SttLatencyCalculator = require('../utils/stt-latency-calculator');
const sqlRetrieveQueueEventHook = `SELECT * FROM webhooks
WHERE webhook_sid =
(
@@ -150,6 +148,30 @@ class CallSession extends Emitter {
this.conversationTurns = [];
this.on('userSaid', this._onUserSaid.bind(this));
this.on('botSaid', this._onBotSaid.bind(this));
/**
* Support STT latency
*/
this.sttLatencyCalculator = new SttLatencyCalculator({
logger,
cs: this
});
this.on('transcribe-start', () => {
this.sttLatencyCalculator.resetTime();
});
this.on('on-transcription', () => {
this.sttLatencyCalculator.onTranscriptionReceived();
});
this.on('transcribe-stop', () => {
this.sttLatencyCalculator.onTranscribeStop();
});
}
get notifySttLatencyEnabled() {
return this._notifySttLatencyEnabled || false;
}
set notifySttLatencyEnabled(enabled) {
this._notifySttLatencyEnabled = enabled;
}
/**
@@ -220,6 +242,18 @@ class CallSession extends Emitter {
this._synthesizer = synth;
}
/**
* Say stream enabled
*/
get autoStreamTts() {
return this._autoStreamTts || false;
}
set autoStreamTts(i) {
this._autoStreamTts = i;
}
/**
* ASR TTS fallback
*/
@@ -470,7 +504,12 @@ class CallSession extends Emitter {
}
get isTtsStreamEnabled() {
return this.backgroundTaskManager.isTaskRunning('ttsStream');
// 1st background tts stream
return this.backgroundTaskManager.isTaskRunning('ttsStream') ||
// 2nd current task streaming tts
TaskName.Say === this.currentTask?.name && this.currentTask?.isStreamingTts ||
// 3rd nested verb is streaming tts
TaskName.Gather === this.currentTask?.name && this.currentTask.sayTask?.isStreamingTts;
}
get isListenEnabled() {
@@ -624,6 +663,15 @@ class CallSession extends Emitter {
}
}
// disableTtsCache
get disableTtsCache() {
return this._disableTtsCache || false;
}
set disableTtsCache(d) {
this._disableTtsCache = d;
}
getTsStreamingVendor() {
let v;
if (this.currentTask?.isStreamingTts) {
@@ -676,7 +724,7 @@ class CallSession extends Emitter {
}
hasGlobalSttPunctuation() {
get hasGlobalSttPunctuation() {
return this._globalSttPunctuation !== undefined;
}
@@ -708,54 +756,101 @@ class CallSession extends Emitter {
return this._fillerNoise;
}
async notifyRecordOptions(opts) {
const {action} = opts;
const {action, silence = false, type = 'siprec'} = opts;
this.logger.debug({opts}, 'CallSession:notifyRecordOptions');
/* if we have not answered yet, just save the details for later */
if (!this.dlg) {
if (action === 'startCallRecording') {
this.recordOptions = opts;
return true;
if (type == 'cloud') {
switch (action) {
case 'pauseCallRecording':
if (this.backgroundTaskManager.isTaskRunning('record')) {
this.logger.debug({action, silence, type}, 'CallSession:cloudRecording');
const backgroundListenTask = this.backgroundTaskManager.getTask('record');
backgroundListenTask.updateListen(
ListenStatus.Pause,
silence
);
return true;
} else { return false; }
case 'resumeCallRecording':
if (this.backgroundTaskManager.isTaskRunning('record')) {
this.logger.debug({action, silence, type}, 'CallSession:cloudRecording');
const backgroundListenTask = this.backgroundTaskManager.getTask('record');
backgroundListenTask.updateListen(
ListenStatus.Resume,
silence
);
return true;
} else { return false; }
case 'startCallRecording':
if (!this.backgroundTaskManager.isTaskRunning('record')) {
this.logger.debug({action, silence, type}, 'CallSession:cloudRecording');
this.callInfo.hasRecording = true;
this.updateCallStatus(Object.assign({}, this.callInfo.toJSON()), this.serviceUrl)
.catch((err) => this.logger.error(err, 'redis error'));
if (!this.dlg) {
// Call not yet answered so set flag to record on status change
this.application.record_all_calls = true;
} else {
this.backgroundTaskManager.newTask('record');
}
return true;
} else { return false; }
case 'stopCallRecording':
if (this.backgroundTaskManager.isTaskRunning('record')) {
this.logger.debug({action, silence, type}, 'CallSession:cloudRecording');
this.backgroundTaskManager.stop('record');
return true;
} else { return false; }
}
} else {
// SIPREC
/* if we have not answered yet, just save the details for later */
if (!this.dlg) {
if (action === 'startCallRecording') {
this.recordOptions = opts;
return true;
}
return false;
}
return false;
}
/* check validity of request */
if (action == 'startCallRecording' && this.recordState !== RecordState.RecordingOff) {
this.logger.info({recordState: this.recordState},
'CallSession:notifyRecordOptions: recording is already started, ignoring request');
return false;
}
if (action == 'stopCallRecording' && this.recordState === RecordState.RecordingOff) {
this.logger.info({recordState: this.recordState},
'CallSession:notifyRecordOptions: recording is already stopped, ignoring request');
return false;
}
if (action == 'pauseCallRecording' && this.recordState !== RecordState.RecordingOn) {
this.logger.info({recordState: this.recordState},
'CallSession:notifyRecordOptions: cannot pause recording, ignoring request ');
return false;
}
if (action == 'resumeCallRecording' && this.recordState !== RecordState.RecordingPaused) {
this.logger.info({recordState: this.recordState},
'CallSession:notifyRecordOptions: cannot resume recording, ignoring request ');
return false;
}
/* check validity of request */
if (action == 'startCallRecording' && this.recordState !== RecordState.RecordingOff) {
this.logger.info({recordState: this.recordState},
'CallSession:notifyRecordOptions: recording is already started, ignoring request');
return false;
}
if (action == 'stopCallRecording' && this.recordState === RecordState.RecordingOff) {
this.logger.info({recordState: this.recordState},
'CallSession:notifyRecordOptions: recording is already stopped, ignoring request');
return false;
}
if (action == 'pauseCallRecording' && this.recordState !== RecordState.RecordingOn) {
this.logger.info({recordState: this.recordState},
'CallSession:notifyRecordOptions: cannot pause recording, ignoring request ');
return false;
}
if (action == 'resumeCallRecording' && this.recordState !== RecordState.RecordingPaused) {
this.logger.info({recordState: this.recordState},
'CallSession:notifyRecordOptions: cannot resume recording, ignoring request ');
return false;
}
this.recordOptions = opts;
this.recordOptions = opts;
switch (action) {
case 'startCallRecording':
return await this.startRecording();
case 'stopCallRecording':
return await this.stopRecording();
case 'pauseCallRecording':
return await this.pauseRecording();
case 'resumeCallRecording':
return await this.resumeRecording();
default:
throw new Error(`invalid record action ${action}`);
switch (action) {
case 'startCallRecording':
return await this.startRecording();
case 'stopCallRecording':
return await this.stopRecording();
case 'pauseCallRecording':
return await this.pauseRecording();
case 'resumeCallRecording':
return await this.resumeRecording();
default:
throw new Error(`invalid record action ${action}`);
}
}
}
@@ -869,7 +964,7 @@ class CallSession extends Emitter {
this.logger.debug('CallSession:enableBackgroundTtsStream - ttsStream enabled');
} else {
this.logger.debug(
'CallSession:enableBackgroundTtsStream - ignoring request as call does not have required conditions');
'CallSession:enableBackgroundTtsStream - ignoring request; conditions not met (probably not using ws api)');
}
} catch (err) {
this.logger.info({err, say}, 'CallSession:enableBackgroundTtsStream - Error creating background tts stream task');
@@ -883,15 +978,25 @@ class CallSession extends Emitter {
}
}
clearTtsStream() {
this.requestor?.request('tts:streaming-event', '/streaming-event', {event_type: 'user_interruption'})
.catch((err) => this.logger.info({err}, 'CallSession:clearTtsStream - Error sending user_interruption'));
this.ttsStreamingBuffer?.clear();
if (this.isTtsStreamEnabled) {
this.requestor?.request('tts:streaming-event', '/streaming-event', {event_type: 'user_interruption'})
.catch((err) => this.logger.info({err}, 'CallSession:clearTtsStream - Error sending user_interruption'));
this.ttsStreamingBuffer?.clear();
}
}
startTtsStream() {
this.ttsStreamingBuffer?.start();
}
stopTtsStream() {
if (this.isTtsStreamEnabled) {
this.requestor?.request('tts:streaming-event', '/streaming-event', {event_type: 'stream_closed'})
.catch((err) => this.logger.info({err}, 'CallSession:clearTtsStream - Error sending user_interruption'));
this.ttsStreamingBuffer?.stop();
}
}
async enableBotMode(gather, autoEnable) {
try {
let task;
@@ -915,7 +1020,7 @@ class CallSession extends Emitter {
task.sticky = autoEnable;
// listen to the bargein-done from background manager
this.backgroundTaskManager.on('bargeIn-done', () => {
if (this.requestor instanceof WsRequestor) {
if (this.appIsUsingWebsockets) {
try {
this.kill(true);
} catch (err) {}
@@ -968,8 +1073,6 @@ class CallSession extends Emitter {
(type === 'tts' && credential.use_for_tts) ||
(type === 'stt' && credential.use_for_stt)
)) {
this.logger.debug(
`${type}: ${credential.vendor} ${credential.label ? `, label: ${credential.label}` : ''} `);
if ('google' === vendor) {
if (type === 'tts' && !credential.tts_tested_ok ||
type === 'stt' && !credential.stt_tested_ok) {
@@ -979,7 +1082,7 @@ class CallSession extends Emitter {
const cred = JSON.parse(credential.service_key.replace(/\n/g, '\\n'));
return {
speech_credential_sid: credential.speech_credential_sid,
credentials: cred
credentials: cred,
};
} catch (err) {
const sid = this.accountInfo.account.account_sid;
@@ -1039,6 +1142,13 @@ class CallSession extends Emitter {
deepgram_stt_use_tls: credential.deepgram_stt_use_tls
};
}
else if ('gladia' === vendor) {
return {
speech_credential_sid: credential.speech_credential_sid,
api_key: credential.api_key,
region: credential.region,
};
}
else if ('soniox' === vendor) {
return {
speech_credential_sid: credential.speech_credential_sid,
@@ -1070,6 +1180,7 @@ class CallSession extends Emitter {
return {
api_key: credential.api_key,
model_id: credential.model_id,
api_uri: credential.api_uri,
options: credential.options
};
}
@@ -1085,6 +1196,7 @@ class CallSession extends Emitter {
return {
api_key: credential.api_key,
model_id: credential.model_id,
stt_model_id: credential.stt_model_id,
embedding: credential.embedding,
options: credential.options
};
@@ -1096,10 +1208,40 @@ class CallSession extends Emitter {
options: credential.options
};
}
else if ('resemble' === vendor) {
return {
api_key: credential.api_key,
resemble_tts_use_tls: credential.resemble_tts_use_tls,
resemble_tts_uri: credential.resemble_tts_uri,
};
}
else if ('inworld' === vendor) {
return {
api_key: credential.api_key,
model_id: credential.model_id,
options: credential.options
};
}
else if ('assemblyai' === vendor) {
return {
speech_credential_sid: credential.speech_credential_sid,
api_key: credential.api_key
api_key: credential.api_key,
service_version: credential.service_version
};
}
else if ('houndify' === vendor) {
return {
speech_credential_sid: credential.speech_credential_sid,
client_id: credential.client_id,
client_key: credential.client_key,
user_id: credential.user_id,
houndify_server_uri: credential.houndify_server_uri
};
}
else if ('deepgramflux' === vendor) {
return {
speech_credential_sid: credential.speech_credential_sid,
api_key: credential.api_key,
};
}
else if ('voxist' === vendor) {
@@ -1145,9 +1287,10 @@ class CallSession extends Emitter {
}
else {
writeAlerts({
alert_type: AlertType.STT_NOT_PROVISIONED,
alert_type: type === 'tts' ? AlertType.TTS_NOT_PROVISIONED : AlertType.STT_NOT_PROVISIONED,
account_sid: this.accountSid,
vendor,
label,
target_sid: this.callSid
}).catch((err) => this.logger.error({err}, 'Error writing tts alert'));
}
@@ -1178,6 +1321,7 @@ class CallSession extends Emitter {
this.ttsStreamingBuffer.on(TtsStreamingEvents.Pause, this._onTtsStreamingPause.bind(this));
this.ttsStreamingBuffer.on(TtsStreamingEvents.Resume, this._onTtsStreamingResume.bind(this));
this.ttsStreamingBuffer.on(TtsStreamingEvents.ConnectFailure, this._onTtsStreamingConnectFailure.bind(this));
this.ttsStreamingBuffer.on(TtsStreamingEvents.Connected, this._onTtsStreamingConnected.bind(this));
}
else {
this.logger.info(`CallSession:exec - not a normal call session: ${this.constructor.name}`);
@@ -1236,7 +1380,7 @@ class CallSession extends Emitter {
}
if (0 === this.tasks.length &&
this.requestor instanceof WsRequestor &&
this.appIsUsingWebsockets &&
!this.requestor.closedGracefully &&
!this.callGone &&
!this.isConfirmCallSession
@@ -1346,7 +1490,11 @@ class CallSession extends Emitter {
}
else {
if (this.req && !this.dlg) {
this.req.cancel();
try {
this.req.cancel();
} catch (err) {
this.logger.error({err}, 'CallSession:_lccCallStatus error cancelling request');
}
this._callReleased();
}
}
@@ -1409,6 +1557,14 @@ class CallSession extends Emitter {
method: 'POST'
};
}
/* if given a relative url then send to same base url as current session */
if (this.requestor._isRelativeUrl(opts.call_hook?.url)) {
opts.call_hook.url = `${this.requestor.baseUrl}${opts.call_hook?.url}`;
}
if (this.requestor._isRelativeUrl(opts.call_status_hook?.url)) {
opts.call_status_hook.url = `${this.requestor.baseUrl}${opts.call_status_hook?.url}`;
}
}
/**
@@ -1636,8 +1792,8 @@ Duration=${duration} `
async _lccWhisper(opts, callSid) {
const {whisper} = opts;
let tasks;
const b3 = this.b3;
const httpHeaders = b3 && {b3};
//const b3 = this.b3;
//const httpHeaders = b3 && {b3};
// this whole thing requires us to be in a Dial verb
const task = this.currentTask;
@@ -1646,12 +1802,15 @@ Duration=${duration} `
}
// allow user to provide a url object, a url string, an array of tasks, or a single task
if (typeof whisper === 'string' || (typeof whisper === 'object' && whisper.url)) {
// retrieve a url
const json = await this.requestor(opts.call_hook, this.callInfo.toJSON(), httpHeaders);
tasks = normalizeJambones(this.logger, json).map((tdata) => makeTask(this.logger, tdata));
}
else if (Array.isArray(whisper)) {
// Disable passing a URL as not functional Sam Machin - 17/07/2025
//if (typeof whisper === 'string' || (typeof whisper === 'object' && whisper.url && !whisper.verb)) {
// // retrieve a url
// const json = await this.requestor(opts.call_hook, this.callInfo.toJSON(), httpHeaders);
// tasks = normalizeJambones(this.logger, json).map((tdata) => makeTask(this.logger, tdata));
//}
//else if (Array.isArray(whisper)) {
if (Array.isArray(whisper)) {
// an inline array of tasks
tasks = normalizeJambones(this.logger, whisper).map((tdata) => makeTask(this.logger, tdata));
}
@@ -1782,7 +1941,7 @@ Duration=${duration} `
return;
}
else if (tokens === undefined) {
this.logger.info({opts}, 'CallSession:_lccTtsTokens - invalid command since id is missing');
this.logger.info({opts}, 'CallSession:_lccTtsTokens - invalid command since tokens is missing');
return this.requestor.request('tts:tokens-result', '/tokens-result', {
id,
status: 'failed',
@@ -1799,6 +1958,10 @@ Duration=${duration} `
.catch((err) => this.logger.debug({err}, 'CallSession:_notifyTaskStatus - Error sending'));
}
async _internalTtsStreamingBufferTokens(tokens) {
return await this.ttsStreamingBuffer?.bufferTokens(tokens) || {status: 'failed', reason: 'no tts streaming buffer'};
}
_lccTtsFlush(opts) {
this.ttsStreamingBuffer?.flush(opts);
}
@@ -1884,6 +2047,17 @@ Duration=${duration} `
}
} catch (err) {
this.logger.info({err, opts, callSid}, 'CallSession:updateCall - error updating call');
const {writeAlerts} = this.srf.locals;
try {
writeAlerts({
alert_type: 'error-updating-call',
account_sid: this.accountSid,
message: err.message,
target_sid: callSid
});
} catch (err) {
this.logger.error({err}, 'Error writing error-updating-call alert');
}
}
}
@@ -2285,23 +2459,22 @@ Duration=${duration} `
// need to allocate an endpoint
try {
if (!this.ms) this.ms = this.getMS();
const ep = await this.ms.createEndpoint({
const ep = await this._createMediaEndpoint({
headers: {
'X-Jambones-Call-ID': this.callId,
'X-Call-Sid': this.callSid,
},
remoteSdp: this.req.body
});
//ep.cs = this;
this.ep = ep;
this.logger.info(`allocated endpoint ${ep.uuid}`);
this._configMsEndpoint();
this.ep.on('destroy', () => {
this.logger.debug(`endpoint was destroyed!! ${this.ep.uuid}`);
});
if (this.direction === CallDirection.Inbound) {
if (this.direction === CallDirection.Inbound || this.application?.transferredCall) {
if (task.earlyMedia && !this.req.finalResponseSent) {
this.res.send(183, {body: ep.local.sdp});
return {ep};
@@ -2327,6 +2500,36 @@ Duration=${duration} `
}
else {
this.logger.error(err, `Error attempting to allocate endpoint for for task ${task.name}`);
// Check for SipError type (e.g., 488 codec incompatibility)
const isSipError = err.name === 'SipError';
if (isSipError && err.status) {
// Extract Reason header from SIP response if available (e.g., Q.850;cause=88;text="INCOMPATIBLE_DESTINATION")
const sipReasonHeader = err.res?.msg?.headers?.reason;
this._endpointAllocationError = {
status: err.status,
reason: err.reason || 'Endpoint Allocation Failed',
sipReasonHeader
};
this.logger.info({endpointAllocationError: this._endpointAllocationError},
'Captured SipError for propagation to SBC');
// Send SIP error response immediately for inbound calls
if (this.res && !this.res.finalResponseSent) {
this.logger.info(`Sending ${err.status} response to SBC due to SipError`);
this.res.send(err.status, {
headers: {
'X-Reason': `endpoint allocation failure: ${err.reason || 'Endpoint Allocation Failed'}`,
...(sipReasonHeader && {'Reason': sipReasonHeader})
}
});
this._notifyCallStatusChange({
callStatus: CallStatus.Failed,
sipStatus: err.status,
sipReason: err.reason || 'Endpoint Allocation Failed'
});
this._callReleased();
}
}
throw new Error(`${BADPRECONDITIONS}: unable to allocate endpoint`);
}
}
@@ -2366,9 +2569,6 @@ Duration=${duration} `
this.logger.error('CallSession:replaceEndpoint cannot be called without stable dlg');
return;
}
// When this call kicked out from conference, session need to replace endpoint
// but this.ms might be undefined/null at this case.
this.ms = this.ms || this.getMS();
// Destroy previous ep if it's still running.
if (this.ep?.connected) this.ep.destroy();
@@ -2393,8 +2593,7 @@ Duration=${duration} `
* This prevents call failures during media renegotiation.
*/
this.ep = await this.ms.createEndpoint();
this._configMsEndpoint();
this.ep = await this._createMediaEndpoint();
const sdp = await this.dlg.modify(this.ep.local.sdp);
await this.ep.modify(sdp);
@@ -2437,7 +2636,9 @@ Duration=${duration} `
this.backgroundTaskManager.stopAll();
this.clearOrRestoreActionHookDelayProcessor().catch((err) => {});
this.ttsStreamingBuffer?.stop();
this.stopTtsStream();
this.sttLatencyCalculator?.stop();
}
/**
@@ -2589,7 +2790,7 @@ Duration=${duration} `
*/
_onRefer(req, res) {
const task = this.currentTask;
const sd = task.sd;
const sd = task?.sd;
if (task && TaskName.Dial === task.name && sd && task.referHook) {
task.handleRefer(this, req, res);
}
@@ -2654,15 +2855,8 @@ Duration=${duration} `
async createOrRetrieveEpAndMs() {
if (this.ms && this.ep) return {ms: this.ms, ep: this.ep};
// get a media server
if (!this.ms) {
const ms = this.srf.locals.getFreeswitch();
if (!ms) throw new Error('no available freeswitch');
this.ms = ms;
}
if (!this.ep) {
this.ep = await this.ms.createEndpoint({remoteSdp: this.req.body});
this._configMsEndpoint();
this.ep = await this._createMediaEndpoint({remoteSdp: this.req.body});
}
return {ms: this.ms, ep: this.ep};
}
@@ -2815,8 +3009,7 @@ Duration=${duration} `
async reAnchorMedia(currentMediaRoute = MediaPath.PartialMedia) {
assert(this.dlg && this.dlg.connected && !this.ep);
this.ep = await this.ms.createEndpoint({remoteSdp: this.dlg.remote.sdp});
this._configMsEndpoint();
this.ep = await this._createMediaEndpoint({remoteSdp: this.dlg.remote.sdp});
await this.dlg.modify(this.ep.local.sdp, {
headers: {
'X-Reason': 'anchor-media'
@@ -2849,8 +3042,7 @@ Duration=${duration} `
// manage record all call.
if (callStatus === CallStatus.InProgress) {
if (this.accountInfo.account.record_all_calls ||
this.application.record_all_calls) {
if (this.accountInfo.account.record_all_calls || this.application.record_all_calls) {
this.backgroundTaskManager.newTask('record');
}
} else if (callStatus == CallStatus.Completed) {
@@ -2891,72 +3083,26 @@ Duration=${duration} `
}
}
_configMsEndpoint() {
this._enableInbandDtmfIfRequired(this.ep);
this.ep.once('destroy', this._handleMediaTimeout.bind(this));
const opts = {
...(this.onHoldMusic && {holdMusic: `shout://${this.onHoldMusic.replace(/^https?:\/\//, '')}`}),
...(JAMBONES_USE_FREESWITCH_TIMER_FD && {timer_name: 'timerfd'}),
...(JAMBONES_MEDIA_TIMEOUT_MS && {media_timeout: JAMBONES_MEDIA_TIMEOUT_MS}),
...(JAMBONES_MEDIA_HOLD_TIMEOUT_MS && {media_hold_timeout: JAMBONES_MEDIA_HOLD_TIMEOUT_MS})
};
if (Object.keys(opts).length > 0) {
this.ep.set(opts);
}
const origDestroy = this.ep.destroy.bind(this.ep);
this.ep.destroy = async() => {
try {
if (this.currentTask?.name === TaskName.Transcribe && JAMBONES_TRANSCRIBE_EP_DESTROY_DELAY_MS) {
// transcribe task is being used, wait for some time before destroy
// if final transcription is received but endpoint is already closed,
// freeswitch module will not be able to send the transcription
this.logger.debug('callSession:_configMsEndpoint -' +
' transcribe task, wait for some time before destroy');
await sleepFor(JAMBONES_TRANSCRIBE_EP_DESTROY_DELAY_MS);
}
await origDestroy();
} catch (err) {
this.logger.error(err, 'callSession:_configMsEndpoint - error destroying endpoint');
}
};
}
async _handleMediaTimeout(evt) {
if (evt.reason === 'MEDIA_TIMEOUT' && !this.callGone) {
if (evt?.reason === 'MEDIA_TIMEOUT' && !this.callGone) {
this.logger.info('CallSession:_handleMediaTimeout: received MEDIA_TIMEOUT, hangup the call');
this._jambonzHangup('Media Timeout');
}
}
async _enableInbandDtmfIfRequired(ep) {
if (ep.inbandDtmfEnabled) return;
// only enable inband dtmf detection if voip carrier dtmf_type === tones
if (this.inbandDtmfEnabled) {
// https://developer.signalwire.com/freeswitch/FreeSWITCH-Explained/Modules/mod-dptools/6587132/#0-about
try {
ep.execute('start_dtmf');
ep.inbandDtmfEnabled = true;
} catch (err) {
this.logger.info(err, 'CallSession:_enableInbandDtmf - error enable inband DTMF');
}
}
}
/**
* notifyTaskError - only used when websocket connection is used instead of webhooks
*/
_notifyTaskError(obj) {
if (this.requestor instanceof WsRequestor) {
if (this.appIsUsingWebsockets) {
this.requestor.request('jambonz:error', '/error', obj)
.catch((err) => this.logger.debug({err}, 'CallSession:_notifyTaskError - Error sending'));
}
}
_notifyTaskStatus(task, evt) {
if (this.notifyEvents && this.requestor instanceof WsRequestor) {
if (this.notifyEvents && this.appIsUsingWebsockets) {
const obj = {...evt, id: task.id, name: task.name};
this.requestor.request('verb:status', '/status', obj)
.catch((err) => this.logger.debug({err}, 'CallSession:_notifyTaskStatus - Error sending'));
@@ -2995,8 +3141,20 @@ Duration=${duration} `
});
}
async _enableInbandDtmfIfRequired(ep) {
if (ep.inbandDtmfEnabled) return;
// only enable inband dtmf detection if voip carrier dtmf_type === tones
if (this.inbandDtmfEnabled) {
// https://developer.signalwire.com/freeswitch/FreeSWITCH-Explained/Modules/mod-dptools/6587132/#0-about
ep.execute('start_dtmf').catch((err) => {
this.logger.info({err}, 'CallSession:_enableInbandDtmfIfRequired - error starting DTMF');
});
ep.inbandDtmfEnabled = true;
}
}
_clearTasks(backgroundGather, evt) {
if (this.requestor instanceof WsRequestor && !backgroundGather.cleared) {
if (this.appIsUsingWebsockets && !backgroundGather.cleared) {
this.logger.debug({evt}, 'CallSession:_clearTasks on event from background gather');
try {
backgroundGather.cleared = true;
@@ -3024,10 +3182,20 @@ Duration=${duration} `
}
}
_onTtsStreamingConnected() {
this.requestor?.request('tts:streaming-event', '/streaming-event', {event_type: 'stream_open'})
.catch((err) => this.logger.info({err}, 'CallSession:_onTtsStreamingConnected - Error sending'));
}
_onTtsStreamingEmpty() {
const task = this.currentTask;
if (task && TaskName.Say === task.name) {
task.notifyTtsStreamIsEmpty();
} else if (
// If Gather nested say task is streaming
task && TaskName.Gather === task.name && task.sayTask && task.sayTask.isStreamingTts) {
const sayTask = task.sayTask;
sayTask.notifyTtsStreamIsEmpty();
}
}
@@ -3102,6 +3270,16 @@ Duration=${duration} `
}
}
async _createMediaEndpoint(drachtioFsmrfOptions = {}) {
return await createMediaEndpoint(this.srf, this.logger, {
activeMs: this.getMS(),
drachtioFsmrfOptions,
onHoldMusic: this.onHoldMusic,
inbandDtmfEnabled: this.inbandDtmfEnabled,
mediaTimeoutHandler: this._handleMediaTimeout.bind(this),
});
}
getFormattedConversation(numTurns) {
const turns = this.conversationTurns.slice(-numTurns);
if (turns.length === 0) return null;
@@ -3112,6 +3290,20 @@ Duration=${duration} `
return `assistant: ${t.text}`;
}).join('\n');
}
startSttLatencyVad() {
if (this.notifySttLatencyEnabled) {
this.sttLatencyCalculator.start();
}
}
stopSttLatencyVad() {
this.sttLatencyCalculator.stop();
}
calculateSttLatency() {
return this.sttLatencyCalculator.calculateLatency();
}
}
module.exports = CallSession;

View File

@@ -8,7 +8,8 @@ const CallSession = require('./call-session');
*/
class ConfirmCallSession extends CallSession {
constructor({logger, application, dlg, ep, tasks, callInfo, accountInfo, memberId, confName, rootSpan, req}) {
// eslint-disable-next-line max-len
constructor({logger, application, dlg, ep, tasks, callInfo, accountInfo, memberId, confName, rootSpan, req, tmpFiles}) {
super({
logger,
application,
@@ -24,6 +25,7 @@ class ConfirmCallSession extends CallSession {
this.dlg = dlg;
this.ep = ep;
this.req = req;
this.tmpFiles = tmpFiles;
}
/**

View File

@@ -22,6 +22,12 @@ class InboundCallSession extends CallSession {
this.req = req;
this.res = res;
// if the call was canceled before we got here, handle it
if (this.req.locals.canceled) {
req.locals.logger.info('InboundCallSession: constructor - call was already canceled');
this._onCancel();
}
req.once('cancel', this._onCancel.bind(this));
this.on('callStatusChange', this._notifyCallStatusChange.bind(this));
@@ -54,6 +60,19 @@ class InboundCallSession extends CallSession {
}
});
}
else if (this._endpointAllocationError) {
// Propagate SIP error from endpoint allocation failure back to the client
const {status, reason, sipReasonHeader} = this._endpointAllocationError;
this.rootSpan.setAttributes({'call.termination': `endpoint allocation SIP error ${status}`});
this.logger.info({endpointAllocationError: this._endpointAllocationError},
`InboundCallSession:_onTasksDone generating ${status} due to endpoint allocation failure`);
this.res.send(status, {
headers: {
'X-Reason': `endpoint allocation failure: ${reason}`,
...(sipReasonHeader && {'Reason': sipReasonHeader})
}
});
}
else {
this.rootSpan.setAttributes({'call.termination': 'tasks completed without answering call'});
this.logger.info('InboundCallSession:_onTasksDone auto-generating non-success response to invite');

View File

@@ -27,10 +27,13 @@ class RestCallSession extends CallSession {
}
this.on('callStatusChange', this._notifyCallStatusChange.bind(this));
this._notifyCallStatusChange({
callStatus: CallStatus.Trying,
sipStatus: 100,
sipReason: 'Trying'
setImmediate(() => {
this._notifyCallStatusChange({
callStatus: CallStatus.Trying,
sipStatus: 100,
sipReason: 'Trying'
});
});
}

View File

@@ -45,12 +45,11 @@ class SipRecCallSession extends InboundCallSession {
async answerSipRecCall() {
try {
this.ms = this.getMS();
let remoteSdp = this.sdp1.replace(/sendonly/, 'sendrecv');
this.ep = await this.ms.createEndpoint({remoteSdp});
this.ep = await this._createMediaEndpoint({remoteSdp});
//this.logger.debug({remoteSdp, localSdp: this.ep.local.sdp}, 'SipRecCallSession - allocated first endpoint');
remoteSdp = this.sdp2.replace(/sendonly/, 'sendrecv');
this.ep2 = await this.ms.createEndpoint({remoteSdp});
this.ep2 = await this._createMediaEndpoint({remoteSdp});
//this.logger.debug({remoteSdp, localSdp: this.ep2.local.sdp}, 'SipRecCallSession - allocated second endpoint');
await this.ep.bridge(this.ep2);
const combinedSdp = await createSipRecPayload(this.ep.local.sdp, this.ep2.local.sdp, this.logger);

31
lib/tasks/alert.js Normal file
View File

@@ -0,0 +1,31 @@
const Task = require('./task');
const {TaskName} = require('../utils/constants');
class TaskAlert extends Task {
constructor(logger, opts, parentTask) {
super(logger, opts);
this.message = this.data.message;
}
get name() { return TaskName.Alert; }
async exec(cs) {
const {srf, accountSid:account_sid, callSid:target_sid, applicationSid:application_sid} = cs;
const {writeAlerts, AlertType} = srf.locals;
await super.exec(cs);
writeAlerts({
account_sid,
alert_type: AlertType.APPLICATION,
detail: `Application SID ${application_sid}`,
message: this.message,
target_sid
}).catch((err) => this.logger.info({err}, 'Error generating alert application'));
}
async kill(cs) {
super.kill(cs);
this.notifyTaskDone();
}
}
module.exports = TaskAlert;

View File

@@ -49,7 +49,8 @@ class Conference extends Task {
this.confName = this.data.name;
[
'beep', 'startConferenceOnEnter', 'endConferenceOnExit', 'joinMuted',
'maxParticipants', 'waitHook', 'statusHook', 'endHook', 'enterHook', 'endConferenceDuration'
'maxParticipants', 'waitHook', 'statusHook', 'endHook', 'enterHook',
'endConferenceDuration', 'distributeDtmf'
].forEach((attr) => this[attr] = this.data[attr]);
this.record = this.data.record || {};
this.statusEvents = [];
@@ -356,6 +357,7 @@ class Conference extends Task {
//https://developer.signalwire.com/freeswitch/FreeSWITCH-Explained/Modules/mod_conference_3965534/
// mute | Enter conference muted
...((this.joinMuted || this.speakOnlyTo) && {mute: true}),
...(this.distributeDtmf && {'dist-dtmf': true})
}});
/**
@@ -673,7 +675,8 @@ class Conference extends Task {
confName: this.confName,
tasks,
rootSpan: cs.rootSpan,
req: cs.req
req: cs.req,
tmpFiles: cs.tmpFiles,
});
await this._playSession.exec();
this._playSession = null;

View File

@@ -17,12 +17,17 @@ class TaskConfig extends Task {
'actionHookDelayAction',
'boostAudioSignal',
'vad',
'ttsStream'
'ttsStream',
'autoStreamTts',
'disableTtsCache'
].forEach((k) => this[k] = this.data[k] || {});
if ('notifyEvents' in this.data) {
this.notifyEvents = !!this.data.notifyEvents;
}
if (this.hasNotifySttLatency) {
this.notifySttLatency = !!this.data.notifySttLatency;
}
if (this.bargeIn.enable) {
this.gatherOpts = {
@@ -82,7 +87,9 @@ class TaskConfig extends Task {
get hasVad() { return Object.keys(this.vad).length; }
get hasFillerNoise() { return Object.keys(this.fillerNoise).length; }
get hasReferHook() { return Object.keys(this.data).includes('referHook'); }
get hasNotifySttLatency() { return Object.keys(this.data).includes('notifySttLatency'); }
get hasTtsStream() { return Object.keys(this.ttsStream).length; }
get hasDisableTtsCache() { return Object.keys(this.data).includes('disableTtsCache'); }
get summary() {
const phrase = [];
@@ -111,12 +118,16 @@ class TaskConfig extends Task {
if (this.hasFillerNoise) phrase.push(`fillerNoise ${this.fillerNoise.enable ? 'on' : 'off'}`);
if (this.data.amd) phrase.push('enable amd');
if (this.notifyEvents) phrase.push(`event notification ${this.notifyEvents ? 'on' : 'off'}`);
if (this.hasNotifySttLatency) phrase.push(
`notifySttLatency ${this.notifySttLatency ? 'on' : 'off'}`);
if (this.onHoldMusic) phrase.push(`onHoldMusic: ${this.onHoldMusic}`);
if ('boostAudioSignal' in this.data) phrase.push(`setGain ${this.data.boostAudioSignal}`);
if (this.hasReferHook) phrase.push('set referHook');
if (this.hasTtsStream) {
phrase.push(`${this.ttsStream.enable ? 'enable' : 'disable'} ttsStream`);
}
if ('autoStreamTts' in this.data) phrase.push(`enable Say.stream value ${this.data.autoStreamTts ? 'on' : 'off'}`);
if (this.hasDisableTtsCache) phrase.push(`disableTtsCache ${this.data.disableTtsCache ? 'on' : 'off'}`);
return `${this.name}{${phrase.join(',')}}`;
}
@@ -128,6 +139,11 @@ class TaskConfig extends Task {
cs.notifyEvents = !!this.data.notifyEvents;
}
if (this.hasNotifySttLatency) {
this.logger.debug(`turning notifySttLatency ${this.notifySttLatency ? 'on' : 'off'}`);
cs.notifySttLatencyEnabled = this.notifySttLatency;
}
if (this.onHoldMusic) {
cs.onHoldMusic = this.onHoldMusic;
}
@@ -296,6 +312,11 @@ class TaskConfig extends Task {
});
}
if ('autoStreamTts' in this.data) {
this.logger.info(`Config: autoStreamTts set to ${this.data.autoStreamTts}`);
cs.autoStreamTts = this.data.autoStreamTts;
}
if (this.hasFillerNoise) {
const {enable, ...opts} = this.fillerNoise;
this.logger.info({fillerNoise: this.fillerNoise}, 'Config: fillerNoise');
@@ -311,7 +332,10 @@ class TaskConfig extends Task {
voiceMs: this.vad.voiceMs || 250,
silenceMs: this.vad.silenceMs || 150,
strategy: this.vad.strategy || 'one-shot',
mode: (this.vad.mode !== undefined && this.vad.mode !== null) ? this.vad.mode : 2
mode: (this.vad.mode !== undefined && this.vad.mode !== null) ? this.vad.mode : 2,
vendor: this.vad.vendor || 'silero',
threshold: this.vad.threshold || 0.5,
speechPadMs: this.vad.speechPadMs || 30,
};
}
@@ -330,10 +354,17 @@ class TaskConfig extends Task {
};
this.logger.info({opts: this.gatherOpts}, 'Config: enabling ttsStream');
cs.enableBackgroundTtsStream(this.sayOpts);
} else if (!this.ttsStream.enable) {
}
// only disable ttsStream if it specifically set to false
else if (this.ttsStream.enable === false) {
this.logger.info('Config: disabling ttsStream');
cs.disableTtsStream();
}
if (this.hasDisableTtsCache) {
this.logger.info(`set disableTtsCache = ${this.disableTtsCache}`);
cs.disableTtsCache = this.data.disableTtsCache;
}
}
async kill(cs) {

View File

@@ -19,9 +19,9 @@ const parseDecibels = require('../utils/parse-decibels');
const debug = require('debug')('jambonz:feature-server');
const {parseUri} = require('drachtio-srf');
const {ANCHOR_MEDIA_ALWAYS,
JAMBONZ_DISABLE_DIAL_PAI_HEADER,
JAMBONZ_DIAL_PAI_HEADER,
JAMBONES_DIAL_SBC_FOR_REGISTERED_USER} = require('../config');
const { isOnhold, isOpusFirst } = require('../utils/sdp-utils');
const { isOnhold, isOpusFirst, getLeadingCodec } = require('../utils/sdp-utils');
const { normalizeJambones } = require('@jambonz/verb-specifications');
const { selectHostPort } = require('../utils/network');
const { sleepFor } = require('../utils/helpers');
@@ -109,6 +109,7 @@ class TaskDial extends Task {
this.tag = this.data.tag;
this.boostAudioSignal = this.data.boostAudioSignal;
this._mediaPath = MediaPath.FullMedia;
this.forwardPAI = this.data.forwardPAI;
if (this.dtmfHook) {
const {parentDtmfCollector, childDtmfCollector} = parseDtmfOptions(logger, this.data.dtmfCapture || {});
@@ -157,6 +158,7 @@ class TaskDial extends Task {
get canReleaseMedia() {
const keepAnchor = this.data.anchorMedia ||
this.isTranscoding ||
this.cs.isBackGroundListen ||
this.cs.onHoldMusic ||
ANCHOR_MEDIA_ALWAYS ||
@@ -271,7 +273,12 @@ class TaskDial extends Task {
}
this._removeDtmfDetection(cs.dlg);
this._removeDtmfDetection(this.dlg);
await this._killOutdials();
try {
await this._killOutdials();
}
catch (err) {
this.logger.info({err}, 'Dial:kill - error killing outdials');
}
if (this.sd) {
const byeReasonHeader = this.killReason === KillReason.MediaTimeout ? 'Media Timeout' : undefined;
this.sd.kill(byeReasonHeader);
@@ -281,13 +288,22 @@ class TaskDial extends Task {
}
if (this.callSid) sessionTracker.remove(this.callSid);
if (this.listenTask) {
await this.listenTask.kill(cs);
this.listenTask.span.end();
try {
await this.listenTask.kill(cs);
this.listenTask?.span?.end();
}
catch (err) {
this.logger.error({err}, 'Dial:kill - error killing listen task');
}
this.listenTask = null;
}
if (this.transcribeTask) {
await this.transcribeTask.kill(cs);
this.transcribeTask.span.end();
try {
await this.transcribeTask.kill(cs);
this.transcribeTask?.span?.end();
} catch (err) {
this.logger.error({err}, 'Dial:kill - error killing transcribe task');
}
this.transcribeTask = null;
}
this.notifyTaskDone();
@@ -383,7 +399,11 @@ class TaskDial extends Task {
...customHeaders
}
}, httpHeaders);
if (json && Array.isArray(json)) {
res.send(202);
this.logger.info('DialTask:handleRefer - sent 202 Accepted');
const returnedInstructions = !!json && Array.isArray(json);
if (returnedInstructions) {
try {
const logger = isChild ? this.logger : this.sd.logger;
const tasks = normalizeJambones(logger, json).map((tdata) => makeTask(this.logger, tdata));
@@ -401,16 +421,21 @@ class TaskDial extends Task {
/* need to update the callSid of the child with its own (new) AdultingCallSession */
sessionTracker.add(adultingSession.callSid, adultingSession);
}
if (this.ep) this.ep.unbridge();
/* if we got the REFER on the parent leg, end the dial task after completing the refer */
if (!isChild) {
logger.info('DialTask:handleRefer - killing dial task after processing REFER on parent leg');
cs.currentTask?.kill(cs, KillReason.ReferComplete);
}
}
} catch (err) {
this.logger.info(err, 'Dial:handleRefer - error setting new application after receiving REFER');
}
}
//caller and callee legs are briged together, accept refer with 202 will release callee leg endpoint
//that makes freeswitch release endpoint for caller leg.
if (this.ep) this.ep.unbridge();
res.send(202);
this.logger.info('DialTask:handleRefer - sent 202 Accepted');
else {
this.logger.info('DialTask:handleRefer - no tasks returned from referHook, not setting new application');
}
} catch (err) {
this.logger.info({err}, 'DialTask:handleRefer - error processing incoming REFER');
res.send(err.statusCode || 501);
@@ -525,13 +550,14 @@ class TaskDial extends Task {
let sbcAddress = this.proxy || getSBC();
const teamsInfo = {};
let fqdn;
const forwardPAI = this.forwardPAI ?? JAMBONZ_DIAL_PAI_HEADER; // dial verb overides env var
this.logger.debug(forwardPAI, 'forwardPAI value');
if (!sbcAddress) throw new Error('no SBC found for outbound call');
this.headers = {
'X-Account-Sid': cs.accountSid,
...(req && req.has('X-CID') && {'X-CID': req.get('X-CID')}),
...(direction === 'outbound' && callInfo.sbcCallid && {'X-CID': callInfo.sbcCallid}),
...(!JAMBONZ_DISABLE_DIAL_PAI_HEADER && req && {
...(req && forwardPAI && {
...(req.has('P-Asserted-Identity') && {'P-Asserted-Identity': req.get('P-Asserted-Identity')}),
...(req.has('Privacy') && {'Privacy': req.get('Privacy')}),
}),
@@ -550,7 +576,7 @@ class TaskDial extends Task {
proxy: `sip:${sbcAddress}`,
callingNumber: this.callerId || fromUri.user,
...(this.callerName && {callingName: this.callerName}),
opusFirst: isOpusFirst(this.cs.ep.remote.sdp),
opusFirst: isOpusFirst(this.cs.ep.local.sdp),
isVideoCall: this.cs.ep.remote.sdp.includes('m=video')
};
@@ -616,7 +642,9 @@ class TaskDial extends Task {
* trunk isn't specified,
* check if number matches any existing numbers
* */
if (t.type === 'phone' && !t.trunk) {
const { lookupLcrByAccount} = srf.locals.dbHelpers;
const lcrs = await lookupLcrByAccount(cs.accountSid);
if (t.type === 'phone' && !t.trunk && lcrs.length == 0) {
const str = this.callerId || req.callingNumber || '';
const callingNumber = str.startsWith('+') ? str.substring(1) : str;
const voip_carrier_sid = await lookupCarrierByPhoneNumber(cs.accountSid, callingNumber);
@@ -649,7 +677,8 @@ class TaskDial extends Task {
rootSpan: cs.rootSpan,
startSpan: this.startSpan.bind(this),
dialTask: this,
onHoldMusic: this.cs.onHoldMusic
onHoldMusic: this.cs.onHoldMusic,
tmpFiles: this.cs.tmpFiles,
});
this.dials.set(sd.callSid, sd);
@@ -744,12 +773,24 @@ class TaskDial extends Task {
}
async _connectSingleDial(cs, sd) {
// start connect with dialed leg, this is the soonest we can identify transcoding
if (this.epOther && sd.ep) {
const codecA = getLeadingCodec(this.epOther.local.sdp);
const codecB = getLeadingCodec(sd.ep.remote.sdp);
this.isTranscoding = (codecA !== codecB);
if (this.isTranscoding) {
this.logger.info(`Dial:_connectSingleDial - transcoding from ${codecA} (A leg) to ${codecB} (B leg)`);
}
}
if (!this.bridged && !this.canReleaseMedia) {
this.logger.debug('Dial:_connectSingleDial bridging endpoints');
if (this.epOther) {
this.epOther.api('uuid_break', this.epOther.uuid);
this.epOther.bridge(sd.ep);
}
else {
this.logger.error('Dial:_connectSingleDial - no other endpoint to bridge!');
}
this.bridged = true;
}
@@ -898,7 +939,6 @@ class TaskDial extends Task {
this.logger.info({err}, 'Dial:_selectSingleDial - Error boosting audio signal');
}
}
/* if we can release the media back to the SBC, do so now */
if (this.canReleaseMedia || this.shouldExitMediaPathEntirely) {
setTimeout(this._releaseMedia.bind(this, cs, sd, this.shouldExitMediaPathEntirely), 200);
@@ -972,7 +1012,7 @@ class TaskDial extends Task {
this.epOther = null;
this._mediaPath = releaseEntirely ? MediaPath.NoMedia : MediaPath.PartialMedia;
this.logger.info(
`Dial:_releaseMedia - successfully released media from freewitch, media path is now ${this._mediaPath}`);
`Dial:_releaseMedia - successfully released media from freeswitch, media path is now ${this._mediaPath}`);
} catch (err) {
this.logger.info({err}, 'Dial:_releaseMedia error');
}
@@ -981,7 +1021,7 @@ class TaskDial extends Task {
async reAnchorMedia(cs, sd) {
if (cs.ep && sd.ep) return;
this.logger.info('Dial:reAnchorMedia - re-anchoring media to freewitch');
this.logger.info('Dial:reAnchorMedia - re-anchoring media to freeswitch');
await Promise.all([sd.reAnchorMedia(this._mediaPath), cs.reAnchorMedia(this._mediaPath)]);
this.epOther = cs.ep;
@@ -989,7 +1029,7 @@ class TaskDial extends Task {
this._mediaPath = MediaPath.FullMedia;
this.logger.info(
`Dial:_releaseMedia - successfully re-anchored media to freewitch, media path is now ${this._mediaPath}`);
`Dial:_releaseMedia - successfully re-anchored media to freeswitch, media path is now ${this._mediaPath}`);
}
// Handle RE-INVITE hold from caller leg.
@@ -1073,7 +1113,8 @@ class TaskDial extends Task {
accountInfo: this.cs.accountInfo,
tasks,
rootSpan: this.cs.rootSpan,
req: this.cs.req
req: this.cs.req,
tmpFiles: this.cs.tmpFiles,
});
await this._onHoldSession.exec();
this._onHoldSession = null;

View File

@@ -83,7 +83,8 @@ class TaskDub extends TtsTask {
action: 'playOnTrack',
track: this.track,
play: this.play,
loop: this.loop ? 'loop' : 'once',
// drachtio-fsmrf will convert loop from boolean to 'loop' or 'once'
loop: this.loop,
gain: this.gain
});
}

View File

@@ -370,7 +370,8 @@ class TaskEnqueue extends Task {
accountInfo: cs.accountInfo,
tasks: tasksToRun,
rootSpan: cs.rootSpan,
req: cs.req
req: cs.req,
tmpFiles: cs.tmpFiles,
});
await this._playSession.exec();
this._playSession = null;

View File

@@ -5,13 +5,17 @@ const {
AwsTranscriptionEvents,
AzureTranscriptionEvents,
DeepgramTranscriptionEvents,
GladiaTranscriptionEvents,
SonioxTranscriptionEvents,
CobaltTranscriptionEvents,
IbmTranscriptionEvents,
NvidiaTranscriptionEvents,
JambonzTranscriptionEvents,
AssemblyAiTranscriptionEvents,
HoundifyTranscriptionEvents,
DeepgramfluxTranscriptionEvents,
VoxistTranscriptionEvents,
CartesiaTranscriptionEvents,
OpenAITranscriptionEvents,
VadDetection,
VerbioTranscriptionEvents,
@@ -91,6 +95,8 @@ class TaskGather extends SttTask {
get needsStt() { return this.input.includes('speech'); }
get isBackgroundGather() { return this.bugname_prefix === 'background_bargeIn_'; }
get wantsSingleUtterance() {
return this.data.recognizer?.singleUtterance === true;
}
@@ -138,7 +144,11 @@ class TaskGather extends SttTask {
try {
await this.handling(cs, obj);
} catch (error) {
if (error instanceof SpeechCredentialError) {
if (
// avoid bargein task with sticky will restart forever
// throw exception to stop the loop.
!this.sticky &&
error instanceof SpeechCredentialError) {
this.logger.info('Gather failed due to SpeechCredentialError, finished!');
this.notifyTaskDone();
return;
@@ -221,7 +231,9 @@ class TaskGather extends SttTask {
const startListening = async(cs, ep) => {
this._startTimer();
if (this.isContinuousAsr && 0 === this.timeout) this._startAsrTimer();
if (this.isContinuousAsr && 0 === this.timeout && !this.isBackgroundGather) {
this._startAsrTimer();
}
if (this.input.includes('speech') && !this.listenDuringPrompt) {
try {
await this._setSpeechHandlers(cs, ep);
@@ -246,7 +258,7 @@ class TaskGather extends SttTask {
startDtmfListener();
}
this._stopVad();
if (!this.killed) {
if (!this.killed && !this.resolved) {
startListening(cs, ep);
if (this.input.includes('speech') && this.vendor === 'nuance' && this.listenDuringPrompt) {
this.logger.debug('Gather:exec - starting transcription timers after say completes');
@@ -258,15 +270,22 @@ class TaskGather extends SttTask {
};
this.sayTask.span = span;
this.sayTask.ctx = ctx;
this.sayTask.exec(cs, {ep}) // kicked off, _not_ waiting for it to complete
this.sayTask
.exec(cs, {ep}) // kicked off, _not_ waiting for it to complete
.then(() => {
if (this.sayTask.isStreamingTts) return;
this.logger.debug('Gather:exec - nested say task completed');
span.end();
process();
return;
})
.catch((err) => {
process();
});
this.sayTask.on('playDone', (err) => {
span.end();
if (err) this.logger.error({err}, 'Gather:exec Error playing tts');
if (this.sayTask.isStreamingTts && !this.sayTask.closeOnStreamEmpty) {
// if streaming tts, we do not wait for it to complete if it is not closing the stream automatically
process();
});
}
}
else if (this.playTask) {
const {span, ctx} = this.startChildSpan(`nested:${this.playTask.summary}`);
@@ -277,7 +296,7 @@ class TaskGather extends SttTask {
startDtmfListener();
}
this._stopVad();
if (!this.killed) {
if (!this.killed && !this.resolved) {
startListening(cs, ep);
if (this.input.includes('speech') && this.vendor === 'nuance' && this.listenDuringPrompt) {
this.logger.debug('Gather:exec - starting transcription timers after play completes');
@@ -289,15 +308,17 @@ class TaskGather extends SttTask {
};
this.playTask.span = span;
this.playTask.ctx = ctx;
this.playTask.exec(cs, {ep}) // kicked off, _not_ waiting for it to complete
this.playTask
.exec(cs, {ep}) // kicked off, _not_ waiting for it to complete
.then(() => {
this.logger.debug('Gather:exec - nested play task completed');
span.end();
process();
return;
})
.catch((err) => {
process();
});
this.playTask.on('playDone', (err) => {
span.end();
if (err) this.logger.error({err}, 'Gather:exec Error playing url');
process();
});
}
else {
if (this.killed) {
@@ -341,6 +362,7 @@ class TaskGather extends SttTask {
this.sayTask?.span.end();
this._stopVad();
this._resolve('killed');
cs.stopSttLatencyVad();
}
updateTaskInProgress(opts) {
@@ -356,6 +378,9 @@ class TaskGather extends SttTask {
_onDtmf(cs, ep, evt) {
this.logger.debug(evt, 'TaskGather:_onDtmf');
if (!this._timeoutTimer && this.timeout > 0) {
this._startTimer();
}
clearTimeout(this.interDigitTimer);
let resolved = false;
if (this.dtmfBargein) {
@@ -380,12 +405,8 @@ class TaskGather extends SttTask {
if (this.digitBuffer.length === 0 && this.needsStt) {
// DTMF is higher priority than STT.
this.removeCustomEventListeners();
ep.stopTranscription({
vendor: this.vendor,
bugname: this.bugname,
})
.catch((err) => this.logger.error({err},
` Received DTMF, Error stopping transcription for vendor ${this.vendor}`));
this._clearAsrTimer(); //clear ASR timer as we're now using dtmf
this._stopTranscribing(ep);
}
this.digitBuffer += evt.dtmf;
const len = this.digitBuffer.length;
@@ -399,6 +420,7 @@ class TaskGather extends SttTask {
const ms = this.interDigitTimeout * 1000;
this.logger.debug(`starting interdigit timer of ${ms}`);
this.interDigitTimer = setTimeout(() => this._resolve('dtmf-interdigit-timeout'), ms);
this._clearTimer(); //clear main timer as we're now using interdigit dtmf timer
}
}
@@ -456,6 +478,32 @@ class TaskGather extends SttTask {
this.addCustomEventListener(ep, DeepgramTranscriptionEvents.Connect, this._onVendorConnect.bind(this, cs, ep));
this.addCustomEventListener(ep, DeepgramTranscriptionEvents.ConnectFailure,
this._onVendorConnectFailure.bind(this, cs, ep));
this.addCustomEventListener(ep, DeepgramTranscriptionEvents.Error, this._onVendorError.bind(this, cs, ep));
break;
case 'deepgramflux':
this.bugname = `${this.bugname_prefix}deepgramflux_transcribe`;
this.addCustomEventListener(
ep, DeepgramfluxTranscriptionEvents.Transcription, this._onTranscription.bind(this, cs, ep));
this.addCustomEventListener(
ep, DeepgramfluxTranscriptionEvents.Connect, this._onVendorConnect.bind(this, cs, ep));
this.addCustomEventListener(ep, DeepgramfluxTranscriptionEvents.ConnectFailure,
this._onVendorConnectFailure.bind(this, cs, ep));
this.addCustomEventListener(ep, DeepgramfluxTranscriptionEvents.Error, this._onVendorError.bind(this, cs, ep));
break;
case 'gladia':
this.bugname = `${this.bugname_prefix}gladia_transcribe`;
this.addCustomEventListener(
ep, GladiaTranscriptionEvents.Transcription, this._onTranscription.bind(this, cs, ep));
this.addCustomEventListener(ep, GladiaTranscriptionEvents.Connect, this._onVendorConnect.bind(this, cs, ep));
this.addCustomEventListener(ep, GladiaTranscriptionEvents.ConnectFailure,
this._onVendorConnectFailure.bind(this, cs, ep));
this.addCustomEventListener(ep, GladiaTranscriptionEvents.Error, this._onVendorError.bind(this, cs, ep));
// gladia require unique url for each session
const {host, path} = await this.createGladiaLiveSession();
opts.GLADIA_SPEECH_HOST = host;
opts.GLADIA_SPEECH_PATH = path;
break;
case 'soniox':
@@ -535,6 +583,18 @@ class TaskGather extends SttTask {
this._onVendorConnectFailure.bind(this, cs, ep));
break;
case 'houndify':
this.bugname = `${this.bugname_prefix}houndify_transcribe`;
this.addCustomEventListener(ep, HoundifyTranscriptionEvents.Transcription,
this._onTranscription.bind(this, cs, ep));
this.addCustomEventListener(ep, HoundifyTranscriptionEvents.Error,
this._onVendorError.bind(this, cs, ep));
this.addCustomEventListener(ep, HoundifyTranscriptionEvents.ConnectFailure,
this._onVendorConnectFailure.bind(this, cs, ep));
this.addCustomEventListener(ep, HoundifyTranscriptionEvents.Connect,
this._onVendorConnect.bind(this, cs, ep));
break;
case 'voxist':
this.bugname = `${this.bugname_prefix}voxist_transcribe`;
this.addCustomEventListener(ep, VoxistTranscriptionEvents.Transcription,
@@ -546,6 +606,17 @@ class TaskGather extends SttTask {
this._onVendorConnectFailure.bind(this, cs, ep));
break;
case 'cartesia':
this.bugname = `${this.bugname_prefix}cartesia_transcribe`;
this.addCustomEventListener(ep, CartesiaTranscriptionEvents.Transcription,
this._onTranscription.bind(this, cs, ep));
this.addCustomEventListener(
ep, CartesiaTranscriptionEvents.Connect, this._onVendorConnect.bind(this, cs, ep));
this.addCustomEventListener(ep, CartesiaTranscriptionEvents.Error, this._onVendorError.bind(this, cs, ep));
this.addCustomEventListener(ep, CartesiaTranscriptionEvents.ConnectFailure,
this._onVendorConnectFailure.bind(this, cs, ep));
break;
case 'speechmatics':
this.bugname = `${this.bugname_prefix}speechmatics_transcribe`;
this.addCustomEventListener(
@@ -661,14 +732,18 @@ class TaskGather extends SttTask {
target_sid: this.cs.callSid
});
}).catch((err) => this.logger.info({err}, 'Error generating alert for tts failure'));
// Some vendor use single connection, that we cannot use onConnect event to track transcription start
this.cs.emit('transcribe-start');
}
_startTimer() {
if (0 === this.timeout) return;
this.logger.debug(`Starting timoutTimer of ${this.timeout}ms`);
this._clearTimer();
this._timeoutTimer = setTimeout(() => {
// If continuousASR in use then extend by the asr window for more transcripts.
if (this.isContinuousAsr) this._startAsrTimer();
if (this.interDigitTimer) return; // let the inter-digit timer complete
else {
this._resolve(this.digitBuffer.length >= this.minDigits ? 'dtmf-num-digits' : 'timeout');
}
@@ -810,17 +885,15 @@ class TaskGather extends SttTask {
this._fillerNoiseOn = false; // in a race, if we just started audio it may sneak through here
this.ep.api('uuid_break', this.ep.uuid)
.catch((err) => this.logger.info(err, 'Error killing audio'));
cs.clearTtsStream();
if (cs.isTtsStreamEnabled) cs.clearTtsStream();
}
return;
}
if (this.sayTask && !this.sayTask.killed) {
this.sayTask.removeAllListeners('playDone');
this.sayTask.kill(cs);
this.sayTask = null;
}
if (this.playTask && !this.playTask.killed) {
this.playTask.removeAllListeners('playDone');
this.playTask.kill(cs);
this.playTask = null;
}
@@ -828,6 +901,10 @@ class TaskGather extends SttTask {
}
_onTranscription(cs, ep, evt, fsEvent) {
// check if we are in graceful shutdown mode
if (ep.gracefulShutdownResolver) {
ep.gracefulShutdownResolver();
}
// make sure this is not a transcript from answering machine detection
const bugname = fsEvent.getHeader('media-bugname');
const finished = fsEvent.getHeader('transcription-session-finished');
@@ -840,6 +917,10 @@ class TaskGather extends SttTask {
if (finished === 'true') return;
if (this.vendor === 'ibm' && evt?.state === 'listening') return;
// emit an event to the call session to track the time transcription is received
cs.emit('on-transcription');
if (this.vendor === 'deepgram' && evt.type === 'UtteranceEnd') {
/* we will only get this when we have set utterance_end_ms */
if (this._bufferedTranscripts.length === 0) {
@@ -867,7 +948,7 @@ class TaskGather extends SttTask {
evt = this.normalizeTranscription(evt, this.vendor, 1, this.language,
this.shortUtterance, this.data.recognizer.punctuation);
//this.logger.debug({evt, bugname, finished, vendor: this.vendor}, 'Gather:_onTranscription normalized transcript');
this.logger.debug({evt, bugname, finished, vendor: this.vendor}, 'Gather:_onTranscription normalized transcript');
if (evt.alternatives.length === 0) {
this.logger.info({evt}, 'TaskGather:_onTranscription - got empty transcript, continue listening');
@@ -923,6 +1004,12 @@ class TaskGather extends SttTask {
}
}
// receive a final transcript, calculate the stt latency for this transcript
const sttLatency = this.cs.calculateSttLatency();
if (!emptyTranscript && sttLatency) {
this.stt_latency_ms += `${sttLatency.stt_latency_ms},`;
}
if (this.isContinuousAsr) {
/* append the transcript and start listening again for asrTimeout */
const t = evt.alternatives[0].transcript;
@@ -1027,6 +1114,11 @@ class TaskGather extends SttTask {
this.cs.requestor.request('verb:hook', this.partialResultHook, Object.assign({speech: evt},
this.cs.callInfo, httpHeaders));
}
else if (this.vendor === 'deepgramflux' &&
['EagerEndOfTurn', 'TurnResumed'].includes(evt.vendor.evt?.event)) {
this.logger.debug(`Gather:_onTranscription - deepgramflux event detected: ${evt.event}`);
this.performAction({speech: evt, reason: 'speechDetected'}, false);
}
if (this.vendor === 'soniox') {
if (evt.vendor.finalWords.length) {
this.logger.debug({evt}, 'TaskGather:_onTranscription - buffering soniox transcript');
@@ -1073,12 +1165,8 @@ class TaskGather extends SttTask {
}
async _startFallback(cs, ep, evt) {
if (this.canFallback) {
ep.stopTranscription({
vendor: this.vendor,
bugname: this.bugname
})
.catch((err) => this.logger.error({err}, `Error stopping transcription for primary vendor ${this.vendor}`));
if (this.canFallback()) {
this._stopTranscribing(ep);
try {
this.logger.debug('gather:_startFallback');
this.notifyError({ msg: 'ASR error',
@@ -1207,17 +1295,26 @@ class TaskGather extends SttTask {
}
}
async _stopTranscribing(ep) {
// Fix for https://github.com/jambonz/jambonz-feature-server/issues/1281
// We should immediately call stop transcription from gather
// so that next gather can start transcription immediately
ep.stopTranscription({
vendor: this.vendor,
bugname: this.bugname,
gracefulShutdown: false
})
.catch((err) => {
if (this.resolved) return;
this.logger.error({err}, 'Error stopping transcription');
});
this.cs.emit('transcribe-stop');
}
async _resolve(reason, evt) {
this.logger.info({evt}, `TaskGather:resolve with reason ${reason}`);
if (this.needsStt && this.ep && this.ep.connected) {
this.ep.stopTranscription({
vendor: this.vendor,
bugname: this.bugname
})
.catch((err) => {
if (this.resolved) return;
this.logger.error({err}, 'Error stopping transcription');
});
this._stopTranscribing(this.ep);
}
if (this.resolved) {
this.logger.debug('TaskGather:_resolve - already resolved');
@@ -1225,6 +1322,8 @@ class TaskGather extends SttTask {
}
this.resolved = true;
// gather is resolved, prevent any further transcription events while resolve in progress
this.removeCustomEventListeners();
// If bargin is false and ws application return ack to verb:hook
// the gather should not play any audio
this._killAudio(this.cs);
@@ -1236,11 +1335,28 @@ class TaskGather extends SttTask {
this._clearAsrTimer();
this._clearFinalAsrTimer();
let sttLatencyMetrics = {};
if (this.needsStt) {
const sttLatency = this.cs.calculateSttLatency();
if (sttLatency) {
this.stt_latency_ms = this.stt_latency_ms.endsWith(',') ?
this.stt_latency_ms.slice(0, -1) : this.stt_latency_ms;
sttLatencyMetrics = {
'stt.latency_ms': this.stt_latency_ms,
'stt.talkspurts': JSON.stringify(sttLatency.talkspurts),
'stt.start_time': sttLatency.stt_start_time,
'stt.stop_time': sttLatency.stt_stop_time,
'stt.usage': sttLatency.stt_usage,
};
}
}
this.span.setAttributes({
channel: 1,
'stt.label': this.label || 'None',
'stt.resolve': reason,
'stt.result': JSON.stringify(evt)
'stt.result': JSON.stringify(evt),
...sttLatencyMetrics
});
if (this.callSession && this.callSession.callGone) {
@@ -1268,6 +1384,9 @@ class TaskGather extends SttTask {
let returnedVerbs = false;
try {
const latencies = Object.fromEntries(
Object.entries(sttLatencyMetrics).map(([key, value]) => [key.replace('stt.', 'stt_'), value])
);
if (reason.startsWith('dtmf')) {
if (this.parentTask) this.parentTask.emit('dtmf', evt);
else {
@@ -1281,7 +1400,7 @@ class TaskGather extends SttTask {
else {
this.emit('transcription', evt);
this.logger.debug('TaskGather:_resolve - invoking performAction');
returnedVerbs = await this.performAction({speech: evt, reason: 'speechDetected'});
returnedVerbs = await this.performAction({speech: evt, reason: 'speechDetected', ...latencies});
this.logger.debug({returnedVerbs}, 'TaskGather:_resolve - back from performAction');
}
}
@@ -1289,20 +1408,20 @@ class TaskGather extends SttTask {
if (this.parentTask) this.parentTask.emit('timeout', evt);
else {
this.emit('timeout', evt);
returnedVerbs = await this.performAction({reason: 'timeout'});
returnedVerbs = await this.performAction({reason: 'timeout', ...latencies});
}
}
else if (reason.startsWith('stt-error')) {
if (this.parentTask) this.parentTask.emit('stt-error', evt);
else {
this.emit('stt-error', evt);
returnedVerbs = await this.performAction({reason: 'error', details: evt.error});
returnedVerbs = await this.performAction({reason: 'error', details: evt.error, ...latencies});
}
} else if (reason.startsWith('stt-low-confidence')) {
if (this.parentTask) this.parentTask.emit('stt-low-confidence', evt);
else {
this.emit('stt-low-confidence', evt);
returnedVerbs = await this.performAction({speech:evt, reason: 'stt-low-confidence'});
returnedVerbs = await this.performAction({speech:evt, reason: 'stt-low-confidence', ...latencies});
}
}
} catch (err) { /*already logged error*/ }

View File

@@ -1,10 +1,21 @@
const Task = require('./task');
const {TaskName, TaskPreconditions, ListenEvents, ListenStatus} = require('../utils/constants');
const {TaskName, TaskPreconditions, ListenEvents, ListenStatus} = require('../utils/constants.json');
const makeTask = require('./make_task');
const moment = require('moment');
const MAX_PLAY_AUDIO_QUEUE_SIZE = 10;
const DTMF_SPAN_NAME = 'dtmf';
function escapeString(str) {
return str
.replace(/\\/g, '\\\\') // Escape backslashes
.replace(/"/g, '\\"') // Escape double quotes
.replace(/[\b]/g, '\\b') // Escape backspace (NOTE: [\b] not \b)
.replace(/\f/g, '\\f') // Escape formfeed
.replace(/\n/g, '\\n') // Escape newlines
.replace(/\r/g, '\\r') // Escape carriage returns
.replace(/\t/g, '\\t'); // Escape tabs
}
class TaskListen extends Task {
constructor(logger, opts, parentTask) {
super(logger, opts);
@@ -16,10 +27,21 @@ class TaskListen extends Task {
this.preconditions = TaskPreconditions.Endpoint;
[
'action', 'auth', 'method', 'url', 'finishOnKey', 'maxLength', 'metadata', 'mixType', 'passDtmf', 'playBeep',
'action', 'auth', 'method', 'url', 'finishOnKey', 'maxLength', 'mixType', 'passDtmf', 'playBeep',
'sampleRate', 'timeout', 'transcribe', 'wsAuth', 'disableBidirectionalAudio', 'channel'
].forEach((k) => this[k] = this.data[k]);
//Escape JSON special characters in metadata
if (this.data.metadata) {
this.metadata = {};
for (const key in this.data.metadata) {
if (this.data.metadata.hasOwnProperty(key)) {
const value = this.data.metadata[key];
this.metadata[key] = typeof value === 'string' ? escapeString(value) : value;
}
}
}
this.mixType = this.mixType || 'mono';
this.sampleRate = this.sampleRate || 8000;
this.earlyMedia = this.data.earlyMedia === true;
@@ -72,7 +94,7 @@ class TaskListen extends Task {
} catch (err) {
this.logger.info(err, `TaskListen:exec - error ${this.url}`);
}
if (this.transcribeTask) this.transcribeTask.kill();
if (this.transcribeTask) this.transcribeTask.kill(cs);
this._removeListeners(ep);
}
@@ -103,9 +125,12 @@ class TaskListen extends Task {
this.notifyTaskDone();
}
async updateListen(status) {
async updateListen(status, silence = false) {
if (!this.killed && this.ep && this.ep.connected) {
const args = this._bugname ? [this._bugname] : [];
const args = [
...(this._bugname ? [this._bugname] : []),
...(status === ListenStatus.Pause ? ([silence]) : []),
];
this.logger.info(`TaskListen:updateListen status ${status}`);
switch (status) {
case ListenStatus.Pause:

View File

@@ -32,12 +32,14 @@ class TaskLlmUltravox_S2S extends Task {
this.auth = this.parent.auth;
this.connectionOptions = this.parent.connectOptions;
const {apiKey} = this.auth || {};
const {apiKey, agent_id} = this.auth || {};
if (!apiKey) throw new Error('auth.apiKey is required for Vendor: Ultravox');
this.apiKey = apiKey;
this.agentId = agent_id;
this.actionHook = this.data.actionHook;
this.eventHook = this.data.eventHook;
this.toolHook = this.data.toolHook;
this.llmOptions = this.data.llmOptions || {};
this.results = {
completionReason: 'normal conversation end'
@@ -105,24 +107,29 @@ class TaskLlmUltravox_S2S extends Task {
}
);
// merge with any existing tools
this.data.llmOptions.selectedTools = [
this.llmOptions.selectedTools = [
...convertedTools,
...(this.data.llmOptions.selectedTools || [])
...(this.llmOptions.selectedTools || [])
];
}
const payload = {
...this.data.llmOptions,
model: this.model,
...this.llmOptions,
...(!this.agentId && {
model: this.model,
}),
medium: {
...(this.data.llmOptions.medium || {}),
...(this.llmOptions.medium || {}),
serverWebSocket: {
inputSampleRate: 8000,
outputSampleRate: 8000,
}
}
};
const {statusCode, body} = await request('https://api.ultravox.ai/api/calls', {
const baseUrl = 'https://api.ultravox.ai';
const url = this.agentId ?
`${baseUrl}/api/agents/${this.agentId}/calls` : `${baseUrl}/api/calls`;
const {statusCode, body} = await request(url, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
@@ -139,8 +146,9 @@ class TaskLlmUltravox_S2S extends Task {
return data;
}
_unregisterHandlers() {
_unregisterHandlers(ep) {
this.removeCustomEventListeners();
ep.removeAllListeners('dtmf');
}
_registerHandlers(ep) {
@@ -148,6 +156,7 @@ class TaskLlmUltravox_S2S extends Task {
this.addCustomEventListener(ep, LlmEvents_Ultravox.ConnectFailure, this._onConnectFailure.bind(this, ep));
this.addCustomEventListener(ep, LlmEvents_Ultravox.Disconnect, this._onDisconnect.bind(this, ep));
this.addCustomEventListener(ep, LlmEvents_Ultravox.ServerEvent, this._onServerEvent.bind(this, ep));
ep.on('dtmf', this._onDtmf.bind(this, ep));
}
async _startListening(cs, ep) {
@@ -182,7 +191,7 @@ class TaskLlmUltravox_S2S extends Task {
/* note: the parent llm verb started the span, which is why this is necessary */
await this.parent.performAction(this.results);
this._unregisterHandlers();
this._unregisterHandlers(ep);
}
async kill(cs) {
@@ -211,7 +220,7 @@ class TaskLlmUltravox_S2S extends Task {
async _onServerEvent(_ep, evt) {
let endConversation = false;
const type = evt.type;
this.logger.debug({evt}, 'TaskLlmUltravox_S2S:_onServerEvent');
//this.logger.debug({evt}, 'TaskLlmUltravox_S2S:_onServerEvent');
/* server errors of some sort */
if (type === 'error') {
@@ -339,6 +348,18 @@ class TaskLlmUltravox_S2S extends Task {
excludeEvents: this.excludeEvents
}, 'TaskLlmUltravox_S2S:_populateEvents');
}
_onDtmf(ep, evt) {
this.logger.info({evt}, 'TaskLlmUltravox_S2S:_onDtmf - DTMF received');
const {dtmf} = evt;
const data = {
type: 'user_text_message',
text: `DTMF received: ${dtmf}`,
urgency: 'immediate'
};
this._api(ep, [ep.uuid, ClientEvent, JSON.stringify(data)])
.catch((err) => this.logger.info({err, evt}, 'TaskLlmUltravox_S2S:_onDtmf - Error sending DTMF as text message'));
}
}
module.exports = TaskLlmUltravox_S2S;

View File

@@ -96,6 +96,9 @@ function makeTask(logger, obj, parent) {
case TaskName.Tag:
const TaskTag = require('./tag');
return new TaskTag(logger, data, parent);
case TaskName.Alert:
const TaskAlert = require('./alert');
return new TaskAlert(logger, data, parent);
}
// should never reach

View File

@@ -6,7 +6,21 @@ class TaskPlay extends Task {
super(logger, opts);
this.preconditions = TaskPreconditions.Endpoint;
this.url = this.data.url;
//Cleanup URLs that contain a querystring with a . unless that querystring is the filename
// see https://github.com/jambonz/jambonz-feature-server/pull/1293
// and https://github.com/jambonz/jambonz-feature-server/issues/1394 for background
if (this.data.url.includes('?')) {
if (['.mp3', '.wav'].includes(this.data.url.slice(-4))) {
this.url = this.data.url;
}
else {
this.url = this.data.url.split('?')[0] + '?' + this.data.url.split('?')[1].replaceAll('.', '%2E');
}
}
else {
this.url = this.data.url;
}
this.seekOffset = this.data.seekOffset || -1;
this.timeoutSecs = this.data.timeoutSecs || -1;
this.loop = this.data.loop || 1;
@@ -48,7 +62,7 @@ class TaskPlay extends Task {
*/
ep.once('playback-start', (evt) => {
this.logger.debug({evt}, 'Play got playback-start');
this.cs.stickyEventEmitter.once('uuid_break', (t) => {
this.cs.stickyEventEmitter?.once('uuid_break', (t) => {
if (t?.taskId === this.taskId) {
this.logger.debug(`Play got kill-playback, executing uuid_break, taskId: ${t?.taskId}`);
this.ep.api('uuid_break', this.ep.uuid).catch((err) => this.logger.info(err, 'Error killing audio'));

View File

@@ -1,7 +1,6 @@
const Task = require('./task');
const {TaskName} = require('../utils/constants');
const WsRequestor = require('../utils/ws-requestor');
const URL = require('url');
const HttpRequestor = require('../utils/http-requestor');
/**
@@ -10,6 +9,7 @@ const HttpRequestor = require('../utils/http-requestor');
class TaskRedirect extends Task {
constructor(logger, opts) {
super(logger, opts);
this.statusHook = opts.statusHook || false;
}
get name() { return TaskName.Redirect; }
@@ -17,31 +17,60 @@ class TaskRedirect extends Task {
async exec(cs) {
await super.exec(cs);
if (cs.requestor instanceof WsRequestor && cs.application.requestor._isAbsoluteUrl(this.actionHook)) {
this.logger.info(`Task:performAction redirecting to ${this.actionHook}, requires new ws connection`);
try {
this.cs.requestor.close();
const requestor = new WsRequestor(this.logger, cs.accountSid, {url: this.actionHook}, this.webhook_secret) ;
this.cs.application.requestor = requestor;
} catch (err) {
this.logger.info(err, `Task:performAction error redirecting to ${this.actionHook}`);
}
} else if (cs.application.requestor._isAbsoluteUrl(this.actionHook)) {
const baseUrl = this.cs.application.requestor.baseUrl;
const newUrl = URL.parse(this.actionHook);
const newBaseUrl = newUrl.protocol + '//' + newUrl.host;
if (baseUrl != newBaseUrl) {
const isAbsoluteUrl = cs.application?.requestor?._isAbsoluteUrl(this.actionHook);
if (isAbsoluteUrl) {
this.logger.info(`TaskRedirect redirecting to new absolute URL ${this.actionHook}, requires new requestor`);
if (cs.requestor instanceof WsRequestor) {
try {
this.logger.info(`Task:redirect updating base url to ${newBaseUrl}`);
const newRequestor = new HttpRequestor(this.logger, cs.accountSid, {url: this.actionHook},
cs.accountInfo.account.webhook_secret);
this.cs.requestor.removeAllListeners();
this.cs.application.requestor = newRequestor;
const requestor = new WsRequestor(this.logger, cs.accountSid, {url: this.actionHook},
cs.accountInfo.account.webhook_secret) ;
cs.requestor.emit('handover', requestor);
} catch (err) {
this.logger.info(err, `Task:redirect error updating base url to ${this.actionHook}`);
this.logger.info(err, `TaskRedirect error redirecting to ${this.actionHook}`);
}
}
else {
const baseUrl = this.cs.application.requestor.baseUrl;
const newUrl = new URL(this.actionHook);
const newBaseUrl = newUrl.protocol + '//' + newUrl.host;
if (baseUrl != newBaseUrl) {
try {
this.logger.info(`Task:redirect updating base url to ${newBaseUrl}`);
const newRequestor = new HttpRequestor(this.logger, cs.accountSid, {url: this.actionHook},
cs.accountInfo.account.webhook_secret);
cs.requestor.emit('handover', newRequestor);
} catch (err) {
this.logger.info(err, `TaskRedirect error updating base url to ${this.actionHook}`);
}
}
}
}
/* update the notifier if a new statusHook was provided */
if (this.statusHook) {
this.logger.info(`TaskRedirect updating statusHook to ${this.statusHook}`);
try {
const oldNotifier = cs.application.notifier;
const isStatusHookAbsolute = cs.notifier?._isAbsoluteUrl(this.statusHook);
if (isStatusHookAbsolute) {
if (cs.notifier instanceof WsRequestor) {
cs.application.notifier = new WsRequestor(this.logger, cs.accountSid, {url: this.statusHook},
cs.accountInfo.account.webhook_secret);
} else {
cs.application.notifier = new HttpRequestor(this.logger, cs.accountSid, {url: this.statusHook},
cs.accountInfo.account.webhook_secret);
}
if (oldNotifier?.close) oldNotifier.close();
}
/* update the call_status_hook URL that gets passed to the notifier */
cs.application.call_status_hook = this.statusHook;
} catch (err) {
this.logger.info(err, `TaskRedirect error updating statusHook to ${this.statusHook}`);
}
}
await this.performAction();
}
}

View File

@@ -19,6 +19,7 @@ class TaskRestDial extends Task {
this.timeout = this.data.timeout || 60;
this.sipRequestWithinDialogHook = this.data.sipRequestWithinDialogHook;
this.referHook = this.data.referHook;
this.recentCallStatus = 0;
this.on('connect', this._onConnect.bind(this));
this.on('callStatus', this._onCallStatus.bind(this));
@@ -57,7 +58,11 @@ class TaskRestDial extends Task {
this._clearCallTimer();
if (this.canCancel) {
this.canCancel = false;
cs?.req?.cancel();
try {
cs?.req?.cancel();
} catch (err) {
this.logger.error({err}, 'TaskRestDial: error cancelling call');
}
}
this.notifyTaskDone();
}
@@ -118,7 +123,8 @@ class TaskRestDial extends Task {
}
_onCallStatus(status) {
this.logger.debug(`CallStatus: ${status}`);
this.logger.debug(`RestDial CallStatus: ${status}`);
this.recentCallStatus = status;
if (status >= 200) {
this.canCancel = false;
this._clearCallTimer();
@@ -136,11 +142,16 @@ class TaskRestDial extends Task {
}
_onCallTimeout() {
this.logger.debug('TaskRestDial: timeout expired without answer, killing task');
this.logger.debug(`TaskRestDial: timeout expired without answer, last status ${this.recentCallStatus}`);
this.timer = null;
if (this.canCancel) {
if (this.canCancel && this.recentCallStatus < 200) {
this.logger.debug('TaskRestDial: cancelling call attempt');
this.canCancel = false;
this.cs?.req?.cancel();
try {
this.cs?.req?.cancel();
} catch (err) {
this.logger.error({err}, 'TaskRestDial: error cancelling call');
}
}
}

View File

@@ -1,26 +1,58 @@
const assert = require('assert');
const TtsTask = require('./tts-task');
const {TaskName, TaskPreconditions} = require('../utils/constants');
const {JAMBONES_SAY_CHUNK_SIZE} = require('../config');
const pollySSMLSplit = require('polly-ssml-split');
const { SpeechCredentialError } = require('../utils/error');
const { SpeechCredentialError, NonFatalTaskError } = require('../utils/error');
const { sleepFor } = require('../utils/helpers');
const { NON_FANTAL_ERRORS } = require('../utils/constants.json');
const breakLengthyTextIfNeeded = (logger, text) => {
const chunkSize = 1000;
/**
* Discard unmatching responses:
* (1) I sent a playback id but get a response with a different playback id
* (2) I sent a playback id but get a response with no playback id
* (3) I did not send a playback id but get a response with a playback id
* (4) I sent a cache file but get a response with a different cache file
*/
const isMatchingEvent = (logger, filename, playbackId, evt) => {
if (!!playbackId && !!evt.variable_tts_playback_id && evt.variable_tts_playback_id === playbackId) {
//logger.debug({filename, playbackId, evt}, 'Say:isMatchingEvent - playbackId matched');
return true;
}
if (!!filename && !!evt.file && evt.file === filename) {
//logger.debug({filename, playbackId, evt}, 'Say:isMatchingEvent - filename matched');
return true;
}
logger.info({filename, playbackId, evt}, 'Say:isMatchingEvent - no match');
return false;
};
const breakLengthyTextIfNeeded = (logger, text) => {
// As The text can be used for tts streaming, we need to break lengthy text into smaller chunks
// HIGH_WATER_BUFFER_SIZE defined in tts-streaming-buffer.js
const chunkSize = JAMBONES_SAY_CHUNK_SIZE;
const isSSML = text.startsWith('<speak>');
if (text.length <= chunkSize || !isSSML) return [text];
const options = {
// MIN length
softLimit: 100,
// MAX length, exclude 15 characters <speak></speak>
hardLimit: chunkSize - 15,
// Set of extra split characters (Optional property)
extraSplitChars: ',;!?',
};
pollySSMLSplit.configure(options);
try {
return pollySSMLSplit.split(text);
if (text.length <= chunkSize) return [text];
if (isSSML) {
return pollySSMLSplit.split(text);
} else {
// Wrap with <speak> and split
const wrapped = `<speak>${text}</speak>`;
const splitArr = pollySSMLSplit.split(wrapped);
// Remove <speak> and </speak> from each chunk
return splitArr.map((str) => str.replace(/^<speak>/, '').replace(/<\/speak>$/, ''));
}
} catch (err) {
logger.info({err}, 'Error spliting SSML long text');
logger.info({err}, 'Error splitting SSML long text');
return [text];
}
};
@@ -39,6 +71,9 @@ class TaskSay extends TtsTask {
assert.ok((typeof this.data.text === 'string' || Array.isArray(this.data.text)) || this.data.stream === true,
'Say: either text or stream:true is required');
this.text = this.data.text ? (Array.isArray(this.data.text) ? this.data.text : [this.data.text])
.map((t) => breakLengthyTextIfNeeded(this.logger, t))
.flat() : [];
if (this.data.stream === true) {
this._isStreamingTts = true;
@@ -46,10 +81,6 @@ class TaskSay extends TtsTask {
}
else {
this._isStreamingTts = false;
this.text = (Array.isArray(this.data.text) ? this.data.text : [this.data.text])
.map((t) => breakLengthyTextIfNeeded(this.logger, t))
.flat();
this.loop = this.data.loop || 1;
this.isHandledByPrimaryProvider = true;
}
@@ -85,15 +116,17 @@ class TaskSay extends TtsTask {
}
try {
this._isStreamingTts = this._isStreamingTts || cs.autoStreamTts;
if (this.isStreamingTts) {
this.closeOnStreamEmpty = this.closeOnStreamEmpty || this.text.length !== 0;
}
if (this.isStreamingTts) await this.handlingStreaming(cs, obj);
else await this.handling(cs, obj);
this.emit('playDone');
} catch (error) {
if (error instanceof SpeechCredentialError) {
// if say failed due to speech credentials, alarm is writtern and error notification is sent
// finished this say to move to next task.
this.logger.info({error}, 'Say failed due to SpeechCredentialError, finished!');
this.emit('playDone');
return;
}
throw error;
@@ -114,8 +147,53 @@ class TaskSay extends TtsTask {
await cs.startTtsStream();
cs.requestor?.request('tts:streaming-event', '/streaming-event', {event_type: 'stream_open'})
.catch((err) => this.logger.info({err}, 'TaskSay:handlingStreaming - Error sending'));
if (this.text.length !== 0) {
this.logger.info('TaskSay:handlingStreaming - sending text to TTS stream');
for (const t of this.text) {
const result = await cs._internalTtsStreamingBufferTokens(t);
if (result?.status === 'failed') {
if (result.reason === 'full') {
// Retry logic for full buffer
const maxRetries = 5;
let backoffMs = 1000;
for (let retryCount = 0; retryCount < maxRetries && !this.killed; retryCount++) {
this.logger.info(
`TaskSay:handlingStreaming - retry ${retryCount + 1}/${maxRetries} after ${backoffMs}ms`);
await sleepFor(backoffMs);
const retryResult = await cs._internalTtsStreamingBufferTokens(t);
// Exit retry loop on success
if (retryResult?.status !== 'failed') {
break;
}
// Handle failure for reason other than full buffer
if (retryResult.reason !== 'full') {
this.logger.info(
{result: retryResult}, 'TaskSay:handlingStreaming - TTS stream failed to buffer tokens');
throw new Error(`TTS stream failed to buffer tokens: ${retryResult.reason}`);
}
// Last retry attempt failed
if (retryCount === maxRetries - 1) {
this.logger.info('TaskSay:handlingStreaming - Maximum retries exceeded for full buffer');
throw new Error('TTS stream buffer full - maximum retries exceeded');
}
// Increase backoff for next retry
backoffMs = Math.min(backoffMs * 1.5, 10000);
}
} else {
// Immediate failure for non-full buffer issues
this.logger.info({result}, 'TaskSay:handlingStreaming - TTS stream failed to buffer tokens');
throw new Error(`TTS stream failed to buffer tokens: ${result.reason}`);
}
} else {
await cs._lccTtsFlush();
}
}
}
} catch (err) {
this.logger.info({err}, 'TaskSay:handlingStreaming - Error setting channel vars');
cs.requestor?.request('tts:streaming-event', '/streaming-event', {event_type: 'stream_closed'})
@@ -188,6 +266,7 @@ class TaskSay extends TtsTask {
throw new SpeechCredentialError(error.message);
}
};
let filepath;
try {
filepath = await this._synthesizeWithSpecificVendor(cs, ep, {vendor, language, voice, label});
@@ -199,77 +278,145 @@ class TaskSay extends TtsTask {
while (!this.killed && (this.loop === 'forever' || this.loop--) && ep?.connected) {
let segment = 0;
while (!this.killed && segment < filepath.length) {
const filename = filepath[segment];
if (cs.isInConference) {
const {memberId, confName, confUuid} = cs;
await this.playToConfMember(ep, memberId, confName, confUuid, filepath[segment]);
await this.playToConfMember(ep, memberId, confName, confUuid, filename);
}
else {
const isStreaming = filepath[segment].startsWith('say:{');
const isStreaming = filename.startsWith('say:{');
if (isStreaming) {
const arr = /^say:\{.*\}\s*(.*)$/.exec(filepath[segment]);
if (arr) this.logger.debug(`Say:exec sending streaming tts request: ${arr[1].substring(0, 64)}..`);
const arr = /^say:\{.*\}\s*(.*)$/.exec(filename);
if (arr) this.logger.debug(`Say:exec sending streaming tts request ${arr[1].substring(0, 64)}..`);
else this.logger.debug(`Say:exec sending ${filename.substring(0, 64)}`);
}
else this.logger.debug(`Say:exec sending ${filepath[segment].substring(0, 64)}`);
ep.once('playback-start', (evt) => {
this.logger.debug({evt}, 'Say got playback-start');
if (this.otelSpan) {
this._addStreamingTtsAttributes(this.otelSpan, evt, vendor);
this.otelSpan.end();
this.otelSpan = null;
if (evt.variable_tts_cache_filename) {
cs.trackTmpFile(evt.variable_tts_cache_filename);
}
}
});
ep.once('playback-stop', (evt) => {
this.logger.debug({evt}, 'Say got playback-stop');
this.notifyStatus({event: 'stop-playback'});
this.notifiedPlayBackStop = true;
const tts_error = evt.variable_tts_error;
let response_code = 200;
// Check if any property ends with _response_code
for (const [key, value] of Object.entries(evt)) {
if (key.endsWith('_response_code')) {
response_code = parseInt(value, 10) || 200;
break;
}
}
if (tts_error) {
writeAlerts({
account_sid,
alert_type: AlertType.TTS_FAILURE,
vendor,
detail: evt.variable_tts_error,
target_sid
}).catch((err) => this.logger.info({err}, 'Error generating alert for no tts'));
}
if (!tts_error && response_code < 300 && evt.variable_tts_cache_filename && !this.killed) {
const text = parseTextFromSayString(this.text[segment]);
addFileToCache(evt.variable_tts_cache_filename, {
account_sid,
vendor,
language,
voice,
engine,
model: this.model || this.model_id,
text,
instructions: this.instructions
}).catch((err) => this.logger.info({err}, 'Error adding file to cache'));
}
const onPlaybackStop = (evt) => {
try {
const playbackId = this.getPlaybackId(segment);
const isMatch = isMatchingEvent(this.logger, filename, playbackId, evt);
if (!isMatch) {
this.logger.info({currentPlaybackId: playbackId, stopPlaybackId: evt.variable_tts_playback_id},
'Say:exec discarding playback-stop for earlier play');
ep.once('playback-stop', this._boundOnPlaybackStop);
if (this._playResolve) {
(tts_error || response_code >= 300) ? this._playReject(new Error(evt.variable_tts_error)) :
this._playResolve();
return;
}
this.logger.debug({evt},
`Say got playback-stop ${evt.variable_tts_playback_id ? evt.variable_tts_playback_id : ''}`);
this.notifyStatus({event: 'stop-playback'});
this.notifiedPlayBackStop = true;
const tts_error = evt.variable_tts_error;
// some tts vendor may not provide response code, so we default to 200
let response_code = 200;
// Check if any property ends with _response_code
for (const [key, value] of Object.entries(evt)) {
if (key.endsWith('_response_code')) {
response_code = parseInt(value, 10);
if (isNaN(response_code)) {
this.logger.info(`Say:exec playback-stop - Invalid response code: ${value}`);
response_code = 0;
}
break;
}
}
if (tts_error ||
// error response codes indicate failure
response_code <= 199 || response_code >= 300) {
writeAlerts({
account_sid,
alert_type: AlertType.TTS_FAILURE,
vendor,
detail: evt.variable_tts_error || `TTS playback failed with response code ${response_code}`,
target_sid
}).catch((err) => this.logger.info({err}, 'Error generating alert for no tts'));
}
if (
!tts_error &&
//2xx response codes indicate success
199 < response_code && response_code < 300 &&
evt.variable_tts_cache_filename &&
!this.killed &&
// if tts cache is not disabled, add the file to cache
!this.disableTtsCache
) {
const text = parseTextFromSayString(this.text[segment]);
this.logger.debug({text, cacheFile: evt.variable_tts_cache_filename}, 'Say:exec cache tts');
addFileToCache(evt.variable_tts_cache_filename, {
account_sid,
vendor,
language,
voice,
engine,
model: this.model || this.model_id,
text,
instructions: this.instructions
}).catch((err) => this.logger.info({err}, 'Error adding file to cache'));
}
if (this._playResolve) {
(tts_error ||
// error response codes indicate failure
response_code <= 199 || response_code >= 300
) ?
this._playReject(
new Error(evt.variable_tts_error || `TTS playback failed with response code ${response_code}`)
) : this._playResolve();
}
} catch (err) {
this.logger.info({err}, 'Error handling playback-stop event');
}
});
};
this._boundOnPlaybackStop = onPlaybackStop.bind(this);
const onPlaybackStart = (evt) => {
try {
const playbackId = this.getPlaybackId(segment);
const isMatch = isMatchingEvent(this.logger, filename, playbackId, evt);
if (!isMatch) {
this.logger.info({currentPlaybackId: playbackId, startPlaybackId: evt.variable_tts_playback_id},
'Say:exec playback-start - unmatched playback_id');
ep.once('playback-start', this._boundOnPlaybackStart);
return;
}
ep.once('playback-stop', this._boundOnPlaybackStop);
this.logger.debug({evt},
`Say got playback-start ${evt.variable_tts_playback_id ? evt.variable_tts_playback_id : ''}`);
if (this.otelSpan) {
this._addStreamingTtsAttributes(this.otelSpan, evt, vendor);
this.otelSpan.end();
this.otelSpan = null;
if (evt.variable_tts_cache_filename) {
cs.trackTmpFile(evt.variable_tts_cache_filename);
}
}
} catch (err) {
this.logger.info({err}, 'Error handling playback-start event');
}
};
this._boundOnPlaybackStart = onPlaybackStart.bind(this);
ep.once('playback-start', this._boundOnPlaybackStart);
// wait for playback-stop event received to confirm if the playback is successful
this._playPromise = new Promise((resolve, reject) => {
this._playResolve = resolve;
this._playReject = reject;
});
const r = await ep.play(filepath[segment]);
this.logger.debug({r}, 'Say:exec play result');
try {
const r = await ep.play(filename);
this.logger.debug({r}, 'Say:exec play result');
if (r.playbackSeconds == null && r.playbackMilliseconds == null && r.playbackLastOffsetPos == null) {
this._playReject(new Error('Playback failed to start'));
}
} catch (err) {
if (NON_FANTAL_ERRORS.includes(err.message)) {
throw new NonFatalTaskError(err.message);
}
throw err;
}
try {
// wait for playback-stop event received to confirm if the playback is successful
await this._playPromise;
@@ -286,12 +433,12 @@ class TaskSay extends TtsTask {
this._playResolve = null;
this._playReject = null;
}
if (filepath[segment].startsWith('say:{')) {
const arr = /^say:\{.*\}\s*(.*)$/.exec(filepath[segment]);
if (filename.startsWith('say:{')) {
const arr = /^say:\{.*\}\s*(.*)$/.exec(filename);
if (arr) this.logger.debug(`Say:exec complete playing streaming tts request: ${arr[1].substring(0, 64)}..`);
} else {
// This log will print spech credentials in say command for tts stream mode
this.logger.debug(`Say:exec completed play file ${filepath[segment]}`);
this.logger.debug(`Say:exec completed play file ${filename}`);
}
}
segment++;
@@ -307,8 +454,8 @@ class TaskSay extends TtsTask {
const {memberId, confName} = cs;
this.killPlayToConfMember(this.ep, memberId, confName);
} else if (this.isStreamingTts) {
this.logger.debug('TaskSay:kill - clearing TTS stream for streaming audio');
cs.clearTtsStream();
this.logger.debug('TaskSay:kill - stopping TTS stream for streaming audio');
cs.stopTtsStream();
} else {
if (!this.notifiedPlayBackStop) {
this.notifyStatus({event: 'stop-playback'});
@@ -338,6 +485,8 @@ class TaskSay extends TtsTask {
.replace('playht_', 'playht.')
.replace('cartesia_', 'cartesia.')
.replace('rimelabs_', 'rimelabs.')
.replace('resemble_', 'resemble.')
.replace('inworld_', 'inworld.')
.replace('verbio_', 'verbio.')
.replace('elevenlabs_', 'elevenlabs.');
if (spanMapping[newKey]) newKey = spanMapping[newKey];
@@ -402,6 +551,14 @@ const spanMapping = {
'rimelabs.name_lookup_time_ms': 'name_lookup_ms',
'rimelabs.connect_time_ms': 'connect_ms',
'rimelabs.final_response_time_ms': 'final_response_ms',
// Resemble
'resemble.connect_time_ms': 'connect_ms',
'resemble.final_response_time_ms': 'final_response_ms',
// inworld
'inworld.name_lookup_time_ms': 'name_lookup_ms',
'inworld.connect_time_ms': 'connect_ms',
'inworld.final_response_time_ms': 'final_response_ms',
'inworld.x_envoy_upstream_service_time': 'upstream_service_time',
// verbio
'verbio.name_lookup_time_ms': 'name_lookup_ms',
'verbio.connect_time_ms': 'connect_ms',

View File

@@ -111,7 +111,12 @@ class TaskSipRefer extends Task {
/* get IP address of the SBC to use as hostname if needed */
const {host} = parseUri(dlg.remote.uri);
if (!referTo.startsWith('<') && !referTo.startsWith('sip') && !referTo.startsWith('"')) {
if (
!referTo.startsWith('<') &&
!referTo.startsWith('sip:') &&
!referTo.startsWith('"') &&
!referTo.startsWith('tel:')
) {
/* they may have only provided a phone number/user */
referTo = `sip:${referTo}@${host}`;
}
@@ -124,7 +129,12 @@ class TaskSipRefer extends Task {
if (!referredByDisplayName) {
referredByDisplayName = cs.req?.callingName;
}
if (!referredBy.startsWith('<') && !referredBy.startsWith('sip') && !referredBy.startsWith('"')) {
if (
!referredBy.startsWith('<') &&
!referredBy.startsWith('sip:') &&
!referredBy.startsWith('"') &&
!referredBy.startsWith('tel:')
) {
/* they may have only provided a phone number/user */
referredBy = `${referredByDisplayName ? `"${referredByDisplayName}"` : ''}<sip:${referredBy}@${host}>`;
}

View File

@@ -4,6 +4,7 @@ const crypto = require('crypto');
const { TaskPreconditions, CobaltTranscriptionEvents } = require('../utils/constants');
const { SpeechCredentialError } = require('../utils/error');
const {JAMBONES_AWS_TRANSCRIBE_USE_GRPC} = require('../config');
const {TaskName} = require('../utils/constants.json');
/**
* "Please insert turns here: {{turns:4}}"
@@ -84,6 +85,9 @@ class SttTask extends Task {
/*bug name prefix */
this.bugname_prefix = '';
// stt latency calculator
this.stt_latency_ms = '';
}
async exec(cs, {ep, ep2}) {
@@ -91,6 +95,12 @@ class SttTask extends Task {
this.ep = ep;
this.ep2 = ep2;
// start vad from stt latency calculator
if (this.name !== TaskName.Gather ||
this.name === TaskName.Gather && this.needsStt) {
cs.startSttLatencyVad();
}
// use session preferences if we don't have specific verb-level settings.
if (cs.recognizer) {
for (const k in cs.recognizer) {
@@ -161,7 +171,7 @@ class SttTask extends Task {
try {
this.sttCredentials = await this._initSpeechCredentials(this.cs, this.vendor, this.label);
} catch (error) {
if (this.canFallback) {
if (this.canFallback()) {
this.notifyError(
{
msg: 'ASR error', details:`Invalid vendor ${this.vendor}, Error: ${error}`,
@@ -195,13 +205,64 @@ class SttTask extends Task {
}
}
async createGladiaLiveSession() {
const { api_key, region = 'us-west' } = this.sttCredentials;
const model = this.data.recognizer.model || 'solaria-1';
const options = this.data.recognizer.gladiaOptions || {};
const url = `https://api.gladia.io/v2/live?region=${region}`;
const response = await fetch(url, {
method: 'POST',
headers: {
'x-gladia-key': api_key,
'Content-Type': 'application/json'
},
body: JSON.stringify({
encoding: 'wav/pcm',
bit_depth: 16,
sample_rate: 8000,
channels: 1,
model,
...options,
messages_config: {
receive_final_transcripts: true,
receive_speech_events: true,
receive_errors: true,
}
})
});
if (!response.ok) {
const error = await response.text();
this.logger.error({url, status: response.status, error}, 'Error creating Gladia live session');
throw new Error(`Error creating Gladia live session: ${response.status} ${error}`);
}
const data = await response.json();
this.logger.debug({url: data.url}, 'Gladia Call registered');
const {host, pathname, search} = new URL(data.url);
return {host, path: `${pathname}${search}`};
}
addCustomEventListener(ep, event, handler) {
this.eventHandlers.push({ep, event, handler});
ep.addCustomEventListener(event, handler);
}
removeCustomEventListeners() {
this.eventHandlers.forEach((h) => h.ep.removeCustomEventListener(h.event, h.handler));
removeCustomEventListeners(ep) {
if (ep) {
// for specific endpoint
this.eventHandlers.filter((h) => h.ep === ep).forEach((h) => {
h.ep.removeCustomEventListener(h.event, h.handler);
});
this.eventHandlers = this.eventHandlers.filter((h) => h.ep !== ep);
return;
} else {
// for all endpoints
this.eventHandlers.forEach((h) => h.ep.removeCustomEventListener(h.event, h.handler));
this.eventHandlers = [];
}
}
async _initSpeechCredentials(cs, vendor, label) {
@@ -215,6 +276,7 @@ class SttTask extends Task {
account_sid: cs.accountSid,
alert_type: AlertType.STT_NOT_PROVISIONED,
vendor,
label,
target_sid: cs.callSid
}).catch((err) => this.logger.info({err}, 'Error generating alert for no stt'));
// the ASR might have fallback configuration, should not done task here.
@@ -269,11 +331,13 @@ class SttTask extends Task {
return credentials;
}
get canFallback() {
canFallback() {
return this.fallbackVendor && this.isHandledByPrimaryProvider && !this.cs.hasFallbackAsr;
}
async _initFallback() {
// ep is optional for gather or any verb that have single ep,
// but transcribe does need as it might has 2 eps
async _initFallback(ep) {
assert(this.fallbackVendor, 'fallback failed without fallbackVendor configuration');
this.logger.info(`Failed to use primary STT provider, fallback to ${this.fallbackVendor}`);
this.isHandledByPrimaryProvider = false;
@@ -286,7 +350,7 @@ class SttTask extends Task {
this.data.recognizer.label = this.label;
this.sttCredentials = await this._initSpeechCredentials(this.cs, this.vendor, this.label);
// cleanup previous listener from previous vendor
this.removeCustomEventListeners();
this.removeCustomEventListeners(ep);
}
async compileHintsForCobalt(ep, hostport, model, token, hints) {
@@ -400,7 +464,7 @@ class SttTask extends Task {
dgOptions.utteranceEndMs = dgOptions.utteranceEndMs || asrTimeout;
}
_onVendorConnect(_cs, _ep) {
_onVendorConnect(cs, _ep) {
this.logger.debug(`TaskGather:_on${this.vendor}Connect`);
}
@@ -413,6 +477,7 @@ class SttTask extends Task {
message: 'STT failure reported by vendor',
detail: evt.error,
vendor: this.vendor,
label: this.label,
target_sid: cs.callSid
}).catch((err) => this.logger.info({err}, `Error generating alert for ${this.vendor} connection failure`));
}
@@ -426,6 +491,7 @@ class SttTask extends Task {
alert_type: AlertType.STT_FAILURE,
message: `Failed connecting to ${this.vendor} speech recognizer: ${reason}`,
vendor: this.vendor,
label: this.label,
target_sid: cs.callSid
}).catch((err) => this.logger.info({err}, `Error generating alert for ${this.vendor} connection failure`));
}

View File

@@ -6,6 +6,8 @@ const {
AwsTranscriptionEvents,
AzureTranscriptionEvents,
DeepgramTranscriptionEvents,
GladiaTranscriptionEvents,
DeepgramfluxTranscriptionEvents,
SonioxTranscriptionEvents,
CobaltTranscriptionEvents,
IbmTranscriptionEvents,
@@ -13,7 +15,9 @@ const {
JambonzTranscriptionEvents,
TranscribeStatus,
AssemblyAiTranscriptionEvents,
HoundifyTranscriptionEvents,
VoxistTranscriptionEvents,
CartesiaTranscriptionEvents,
OpenAITranscriptionEvents,
VerbioTranscriptionEvents,
SpeechmaticsTranscriptionEvents
@@ -66,6 +70,9 @@ class TaskTranscribe extends SttTask {
this._bufferedTranscripts = [ [], [] ]; // for channel 1 and 2
this.bugname_prefix = 'transcribe_';
this.paused = false;
// fallback flags
this.isHandledByPrimaryProviderForEp1 = true;
this.isHandledByPrimaryProviderForEp2 = true;
}
get name() { return TaskName.Transcribe; }
@@ -136,22 +143,30 @@ class TaskTranscribe extends SttTask {
stopTranscription = true;
this.ep.stopTranscription({
vendor: this.vendor,
bugname: this.bugname
bugname: this.bugname,
gracefulShutdown: this.paused ? false : true
})
.catch((err) => this.logger.info(err, 'Error TaskTranscribe:kill'));
}
if (this.transcribing2 && this.ep2?.connected) {
stopTranscription = true;
this.ep2.stopTranscription({vendor: this.vendor, bugname: this.bugname})
this.ep2.stopTranscription({
vendor: this.vendor,
bugname: this.bugname,
gracefulShutdown: this.paused ? false : true
})
.catch((err) => this.logger.info(err, 'Error TaskTranscribe:kill'));
}
this.cs.emit('transcribe-stop');
return stopTranscription;
}
async kill(cs) {
super.kill(cs);
const stopTranscription = this._stopTranscription();
cs.stopSttLatencyVad();
// hangup after 1 sec if we don't get a final transcription
if (stopTranscription) this._timer = setTimeout(() => this.notifyTaskDone(), 1500);
else this.notifyTaskDone();
@@ -227,10 +242,35 @@ class TaskTranscribe extends SttTask {
this._onVendorConnect.bind(this, cs, ep));
this.addCustomEventListener(ep, DeepgramTranscriptionEvents.ConnectFailure,
this._onVendorConnectFailure.bind(this, cs, ep, channel));
this.addCustomEventListener(ep, DeepgramTranscriptionEvents.Error, this._onVendorError.bind(this, cs, ep));
/* if app sets deepgramOptions.utteranceEndMs they essentially want continuous asr */
//if (opts.DEEPGRAM_SPEECH_UTTERANCE_END_MS) this.isContinuousAsr = true;
break;
case 'deepgramflux':
this.bugname = `${this.bugname_prefix}deepgramflux_transcribe`;
this.addCustomEventListener(ep, DeepgramfluxTranscriptionEvents.Transcription,
this._onTranscription.bind(this, cs, ep, channel));
this.addCustomEventListener(ep, DeepgramfluxTranscriptionEvents.Connect,
this._onVendorConnect.bind(this, cs, ep));
this.addCustomEventListener(ep, DeepgramfluxTranscriptionEvents.ConnectFailure,
this._onVendorConnectFailure.bind(this, cs, ep, channel));
this.addCustomEventListener(ep, DeepgramfluxTranscriptionEvents.Error, this._onVendorError.bind(this, cs, ep));
break;
case 'gladia':
this.bugname = `${this.bugname_prefix}gladia_transcribe`;
this.addCustomEventListener(ep, GladiaTranscriptionEvents.Transcription,
this._onTranscription.bind(this, cs, ep, channel));
this.addCustomEventListener(ep, GladiaTranscriptionEvents.Connect,
this._onVendorConnect.bind(this, cs, ep));
this.addCustomEventListener(ep, GladiaTranscriptionEvents.ConnectFailure,
this._onVendorConnectFailure.bind(this, cs, ep, channel));
this.addCustomEventListener(ep, GladiaTranscriptionEvents.Error, this._onVendorError.bind(this, cs, ep));
break;
case 'soniox':
this.bugname = `${this.bugname_prefix}soniox_transcribe`;
@@ -301,6 +341,18 @@ class TaskTranscribe extends SttTask {
this._onVendorConnectFailure.bind(this, cs, ep, channel));
break;
case 'houndify':
this.bugname = `${this.bugname_prefix}houndify_transcribe`;
this.addCustomEventListener(ep, HoundifyTranscriptionEvents.Transcription,
this._onTranscription.bind(this, cs, ep, channel));
this.addCustomEventListener(ep, HoundifyTranscriptionEvents.Error,
this._onVendorError.bind(this, cs, ep));
this.addCustomEventListener(ep, HoundifyTranscriptionEvents.ConnectFailure,
this._onVendorConnectFailure.bind(this, cs, ep, channel));
this.addCustomEventListener(ep, HoundifyTranscriptionEvents.Connect,
this._onVendorConnect.bind(this, cs, ep));
break;
case 'voxist':
this.bugname = `${this.bugname_prefix}voxist_transcribe`;
this.addCustomEventListener(ep, VoxistTranscriptionEvents.Transcription,
@@ -312,6 +364,17 @@ class TaskTranscribe extends SttTask {
this._onVendorConnectFailure.bind(this, cs, ep, channel));
break;
case 'cartesia':
this.bugname = `${this.bugname_prefix}cartesia_transcribe`;
this.addCustomEventListener(ep, CartesiaTranscriptionEvents.Transcription,
this._onTranscription.bind(this, cs, ep, channel));
this.addCustomEventListener(ep,
CartesiaTranscriptionEvents.Connect, this._onVendorConnect.bind(this, cs, ep));
this.addCustomEventListener(ep, CartesiaTranscriptionEvents.Error, this._onVendorError.bind(this, cs, ep));
this.addCustomEventListener(ep, CartesiaTranscriptionEvents.ConnectFailure,
this._onVendorConnectFailure.bind(this, cs, ep, channel));
break;
case 'speechmatics':
this.bugname = `${this.bugname_prefix}speechmatics_transcribe`;
this.addCustomEventListener(
@@ -396,6 +459,14 @@ class TaskTranscribe extends SttTask {
else if (this.data.recognizer?.hints?.length > 0) {
prompt = this.data.recognizer?.hints.join(', ');
}
} else if (this.vendor === 'gladia') {
// gladia require unique url for each session
const {host, path} = await this.createGladiaLiveSession();
await ep.set({
GLADIA_SPEECH_HOST: host,
GLADIA_SPEECH_PATH: path,
})
.catch((err) => this.logger.info(err, 'Error setting channel variables'));
}
await ep.startTranscription({
@@ -406,9 +477,16 @@ class TaskTranscribe extends SttTask {
bugname: this.bugname,
hostport: this.hostport
});
// Some vendor use single connection, that we cannot use onConnect event to track transcription start
this.cs.emit('transcribe-start');
}
async _onTranscription(cs, ep, channel, evt, fsEvent) {
// check if we are in graceful shutdown mode
if (ep.gracefulShutdownResolver) {
ep.gracefulShutdownResolver();
}
// make sure this is not a transcript from answering machine detection
const bugname = fsEvent.getHeader('media-bugname');
const finished = fsEvent.getHeader('transcription-session-finished');
@@ -420,6 +498,9 @@ class TaskTranscribe extends SttTask {
if (this.vendor === 'ibm' && evt?.state === 'listening') return;
// emit an event to the call session to track the time transcription is received
cs.emit('on-transcription');
if (this.vendor === 'deepgram' && evt.type === 'UtteranceEnd') {
/* we will only get this when we have set utterance_end_ms */
@@ -581,14 +662,28 @@ class TaskTranscribe extends SttTask {
}
async _resolve(channel, evt) {
let sttLatencyMetrics = {};
if (evt.is_final) {
const sttLatency = this.cs.calculateSttLatency();
if (sttLatency) {
sttLatencyMetrics = {
'stt.latency_ms': `${sttLatency.stt_latency_ms}`,
'stt.talkspurts': JSON.stringify(sttLatency.talkspurts),
'stt.start_time': sttLatency.stt_start_time,
'stt.stop_time': sttLatency.stt_stop_time,
'stt.usage': sttLatency.stt_usage,
};
}
// time to reset the stt latency
this.cs.emit('transcribe-start');
/* we've got a final transcript, so end the otel child span for this channel */
if (this.childSpan[channel - 1] && this.childSpan[channel - 1].span) {
this.childSpan[channel - 1].span.setAttributes({
channel,
'stt.label': this.label || 'None',
'stt.resolve': 'transcript',
'stt.result': JSON.stringify(evt)
'stt.result': JSON.stringify(evt),
...sttLatencyMetrics
});
this.childSpan[channel - 1].span.end();
}
@@ -597,9 +692,13 @@ class TaskTranscribe extends SttTask {
if (this.transcriptionHook) {
const b3 = this.getTracingPropagation();
const httpHeaders = b3 && {b3};
const latencies = Object.fromEntries(
Object.entries(sttLatencyMetrics).map(([key, value]) => [key.replace('stt.', 'stt_'), value])
);
const payload = {
...this.cs.callInfo,
...httpHeaders,
...latencies,
...(evt.alternatives && {speech: evt}),
...(evt.type && {speechEvent: evt})
};
@@ -688,16 +787,17 @@ class TaskTranscribe extends SttTask {
}
async _startFallback(cs, _ep, evt) {
if (this.canFallback) {
if (this.canFallback(_ep)) {
_ep.stopTranscription({
vendor: this.vendor,
bugname: this.bugname
bugname: this.bugname,
gracefulShutdown: false
})
.catch((err) => this.logger.error({err}, `Error stopping transcription for primary vendor ${this.vendor}`));
try {
this.notifyError({ msg: 'ASR error',
details:`STT Vendor ${this.vendor} error: ${evt.error || evt.reason}`, failover: 'in progress'});
await this._initFallback();
await this._initFallback(_ep);
let channel = 1;
if (this.ep !== _ep) {
channel = 2;
@@ -806,6 +906,41 @@ class TaskTranscribe extends SttTask {
if (this._asrTimer) clearTimeout(this._asrTimer);
this._asrTimer = null;
}
// We need to keep track the fallback is happened for each endpoint
// override the canFallback and _initFallback methods to make sure that
// we only fallback once per endpoint
// we want to keep track this on task level instead of endpoint level
// because the endpoint instance is used across multiple tasks.
canFallback(ep) {
let isHandledByPrimaryProvider = this.isHandledByPrimaryProvider;
if (ep === this.ep) {
isHandledByPrimaryProvider = this.isHandledByPrimaryProviderForEp1;
} else if (ep === this.ep2) {
isHandledByPrimaryProvider = this.isHandledByPrimaryProviderForEp2;
}
const isOneOfEndpointAlreadyFallenBack = !!this.ep && !!this.ep2 &&
this.isHandledByPrimaryProviderForEp1 !== this.isHandledByPrimaryProviderForEp2;
// fallback is configured
return this.fallbackVendor &&
// has this endpoint already fallen back
isHandledByPrimaryProvider &&
// in global level, is there any fallback is already happened
// one fallen endpoint will mark cs.hasFallbackAsr to true,
// so if one endpoint was fallen, the other endpoint would be able to fallback.
(isOneOfEndpointAlreadyFallenBack || !this.cs.hasFallbackAsr);
}
_initFallback(ep) {
if (ep === this.ep) {
this.isHandledByPrimaryProviderForEp1 = false;
} else if (ep === this.ep2) {
this.isHandledByPrimaryProviderForEp2 = false;
}
return super._initFallback(ep);
}
}
module.exports = TaskTranscribe;

View File

@@ -3,6 +3,16 @@ const { TaskPreconditions } = require('../utils/constants');
const { SpeechCredentialError } = require('../utils/error');
const dbUtils = require('../utils/db-utils');
const extractPlaybackId = (str) => {
// Match say:{...} and capture the content inside braces
const match = str.match(/say:\{([^}]*)\}/);
if (!match) return null;
// Look for playback_id=value within the captured content
const playbackMatch = match[1].match(/playback_id=([^,]*)/);
return playbackMatch ? playbackMatch[1] : null;
};
class TtsTask extends Task {
constructor(logger, data, parentTask) {
@@ -21,11 +31,21 @@ class TtsTask extends Task {
this.synthesizer = this.data.synthesizer || {};
this.disableTtsCache = this.data.disableTtsCache;
this.options = this.synthesizer.options || {};
this.instructions = this.data.instructions;
this.instructions = this.data.instructions || this.options.instructions;
this.playbackIds = [];
this.useGeminiTts = this.options.useGeminiTts;
}
getPlaybackId(offset) {
return this.playbackIds[offset];
}
async exec(cs) {
super.exec(cs);
// update disableTtsCache from call session if not set in task
if (this.data.disableTtsCache == null) {
this.disableTtsCache = cs.disableTtsCache;
}
if (cs.synthesizer) {
this.options = {...cs.synthesizer.options, ...this.options};
this.data.synthesizer = this.data.synthesizer || {};
@@ -66,55 +86,67 @@ class TtsTask extends Task {
}
async setTtsStreamingChannelVars(vendor, language, voice, credentials, ep) {
const {api_key, model_id, custom_tts_streaming_url, auth_token} = credentials;
let obj;
const {api_key, model_id, api_uri, custom_tts_streaming_url, auth_token, options} = credentials;
// api_key, model_id, api_uri, custom_tts_streaming_url, and auth_token are encoded in the credentials
// allow them to be overriden via config, using options
// give preference to options passed in via config
const parsed_options = options ? JSON.parse(options) : {};
const local_options = {...parsed_options, ...this.options};
const local_voice_settings = {...(parsed_options.voice_settings || {}), ...(this.options.voice_settings || {})};
const local_api_key = local_options.api_key ?? api_key;
const local_model_id = local_options.model_id ?? model_id;
const local_api_uri = local_options.api_uri ?? api_uri;
const local_custom_tts_streaming_url = local_options.custom_tts_streaming_url ?? custom_tts_streaming_url;
const local_auth_token = local_options.auth_token ?? auth_token;
this.logger.debug({credentials},
`setTtsStreamingChannelVars: vendor: ${vendor}, language: ${language}, voice: ${voice}`);
let obj;
switch (vendor) {
case 'deepgram':
obj = {
DEEPGRAM_API_KEY: api_key,
DEEPGRAM_API_KEY: local_api_key,
DEEPGRAM_TTS_STREAMING_MODEL: voice
};
break;
case 'cartesia':
obj = {
CARTESIA_API_KEY: api_key,
CARTESIA_TTS_STREAMING_MODEL_ID: model_id,
CARTESIA_API_KEY: local_api_key,
CARTESIA_TTS_STREAMING_MODEL_ID: local_model_id,
CARTESIA_TTS_STREAMING_VOICE_ID: voice,
CARTESIA_TTS_STREAMING_LANGUAGE: language || 'en',
};
break;
case 'elevenlabs':
const {stability, similarity_boost, use_speaker_boost, style, speed} = this.options.voice_settings || {};
// eslint-disable-next-line max-len
const {stability, similarity_boost, use_speaker_boost, style, speed} = local_voice_settings || {};
obj = {
ELEVENLABS_API_KEY: api_key,
ELEVENLABS_TTS_STREAMING_MODEL_ID: model_id,
ELEVENLABS_API_KEY: local_api_key,
...(api_uri && {ELEVENLABS_API_URI: local_api_uri}),
ELEVENLABS_TTS_STREAMING_MODEL_ID: local_model_id,
ELEVENLABS_TTS_STREAMING_VOICE_ID: voice,
// 20/12/2024 - only eleven_turbo_v2_5 support multiple language
...(['eleven_turbo_v2_5'].includes(model_id) && {ELEVENLABS_TTS_STREAMING_LANGUAGE: language}),
...(['eleven_turbo_v2_5'].includes(local_model_id) && {ELEVENLABS_TTS_STREAMING_LANGUAGE: language}),
...(stability && {ELEVENLABS_TTS_STREAMING_VOICE_SETTINGS_STABILITY: stability}),
...(similarity_boost && {ELEVENLABS_TTS_STREAMING_VOICE_SETTINGS_SIMILARITY_BOOST: similarity_boost}),
...(use_speaker_boost && {ELEVENLABS_TTS_STREAMING_VOICE_SETTINGS_USE_SPEAKER_BOOST: use_speaker_boost}),
...(style && {ELEVENLABS_TTS_STREAMING_VOICE_SETTINGS_STYLE: style}),
// speed has value 0.7 to 1.2, 1.0 is default, make sure we send the value event it's 0
...(speed !== null && speed !== undefined && {ELEVENLABS_TTS_STREAMING_VOICE_SETTINGS_SPEED: `${speed}`}),
...(this.options.pronunciation_dictionary_locators &&
Array.isArray(this.options.pronunciation_dictionary_locators) && {
...(local_options.pronunciation_dictionary_locators &&
Array.isArray(local_options.pronunciation_dictionary_locators) && {
ELEVENLABS_TTS_STREAMING_PRONUNCIATION_DICTIONARY_LOCATORS:
JSON.stringify(this.options.pronunciation_dictionary_locators)
JSON.stringify(local_options.pronunciation_dictionary_locators)
}),
};
break;
case 'rimelabs':
const {
pauseBetweenBrackets, phonemizeBetweenBrackets, inlineSpeedAlpha, speedAlpha, reduceLatency
} = this.options;
} = local_options;
obj = {
RIMELABS_API_KEY: api_key,
RIMELABS_TTS_STREAMING_MODEL_ID: model_id,
RIMELABS_API_KEY: local_api_key,
RIMELABS_TTS_STREAMING_MODEL_ID: local_model_id,
RIMELABS_TTS_STREAMING_VOICE_ID: voice,
RIMELABS_TTS_STREAMING_LANGUAGE: language || 'en',
...(pauseBetweenBrackets && {RIMELABS_TTS_STREAMING_PAUSE_BETWEEN_BRACKETS: pauseBetweenBrackets}),
@@ -129,8 +161,8 @@ class TtsTask extends Task {
if (vendor.startsWith('custom:')) {
const use_tls = custom_tts_streaming_url.startsWith('wss://');
obj = {
CUSTOM_TTS_STREAMING_HOST: custom_tts_streaming_url.replace(/^(ws|wss):\/\//, ''),
CUSTOM_TTS_STREAMING_API_KEY: auth_token,
CUSTOM_TTS_STREAMING_HOST: local_custom_tts_streaming_url.replace(/^(ws|wss):\/\//, ''),
CUSTOM_TTS_STREAMING_API_KEY: local_auth_token,
CUSTOM_TTS_STREAMING_VOICE_ID: voice,
CUSTOM_TTS_STREAMING_LANGUAGE: language || 'en',
CUSTOM_TTS_STREAMING_USE_TLS: use_tls
@@ -185,6 +217,9 @@ class TtsTask extends Task {
} else if (vendor === 'rimelabs') {
credentials = credentials || {};
credentials.model_id = this.options.model_id || credentials.model_id;
} else if (vendor === 'inworld') {
credentials = credentials || {};
credentials.model_id = this.options.model_id || credentials.model_id;
} else if (vendor === 'whisper') {
credentials = credentials || {};
credentials.model_id = this.options.model_id || credentials.model_id;
@@ -208,6 +243,8 @@ class TtsTask extends Task {
}
} else if (vendor === 'cartesia') {
credentials.model_id = this.options.model_id || credentials.model_id;
} else if (vendor === 'google') {
this.model = this.options.model || credentials.credentials.model_id;
}
this.model_id = credentials.model_id;
@@ -240,15 +277,16 @@ class TtsTask extends Task {
account_sid,
alert_type: AlertType.TTS_NOT_PROVISIONED,
vendor,
label,
target_sid: cs.callSid
}).catch((err) => this.logger.info({err}, 'Error generating alert for no tts'));
throw new SpeechCredentialError('no provisioned speech credentials for TTS');
}
/* produce an audio segment from the provided text */
const generateAudio = async(text) => {
if (this.killed) return;
if (text.startsWith('silence_stream://')) return text;
const generateAudio = async(text, index) => {
if (this.killed) return {index, filePath: null};
if (text.startsWith('silence_stream://')) return {index, filePath: text};
/* otel: trace time for tts */
if (!preCache && !this._disableTracing) {
@@ -290,16 +328,35 @@ class TtsTask extends Task {
vendor,
language,
characters: text.length,
elapsedTime: rtt
elapsedTime: rtt,
servedFromCache,
'id': this.id
});
}
if (servedFromCache) {
this.notifyStatus({
event: 'synthesized-audio',
vendor,
language,
servedFromCache,
'id': this.id
});
}
return {index, filePath, playbackId: null};
}
else {
const playbackId = extractPlaybackId(filePath);
this.logger.debug('Say: a streaming tts api will be used');
const modifiedPath = filePath.replace('say:{', `say:{session-uuid=${ep.uuid},`);
return modifiedPath;
this.notifyStatus({
event: 'synthesized-audio',
vendor,
language,
servedFromCache,
'id': this.id
});
return {index, filePath: modifiedPath, playbackId};
}
return filePath;
} catch (err) {
this.logger.info({err}, 'Error synthesizing tts');
if (this.otelSpan) this.otelSpan.end();
@@ -307,6 +364,7 @@ class TtsTask extends Task {
account_sid: cs.accountSid,
alert_type: AlertType.TTS_FAILURE,
vendor,
label,
detail: err.message,
target_sid: cs.callSid
}).catch((err) => this.logger.info({err}, 'Error generating alert for tts failure'));
@@ -314,8 +372,20 @@ class TtsTask extends Task {
}
};
const arr = this.text.map((t) => (this._validateURL(t) ? t : generateAudio(t)));
return (await Promise.all(arr)).filter((fp) => fp && fp.length);
// process all text segments in parallel will cause ordering issue
// so we attach index to each promise result and sort them later
const arr = this.text.map((t, index) => (this._validateURL(t) ?
Promise.resolve({index, filePath: t, playbackId: null}) : generateAudio(t, index)));
const results = await Promise.all(arr);
const sorted = results.sort((a, b) => a.index - b.index);
return sorted
.filter((fp) => fp.filePath && fp.filePath.length)
.map((r) => {
this.playbackIds.push(r.playbackId);
return r.filePath;
});
} catch (err) {
this.logger.info(err, 'TaskSay:exec error');
throw err;

View File

@@ -118,6 +118,13 @@ class ActionHookDelayProcessor extends Emitter {
this.logger.debug('ActionHookDelayProcessor#_onNoResponseTimer');
this._noResponseTimer = null;
/* check if endpoint is still available (call may have ended) */
if (!this.ep) {
this.logger.debug('ActionHookDelayProcessor#_onNoResponseTimer: endpoint is null, call may have ended');
this._active = false;
return;
}
/* get the next play or say action */
const verb = this.actions[this._retryCount % this.actions.length];
@@ -129,8 +136,8 @@ class ActionHookDelayProcessor extends Emitter {
this._taskInProgress.exec(this.cs, {ep: this.ep}).catch((err) => {
this.logger.info(`ActionHookDelayProcessor#_onNoResponseTimer: error playing file: ${err.message}`);
this._taskInProgress = null;
this.ep.removeAllListeners('playback-start');
this.ep.removeAllListeners('playback-stop');
this.ep?.removeAllListeners('playback-start');
this.ep?.removeAllListeners('playback-stop');
});
} catch (err) {
this.logger.info(err, 'ActionHookDelayProcessor#_onNoResponseTimer: error starting action');

View File

@@ -281,13 +281,17 @@ module.exports = (logger) => {
/* set stt options */
logger.info(`starting amd for vendor ${vendor} and language ${language}`);
const sttOpts = amd.setChannelVarsForStt({name: TaskName.Gather}, sttCredentials, language, {
vendor,
hints,
enhancedModel: true,
altLanguages: opts.recognizer?.altLanguages || [],
initialSpeechTimeoutMs: opts.resolveTimeoutMs,
});
/* if opts contains recognizer object use that config for stt, otherwise use defaults */
const rOpts = opts.recognizer ?
opts.recognizer :
{
vendor,
hints,
enhancedModel: true,
altLanguages: opts.recognizer?.altLanguages || [],
initialSpeechTimeoutMs: opts.resolveTimeoutMs,
};
const sttOpts = amd.setChannelVarsForStt({name: TaskName.Gather}, sttCredentials, language, rOpts);
await ep.set(sttOpts).catch((err) => logger.info(err, 'Error setting channel variables'));
@@ -401,24 +405,30 @@ module.exports = (logger) => {
if (ep.amd) {
vendor = ep.amd.vendor;
ep.amd.stopAllTimers();
ep.removeListener(GoogleTranscriptionEvents.Transcription, ep.amd.transcriptionHandler);
ep.removeListener(GoogleTranscriptionEvents.EndOfUtterance, ep.amd.EndOfUtteranceHandler);
ep.removeListener(AwsTranscriptionEvents.Transcription, ep.amd.transcriptionHandler);
ep.removeListener(AzureTranscriptionEvents.Transcription, ep.amd.transcriptionHandler);
ep.removeListener(AzureTranscriptionEvents.NoSpeechDetected, ep.amd.noSpeechHandler);
ep.removeListener(NuanceTranscriptionEvents.Transcription, ep.amd.transcriptionHandler);
ep.removeListener(DeepgramTranscriptionEvents.Transcription, ep.amd.transcriptionHandler);
ep.removeListener(SonioxTranscriptionEvents.Transcription, ep.amd.transcriptionHandler);
ep.removeListener(IbmTranscriptionEvents.Transcription, ep.amd.transcriptionHandler);
ep.removeListener(NvidiaTranscriptionEvents.Transcription, ep.amd.transcriptionHandler);
ep.removeListener(JambonzTranscriptionEvents.Transcription, ep.amd.transcriptionHandler);
try {
ep.removeListener(GoogleTranscriptionEvents.Transcription, ep.amd.transcriptionHandler);
ep.removeListener(GoogleTranscriptionEvents.EndOfUtterance, ep.amd.EndOfUtteranceHandler);
ep.removeListener(AwsTranscriptionEvents.Transcription, ep.amd.transcriptionHandler);
ep.removeListener(AzureTranscriptionEvents.Transcription, ep.amd.transcriptionHandler);
ep.removeListener(AzureTranscriptionEvents.NoSpeechDetected, ep.amd.noSpeechHandler);
ep.removeListener(NuanceTranscriptionEvents.Transcription, ep.amd.transcriptionHandler);
ep.removeListener(DeepgramTranscriptionEvents.Transcription, ep.amd.transcriptionHandler);
ep.removeListener(SonioxTranscriptionEvents.Transcription, ep.amd.transcriptionHandler);
ep.removeListener(IbmTranscriptionEvents.Transcription, ep.amd.transcriptionHandler);
ep.removeListener(NvidiaTranscriptionEvents.Transcription, ep.amd.transcriptionHandler);
ep.removeListener(JambonzTranscriptionEvents.Transcription, ep.amd.transcriptionHandler);
} catch (error) {
logger.error('Unable to Remove AMD Listener', error);
}
ep.amd = null;
}
if (ep.connected) {
ep.stopTranscription({vendor, bugname})
ep.stopTranscription({
vendor,
bugname,
gracefulShutdown: false
})
.catch((err) => logger.info(err, 'stopAmd: Error stopping transcription'));
task.emit('amd', {type: AmdEvents.Stopped});
ep.execute('avmd_stop').catch((err) => this.logger.info(err, 'Error stopping avmd'));

View File

@@ -65,7 +65,7 @@ class BackgroundTaskManager extends Emitter {
this.logger.info(`stopping background task: ${type}`);
task.removeAllListeners();
task.span.end();
task.kill();
task.kill(this.cs);
// Remove task from managed List
this.tasks.delete(type);
}
@@ -135,26 +135,24 @@ class BackgroundTaskManager extends Emitter {
// Initiate Record
async _initRecord() {
if (this.cs.accountInfo.account.record_all_calls || this.cs.application.record_all_calls) {
if (!JAMBONZ_RECORD_WS_BASE_URL || !this.cs.accountInfo.account.bucket_credential) {
this.logger.error('_initRecord: invalid cfg - missing JAMBONZ_RECORD_WS_BASE_URL or bucket config');
return undefined;
}
const listenOpts = {
url: `${JAMBONZ_RECORD_WS_BASE_URL}/record/${this.cs.accountInfo.account.bucket_credential.vendor}`,
disableBidirectionalAudio: true,
mixType : 'stereo',
passDtmf: true
};
if (JAMBONZ_RECORD_WS_USERNAME && JAMBONZ_RECORD_WS_PASSWORD) {
listenOpts.wsAuth = {
username: JAMBONZ_RECORD_WS_USERNAME,
password: JAMBONZ_RECORD_WS_PASSWORD
};
}
this.logger.debug({listenOpts}, '_initRecord: enabling listen');
return await this._initListen({verb: 'listen', ...listenOpts}, 'jambonz-session-record', true, 'record');
if (!JAMBONZ_RECORD_WS_BASE_URL || !this.cs.accountInfo.account.bucket_credential) {
this.logger.error('_initRecord: invalid cfg - missing JAMBONZ_RECORD_WS_BASE_URL or bucket config');
return undefined;
}
const listenOpts = {
url: `${JAMBONZ_RECORD_WS_BASE_URL}/record/${this.cs.accountInfo.account.bucket_credential.vendor}`,
disableBidirectionalAudio: true,
mixType : 'stereo',
passDtmf: true
};
if (JAMBONZ_RECORD_WS_USERNAME && JAMBONZ_RECORD_WS_PASSWORD) {
listenOpts.wsAuth = {
username: JAMBONZ_RECORD_WS_USERNAME,
password: JAMBONZ_RECORD_WS_PASSWORD
};
}
this.logger.debug({listenOpts}, '_initRecord: enabling listen');
return await this._initListen({verb: 'listen', ...listenOpts}, 'jambonz-session-record', true, 'record');
}
// Initiate Transcribe

View File

@@ -1,5 +1,6 @@
{
"TaskName": {
"Alert": "alert",
"Answer": "answer",
"Conference": "conference",
"Config": "config",
@@ -93,7 +94,20 @@
"DeepgramTranscriptionEvents": {
"Transcription": "deepgram_transcribe::transcription",
"ConnectFailure": "deepgram_transcribe::connect_failed",
"Connect": "deepgram_transcribe::connect"
"Connect": "deepgram_transcribe::connect",
"Error": "deepgram_transcribe::error"
},
"DeepgramfluxTranscriptionEvents": {
"Transcription": "deepgramflux_transcribe::transcription",
"ConnectFailure": "deepgramflux_transcribe::connect_failed",
"Connect": "deepgramflux_transcribe::connect",
"Error": "deepgramflux_transcribe::error"
},
"GladiaTranscriptionEvents": {
"Transcription": "gladia_transcribe::transcription",
"ConnectFailure": "gladia_transcribe::connect_failed",
"Connect": "gladia_transcribe::connect",
"Error": "gladia_transcribe::error"
},
"SonioxTranscriptionEvents": {
"Transcription": "soniox_transcribe::transcription",
@@ -161,15 +175,30 @@
"ConnectFailure": "assemblyai_transcribe::connect_failed",
"Connect": "assemblyai_transcribe::connect"
},
"HoundifyTranscriptionEvents": {
"Transcription": "houndify_transcribe::transcription",
"Error": "houndify_transcribe::error",
"ConnectFailure": "houndify_transcribe::connect_failed",
"Connect": "houndify_transcribe::connect"
},
"VoxistTranscriptionEvents": {
"Transcription": "voxist_transcribe::transcription",
"Error": "voxist_transcribe::error",
"ConnectFailure": "voxist_transcribe::connect_failed",
"Connect": "voxist_transcribe::connect"
},
"CartesiaTranscriptionEvents": {
"Transcription": "cartesia_transcribe::transcription",
"Error": "cartesia_transcribe::error",
"ConnectFailure": "cartesia_transcribe::connect_failed",
"Connect": "cartesia_transcribe::connect"
},
"VadDetection": {
"Detection": "vad_detect:detection"
},
"SileroVadDetection": {
"Detection": "vad_silero:detect"
},
"ListenEvents": {
"Connect": "mod_audio_fork::connect",
"ConnectFailure": "mod_audio_fork::connect_failed",
@@ -237,6 +266,7 @@
"KillReason": {
"Hangup": "hangup",
"Replaced": "replaced",
"ReferComplete": "refer-complete",
"MediaTimeout": "media_timeout"
},
"HookMsgTypes": [
@@ -305,7 +335,8 @@
"Empty": "tts_streaming::empty",
"Pause": "tts_streaming::pause",
"Resume": "tts_streaming::resume",
"ConnectFailure": "tts_streaming::connect_failed"
"ConnectFailure": "tts_streaming::connect_failed",
"Connected": "tts_streaming::connected"
},
"TtsStreamingConnectionStatus": {
"NotConnected": "not_connected",
@@ -325,5 +356,8 @@
"WS_CLOSE_CODES": {
"NormalClosure": 1000,
"GoingAway": 1001
}
},
"NON_FANTAL_ERRORS": [
"File Not Found"
]
}

View File

@@ -81,6 +81,15 @@ const speechMapper = (cred) => {
obj.deepgram_tts_uri = o.deepgram_tts_uri;
obj.deepgram_stt_use_tls = o.deepgram_stt_use_tls;
}
else if ('gladia' === obj.vendor) {
const o = JSON.parse(decrypt(credential));
obj.api_key = o.api_key;
obj.region = o.region;
}
else if ('deepgramflux' === obj.vendor) {
const o = JSON.parse(decrypt(credential));
obj.api_key = o.api_key;
}
else if ('soniox' === obj.vendor) {
const o = JSON.parse(decrypt(credential));
obj.api_key = o.api_key;
@@ -97,6 +106,7 @@ const speechMapper = (cred) => {
const o = JSON.parse(decrypt(credential));
obj.api_key = o.api_key;
obj.model_id = o.model_id;
obj.api_uri = o.api_uri;
obj.options = o.options;
}
else if ('playht' === obj.vendor) {
@@ -110,6 +120,7 @@ const speechMapper = (cred) => {
const o = JSON.parse(decrypt(credential));
obj.api_key = o.api_key;
obj.model_id = o.model_id;
obj.stt_model_id = o.stt_model_id;
obj.embedding = o.embedding;
obj.options = o.options;
}
@@ -119,9 +130,29 @@ const speechMapper = (cred) => {
obj.model_id = o.model_id;
obj.options = o.options;
}
else if ('resemble' === obj.vendor) {
const o = JSON.parse(decrypt(credential));
obj.api_key = o.api_key;
obj.resemble_tts_use_tls = o.resemble_tts_use_tls;
obj.resemble_tts_uri = o.resemble_tts_uri;
}
else if ('inworld' === obj.vendor) {
const o = JSON.parse(decrypt(credential));
obj.api_key = o.api_key;
obj.model_id = o.model_id;
obj.options = o.options;
}
else if ('assemblyai' === obj.vendor) {
const o = JSON.parse(decrypt(credential));
obj.api_key = o.api_key;
obj.service_version = o.service_version;
}
else if ('houndify' === obj.vendor) {
const o = JSON.parse(decrypt(credential));
obj.client_id = o.client_id;
obj.client_key = o.client_key;
obj.user_id = o.user_id;
obj.houndify_server_uri = o.houndify_server_uri;
}
else if ('voxist' === obj.vendor) {
const o = JSON.parse(decrypt(credential));

View File

@@ -41,7 +41,7 @@ class HttpRequestor extends BaseRequestor {
constructor(logger, account_sid, hook, secret) {
super(logger, account_sid, hook, secret);
this.method = hook.method || 'POST';
this.method = hook.method?.toUpperCase() || 'POST';
this.authHeader = basicAuth(hook.username, hook.password);
this.backoffMs = 500;
@@ -111,7 +111,7 @@ class HttpRequestor extends BaseRequestor {
const payload = params ? snakeCaseKeys(params, ['customerData', 'sip', 'env_vars', 'args']) : null;
const url = hook.url || hook;
const method = hook.method || 'POST';
const method = hook.method?.toUpperCase() || 'POST';
let buf = '';
httpHeaders = {
...httpHeaders,
@@ -119,7 +119,7 @@ class HttpRequestor extends BaseRequestor {
};
assert.ok(url, 'HttpRequestor:request url was not provided');
assert.ok, (['GET', 'POST'].includes(method), `HttpRequestor:request method must be 'GET' or 'POST' not ${method}`);
assert.ok(['GET', 'POST'].includes(method), `HttpRequestor:request method must be 'GET' or 'POST' not ${method}`);
const startAt = process.hrtime();
/* if we have an absolute url, and it is ws then do a websocket connection */
@@ -191,7 +191,7 @@ class HttpRequestor extends BaseRequestor {
method,
headers: hdrs,
...('POST' === method && {body: JSON.stringify(payload)}),
timeout: HTTP_TIMEOUT,
headersTimeout: HTTP_TIMEOUT,
followRedirects: false
};

View File

@@ -173,7 +173,8 @@ function installSrfLocals(srf, logger, {
lookupAccountCapacitiesBySid,
lookupSmppGateways,
lookupClientByAccountAndUsername,
lookupSystemInformation
lookupSystemInformation,
lookupLcrByAccount
} = require('@jambonz/db-helpers')({
host: JAMBONES_MYSQL_HOST,
user: JAMBONES_MYSQL_USER,
@@ -279,7 +280,8 @@ function installSrfLocals(srf, logger, {
retrieveByPatternSortedSet,
sortedSetLength,
sortedSetPositionByPattern,
getVerbioAccessToken
getVerbioAccessToken,
lookupLcrByAccount
},
parentLogger: logger,
getSBC,

115
lib/utils/media-endpoint.js Normal file
View File

@@ -0,0 +1,115 @@
const {
JAMBONES_USE_FREESWITCH_TIMER_FD,
JAMBONES_MEDIA_TIMEOUT_MS,
JAMBONES_MEDIA_HOLD_TIMEOUT_MS,
JAMBONES_TRANSCRIBE_EP_DESTROY_DELAY_MS,
} = require('../config');
const { sleepFor } = require('./helpers');
const createMediaEndpoint = async(srf, logger, {
activeMs,
drachtioFsmrfOptions = {},
onHoldMusic,
inbandDtmfEnabled,
mediaTimeoutHandler,
} = {}) => {
const { getFreeswitch } = srf.locals;
const ms = activeMs || getFreeswitch();
if (!ms)
throw new Error('no available Freeswitch for creating media endpoint');
const ep = await ms.createEndpoint(drachtioFsmrfOptions);
// Configure the endpoint
const opts = {
...(onHoldMusic && {holdMusic: `shout://${onHoldMusic.replace(/^https?:\/\//, '')}`}),
...(JAMBONES_USE_FREESWITCH_TIMER_FD && {timer_name: 'timerfd'}),
...(JAMBONES_MEDIA_TIMEOUT_MS && {media_timeout: JAMBONES_MEDIA_TIMEOUT_MS}),
...(JAMBONES_MEDIA_HOLD_TIMEOUT_MS && {media_hold_timeout: JAMBONES_MEDIA_HOLD_TIMEOUT_MS})
};
if (Object.keys(opts).length > 0) {
ep.set(opts);
}
// inbandDtmfEnabled
if (inbandDtmfEnabled) {
// https://developer.signalwire.com/freeswitch/FreeSWITCH-Explained/Modules/mod-dptools/6587132/#0-about
ep.execute('start_dtmf').catch((err) => {
logger.error('Error starting inband DTMF', { error: err });
});
ep.inbandDtmfEnabled = true;
}
// Handle Media Timeout
if (mediaTimeoutHandler) {
ep.once('destroy', (evt) => {
mediaTimeoutHandler(evt, ep);
});
}
// Handle graceful shutdown for endpoint if required
if (JAMBONES_TRANSCRIBE_EP_DESTROY_DELAY_MS > 0) {
const getEpGracefulShutdownPromise = () => {
if (!ep.gracefulShutdownPromise) {
ep.gracefulShutdownPromise = new Promise((resolve) => {
// this resolver will be called when stt task received transcription.
ep.gracefulShutdownResolver = () => {
resolve();
ep.gracefulShutdownPromise = null;
};
});
}
return ep.gracefulShutdownPromise;
};
const gracefulShutdownHandler = async() => {
// resolve when one of the following happens:
// 1. stt task received transcription
// 2. JAMBONES_TRANSCRIBE_EP_DESTROY_DELAY_MS passed
await Promise.race([
getEpGracefulShutdownPromise(),
sleepFor(JAMBONES_TRANSCRIBE_EP_DESTROY_DELAY_MS)
]);
};
const origStartTranscription = ep.startTranscription.bind(ep);
ep.startTranscription = async(...args) => {
try {
const result = await origStartTranscription(...args);
ep.isTranscribeActive = true;
return result;
} catch (err) {
ep.isTranscribeActive = false;
throw err;
}
};
const origStopTranscription = ep.stopTranscription.bind(ep);
ep.stopTranscription = async(opts = {}, ...args) => {
const { gracefulShutdown = true, ...others } = opts;
if (ep.isTranscribeActive && gracefulShutdown) {
// only wait for graceful shutdown if transcription is active
await gracefulShutdownHandler();
}
try {
const result = await origStopTranscription({...others}, ...args);
ep.isTranscribeActive = false;
return result;
} catch (err) {
ep.isTranscribeActive = false;
throw err;
}
};
const origDestroy = ep.destroy.bind(ep);
ep.destroy = async() => {
if (ep.isTranscribeActive) {
await gracefulShutdownHandler();
}
return await origDestroy();
};
}
return ep;
};
module.exports = {
createMediaEndpoint,
};

View File

@@ -16,17 +16,11 @@ const crypto = require('crypto');
const HttpRequestor = require('./http-requestor');
const WsRequestor = require('./ws-requestor');
const {makeOpusFirst, removeVideoSdp} = require('./sdp-utils');
const {
JAMBONES_USE_FREESWITCH_TIMER_FD,
JAMBONES_MEDIA_TIMEOUT_MS,
JAMBONES_MEDIA_HOLD_TIMEOUT_MS,
JAMBONES_TRANSCRIBE_EP_DESTROY_DELAY_MS
} = require('../config');
const { sleepFor } = require('./helpers');
const { createMediaEndpoint } = require('./media-endpoint');
class SingleDialer extends Emitter {
constructor({logger, sbcAddress, target, opts, application, callInfo, accountInfo, rootSpan, startSpan, dialTask,
onHoldMusic}) {
onHoldMusic, tmpFiles}) {
super();
assert(target.type);
@@ -50,6 +44,7 @@ class SingleDialer extends Emitter {
this.callSid = crypto.randomUUID();
this.dialTask = dialTask;
this.onHoldMusic = onHoldMusic;
this.tmpFiles = tmpFiles;
this.on('callStatusChange', this._notifyCallStatusChange.bind(this));
}
@@ -98,6 +93,7 @@ class SingleDialer extends Emitter {
};
}
this.ms = ms;
this.srf = srf;
let uri, to, inviteSpan;
try {
switch (this.target.type) {
@@ -139,8 +135,7 @@ class SingleDialer extends Emitter {
this.updateCallStatus = srf.locals.dbHelpers.updateCallStatus;
this.serviceUrl = srf.locals.serviceUrl;
this.ep = await ms.createEndpoint();
this._configMsEndpoint();
this.ep = await this._createMediaEndpoint();
this.logger.debug(`SingleDialer:exec - created endpoint ${this.ep.uuid}`);
/**
@@ -334,7 +329,13 @@ class SingleDialer extends Emitter {
*/
async kill(Reason) {
this.killed = true;
if (this.inviteInProgress) await this.inviteInProgress.cancel();
if (this.inviteInProgress) {
try {
await this.inviteInProgress.cancel();
} catch (err) {
this.logger.error({err}, 'SingleDialer:kill error cancelling invite');
}
}
else if (this.dlg && this.dlg.connected) {
const duration = moment().diff(this.dlg.connectTime, 'seconds');
this.logger.debug('SingleDialer:kill hanging up called party');
@@ -352,43 +353,19 @@ class SingleDialer extends Emitter {
}
}
_configMsEndpoint() {
const opts = {
...(this.onHoldMusic && {holdMusic: `shout://${this.onHoldMusic.replace(/^https?:\/\//, '')}`}),
...(JAMBONES_USE_FREESWITCH_TIMER_FD && {timer_name: 'timerfd'}),
...(JAMBONES_MEDIA_TIMEOUT_MS && {media_timeout: JAMBONES_MEDIA_TIMEOUT_MS}),
...(JAMBONES_MEDIA_HOLD_TIMEOUT_MS && {media_hold_timeout: JAMBONES_MEDIA_HOLD_TIMEOUT_MS})
};
if (Object.keys(opts).length > 0) {
this.ep.set(opts);
}
if (this.dialTask?.inbandDtmfEnabled && !this.ep.inbandDtmfEnabled) {
// https://developer.signalwire.com/freeswitch/FreeSWITCH-Explained/Modules/mod-dptools/6587132/#0-about
try {
this.ep.execute('start_dtmf');
this.ep.inbandDtmfEnabled = true;
} catch (err) {
this.logger.info(err, 'place-outdial:_configMsEndpoint - error enable inband DTMF');
}
}
async _handleMediaTimeout(evt, ep) {
this.logger.info({evt}, 'SingleDialer:_handleMediaTimeout - media timeout event received');
this.dialTask.kill(this.dialTask.cs, 'media-timeout');
}
const origDestroy = this.ep.destroy.bind(this.ep);
this.ep.destroy = async() => {
try {
if (this.dialTask.transcribeTask && JAMBONES_TRANSCRIBE_EP_DESTROY_DELAY_MS) {
// transcribe task is being used, wait for some time before destroy
// if final transcription is received but endpoint is already closed,
// freeswitch module will not be able to send the transcription
this.logger.info('SingleDialer:_configMsEndpoint -' +
' Dial with transcribe task, wait for some time before destroy');
await sleepFor(JAMBONES_TRANSCRIBE_EP_DESTROY_DELAY_MS);
}
await origDestroy();
} catch (err) {
this.logger.error(err, 'SingleDialer:_configMsEndpoint - error destroying endpoint');
}
};
async _createMediaEndpoint(drachtioFsmrfOptions = {}) {
return await createMediaEndpoint(this.srf, this.logger, {
acactiveMs: this.ms,
drachtioFsmrfOptions,
onHoldMusic: this.onHoldMusic,
inbandDtmfEnabled: this.dialTask?.inbandDtmfEnabled,
mediaTimeoutHandler: this._handleMediaTimeout.bind(this),
});
}
/**
@@ -431,7 +408,8 @@ class SingleDialer extends Emitter {
accountInfo: this.accountInfo,
tasks,
rootSpan: this.rootSpan,
req: this.req
req: this.req,
tmpFiles: this.tmpFiles,
});
await cs.exec();
@@ -528,8 +506,7 @@ class SingleDialer extends Emitter {
assert(this.dlg && this.dlg.connected && !this.ep);
this.logger.debug('SingleDialer:reAnchorMedia: re-anchoring media after partial media');
this.ep = await this.ms.createEndpoint({remoteSdp: this.dlg.remote.sdp});
this._configMsEndpoint();
this.ep = await this._createMediaEndpoint({remoteSdp: this.dlg.remote.sdp});
await this.dlg.modify(this.ep.local.sdp, {
headers: {
'X-Reason': 'anchor-media'
@@ -566,12 +543,12 @@ class SingleDialer extends Emitter {
function placeOutdial({
logger, srf, ms, sbcAddress, target, opts, application, callInfo, accountInfo, rootSpan, startSpan, dialTask,
onHoldMusic
onHoldMusic, tmpFiles
}) {
const myOpts = deepcopy(opts);
const sd = new SingleDialer({
logger, sbcAddress, target, opts: myOpts, application, callInfo,
accountInfo, rootSpan, startSpan, dialTask, onHoldMusic
accountInfo, rootSpan, startSpan, dialTask, onHoldMusic, tmpFiles
});
sd.exec(srf, ms, myOpts);
return sd;

View File

@@ -0,0 +1,91 @@
// lib/utils/process-monitor.js
const fs = require('fs');
const path = require('path');
class ProcessMonitor {
constructor(logger) {
this.logger = logger;
this.packageInfo = this.getPackageInfo();
this.processName = this.packageInfo.name || 'unknown-app';
}
getPackageInfo() {
try {
const packagePath = path.join(process.cwd(), 'package.json');
return JSON.parse(fs.readFileSync(packagePath, 'utf8'));
} catch (e) {
return { name: 'unknown', version: 'unknown' };
}
}
logStartup(additionalInfo = {}) {
const startupInfo = {
msg: `${this.processName} started`,
app_name: this.processName,
app_version: this.packageInfo.version,
pid: process.pid,
ppid: process.ppid,
pm2_instance_id: process.env.NODE_APP_INSTANCE || 'not_pm2',
pm2_id: process.env.pm_id,
is_pm2: !!process.env.PM2,
node_version: process.version,
uptime: process.uptime(),
timestamp: new Date().toISOString(),
...additionalInfo
};
this.logger.info(startupInfo);
return startupInfo;
}
setupSignalHandlers() {
// Log when we receive signals that would cause restart
process.on('SIGINT', () => {
this.logger.info({
msg: 'SIGINT received',
app_name: this.processName,
pid: process.pid,
ppid: process.ppid,
uptime: process.uptime(),
timestamp: new Date().toISOString()
});
process.exit(0);
});
process.on('SIGTERM', () => {
this.logger.info({
msg: 'SIGTERM received',
app_name: this.processName,
pid: process.pid,
ppid: process.ppid,
uptime: process.uptime(),
timestamp: new Date().toISOString()
});
process.exit(0);
});
process.on('uncaughtException', (error) => {
this.logger.error({
msg: 'Uncaught exception - process will restart',
app_name: this.processName,
error: error.message,
stack: error.stack,
pid: process.pid,
timestamp: new Date().toISOString()
});
process.exit(1);
});
process.on('unhandledRejection', (reason, promise) => {
this.logger.error({
msg: 'Unhandled rejection',
app_name: this.processName,
reason,
pid: process.pid,
timestamp: new Date().toISOString()
});
});
}
}
module.exports = ProcessMonitor;

View File

@@ -100,6 +100,30 @@ module.exports = (logger) => {
else if (K8S) {
lifecycleEmitter.scaleIn = () => process.exit(0);
}
else {
process.on('SIGUSR1', () => {
logger.info('received SIGUSR1: begin drying up calls for scale-in');
dryUpCalls = true;
const {srf} = require('../..');
const {writeSystemAlerts} = srf.locals;
if (writeSystemAlerts) {
const {SystemState, FEATURE_SERVER} = require('./constants');
writeSystemAlerts({
system_component: FEATURE_SERVER,
state : SystemState.GracefulShutdownInProgress,
fields : {
detail: `feature-server with process_id ${process.pid} shutdown in progress`,
host: srf.locals?.ipv4
}
});
}
pingProxies(srf);
// Note: in response to SIGUSR1 we start drying up but do not exit when calls reach zero.
// This is to allow external scripts that sent the signal to manage the lifecycle.
});
}
async function pingProxies(srf) {

View File

@@ -55,11 +55,28 @@ const extractSdpMedia = (sdp) => {
}
};
const getLeadingCodec = (sdp) => {
if (!sdp) {
return null;
}
const parsed = sdpTransform.parse(sdp);
const audio = parsed.media?.find((m) => m.type === 'audio');
if (!audio) {
return null;
}
return audio.rtp?.[0]?.codec || null;
};
module.exports = {
isOnhold,
mergeSdpMedia,
extractSdpMedia,
isOpusFirst,
makeOpusFirst,
removeVideoSdp
removeVideoSdp,
getLeadingCodec
};

View File

@@ -0,0 +1,196 @@
const { assert } = require('console');
const Emitter = require('events');
const {
VadDetection,
SileroVadDetection
} = require('../utils/constants.json');
class SttLatencyCalculator extends Emitter {
constructor({ logger, cs}) {
super();
this.logger = logger;
this.cs = cs;
this.isRunning = false;
this.isInTalkSpurt = false;
this.start_talking_time = 0;
this.talkspurts = [];
this.vendor = this.cs.vad?.vendor || 'silero';
this.stt_start_time = 0;
this.stt_stop_time = 0;
this.stt_on_transcription_time = 0;
}
set sttStartTime(time) {
this.stt_start_time = time;
}
get sttStartTime() {
return this.stt_start_time || 0;
}
set sttStopTime(time) {
this.stt_stop_time = time;
}
get sttStopTime() {
return this.stt_stop_time || 0;
}
set sttOnTranscriptionTime(time) {
this.stt_on_transcription_time = time;
}
get sttOnTranscriptionTime() {
return this.stt_on_transcription_time || 0;
}
_onVadDetected(_ep, _evt, fsEvent) {
if (fsEvent.getHeader('detected-event') === 'stop_talking') {
if (this.isInTalkSpurt) {
this.talkspurts.push({
start: this.start_talking_time,
stop: Date.now()
});
}
this.start_talking_time = 0;
this.isInTalkSpurt = false;
} else if (fsEvent.getHeader('detected-event') === 'start_talking') {
this.start_talking_time = Date.now();
this.isInTalkSpurt = true;
}
}
_startVad() {
assert(!this.isRunning, 'Latency calculator is already running');
assert(this.cs.ep, 'Callsession has no endpoint to start the latency calculator');
const ep = this.cs.ep;
if (!ep.sttLatencyVadHandler) {
ep.sttLatencyVadHandler = this._onVadDetected.bind(this, ep);
if (this.vendor === 'silero') {
ep.addCustomEventListener(SileroVadDetection.Detection, ep.sttLatencyVadHandler);
} else {
ep.addCustomEventListener(VadDetection.Detection, ep.sttLatencyVadHandler);
}
}
this.stop_talking_time = 0;
this.start_talking_time = 0;
this.vad = {
...(this.cs.vad || {}),
strategy: 'continuous',
bugname: 'stt-latency-calculator-vad',
vendor: this.vendor
};
ep.startVadDetection(this.vad);
this.isRunning = true;
}
_stopVad() {
if (this.isRunning) {
this.logger.warn('Latency calculator is still running, stopping VAD detection');
const ep = this.cs.ep;
ep.stopVadDetection(this.vad);
if (ep.sttLatencyVadHandler) {
if (this.vendor === 'silero') {
this.ep?.removeCustomEventListener(SileroVadDetection.Detection, ep.sttLatencyVadHandler);
} else {
this.ep?.removeCustomEventListener(VadDetection.Detection, ep.sttLatencyVadHandler);
}
ep.sttLatencyVadHandler = null;
}
this.isRunning = false;
this.logger.info('STT Latency Calculator stopped');
}
}
start() {
if (this.isRunning) {
this.logger.warn('Latency calculator is already running');
return;
}
if (!this.cs.ep) {
this.logger.error('Callsession has no endpoint to start the latency calculator');
return;
}
this._startVad();
this.logger.debug('STT Latency Calculator started');
}
stop() {
this._stopVad();
}
toUnixTimestamp(date) {
return Math.floor(date / 1000);
}
calculateLatency() {
if (!this.isRunning) {
return null;
}
const stt_stop_time = this.stt_stop_time || Date.now();
if (this.isInTalkSpurt) {
this.talkspurts.push({
start: this.start_talking_time,
stop: stt_stop_time
});
this.isInTalkSpurt = false;
this.start_talking_time = 0;
}
const stt_on_transcription_time = this.stt_on_transcription_time || stt_stop_time;
const start_talking_time = this.talkspurts[0]?.start;
let lastIdx = this.talkspurts.length - 1;
lastIdx = lastIdx < 0 ? 0 : lastIdx;
const stop_talking_time = this.talkspurts[lastIdx]?.stop || stt_stop_time;
return {
stt_start_time: this.toUnixTimestamp(this.stt_start_time),
stt_stop_time: this.toUnixTimestamp(stt_stop_time),
start_talking_time: this.toUnixTimestamp(start_talking_time),
stop_talking_time: this.toUnixTimestamp(stop_talking_time),
stt_latency: parseFloat((Math.abs(stt_on_transcription_time - stop_talking_time)) / 1000).toFixed(2),
stt_latency_ms: Math.abs(stt_on_transcription_time - stop_talking_time),
stt_usage: parseFloat((stt_stop_time - this.stt_start_time) / 1000).toFixed(2),
talkspurts: this.talkspurts.map((ts) =>
([this.toUnixTimestamp(ts.start || 0), this.toUnixTimestamp(ts.stop || 0)]))
};
}
resetTime() {
if (!this.isRunning) {
return;
}
this.stt_start_time = Date.now();
this.stt_stop_time = 0;
this.stt_on_transcription_time = 0;
this.clearTalkspurts();
this.logger.info('STT Latency Calculator reset');
}
onTranscriptionReceived() {
if (!this.isRunning) {
return;
}
this.stt_on_transcription_time = Date.now();
this.logger.debug(`CallSession:on-transcription set to ${this.stt_on_transcription_time}`);
}
onTranscribeStop() {
if (!this.isRunning) {
return;
}
this.stt_stop_time = Date.now();
this.logger.debug(`CallSession:transcribe-stop set to ${this.stt_stop_time}`);
}
clearTalkspurts() {
this.talkspurts = [];
if (!this.isInTalkSpurt) {
this.start_talking_time = 0;
}
}
}
module.exports = SttLatencyCalculator;

View File

@@ -110,6 +110,10 @@ const stickyVars = {
voxist: [
'VOXIST_API_KEY',
],
cartesia: [
'CARTESIA_API_KEY',
'CARTESIA_MODEL_ID'
],
speechmatics: [
'SPEECHMATICS_API_KEY',
'SPEECHMATICS_HOST',
@@ -127,6 +131,43 @@ const stickyVars = {
'OPENAI_TURN_DETECTION_PREFIX_PADDING_MS',
'OPENAI_TURN_DETECTION_SILENCE_DURATION_MS',
],
houndify: [
'HOUNDIFY_CLIENT_ID',
'HOUNDIFY_CLIENT_KEY',
'HOUNDIFY_USER_ID',
'HOUNDIFY_MAX_SILENCE_SECONDS',
'HOUNDIFY_MAX_SILENCE_AFTER_FULL_QUERY_SECONDS',
'HOUNDIFY_MAX_SILENCE_AFTER_PARTIAL_QUERY_SECONDS',
'HOUNDIFY_VAD_SENSITIVITY',
'HOUNDIFY_VAD_TIMEOUT',
'HOUNDIFY_VAD_MODE',
'HOUNDIFY_VAD_VOICE_MS',
'HOUNDIFY_VAD_SILENCE_MS',
'HOUNDIFY_VAD_DEBUG',
'HOUNDIFY_AUDIO_FORMAT',
'HOUNDIFY_ENABLE_NOISE_REDUCTION',
'HOUNDIFY_AUDIO_ENDPOINT',
'HOUNDIFY_ENABLE_PROFANITY_FILTER',
'HOUNDIFY_ENABLE_PUNCTUATION',
'HOUNDIFY_ENABLE_CAPITALIZATION',
'HOUNDIFY_CONFIDENCE_THRESHOLD',
'HOUNDIFY_ENABLE_DISFLUENCY_FILTER',
'HOUNDIFY_MAX_RESULTS',
'HOUNDIFY_ENABLE_WORD_TIMESTAMPS',
'HOUNDIFY_MAX_ALTERNATIVES',
'HOUNDIFY_PARTIAL_TRANSCRIPT_INTERVAL',
'HOUNDIFY_SESSION_TIMEOUT',
'HOUNDIFY_CONNECTION_TIMEOUT',
'HOUNDIFY_LATITUDE',
'HOUNDIFY_LONGITUDE',
'HOUNDIFY_CITY',
'HOUNDIFY_STATE',
'HOUNDIFY_COUNTRY',
'HOUNDIFY_TIMEZONE',
'HOUNDIFY_DOMAIN',
'HOUNDIFY_CUSTOM_VOCABULARY',
'HOUNDIFY_LANGUAGE_MODEL'
],
};
/**
@@ -335,6 +376,61 @@ const normalizeDeepgram = (evt, channel, language, shortUtterance) => {
};
};
const normalizeGladia = (evt, channel, language, shortUtterance) => {
const copy = JSON.parse(JSON.stringify(evt));
// Handle Gladia transcript format
if (evt.type === 'transcript' && evt.data && evt.data.utterance) {
const utterance = evt.data.utterance;
const alternatives = [{
confidence: utterance.confidence || 0,
transcript: utterance.text || '',
}];
return {
language_code: utterance.language || language,
channel_tag: channel,
is_final: evt.data.is_final || false,
alternatives,
vendor: {
name: 'gladia',
evt: copy
}
};
}
};
const normalizeDeepgramFlux = (evt, channel, language) => {
const copy = JSON.parse(JSON.stringify(evt));
let turnTakingEvent;
if (['StartOfTurn', 'EagerEndOfTurn', 'TurnResumed', 'EndOfTurn'].includes(evt.event)) {
turnTakingEvent = evt.event;
}
/* calculate total confidence based on word-level confidence */
const realWords = (evt.words || [])
.filter((w) => ![',.!?;'].includes(w.word));
const confidence = realWords.length > 0 ? realWords.reduce((acc, w) => acc + w.confidence, 0) / realWords.length : 0;
return {
language_code: language,
channel_tag: channel,
is_final: evt.event === 'EndOfTurn',
alternatives: [
{
confidence,
end_of_turn_confidence: evt.end_of_turn_confidence,
transcript: evt.transcript,
...(turnTakingEvent && {turn_taking_event: turnTakingEvent})
}
],
vendor: {
name: 'deepgramflux',
evt: copy
}
};
};
const normalizeNvidia = (evt, channel, language) => {
const copy = JSON.parse(JSON.stringify(evt));
const alternatives = (evt.alternatives || [])
@@ -519,16 +615,27 @@ const normalizeAws = (evt, channel, language) => {
const normalizeAssemblyAi = (evt, channel, language) => {
const copy = JSON.parse(JSON.stringify(evt));
const alternatives = [];
let is_final = false;
if (evt.type && evt.type === 'Turn') {
// v3 is here
alternatives.push({
confidence: evt.end_of_turn_confidence,
transcript: evt.transcript,
});
is_final = evt.end_of_turn;
} else {
alternatives.push({
confidence: evt.confidence,
transcript: evt.text,
});
is_final = evt.message_type === 'FinalTranscript';
}
return {
language_code: language,
channel_tag: channel,
is_final: evt.message_type === 'FinalTranscript',
alternatives: [
{
confidence: evt.confidence,
transcript: evt.text,
}
],
is_final,
alternatives,
vendor: {
name: 'assemblyai',
evt: copy
@@ -536,6 +643,30 @@ const normalizeAssemblyAi = (evt, channel, language) => {
};
};
const normalizeHoundify = (evt, channel, language) => {
const copy = JSON.parse(JSON.stringify(evt));
const alternatives = [];
const is_final = evt.ResultsAreFinal && evt.ResultsAreFinal[0] === true;
if (evt.Disambiguation && evt.Disambiguation.ChoiceData && evt.Disambiguation.ChoiceData.length > 0) {
// Handle Houndify Voice Search Result format
const choiceData = evt.Disambiguation.ChoiceData[0];
alternatives.push({
confidence: choiceData.ConfidenceScore || choiceData.ASRConfidence || 0.0,
transcript: choiceData.FormattedTranscription || choiceData.Transcription || '',
});
}
return {
language_code: language,
channel_tag: channel,
is_final,
alternatives,
vendor: {
name: 'houndify',
evt: copy
}
};
};
const normalizeVoxist = (evt, channel, language) => {
const copy = JSON.parse(JSON.stringify(evt));
return {
@@ -555,6 +686,25 @@ const normalizeVoxist = (evt, channel, language) => {
};
};
const normalizeCartesia = (evt, channel, language) => {
const copy = JSON.parse(JSON.stringify(evt));
return {
language_code: language,
channel_tag: channel,
is_final: evt.is_final,
alternatives: [
{
confidence: 1.00,
transcript: evt.text,
}
],
vendor: {
name: 'cartesia',
evt: copy
}
};
};
const normalizeSpeechmatics = (evt, channel, language) => {
const copy = JSON.parse(JSON.stringify(evt));
const is_final = evt.message === 'AddTranscript';
@@ -616,6 +766,10 @@ module.exports = (logger) => {
switch (vendor) {
case 'deepgram':
return normalizeDeepgram(evt, channel, language, shortUtterance);
case 'gladia':
return normalizeGladia(evt, channel, language, shortUtterance);
case 'deepgramflux':
return normalizeDeepgramFlux(evt, channel, language, shortUtterance);
case 'microsoft':
return normalizeMicrosoft(evt, channel, language, punctuation);
case 'google':
@@ -634,8 +788,12 @@ module.exports = (logger) => {
return normalizeCobalt(evt, channel, language);
case 'assemblyai':
return normalizeAssemblyAi(evt, channel, language, shortUtterance);
case 'houndify':
return normalizeHoundify(evt, channel, language, shortUtterance);
case 'voxist':
return normalizeVoxist(evt, channel, language);
case 'cartesia':
return normalizeCartesia(evt, channel, language);
case 'verbio':
return normalizeVerbio(evt, channel, language);
case 'speechmatics':
@@ -728,12 +886,15 @@ module.exports = (logger) => {
AWS_ACCESS_KEY_ID: sttCredentials.accessKeyId,
AWS_SECRET_ACCESS_KEY: sttCredentials.secretAccessKey,
AWS_REGION: sttCredentials.region,
AWS_SECURITY_TOKEN: sttCredentials.securityToken
AWS_SECURITY_TOKEN: sttCredentials.securityToken,
AWS_SESSION_TOKEN: sttCredentials.sessionToken ? sttCredentials.sessionToken : sttCredentials.securityToken
}),
...(awsOptions.accessKey && {AWS_ACCESS_KEY_ID: awsOptions.accessKey}),
...(awsOptions.secretKey && {AWS_SECRET_ACCESS_KEY: awsOptions.secretKey}),
...(awsOptions.region && {AWS_REGION: awsOptions.region}),
...(awsOptions.securityToken && {AWS_SECURITY_TOKEN: awsOptions.securityToken}),
...(awsOptions.sessionToken && {AWS_SESSION_TOKEN: awsOptions.sessionToken ?
awsOptions.sessionToken : awsOptions.securityToken}),
...(awsOptions.languageModelName && {AWS_LANGUAGE_MODEL_NAME: awsOptions.languageModelName}),
...(awsOptions.piiEntityTypes?.length && {AWS_PII_ENTITY_TYPES: awsOptions.piiEntityTypes.join(',')}),
...(awsOptions.piiIdentifyEntities && {AWS_PII_IDENTIFY_ENTITIES: true}),
@@ -759,7 +920,7 @@ module.exports = (logger) => {
...(rOpts.initialSpeechTimeoutMs > 0 &&
{AZURE_INITIAL_SPEECH_TIMEOUT_MS: rOpts.initialSpeechTimeoutMs}),
...(rOpts.requestSnr && {AZURE_REQUEST_SNR: 1}),
...(rOpts.audioLogging && {AZURE_AUDIO_LOGGING: 1}),
...(azureOptions.audioLogging && {AZURE_AUDIO_LOGGING: 1}),
...{AZURE_USE_OUTPUT_FORMAT_DETAILED: 1},
...(azureOptions.speechSegmentationSilenceTimeoutMs &&
{AZURE_SPEECH_SEGMENTATION_SILENCE_TIMEOUT_MS: azureOptions.speechSegmentationSilenceTimeoutMs}),
@@ -834,6 +995,14 @@ module.exports = (logger) => {
const deepgramUri = deepgramOptions.deepgramSttUri || sttCredentials.deepgram_stt_uri;
const useTls = deepgramOptions.deepgramSttUseTls || sttCredentials.deepgram_stt_use_tls;
// DH (2025-08-11) entity_prompt is currently limited to 100 words
const entityPrompt = deepgramOptions.entityPrompt ?
deepgramOptions.entityPrompt
.split(/\s+/)
.slice(0, 100)
.join(' ')
: undefined;
/* default to a sensible model if not supplied */
if (!model) {
model = selectDefaultDeepgramModel(task, language);
@@ -892,7 +1061,28 @@ module.exports = (logger) => {
...(deepgramOptions.fillerWords) &&
{DEEPGRAM_SPEECH_FILLER_WORDS: deepgramOptions.fillerWords},
...((Array.isArray(deepgramOptions.keyterms) && deepgramOptions.keyterms.length > 0) &&
{DEEPGRAM_SPEECH_KEYTERMS: deepgramOptions.keyterms.join(',')})
{DEEPGRAM_SPEECH_KEYTERMS: deepgramOptions.keyterms.join(',')}),
...(deepgramOptions.mipOptOut && {DEEPGRAM_SPEECH_MIP_OPT_OUT: deepgramOptions.mipOptOut}),
...(entityPrompt && {DEEPGRAM_SPEECH_ENTITY_PROMPT: entityPrompt}),
};
}
else if ('deepgramflux' === vendor) {
const {
eotThreshold,
eotTimeoutMs,
mipOptOut,
model,
eagerEotThreshold,
keyterms
} = rOpts.deepgramOptions || {};
opts = {
DEEPGRAMFLUX_API_KEY: sttCredentials.api_key,
DEEPGRAMFLUX_SPEECH_MODEL: model || 'flux-general-en',
...(eotThreshold && {DEEPGRAMFLUX_SPEECH_EOT_THRESHOLD: eotThreshold}),
...(eotTimeoutMs && {DEEPGRAMFLUX_SPEECH_EOT_TIMEOUT_MS: eotTimeoutMs}),
...(mipOptOut && {DEEPGRAMFLUX_SPEECH_MIP_OPT_OUT: mipOptOut}),
...(eagerEotThreshold && {DEEPGRAMFLUX_SPEECH_EAGER_EOT_THRESHOLD: eagerEotThreshold}),
...(keyterms && keyterms.length > 0 && {DEEPGRAMFLUX_SPEECH_KEYTERMS: keyterms.join(',')}),
};
}
else if ('soniox' === vendor) {
@@ -993,14 +1183,89 @@ module.exports = (logger) => {
};
}
else if ('assemblyai' === vendor) {
const serviceVersion = rOpts.assemblyAiOptions?.serviceVersion || sttCredentials.service_version || 'v2';
const {
formatTurns,
endOfTurnConfidenceThreshold,
minEndOfTurnSilenceWhenConfident,
maxTurnSilence
} = rOpts.assemblyAiOptions || {};
opts = {
...opts,
ASSEMBLYAI_API_VERSION: serviceVersion,
...(serviceVersion === 'v3' && {
...(formatTurns && {
ASSEMBLYAI_FORMAT_TURNS: formatTurns
}),
...(endOfTurnConfidenceThreshold && {
ASSEMBLYAI_END_OF_TURN_CONFIDENCE_THRESHOLD: endOfTurnConfidenceThreshold
}),
ASSEMBLYAI_MIN_END_OF_TURN_SILENCE_WHEN_CONFIDENT: minEndOfTurnSilenceWhenConfident || 500,
...(maxTurnSilence && {
ASSEMBLYAI_MAX_TURN_SILENCE: maxTurnSilence
}),
}),
...(sttCredentials.api_key) &&
{ASSEMBLYAI_API_KEY: sttCredentials.api_key},
...(rOpts.hints?.length > 0 &&
{ASSEMBLYAI_WORD_BOOST: JSON.stringify(rOpts.hints)})
};
}
else if ('houndify' === vendor) {
const {
latitude, longitude, city, state, country, timeZone, domain, audioEndpoint,
maxSilenceSeconds, maxSilenceAfterFullQuerySeconds, maxSilenceAfterPartialQuerySeconds,
vadSensitivity, vadTimeout, vadMode, vadVoiceMs, vadSilenceMs, vadDebug,
audioFormat, enableNoiseReduction, enableProfanityFilter, enablePunctuation,
enableCapitalization, confidenceThreshold, enableDisfluencyFilter,
maxResults, enableWordTimestamps, maxAlternatives, partialTranscriptInterval,
sessionTimeout, connectionTimeout, customVocabulary, languageModel,
requestInfo, sampleRate
} = rOpts.houndifyOptions || {};
const audioEndpointUri = audioEndpoint || sttCredentials.houndify_server_uri;
opts = {
...opts,
HOUNDIFY_CLIENT_ID: sttCredentials.client_id,
HOUNDIFY_CLIENT_KEY: sttCredentials.client_key,
HOUNDIFY_USER_ID: sttCredentials.user_id,
HOUNDIFY_MAX_SILENCE_SECONDS: maxSilenceSeconds || 5,
HOUNDIFY_MAX_SILENCE_AFTER_FULL_QUERY_SECONDS: maxSilenceAfterFullQuerySeconds || 1,
HOUNDIFY_MAX_SILENCE_AFTER_PARTIAL_QUERY_SECONDS: maxSilenceAfterPartialQuerySeconds || 1.5,
...(vadSensitivity && {HOUNDIFY_VAD_SENSITIVITY: vadSensitivity}),
...(vadTimeout && {HOUNDIFY_VAD_TIMEOUT: vadTimeout}),
...(vadMode && {HOUNDIFY_VAD_MODE: vadMode}),
...(vadVoiceMs && {HOUNDIFY_VAD_VOICE_MS: vadVoiceMs}),
...(vadSilenceMs && {HOUNDIFY_VAD_SILENCE_MS: vadSilenceMs}),
...(vadDebug && {HOUNDIFY_VAD_DEBUG: vadDebug}),
...(audioFormat && {HOUNDIFY_AUDIO_FORMAT: audioFormat}),
...(enableNoiseReduction && {HOUNDIFY_ENABLE_NOISE_REDUCTION: enableNoiseReduction}),
...(enableProfanityFilter && {HOUNDIFY_ENABLE_PROFANITY_FILTER: enableProfanityFilter}),
...(enablePunctuation && {HOUNDIFY_ENABLE_PUNCTUATION: enablePunctuation}),
...(enableCapitalization && {HOUNDIFY_ENABLE_CAPITALIZATION: enableCapitalization}),
...(confidenceThreshold && {HOUNDIFY_CONFIDENCE_THRESHOLD: confidenceThreshold}),
...(enableDisfluencyFilter && {HOUNDIFY_ENABLE_DISFLUENCY_FILTER: enableDisfluencyFilter}),
...(maxResults && {HOUNDIFY_MAX_RESULTS: maxResults}),
...(enableWordTimestamps && {HOUNDIFY_ENABLE_WORD_TIMESTAMPS: enableWordTimestamps}),
...(maxAlternatives && {HOUNDIFY_MAX_ALTERNATIVES: maxAlternatives}),
...(partialTranscriptInterval && {HOUNDIFY_PARTIAL_TRANSCRIPT_INTERVAL: partialTranscriptInterval}),
...(sessionTimeout && {HOUNDIFY_SESSION_TIMEOUT: sessionTimeout}),
...(connectionTimeout && {HOUNDIFY_CONNECTION_TIMEOUT: connectionTimeout}),
...(latitude && {HOUNDIFY_LATITUDE: latitude}),
...(longitude && {HOUNDIFY_LONGITUDE: longitude}),
...(city && {HOUNDIFY_CITY: city}),
...(state && {HOUNDIFY_STATE: state}),
...(country && {HOUNDIFY_COUNTRY: country}),
...(timeZone && {HOUNDIFY_TIMEZONE: timeZone}),
...(domain && {HOUNDIFY_DOMAIN: domain}),
...(audioEndpointUri && {HOUNDIFY_AUDIO_ENDPOINT: audioEndpointUri}),
...(customVocabulary && {HOUNDIFY_CUSTOM_VOCABULARY:
Array.isArray(customVocabulary) ? customVocabulary.join(',') : customVocabulary}),
...(languageModel && {HOUNDIFY_LANGUAGE_MODEL: languageModel}),
...(requestInfo && {HOUNDIFY_REQUEST_INFO: JSON.stringify(requestInfo)}),
...(sampleRate && {HOUNDIFY_SAMPLING_RATE: sampleRate}),
};
}
else if ('voxist' === vendor) {
opts = {
...opts,
@@ -1008,6 +1273,16 @@ module.exports = (logger) => {
{VOXIST_API_KEY: sttCredentials.api_key},
};
}
else if ('cartesia' === vendor) {
opts = {
...opts,
...(sttCredentials.api_key &&
{CARTESIA_API_KEY: sttCredentials.api_key}),
...(sttCredentials.stt_model_id && {
CARTESIA_MODEL_ID: sttCredentials.stt_model_id
})
};
}
else if ('openai' === vendor) {
const {openaiOptions = {}} = rOpts;
const model = openaiOptions.model || rOpts.model || sttCredentials.model_id || 'whisper-1';
@@ -1100,7 +1375,9 @@ module.exports = (logger) => {
speechmaticsOptions.transcription_config.audio_filtering_config.volume_threshold}),
...(speechmaticsOptions.transcription_config?.transcript_filtering_config?.remove_disfluencies &&
{SPEECHMATICS_REMOVE_DISFLUENCIES:
speechmaticsOptions.transcription_config.transcript_filtering_config.remove_disfluencies})
speechmaticsOptions.transcription_config.transcript_filtering_config.remove_disfluencies}),
SPEECHMATICS_END_OF_UTTERANCE_SILENCE_TRIGGER:
speechmaticsOptions.transcription_config?.conversation_config?.end_of_utterance_silence_trigger || 0.5
};
}
else if (vendor.startsWith('custom:')) {

View File

@@ -80,7 +80,7 @@ class TtsStreamingBuffer extends Emitter {
clearTimeout(this.timer);
this.removeCustomEventListeners();
if (this.ep) {
this._api(this.ep, [this.ep.uuid, 'close'])
this._api(this.ep, [this.ep.uuid, 'stop'])
.catch((err) =>
this.logger.info({ err }, 'TtsStreamingBuffer:stop Error closing TTS streaming')
);
@@ -163,7 +163,6 @@ class TtsStreamingBuffer extends Emitter {
}
clear() {
this.logger.debug('TtsStreamingBuffer:clear');
if (this._connectionStatus !== TtsStreamingConnectionStatus.Connected) return;
clearTimeout(this.timer);
this._api(this.ep, [this.ep.uuid, 'clear']).catch((err) =>
@@ -193,10 +192,7 @@ class TtsStreamingBuffer extends Emitter {
this.logger.debug('TtsStreamingBuffer:_feedQueue TTS stream is not open or no endpoint available');
return;
}
if (
this._connectionStatus === TtsStreamingConnectionStatus.NotConnected ||
this._connectionStatus === TtsStreamingConnectionStatus.Failed
) {
if (this._connectionStatus !== TtsStreamingConnectionStatus.Connected) {
this.logger.debug('TtsStreamingBuffer:_feedQueue TTS stream is not connected');
return;
}
@@ -278,6 +274,14 @@ class TtsStreamingBuffer extends Emitter {
}
const chunk = combinedText.slice(0, chunkEnd);
// Check if the chunk is only whitespace before processing the queue
// If so, wait for more meaningful text
if (isWhitespace(chunk)) {
this.logger.debug('TtsStreamingBuffer:_feedQueue chunk is only whitespace, waiting for more text');
this._setTimerIfNeeded();
return;
}
// Now we iterate over the queue items
// and deduct their lengths until we've accounted for chunkEnd characters.
let remaining = chunkEnd;
@@ -301,6 +305,14 @@ class TtsStreamingBuffer extends Emitter {
this.bufferedLength -= chunkEnd;
const modifiedChunk = chunk.replace(/\n\n/g, '\n \n');
if (isWhitespace(modifiedChunk)) {
this.logger.debug('TtsStreamingBuffer:_feedQueue modified chunk is only whitespace, restoring queue');
this.queue.unshift({ type: 'text', value: chunk });
this.bufferedLength += chunkEnd;
this._setTimerIfNeeded();
return;
}
this.logger.debug(`TtsStreamingBuffer:_feedQueue sending chunk to tts: ${modifiedChunk}`);
try {
@@ -349,6 +361,7 @@ class TtsStreamingBuffer extends Emitter {
if (this.queue.length > 0) {
await this._feedQueue();
}
this.emit(TtsStreamingEvents.Connected, { vendor });
}
_onConnectFailure(vendor) {
@@ -399,6 +412,7 @@ class TtsStreamingBuffer extends Emitter {
removeCustomEventListeners() {
this.eventHandlers.forEach((h) => h.ep.removeCustomEventListener(h.event, h.handler));
this.eventHandlers.length = 0;
}
_initHandlers(ep) {
@@ -422,7 +436,15 @@ class TtsStreamingBuffer extends Emitter {
const findSentenceBoundary = (text, limit) => {
// Look for punctuation or double newline that signals sentence end.
const sentenceEndRegex = /[.!?](?=\s|$)|\n\n/g;
// Includes:
// - ASCII: . ! ?
// - Arabic: ؟ (question mark), ۔ (full stop)
// - Japanese: 。 (full stop), , (full-width exclamation/question)
//
// For languages that use spaces between sentences, we still require
// whitespace or end-of-string after the mark. For Japanese (no spaces),
// we treat the punctuation itself as a boundary regardless of following char.
const sentenceEndRegex = /[.!?؟۔](?=\s|$)|[。!?]|\n\n/g;
let lastSentenceBoundary = -1;
let match;
while ((match = sentenceEndRegex.exec(text)) && match.index < limit) {

View File

@@ -132,8 +132,8 @@ class WsRequestor extends BaseRequestor {
while (retryCount <= this.maxReconnects) {
try {
this.logger.error({retryCount, maxReconnects: this.maxReconnects},
'WsRequestor:request - attempting connection');
this.logger.debug({retryCount, maxReconnects: this.maxReconnects},
'WsRequestor:request - attempting connection retry');
// Ensure clean state before each connection attempt
if (this.ws) {
@@ -141,38 +141,29 @@ class WsRequestor extends BaseRequestor {
this.ws = null;
}
this.logger.error({retryCount}, 'WsRequestor:request - calling _connect()');
const startAt = process.hrtime();
await this._connect();
const rtt = this._roundTrip(startAt);
this.stats.histogram('app.hook.connect_time', rtt, ['hook_type:app']);
this.logger.error({retryCount}, 'WsRequestor:request - connection successful, exiting retry loop');
lastError = null;
break;
} catch (error) {
lastError = error;
retryCount++;
this.logger.error({error: error.message, retryCount, maxReconnects: this.maxReconnects},
'WsRequestor:request - connection attempt failed');
if (retryCount <= this.maxReconnects &&
this.retryPolicyValues?.length &&
this._shouldRetry(error, this.retryPolicyValues)) {
this.logger.error(
{url, error, retryCount, maxRetries: this.maxReconnects},
`WsRequestor:request - connection failed, retrying (${retryCount}/${this.maxReconnects})`
);
const delay = this.backoffMs;
this.backoffMs = this.backoffMs < 2000 ? this.backoffMs * 2 : (this.backoffMs + 2000);
this.logger.error({delay}, 'WsRequestor:request - waiting before retry');
this.logger.debug({delay}, 'WsRequestor:request - waiting before retry');
await new Promise((resolve) => setTimeout(resolve, delay));
this.logger.error('WsRequestor:request - retry delay complete, attempting retry');
continue;
}
this.logger.error({lastError: lastError.message, retryCount, maxReconnects: this.maxReconnects},
'WsRequestor:request - throwing last error');
this.logger.error({error: error.message, retryCount, maxReconnects: this.maxReconnects},
'WsRequestor:request - all connection attempts failed');
throw lastError;
}
}
@@ -302,7 +293,7 @@ class WsRequestor extends BaseRequestor {
/* send the message */
this.ws.send(JSON.stringify(obj), async() => {
this.logger.debug({obj}, `WsRequestor:request websocket: sent (${url})`);
if (obj.type !== 'llm:event') this.logger.debug({obj}, `WsRequestor:request websocket: sent (${url})`);
// If session:reconnect is waiting for ack, hold here until ack to send queuedMsgs
if (this._reconnectPromise) {
try {
@@ -370,7 +361,7 @@ class WsRequestor extends BaseRequestor {
this
.once('ready', (ws) => {
this.logger.error({retryCount: 'unknown'}, 'WsRequestor:_connect - ready event fired, resolving Promise');
this.logger.debug('WsRequestor:_connect - ready event fired, resolving Promise');
this.removeAllListeners('not-ready');
if (this.connections > 1) this.request('session:reconnect', this.url);
resolve();

6386
package-lock.json generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +1,6 @@
{
"name": "jambonz-feature-server",
"version": "0.9.4",
"version": "0.9.5",
"main": "app.js",
"engines": {
"node": ">= 18.x"
@@ -27,14 +27,14 @@
"dependencies": {
"@aws-sdk/client-auto-scaling": "^3.549.0",
"@aws-sdk/client-sns": "^3.549.0",
"@jambonz/db-helpers": "^0.9.12",
"@jambonz/db-helpers": "^0.9.18",
"@jambonz/http-health-check": "^0.0.1",
"@jambonz/mw-registrar": "^0.2.7",
"@jambonz/realtimedb-helpers": "^0.8.13",
"@jambonz/speech-utils": "^0.2.11",
"@jambonz/realtimedb-helpers": "^0.8.15",
"@jambonz/speech-utils": "^0.2.30",
"@jambonz/stats-collector": "^0.1.10",
"@jambonz/time-series": "^0.2.13",
"@jambonz/verb-specifications": "^0.0.104",
"@jambonz/time-series": "^0.2.15",
"@jambonz/verb-specifications": "^0.0.125",
"@modelcontextprotocol/sdk": "^1.9.0",
"@opentelemetry/api": "^1.8.0",
"@opentelemetry/exporter-jaeger": "^1.23.0",
@@ -48,13 +48,13 @@
"bent": "^7.3.12",
"debug": "^4.3.4",
"deepcopy": "^2.1.0",
"drachtio-fsmrf": "^4.0.3",
"drachtio-srf": "^5.0.5",
"drachtio-fsmrf": "^4.1.2",
"drachtio-srf": "^5.0.14",
"express": "^4.19.2",
"express-validator": "^7.0.1",
"moment": "^2.30.1",
"parse-url": "^9.2.0",
"pino": "^8.20.0",
"pino": "^10.1.0",
"polly-ssml-split": "^0.1.0",
"sdp-transform": "^2.15.0",
"short-uuid": "^5.1.0",

View File

@@ -16,6 +16,7 @@ require('./sip-request-tests');
require('./create-call-test');
require('./play-tests');
require('./sip-refer-tests');
require('./sip-refer-handler-tests');
require('./listen-tests');
require('./config-test');
require('./queue-test');

View File

@@ -0,0 +1,117 @@
<?xml version="1.0" encoding="ISO-8859-1" ?>
<!DOCTYPE scenario SYSTEM "sipp.dtd">
<scenario name="UAS that accepts call and sends REFER">
<!-- Receive incoming INVITE -->
<recv request="INVITE" crlf="true">
<action>
<ereg regexp=".*" search_in="hdr" header="Subject:" assign_to="1" />
<ereg regexp=".*" search_in="hdr" header="From:" assign_to="2" />
</action>
</recv>
<!-- Send 180 Ringing -->
<send>
<![CDATA[
SIP/2.0 180 Ringing
[last_Via:]
[last_From:]
[last_To:];tag=[pid]SIPpTag01[call_number]
[last_Call-ID:]
[last_CSeq:]
[last_Record-Route:]
Subject:[$1]
Content-Length: 0
]]>
</send>
<!-- Send 200 OK with SDP -->
<send>
<![CDATA[
SIP/2.0 200 OK
[last_Via:]
[last_From:]
[last_To:];tag=[pid]SIPpTag01[call_number]
[last_Call-ID:]
[last_CSeq:]
[last_Record-Route:]
Subject:[$1]
Contact: <sip:[local_ip]:[local_port];transport=[transport]>
Content-Type: application/sdp
Content-Length: [len]
v=0
o=user1 53655765 2353687637 IN IP[local_ip_type] [local_ip]
s=-
c=IN IP[media_ip_type] [media_ip]
t=0 0
m=audio [media_port] RTP/AVP 0
a=rtpmap:0 PCMU/8000
]]>
</send>
<recv request="ACK" rtd="true" crlf="true">
<action>
<!-- Check if this is NOT the first call (tag ends with 012 or higher) -->
<ereg regexp="tag=1SIPpTag01[2-9]" search_in="hdr" header="To:" assign_to="3" />
<log message="Not first call check result: [$3]"/>
</action>
</recv>
<!-- Skip REFER if we found a non-first call tag -->
<nop next="skip_refer" test="3" value="" compare="not_equal">
<action>
<log message="Found non-first call tag [$3], skipping REFER"/>
</action>
</nop>
<!-- Wait a moment, then send REFER (only on first call) -->
<pause milliseconds="1000"/>
<nop>
<action>
<log message="Sending REFER for first call"/>
</action>
</nop>
<!-- Send REFER (only on first iteration) -->
<send retrans="500">
<![CDATA[
REFER sip:service@[remote_ip]:[remote_port] SIP/2.0
Via: SIP/2.0/[transport] [local_ip]:[local_port];branch=[branch]
From: <sip:[local_ip]:[local_port]>;tag=[pid]SIPpTag01[call_number]
To: [$2]
[last_Call-ID:]
CSeq: 2 REFER
Contact: <sip:[local_ip]:[local_port];transport=[transport]>
Max-Forwards: 70
X-Call-Number: [call_number]
Refer-To: <sip:+15551234567@example.com>
Referred-By: <sip:[local_ip]:[local_port]>
Content-Length: 0
]]>
</send>
<!-- Expect 202 Accepted (only on first iteration) -->
<recv response="202"/>
<label id="skip_refer"/>
<!-- Wait for BYE from feature server -->
<recv request="BYE"/>
<!-- Send 200 OK to BYE -->
<send>
<![CDATA[
SIP/2.0 200 OK
[last_Via:]
[last_From:]
[last_To:]
[last_Call-ID:]
[last_CSeq:]
Contact: <sip:[local_ip]:[local_port];transport=[transport]>
Content-Length: 0
]]>
</send>
</scenario>

View File

@@ -0,0 +1,90 @@
const test = require('tape');
const { sippUac } = require('./sipp')('test_fs');
const bent = require('bent');
const getJSON = bent('json')
const clearModule = require('clear-module');
const {provisionCallHook} = require('./utils');
const { sleepFor } = require('../lib/utils/helpers');
process.on('unhandledRejection', (reason, p) => {
console.log('Unhandled Rejection at: Promise', p, 'reason:', reason);
});
function connect(connectable) {
return new Promise((resolve, reject) => {
connectable.on('connect', () => {
return resolve();
});
});
}
test('when parent leg recvs REFER it should end the dial after adulting child leg', async(t) => {
clearModule.all();
const {srf, disconnect} = require('../app');
try {
await connect(srf);
// wait for fs connected to drachtio server.
await sleepFor(1000);
// GIVEN
const from = "dial_refer_handler";
let verbs = [
{
"verb": "dial",
"callerId": from,
"actionHook": "/actionHook",
"referHook": "/referHook",
"anchorMedia": true,
"target": [
{
"type": "phone",
"number": "15083084809"
}
]
}
];
await provisionCallHook(from, verbs);
// THEN
//const p = sippUac('uas-dial.xml', '172.38.0.10', undefined, undefined, 2);
const p = sippUac('uas-dial-refer.xml', '172.38.0.10', undefined, undefined, 2);
await sleepFor(1000);
let account_sid = '622f62e4-303a-49f2-bbe0-eb1e1714e37a';
let post = bent('http://127.0.0.1:3000/', 'POST', 'json', 201);
post('v1/createCall', {
'account_sid':account_sid,
"call_hook": {
"url": "http://127.0.0.1:3100/",
"method": "POST",
},
"from": from,
"to": {
"type": "phone",
"number": "15583084808"
}});
await p;
// Verify that the referHook was called
const obj = await getJSON(`http://127.0.0.1:3100/lastRequest/${from}_referHook`);
t.ok(obj.body.from === from,
'dial-refer-handler: referHook was called with correct from');
t.ok(obj.body.refer_details && obj.body.refer_details.sip_refer_to,
'dial-refer-handler: refer_details included in referHook');
t.ok(obj.body.refer_details.refer_to_user === '+15551234567',
'dial-refer-handler: refer_to_user correctly parsed');
t.ok(obj.body.refer_details.referring_call_sid,
'dial-refer-handler: referring_call_sid included');
t.ok(obj.body.refer_details.referred_call_sid,
'dial-refer-handler: referred_call_sid included');
disconnect();
} catch (err) {
console.log(`error received: ${err}`);
disconnect();
t.error(err);
}
});

View File

@@ -58,6 +58,46 @@ test('\'refer\' tests w/202 and NOTIFY', {timeout: 25000}, async(t) => {
}
});
test('\'refer\' tests tel:', {timeout: 25000}, async(t) => {
clearModule.all();
const {srf, disconnect} = require('../app');
try {
await connect(srf);
// GIVEN
const verbs = [
{
verb: 'say',
text: 'silence_stream://100'
},
{
verb: 'sip:refer',
referTo: 'tel:+1234567890',
actionHook: '/actionHook'
}
];
const noVerbs = [];
const from = 'refer_with_tel';
await provisionCallHook(from, verbs);
await provisionActionHook(from, noVerbs)
// THEN
await sippUac('uac-refer-with-notify.xml', '172.38.0.10', from);
t.pass('refer: successfully received 202 Accepted');
await sleepFor(1000);
const obj = await getJSON(`http:127.0.0.1:3100/lastRequest/${from}_actionHook`);
t.ok(obj.body.final_referred_call_status === 200, 'refer: successfully received NOTIFY with 200 OK');
// console.log(`obj: ${JSON.stringify(obj)}`);
disconnect();
} catch (err) {
console.log(`error received: ${err}`);
disconnect();
t.error(err);
}
});
test('\'refer\' tests w/202 but no NOTIFY', {timeout: 25000}, async(t) => {
clearModule.all();
const {srf, disconnect} = require('../app');

View File

@@ -99,6 +99,24 @@ app.post('/actionHook', (req, res) => {
return res.sendStatus(200);
});
/*
* referHook
*/
app.post('/referHook', (req, res) => {
console.log({payload: req.body}, 'POST /referHook');
let key = req.body.from + "_referHook"
addRequestToMap(key, req, hook_mapping);
return res.json([{"verb": "pause", "length": 2}]);
});
/*
* adultingHook
*/
app.post('/adulting', (req, res) => {
console.log({payload: req.body}, 'POST /adulting');
return res.sendStatus(200);
});
/*
* customHook
* For the hook to return

View File

@@ -83,7 +83,8 @@ test('invalid jambonz json create alert tests', async(t) => {
{account_sid: 'bb845d4b-83a9-4cde-a6e9-50f3743bab3f', page: 1, page_size: 25, days: 7});
let checked = false;
for (let i = 0; i < data.total; i++) {
checked = data.data[i].message === 'malformed jambonz payload: must be array'
checked = data.data[i].message === 'malformed jambonz payload: must be array';
if (checked) break;
}
t.ok(checked, 'alert is raised as expected');
disconnect();