Compare commits

...

15 Commits

Author SHA1 Message Date
Hoan Luu Huu
d7beaa1b7b Feat/dialogflow cx (#7) (#1516)
* wip

* wip

* wip

* wip

* logging

* wip

* wip

* wip

* update docker env to latest freeswitch

* wip

* lint

* wip

* support dialogflowcx

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

---------

Co-authored-by: Dave Horton <daveh@beachdognet.com>
2026-02-12 07:46:36 -05:00
Dave Horton
45d0ca87af update drachtio-srf (#1515) 2026-02-09 10:31:43 -05:00
Sam Machin
bd435dfff9 add deep copy (#1511)
* escape tag data in listen

* deep copy call data for listen
2026-02-03 10:44:26 -05:00
Sam Machin
b598cd94ae escape tag data in listen (#1510) 2026-02-03 07:35:15 -05:00
Matt Hertogs
ceb9a7a3bd Fix boostAudioSignal parameter in Update Call REST API (#1490)
Corrects the parameter passed to _lccBoostAudioSignal to use
opts.boostAudioSignal instead of the entire opts object, ensuring
the boostAudioSignal option works correctly.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-29 13:58:14 -05:00
Dave Horton
ff5f9acaf8 on dial do not reinvite A leg on answer if already answered and we are anchoring media (#1508) 2026-01-29 13:47:53 -05:00
Sam Machin
96cdc2936b invert default (#1507) 2026-01-29 09:22:00 -05:00
Hoan Luu Huu
6120dcbe96 support openai transcribe turn_detection.eagerness (#1496) 2026-01-28 08:09:01 -05:00
Hoan Luu Huu
96d72216e2 support google s2s host, version, sessionResumption (#1498) 2026-01-28 08:01:53 -05:00
Hoan Luu Huu
faee30278b support mod_google_tts_streaming (#1409)
* support mod_google_tts_streaming

* wip

* wip
2026-01-27 08:18:47 -05:00
Hoan Luu Huu
325af42946 speechmatics support end_of_utterance_silence_trigger (#1499)
* speechmatics support end_of_utterance_silence_trigger

* wip
2026-01-23 10:11:58 -05:00
Hoan Luu Huu
9848152d5b support google gemini tts (#1491)
* support google gemini tts

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* update speech utils version

* wip
2026-01-22 10:12:06 -05:00
Sam Machin
2468557aef add statusHook to redirect verb (#1500)
* add statusHook to redirect

* fix url import

* Update redirect.js

* logging

* constructor statusHook

* lint n logging

* debug

* update call_status_hook

* use notifier to test url

* remove require url as its global since node 10

* update verb specs dep

* update verb specs
2026-01-21 09:20:47 -05:00
Dave Horton
3c3dfa81d3 fix issue where call hangs up and actionhook delay triggered (#1497) 2026-01-19 16:42:43 -05:00
Vinod Dharashive
961c2589ac freeswitch capture sip error and propagate the same error (#1489)
* fix: propagate SIP 488 error to SBC on endpoint allocation failure

When FreeSWITCH returns a SIP 488 'Not Acceptable Here' error during
endpoint allocation (e.g., codec incompatibility), this error was not
being propagated back to the SBC/client. Instead, the call would wait
indefinitely for websocket commands or return a generic 603 response.

Implementation:
- In _evalEndpointPrecondition(), detect SipError by checking
  err.type === 'SipError' or err.name === 'SipError'
- Extract the SIP status code (e.g., 488), reason, and the Reason
  header from the error response (e.g., Q.850;cause=88;text=INCOMPATIBLE_DESTINATION)
- Send the SIP error response immediately to the SBC with:
  - X-Reason header: endpoint allocation failure details
  - Reason header: original Q.850 cause from FreeSWITCH
- Notify call status change as Failed with proper SIP status
- Release the call immediately instead of waiting for commands

Also added fallback handling in InboundCallSession._onTasksDone() to
propagate the stored error if immediate send was not possible.

* wip

* Simplify SipError check to only use err.name
2026-01-13 08:58:37 -05:00
15 changed files with 1671 additions and 1026 deletions

View File

@@ -1082,7 +1082,7 @@ class CallSession extends Emitter {
const cred = JSON.parse(credential.service_key.replace(/\n/g, '\\n'));
return {
speech_credential_sid: credential.speech_credential_sid,
credentials: cred
credentials: cred,
};
} catch (err) {
const sid = this.accountInfo.account.account_sid;
@@ -2028,7 +2028,7 @@ Duration=${duration} `
return this._lccDub(opts.dub, callSid);
}
else if (opts.boostAudioSignal) {
return this._lccBoostAudioSignal(opts, callSid);
return this._lccBoostAudioSignal(opts.boostAudioSignal, callSid);
}
else if (opts.media_path) {
return this._lccMediaPath(opts.media_path, callSid);
@@ -2500,6 +2500,36 @@ Duration=${duration} `
}
else {
this.logger.error(err, `Error attempting to allocate endpoint for for task ${task.name}`);
// Check for SipError type (e.g., 488 codec incompatibility)
const isSipError = err.name === 'SipError';
if (isSipError && err.status) {
// Extract Reason header from SIP response if available (e.g., Q.850;cause=88;text="INCOMPATIBLE_DESTINATION")
const sipReasonHeader = err.res?.msg?.headers?.reason;
this._endpointAllocationError = {
status: err.status,
reason: err.reason || 'Endpoint Allocation Failed',
sipReasonHeader
};
this.logger.info({endpointAllocationError: this._endpointAllocationError},
'Captured SipError for propagation to SBC');
// Send SIP error response immediately for inbound calls
if (this.res && !this.res.finalResponseSent) {
this.logger.info(`Sending ${err.status} response to SBC due to SipError`);
this.res.send(err.status, {
headers: {
'X-Reason': `endpoint allocation failure: ${err.reason || 'Endpoint Allocation Failed'}`,
...(sipReasonHeader && {'Reason': sipReasonHeader})
}
});
this._notifyCallStatusChange({
callStatus: CallStatus.Failed,
sipStatus: err.status,
sipReason: err.reason || 'Endpoint Allocation Failed'
});
this._callReleased();
}
}
throw new Error(`${BADPRECONDITIONS}: unable to allocate endpoint`);
}
}

View File

@@ -60,6 +60,19 @@ class InboundCallSession extends CallSession {
}
});
}
else if (this._endpointAllocationError) {
// Propagate SIP error from endpoint allocation failure back to the client
const {status, reason, sipReasonHeader} = this._endpointAllocationError;
this.rootSpan.setAttributes({'call.termination': `endpoint allocation SIP error ${status}`});
this.logger.info({endpointAllocationError: this._endpointAllocationError},
`InboundCallSession:_onTasksDone generating ${status} due to endpoint allocation failure`);
this.res.send(status, {
headers: {
'X-Reason': `endpoint allocation failure: ${reason}`,
...(sipReasonHeader && {'Reason': sipReasonHeader})
}
});
}
else {
this.rootSpan.setAttributes({'call.termination': 'tasks completed without answering call'});
this.logger.info('InboundCallSession:_onTasksDone auto-generating non-success response to invite');

View File

@@ -195,6 +195,9 @@ class TaskDial extends Task {
async exec(cs) {
await super.exec(cs);
/* capture whether A leg was already answered before this dial task started */
this._aLegAlreadyAnswered = !!cs.dlg;
if (this.data.anchorMedia && this.data.exitMediaPath) {
this.logger.info('Dial:exec - incompatible anchorMedia and exitMediaPath are both set, will obey anchorMedia');
delete this.data.exitMediaPath;
@@ -550,7 +553,7 @@ class TaskDial extends Task {
let sbcAddress = this.proxy || getSBC();
const teamsInfo = {};
let fqdn;
const forwardPAI = this.forwardPAI ?? JAMBONZ_DIAL_PAI_HEADER; // dial verb overides env var
const forwardPAI = this.forwardPAI ?? !JAMBONZ_DIAL_PAI_HEADER; // dial verb overides env var
this.logger.debug(forwardPAI, 'forwardPAI value');
if (!sbcAddress) throw new Error('no SBC found for outbound call');
this.headers = {
@@ -872,8 +875,12 @@ class TaskDial extends Task {
this.sd = sd;
this.callSid = sd.callSid;
if (this.earlyMedia) {
debug('Dial:_selectSingleDial propagating answer supervision on A leg now that B is connected');
await cs.propagateAnswer();
if (this._aLegAlreadyAnswered) {
debug('Dial:_selectSingleDial A leg was already answered, skipping propagateAnswer');
} else {
debug('Dial:_selectSingleDial propagating answer supervision on A leg now that B is connected');
await cs.propagateAnswer();
}
}
if (this.timeLimit) {
this.timerMaxCallDuration = setTimeout(this._onMaxCallDuration.bind(this, cs), this.timeLimit * 1000);

View File

@@ -1,3 +1,4 @@
const assert = require('assert');
const Task = require('../task');
const {TaskName, TaskPreconditions} = require('../../utils/constants');
const Intent = require('./intent');
@@ -10,19 +11,27 @@ class Dialogflow extends Task {
super(logger, opts);
this.preconditions = TaskPreconditions.Endpoint;
this.credentials = this.data.credentials;
this.project = this.data.project;
this.agent = this.data.agent;
this.region = this.data.region || 'us-central1';
this.model = this.data.model || 'es';
/* set project id with environment and region (optionally) */
if (this.data.environment && this.data.region) {
this.project = `${this.data.project}:${this.data.environment}:${this.data.region}`;
}
else if (this.data.environment) {
this.project = `${this.data.project}:${this.data.environment}`;
}
else if (this.data.region) {
this.project = `${this.data.project}::${this.data.region}`;
assert(this.agent || !this.isCX, 'agent is required for dialogflow cx');
assert(this.credentials, 'dialogflow credentials are required');
if (this.isCX) {
this.environment = this.data.environment || 'none';
}
else {
this.project = this.data.project;
if (this.data.environment && this.data.region) {
this.project = `${this.data.project}:${this.data.environment}:${this.data.region}`;
}
else if (this.data.environment) {
this.project = `${this.data.project}:${this.data.environment}`;
}
else if (this.data.region) {
this.project = `${this.data.project}::${this.data.region}`;
}
}
this.lang = this.data.lang || 'en-US';
@@ -39,7 +48,6 @@ class Dialogflow extends Task {
this.events = this.data.events;
}
else if (this.eventHook) {
// send all events by default - except interim transcripts
this.events = [
'intent',
'transcription',
@@ -60,38 +68,33 @@ class Dialogflow extends Task {
this.voice = this.data.tts.voice || 'default';
this.speechSynthesisLabel = this.data.tts.label;
// fallback tts
this.fallbackVendor = this.data.tts.fallbackVendor || 'default';
this.fallbackLanguage = this.data.tts.fallbackLanguage || 'default';
this.fallbackVoice = this.data.tts.fallbackLanguage || 'default';
this.fallbackVoice = this.data.tts.fallbackVoice || 'default';
this.fallbackLabel = this.data.tts.fallbackLabel;
}
this.bargein = this.data.bargein;
this.cmd = this.isCX ? 'dialogflow_cx_start' : 'dialogflow_start';
this.cmdStop = this.isCX ? 'dialogflow_cx_stop' : 'dialogflow_stop';
// CX-specific state
this._suppressNextCXAudio = false;
this._cxAudioHandled = false;
}
get name() { return TaskName.Dialogflow; }
get isCX() { return this.model === 'cx'; }
get isES() { return !this.isCX; }
async exec(cs, {ep}) {
await super.exec(cs);
try {
await this.init(cs, ep);
this.logger.debug(`starting dialogflow bot ${this.project}`);
// kick it off
const baseArgs = `${this.ep.uuid} ${this.project} ${this.lang} ${this.welcomeEvent}`;
if (this.welcomeEventParams) {
this.ep.api('dialogflow_start', `${baseArgs} '${JSON.stringify(this.welcomeEventParams)}'`);
}
else if (this.welcomeEvent.length) {
this.ep.api('dialogflow_start', baseArgs);
}
else {
this.ep.api('dialogflow_start', `${this.ep.uuid} ${this.project} ${this.lang}`);
}
this.logger.debug(`started dialogflow bot ${this.project}`);
await this.startBot('default');
await this.awaitTaskDone();
} catch (err) {
this.logger.error({err}, 'Dialogflow:exec error');
@@ -108,6 +111,12 @@ class Dialogflow extends Task {
this.ep.removeCustomEventListener('dialogflow::end_of_utterance');
this.ep.removeCustomEventListener('dialogflow::error');
this.ep.removeCustomEventListener('dialogflow_cx::intent');
this.ep.removeCustomEventListener('dialogflow_cx::transcription');
this.ep.removeCustomEventListener('dialogflow_cx::audio_provided');
this.ep.removeCustomEventListener('dialogflow_cx::end_of_utterance');
this.ep.removeCustomEventListener('dialogflow_cx::error');
this._clearNoinputTimer();
if (!this.reportedFinalAction) this.performAction({dialogflowResult: 'caller hungup'})
@@ -141,6 +150,12 @@ class Dialogflow extends Task {
this.ep.addCustomEventListener('dialogflow::end_of_utterance', this._onEndOfUtterance.bind(this, ep, cs));
this.ep.addCustomEventListener('dialogflow::error', this._onError.bind(this, ep, cs));
this.ep.addCustomEventListener('dialogflow_cx::intent', this._onIntent.bind(this, ep, cs));
this.ep.addCustomEventListener('dialogflow_cx::transcription', this._onTranscription.bind(this, ep, cs));
this.ep.addCustomEventListener('dialogflow_cx::audio_provided', this._onAudioProvided.bind(this, ep, cs));
this.ep.addCustomEventListener('dialogflow_cx::end_of_utterance', this._onEndOfUtterance.bind(this, ep, cs));
this.ep.addCustomEventListener('dialogflow_cx::error', this._onError.bind(this, ep, cs));
const obj = typeof this.credentials === 'string' ? JSON.parse(this.credentials) : this.credentials;
const creds = JSON.stringify(obj);
await this.ep.set('GOOGLE_APPLICATION_CREDENTIALS', creds);
@@ -151,56 +166,113 @@ class Dialogflow extends Task {
}
}
async startBot(intent) {
if (this.isCX) {
const event = this.welcomeEvent || intent;
const args = this._buildStartArgs({
event: event && event !== 'default' ? event : undefined
});
this.logger.info({args}, 'starting dialogflow CX bot');
await this.ep.api(this.cmd, args);
}
else {
await this._startBotES();
}
}
async _startBotES() {
this.logger.info('starting dialogflow ES bot');
const baseArgs = `${this.ep.uuid} ${this.project} ${this.lang} ${this.welcomeEvent}`;
if (this.welcomeEventParams) {
await this.ep.api(this.cmd, `${baseArgs} '${JSON.stringify(this.welcomeEventParams)}'`);
}
else if (this.welcomeEvent.length) {
await this.ep.api(this.cmd, baseArgs);
}
else {
await this.ep.api(this.cmd, `${this.ep.uuid} ${this.project} ${this.lang}`);
}
}
/**
* Build the start command args string for either ES or CX.
* @param {object} opts - options
* @param {string} opts.event - optional event to send
* @param {string} opts.text - optional text to send
* @param {number} opts.singleUtterance - 1 or 0 (CX only, default 1)
* @returns {string} command args string
*/
_buildStartArgs({event, text, singleUtterance = 1} = {}) {
if (this.isCX) {
const args = [
this.ep.uuid,
this.project,
this.region,
this.agent,
this.environment || 'none',
this.lang,
event || 'none',
text ? `'${text}'` : 'none',
singleUtterance ? '1' : '0',
];
return args.join(' ');
}
// ES
const args = [this.ep.uuid, this.project, this.lang];
if (event) {
args.push(event);
}
if (text) {
if (!event) args.push('none');
args.push(`'${text}'`);
}
return args.join(' ');
}
/**
* An intent has been returned. Since we are using SINGLE_UTTERANCE on the dialogflow side,
* we may get an empty intent, signified by the lack of a 'response_id' attribute.
* In such a case, we just start another StreamingIntentDetectionRequest.
* @param {*} ep - media server endpoint
* @param {*} cs - call session
* @param {*} evt - event data
*/
async _onIntent(ep, cs, evt) {
const intent = new Intent(this.logger, evt);
if (intent.isEmpty) {
/**
* An empty intent is returned in 3 conditions:
* 1. Our no-input timer fired
* 2. We collected dtmf that needs to be fed to dialogflow
* 3. A normal dialogflow timeout
*/
if (this.noinput && this.greetingPlayed) {
this.logger.info('no input timer fired, reprompting..');
this.noinput = false;
ep.api('dialogflow_start', `${ep.uuid} ${this.project} ${this.lang} ${this.noInputEvent}`);
ep.api(this.cmd, this._buildStartArgs({event: this.noInputEvent}));
}
else if (this.dtmfEntry && this.greetingPlayed) {
this.logger.info('dtmf detected, reprompting..');
ep.api('dialogflow_start', `${ep.uuid} ${this.project} ${this.lang} none \'${this.dtmfEntry}\'`);
ep.api(this.cmd, this._buildStartArgs({text: this.dtmfEntry}));
this.dtmfEntry = null;
}
else if (this.greetingPlayed) {
this.logger.info('starting another intent');
ep.api('dialogflow_start', `${ep.uuid} ${this.project} ${this.lang}`);
}
else {
this.logger.info('got empty intent');
ep.api('dialogflow_start', `${ep.uuid} ${this.project} ${this.lang}`);
this.logger.info('got empty intent, restarting');
ep.api(this.cmd, this._buildStartArgs());
}
return;
}
// For CX: suppress NO_INPUT "I didn't get that" audio and silently restart
if (this.isCX && intent.isNoInput && this.greetingPlayed) {
this.logger.info('CX returned NO_INPUT after greeting, suppressing and restarting');
this._suppressNextCXAudio = true;
return;
}
if (this.events.includes('intent')) {
this._performHook(cs, this.eventHook, {event: 'intent', data: evt});
}
// clear the no-input timer and the digit buffer
this._clearNoinputTimer();
if (this.digitBuffer) this.digitBuffer.flush();
/* hang up (or tranfer call) after playing next audio file? */
if (intent.saysEndInteraction) {
// if 'end_interaction' is true, end the dialog after playing the final prompt
// (or in 1 second if there is no final prompt)
this.hangupAfterPlayDone = true;
this.waitingForPlayStart = true;
setTimeout(() => {
@@ -211,8 +283,6 @@ class Dialogflow extends Task {
}
}, 1000);
}
/* collect digits? */
else if (intent.saysCollectDtmf || this.enableDtmfAlways) {
const opts = Object.assign({
idt: this.opts.interDigitTimeout
@@ -221,68 +291,44 @@ class Dialogflow extends Task {
this.digitBuffer.once('fulfilled', this._onDtmfEntryComplete.bind(this, ep));
}
/* if we are using tts and a message was provided, play it out */
// If we have a TTS vendor and fulfillment text, synthesize and play
if (this.vendor && intent.fulfillmentText && intent.fulfillmentText.length > 0) {
const {srf} = cs;
const {stats} = srf.locals;
const {synthAudio} = srf.locals.dbHelpers;
this.waitingForPlayStart = false;
// start a new intent, (we want to continue to listen during the audio playback)
// _unless_ we are transferring or ending the session
if (!this.hangupAfterPlayDone) {
ep.api('dialogflow_start', `${ep.uuid} ${this.project} ${this.lang}`);
// ES: start a new intent during playback so we continue to listen
if (!this.hangupAfterPlayDone && this.isES) {
ep.api(this.cmd, this._buildStartArgs());
}
try {
const {srf} = cs;
const {stats} = srf.locals;
const {synthAudio} = srf.locals.dbHelpers;
const {filePath} = await this._fallbackSynthAudio(cs, intent, stats, synthAudio);
if (filePath) cs.trackTmpFile(filePath);
if (this.playInProgress) {
await ep.api('uuid_break', ep.uuid).catch((err) => this.logger.info(err, 'Error killing audio'));
}
this.playInProgress = true;
this.curentAudioFile = filePath;
this.logger.debug(`starting to play tts ${filePath}`);
if (this.events.includes('start-play')) {
this._performHook(cs, this.eventHook, {event: 'start-play', data: {path: filePath}});
}
await ep.play(filePath);
if (this.events.includes('stop-play')) {
this._performHook(cs, this.eventHook, {event: 'stop-play', data: {path: filePath}});
}
this.logger.debug(`finished ${filePath}`);
if (this.curentAudioFile === filePath) {
this.playInProgress = false;
if (this.queuedTasks) {
this.logger.debug('finished playing audio and we have queued tasks');
this._redirect(cs, this.queuedTasks);
return;
}
}
this.greetingPlayed = true;
if (this.hangupAfterPlayDone) {
this.logger.info('hanging up since intent was marked end interaction and we completed final prompt');
this.performAction({dialogflowResult: 'completed'});
this.notifyTaskDone();
}
else {
// every time we finish playing a prompt, start the no-input timer
this._startNoinputTimer(ep, cs);
}
await this._playAndHandlePostPlay(ep, cs, filePath);
} catch (err) {
this.logger.error({err}, 'Dialogflow:_onIntent - error playing tts');
}
}
else if (this.isCX && !this.hangupAfterPlayDone) {
// CX intent with no TTS — _onAudioProvided may handle playback.
// If not, restart CX after a short delay.
this.greetingPlayed = true;
this._cxAudioHandled = false;
setTimeout(() => {
if (!this._cxAudioHandled && !this.playInProgress) {
this.logger.info('CX: no TTS and no audio provided, restarting to listen');
ep.api(this.cmd, this._buildStartArgs());
this._startNoinputTimer(ep, cs);
}
}, 500);
}
}
async _fallbackSynthAudio(cs, intent, stats, synthAudio) {
try {
const obj = {
return await synthAudio(stats, {
account_sid: cs.accountSid,
text: intent.fulfillmentText,
vendor: this.vendor,
@@ -290,17 +336,13 @@ class Dialogflow extends Task {
voice: this.voice,
salt: cs.callSid,
credentials: this.ttsCredentials
};
this.logger.debug({obj}, 'Dialogflow:_onIntent - playing message via tts');
return await synthAudio(stats, obj);
});
} catch (error) {
this.logger.info({error}, 'Failed to synthesize audio from primary vendor');
try {
if (this.fallbackVendor) {
if (this.fallbackVendor) {
try {
const credentials = cs.getSpeechCredentials(this.fallbackVendor, 'tts', this.fallbackLabel);
const obj = {
return await synthAudio(stats, {
account_sid: cs.accountSid,
text: intent.fulfillmentText,
vendor: this.fallbackVendor,
@@ -308,24 +350,20 @@ class Dialogflow extends Task {
voice: this.fallbackVoice,
salt: cs.callSid,
credentials
};
this.logger.debug({obj}, 'Dialogflow:_onIntent - playing message via fallback tts');
return await synthAudio(stats, obj);
});
} catch (err) {
this.logger.info({err}, 'Failed to synthesize audio from fallback vendor');
throw err;
}
} catch (err) {
this.logger.info({err}, 'Failed to synthesize audio from falllback vendor');
throw err;
}
throw error;
}
}
/**
* A transcription - either interim or final - has been returned.
* If we are doing barge-in based on hotword detection, check for the hotword or phrase.
* If we are playing a filler sound, like typing, during the fullfillment phase, start that
* if this is a final transcript.
* @param {*} ep - media server endpoint
* A transcription has been returned.
* @param {*} ep - media server endpoint
* @param {*} cs - call session
* @param {*} evt - event data
*/
async _onTranscription(ep, cs, evt) {
@@ -338,13 +376,11 @@ class Dialogflow extends Task {
this._performHook(cs, this.eventHook, {event: 'transcription', data: evt});
}
// if a final transcription, start a typing sound
if (this.thinkingMusic && !transcription.isEmpty && transcription.isFinal &&
transcription.confidence > 0.8) {
ep.play(this.data.thinkingMusic).catch((err) => this.logger.info(err, 'Error playing typing sound'));
}
// interrupt playback on speaking if bargein = true
if (this.bargein && this.playInProgress) {
this.logger.debug('terminating playback due to speech bargein');
this.playInProgress = false;
@@ -353,17 +389,21 @@ class Dialogflow extends Task {
}
/**
* The caller has just finished speaking. No action currently taken.
* The caller has just finished speaking.
* @param {*} ep - media server endpoint
* @param {*} cs - call session
* @param {*} evt - event data
*/
_onEndOfUtterance(cs, evt) {
_onEndOfUtterance(ep, cs, evt) {
if (this.events.includes('end-utterance')) {
this._performHook(cs, this.eventHook, {event: 'end-utterance'});
}
}
/**
* Dialogflow has returned an error of some kind.
* Dialogflow has returned an error.
* @param {*} ep - media server endpoint
* @param {*} cs - call session
* @param {*} evt - event data
*/
_onError(ep, cs, evt) {
@@ -372,70 +412,87 @@ class Dialogflow extends Task {
/**
* Audio has been received from dialogflow and written to a temporary disk file.
* Start playing the audio, after killing any filler sound that might be playing.
* When the audio completes, start the no-input timer.
* Play the audio, then restart or hang up as appropriate.
* @param {*} ep - media server endpoint
* @param {*} cs - call session
* @param {*} evt - event data
*/
async _onAudioProvided(ep, cs, evt) {
if (this.vendor) return;
this.waitingForPlayStart = false;
// kill filler audio
await ep.api('uuid_break', ep.uuid);
// start a new intent, (we want to continue to listen during the audio playback)
// _unless_ we are transferring or ending the session
if (/*this.greetingPlayed &&*/ !this.hangupAfterPlayDone) {
ep.api('dialogflow_start', `${ep.uuid} ${this.project} ${this.lang}`);
// For CX: suppress NO_INPUT reprompt audio and silently restart
if (this._suppressNextCXAudio) {
this._suppressNextCXAudio = false;
ep.api(this.cmd, this._buildStartArgs());
return;
}
this.playInProgress = true;
this.curentAudioFile = evt.path;
this.logger.info(`starting to play ${evt.path}`);
if (this.events.includes('start-play')) {
this._performHook(cs, this.eventHook, {event: 'start-play', data: {path: evt.path}});
}
await ep.play(evt.path);
if (this.events.includes('stop-play')) {
this._performHook(cs, this.eventHook, {event: 'stop-play', data: {path: evt.path}});
}
this.logger.info(`finished ${evt.path}, queued tasks: ${(this.queuedTasks || []).length}`);
if (this.curentAudioFile === evt.path) {
this.playInProgress = false;
if (this.queuedTasks) {
this.logger.debug('finished playing audio and we have queued tasks');
this._redirect(cs, this.queuedTasks);
this.queuedTasks.length = 0;
if (this.vendor) {
if (this.isCX && !this.playInProgress) {
// CX audio arrived but TTS didn't play — fall through to use CX audio
this.logger.info('CX audio provided, TTS vendor did not play - using CX audio');
} else {
return;
}
}
/*
if (!this.inbound && !this.greetingPlayed) {
this.logger.info('finished greeting on outbound call, starting new intent');
this.ep.api('dialogflow_start', `${ep.uuid} ${this.project} ${this.lang}`);
this._cxAudioHandled = true;
this.waitingForPlayStart = false;
await ep.api('uuid_break', ep.uuid);
// ES: start a new intent during playback so we continue to listen
if (!this.hangupAfterPlayDone && this.isES) {
ep.api(this.cmd, this._buildStartArgs());
}
*/
await this._playAndHandlePostPlay(ep, cs, evt.path);
}
/**
* Shared post-play logic for both TTS (_onIntent) and CX audio (_onAudioProvided).
* Plays audio, then either hangs up, redirects, or restarts the dialog.
*/
async _playAndHandlePostPlay(ep, cs, filePath) {
if (this.playInProgress) {
await ep.api('uuid_break', ep.uuid).catch((err) => this.logger.info(err, 'Error killing audio'));
}
this.playInProgress = true;
this.curentAudioFile = filePath;
if (this.events.includes('start-play')) {
this._performHook(cs, this.eventHook, {event: 'start-play', data: {path: filePath}});
}
await ep.play(filePath);
if (this.events.includes('stop-play')) {
this._performHook(cs, this.eventHook, {event: 'stop-play', data: {path: filePath}});
}
if (this.curentAudioFile === filePath) {
this.playInProgress = false;
if (this.queuedTasks) {
this._redirect(cs, this.queuedTasks);
this.queuedTasks = null;
return;
}
}
this.greetingPlayed = true;
if (this.hangupAfterPlayDone) {
this.logger.info('hanging up since intent was marked end interaction and we completed final prompt');
this.logger.info('hanging up after end interaction prompt');
this.performAction({dialogflowResult: 'completed'});
this.notifyTaskDone();
}
else {
// every time we finish playing a prompt, start the no-input timer
// CX: restart to listen for the next utterance
if (this.isCX) {
ep.api(this.cmd, this._buildStartArgs());
}
this._startNoinputTimer(ep, cs);
}
}
/**
* receive a dmtf entry from the caller.
* If we have active dtmf instructions, collect and process accordingly.
* Receive a DTMF entry from the caller.
*/
_onDtmf(ep, cs, evt) {
if (this.digitBuffer) this.digitBuffer.process(evt.dtmf);
@@ -444,41 +501,48 @@ class Dialogflow extends Task {
}
}
_onDtmfEntryComplete(ep, dtmfEntry) {
async _onDtmfEntryComplete(ep, dtmfEntry) {
this.logger.info(`collected dtmf entry: ${dtmfEntry}`);
this.dtmfEntry = dtmfEntry;
this.digitBuffer = null;
// if a final transcription, start a typing sound
if (this.thinkingMusic) {
ep.play(this.thinkingMusic).catch((err) => this.logger.info(err, 'Error playing typing sound'));
}
// kill the current dialogflow, which will result in us getting an immediate intent
ep.api('dialogflow_stop', `${ep.uuid}`)
.catch((err) => this.logger.info(`dialogflow_stop failed: ${err.message}`));
if (this.isCX) {
try {
await ep.api(this.cmdStop, ep.uuid);
} catch (err) {
this.logger.info(err, 'dialogflow_cx_stop failed');
}
ep.api(this.cmd, this._buildStartArgs({text: dtmfEntry}));
} else {
this.dtmfEntry = dtmfEntry;
ep.api(this.cmdStop, `${ep.uuid}`)
.catch((err) => this.logger.info(`dialogflow_stop failed: ${err.message}`));
}
}
/**
* The user has not provided any input for some time.
* Set the 'noinput' member to true and kill the current dialogflow.
* This will result in us re-prompting with an event indicating no input.
* @param {*} ep
*/
_onNoInput(ep, cs) {
this.noinput = true;
async _onNoInput(ep, cs) {
this.logger.info('no-input timer fired');
if (this.events.includes('no-input')) {
this._performHook(cs, this.eventHook, {event: 'no-input'});
this._performHook(cs, this.eventHook, {event: 'no-input'});
}
// kill the current dialogflow, which will result in us getting an immediate intent
ep.api('dialogflow_stop', `${ep.uuid}`)
.catch((err) => this.logger.info(`dialogflow_stop failed: ${err.message}`));
if (this.isCX) {
try {
await ep.api(this.cmdStop, ep.uuid);
} catch (err) {
this.logger.info(err, 'dialogflow_cx_stop failed');
}
ep.api(this.cmd, this._buildStartArgs({event: this.noInputEvent}));
} else {
this.noinput = true;
ep.api(this.cmdStop, `${ep.uuid}`)
.catch((err) => this.logger.info(`dialogflow_stop failed: ${err.message}`));
}
}
/**
* Stop the no-input timer, if it is running
*/
_clearNoinputTimer() {
if (this.noinputTimer) {
clearTimeout(this.noinputTimer);
@@ -486,10 +550,6 @@ class Dialogflow extends Task {
}
}
/**
* Start the no-input timer. The duration is set in the configuration file.
* @param {*} ep
*/
_startNoinputTimer(ep, cs) {
if (!this.noInputTimeout) return;
this._clearNoinputTimer();
@@ -507,7 +567,7 @@ class Dialogflow extends Task {
if (tasks && tasks.length > 0) {
if (this.playInProgress) {
this.queuedTasks = tasks;
this.logger.info({tasks: tasks},
this.logger.info({tasks},
`${this.name} replacing application with ${tasks.length} tasks after play completes`);
return;
}
@@ -517,7 +577,7 @@ class Dialogflow extends Task {
}
_redirect(cs, tasks) {
this.logger.info({tasks: tasks}, `${this.name} replacing application with ${tasks.length} tasks`);
this.logger.info({tasks}, `${this.name} replacing application with ${tasks.length} tasks`);
this.performAction({dialogflowResult: 'redirect'}, false);
this.reportedFinalAction = true;
cs.replaceApplication(tasks);

View File

@@ -3,20 +3,44 @@ class Intent {
this.logger = logger;
this.evt = evt;
this.logger.debug({evt}, 'intent');
this.dtmfRequest = checkIntentForDtmfEntry(logger, evt);
this.qr = this.isCX ? evt.detect_intent_response.query_result : evt.query_result;
this.dtmfRequest = this._checkIntentForDtmfEntry();
}
get response_id() {
return this.isCX ? this.evt.detect_intent_response.response_id : this.evt.response_id;
}
get isEmpty() {
return this.evt.response_id.length === 0;
return !(this.response_id?.length > 0);
}
get fulfillmentText() {
return this.evt.query_result.fulfillment_text;
if (this.isCX) {
if (this.qr && this.qr.response_messages) {
for (const msg of this.qr.response_messages) {
if (msg.text && msg.text.text && msg.text.text.length > 0) {
return msg.text.text.join('\n');
}
if (msg.output_audio_text) {
if (msg.output_audio_text.text) return msg.output_audio_text.text;
if (msg.output_audio_text.ssml) return msg.output_audio_text.ssml;
}
}
}
return undefined;
}
return this.qr.fulfillment_text;
}
get saysEndInteraction() {
return this.evt.query_result.intent.end_interaction ;
if (this.isCX) {
if (!this.qr || !this.qr.response_messages) return false;
const end_interaction = this.qr.response_messages
.find((m) => typeof m === 'object' && 'end_interaction' in m)?.end_interaction;
return end_interaction && Object.keys(end_interaction).length > 0;
}
return this.qr.intent.end_interaction;
}
get saysCollectDtmf() {
@@ -28,7 +52,23 @@ class Intent {
}
get name() {
if (!this.isEmpty) return this.evt.query_result.intent.display_name;
if (!this.isEmpty) {
if (this.isCX) {
return this.qr.match?.intent?.display_name;
}
return this.qr.intent.display_name;
}
}
get isCX() {
return typeof this.evt.detect_intent_response === 'object';
}
get isNoInput() {
if (this.isCX && this.qr && this.qr.match) {
return this.qr.match.match_type === 'NO_INPUT';
}
return false;
}
toJSON() {
@@ -38,52 +78,48 @@ class Intent {
};
}
/**
* Parse a returned intent for DTMF entry information (ES only).
* CX does not use fulfillment_messages or output_contexts.
*
* allow-dtmf-x-y-z
* x = min number of digits
* y = optional, max number of digits
* z = optional, terminating character
*/
_checkIntentForDtmfEntry() {
if (this.isCX) return;
const qr = this.qr;
if (!qr || !qr.fulfillment_messages || !qr.output_contexts) {
return;
}
// check for custom payloads with a gather verb
const custom = qr.fulfillment_messages.find((f) => f.payload && f.payload.verb === 'gather');
if (custom) {
this.logger.info({custom}, 'found dtmf custom payload');
return {
max: custom.payload.numDigits,
term: custom.payload.finishOnKey,
template: custom.payload.responseTemplate
};
}
// check for an output context with a specific naming convention
const context = qr.output_contexts.find((oc) => oc.name.includes('/contexts/allow-dtmf-'));
if (context) {
const arr = /allow-dtmf-(\d+)(?:-(\d+))?(?:-(.*))?/.exec(context.name);
if (arr) {
this.logger.info('found dtmf output context');
return {
min: parseInt(arr[1]),
max: arr.length > 2 ? parseInt(arr[2]) : null,
term: arr.length > 3 ? arr[3] : null
};
}
}
}
}
module.exports = Intent;
/**
* Parse a returned intent for DTMF entry information
* i.e.
* allow-dtmf-x-y-z
* x = min number of digits
* y = optional, max number of digits
* z = optional, terminating character
* e.g.
* allow-dtmf-5 : collect 5 digits
* allow-dtmf-1-4 : collect between 1 to 4 (inclusive) digits
* allow-dtmf-1-4-# : collect 1-4 digits, terminating if '#' is entered
* @param {*} intent - dialogflow intent
*/
const checkIntentForDtmfEntry = (logger, intent) => {
const qr = intent.query_result;
if (!qr || !qr.fulfillment_messages || !qr.output_contexts) {
logger.info({f: qr.fulfillment_messages, o: qr.output_contexts}, 'no dtmfs');
return;
}
// check for custom payloads with a gather verb
const custom = qr.fulfillment_messages.find((f) => f.payload && f.payload.verb === 'gather');
if (custom && custom.payload && custom.payload.verb === 'gather') {
logger.info({custom}, 'found dtmf custom payload');
return {
max: custom.payload.numDigits,
term: custom.payload.finishOnKey,
template: custom.payload.responseTemplate
};
}
// check for an output context with a specific naming convention
const context = qr.output_contexts.find((oc) => oc.name.includes('/contexts/allow-dtmf-'));
if (context) {
const arr = /allow-dtmf-(\d+)(?:-(\d+))?(?:-(.*))?/.exec(context.name);
if (arr) {
logger.info({custom}, 'found dtmf output context');
return {
min: parseInt(arr[1]),
max: arr.length > 2 ? parseInt(arr[2]) : null,
term: arr.length > 3 ? arr[3] : null
};
}
}
};

View File

@@ -152,9 +152,17 @@ class TaskListen extends Task {
async _startListening(cs, ep) {
this._initListeners(ep);
const ci = this.nested ? this.parentTask.sd.callInfo : cs.callInfo.toJSON();
const tempci = this.nested ? this.parentTask.sd.callInfo : cs.callInfo.toJSON();
const ci = structuredClone(tempci);
if (this._ignoreCustomerData) {
delete ci.customerData;
} else {
for (const key in ci.customerData) {
if (ci.customerData.hasOwnProperty(key)) {
const value = ci.customerData[key];
ci.customerData[key] = typeof value === 'string' ? escapeString(value) : value;
}
}
}
const metadata = Object.assign(
{sampleRate: this.sampleRate, mixType: this.mixType},

View File

@@ -36,6 +36,9 @@ class TaskLlmGoogle_S2S extends Task {
this.model = this.parent.model || 'models/gemini-2.0-flash-live-001';
this.auth = this.parent.auth;
this.connectionOptions = this.parent.connectOptions;
const {host, version} = this.connectionOptions || {};
this.host = host;
this.version = version;
const {apiKey} = this.auth || {};
if (!apiKey) throw new Error('auth.apiKey is required for Google S2S');
@@ -46,7 +49,7 @@ class TaskLlmGoogle_S2S extends Task {
this.eventHook = this.data.eventHook;
this.toolHook = this.data.toolHook;
const {setup} = this.data.llmOptions;
const {setup, sessionResumption} = this.data.llmOptions;
if (typeof setup !== 'object') {
throw new Error('llmOptions with an initial setup is required for Google S2S');
@@ -54,6 +57,7 @@ class TaskLlmGoogle_S2S extends Task {
this.setup = {
...setup,
model: this.model,
...(sessionResumption && {sessionResumption}),
// make sure output is always audio
generationConfig: {
...(setup.generationConfig || {}),
@@ -138,6 +142,10 @@ class TaskLlmGoogle_S2S extends Task {
try {
const args = [ep.uuid, 'session.create', this.apiKey];
if (this.host) {
args.push(this.host);
if (this.version) args.push(this.version);
}
await this._api(ep, args);
} catch (err) {
this.logger.error({err}, 'TaskLlmGoogle_S2S:_startListening');

View File

@@ -1,7 +1,6 @@
const Task = require('./task');
const {TaskName} = require('../utils/constants');
const WsRequestor = require('../utils/ws-requestor');
const URL = require('url');
const HttpRequestor = require('../utils/http-requestor');
/**
@@ -10,6 +9,7 @@ const HttpRequestor = require('../utils/http-requestor');
class TaskRedirect extends Task {
constructor(logger, opts) {
super(logger, opts);
this.statusHook = opts.statusHook || false;
}
get name() { return TaskName.Redirect; }
@@ -47,6 +47,30 @@ class TaskRedirect extends Task {
}
}
}
/* update the notifier if a new statusHook was provided */
if (this.statusHook) {
this.logger.info(`TaskRedirect updating statusHook to ${this.statusHook}`);
try {
const oldNotifier = cs.application.notifier;
const isStatusHookAbsolute = cs.notifier?._isAbsoluteUrl(this.statusHook);
if (isStatusHookAbsolute) {
if (cs.notifier instanceof WsRequestor) {
cs.application.notifier = new WsRequestor(this.logger, cs.accountSid, {url: this.statusHook},
cs.accountInfo.account.webhook_secret);
} else {
cs.application.notifier = new HttpRequestor(this.logger, cs.accountSid, {url: this.statusHook},
cs.accountInfo.account.webhook_secret);
}
if (oldNotifier?.close) oldNotifier.close();
}
/* update the call_status_hook URL that gets passed to the notifier */
cs.application.call_status_hook = this.statusHook;
} catch (err) {
this.logger.info(err, `TaskRedirect error updating statusHook to ${this.statusHook}`);
}
}
await this.performAction();
}
}

View File

@@ -31,8 +31,9 @@ class TtsTask extends Task {
this.synthesizer = this.data.synthesizer || {};
this.disableTtsCache = this.data.disableTtsCache;
this.options = this.synthesizer.options || {};
this.instructions = this.data.instructions;
this.instructions = this.data.instructions || this.options.instructions;
this.playbackIds = [];
this.useGeminiTts = this.options.useGeminiTts;
}
getPlaybackId(offset) {
@@ -156,6 +157,13 @@ class TtsTask extends Task {
...(reduceLatency && {RIMELABS_TTS_STREAMING_REDUCE_LATENCY: reduceLatency})
};
break;
case 'google':
obj = {
GOOGLE_TTS_LANGUAGE_CODE: language,
GOOGLE_TTS_VOICE_NAME: voice,
GOOGLE_APPLICATION_CREDENTIALS: JSON.stringify(credentials.credentials)
};
break;
default:
if (vendor.startsWith('custom:')) {
const use_tls = custom_tts_streaming_url.startsWith('wss://');
@@ -242,6 +250,8 @@ class TtsTask extends Task {
}
} else if (vendor === 'cartesia') {
credentials.model_id = this.options.model_id || credentials.model_id;
} else if (vendor === 'google') {
this.model = this.options.model || credentials.credentials.model_id;
}
this.model_id = credentials.model_id;

View File

@@ -118,6 +118,13 @@ class ActionHookDelayProcessor extends Emitter {
this.logger.debug('ActionHookDelayProcessor#_onNoResponseTimer');
this._noResponseTimer = null;
/* check if endpoint is still available (call may have ended) */
if (!this.ep) {
this.logger.debug('ActionHookDelayProcessor#_onNoResponseTimer: endpoint is null, call may have ended');
this._active = false;
return;
}
/* get the next play or say action */
const verb = this.actions[this._retryCount % this.actions.length];
@@ -129,8 +136,8 @@ class ActionHookDelayProcessor extends Emitter {
this._taskInProgress.exec(this.cs, {ep: this.ep}).catch((err) => {
this.logger.info(`ActionHookDelayProcessor#_onNoResponseTimer: error playing file: ${err.message}`);
this._taskInProgress = null;
this.ep.removeAllListeners('playback-start');
this.ep.removeAllListeners('playback-stop');
this.ep?.removeAllListeners('playback-start');
this.ep?.removeAllListeners('playback-stop');
});
} catch (err) {
this.logger.info(err, 'ActionHookDelayProcessor#_onNoResponseTimer: error starting action');

View File

@@ -311,6 +311,11 @@
"ConnectFailure": "deepgram_tts_streaming::connect_failed",
"Connect": "deepgram_tts_streaming::connect"
},
"GoogleTtsStreamingEvents": {
"Empty": "google_tts_streaming::empty",
"ConnectFailure": "google_tts_streaming::connect_failed",
"Connect": "google_tts_streaming::connect"
},
"CartesiaTtsStreamingEvents": {
"Empty": "cartesia_tts_streaming::empty",
"ConnectFailure": "cartesia_tts_streaming::connect_failed",

View File

@@ -1310,6 +1310,9 @@ module.exports = (logger) => {
...(openaiOptions.turn_detection.silence_duration_ms && {
OPENAI_TURN_DETECTION_SILENCE_DURATION_MS: openaiOptions.turn_detection.silence_duration_ms
}),
...(openaiOptions.turn_detection.eagerness && {
OPENAI_TURN_DETECTION_EAGERNESS: openaiOptions.turn_detection.eagerness
})
};
}
}
@@ -1375,7 +1378,9 @@ module.exports = (logger) => {
speechmaticsOptions.transcription_config.audio_filtering_config.volume_threshold}),
...(speechmaticsOptions.transcription_config?.transcript_filtering_config?.remove_disfluencies &&
{SPEECHMATICS_REMOVE_DISFLUENCIES:
speechmaticsOptions.transcription_config.transcript_filtering_config.remove_disfluencies})
speechmaticsOptions.transcription_config.transcript_filtering_config.remove_disfluencies}),
SPEECHMATICS_END_OF_UTTERANCE_SILENCE_TRIGGER:
speechmaticsOptions.transcription_config?.conversation_config?.end_of_utterance_silence_trigger || 0.5
};
}
else if (vendor.startsWith('custom:')) {

View File

@@ -421,6 +421,7 @@ class TtsStreamingBuffer extends Emitter {
'cartesia',
'elevenlabs',
'rimelabs',
'google',
'custom'
].forEach((vendor) => {
const eventClassName = `${vendor.charAt(0).toUpperCase() + vendor.slice(1)}TtsStreamingEvents`;

1973
package-lock.json generated

File diff suppressed because it is too large Load Diff

View File

@@ -31,10 +31,10 @@
"@jambonz/http-health-check": "^0.0.1",
"@jambonz/mw-registrar": "^0.2.7",
"@jambonz/realtimedb-helpers": "^0.8.15",
"@jambonz/speech-utils": "^0.2.26",
"@jambonz/speech-utils": "^0.2.30",
"@jambonz/stats-collector": "^0.1.10",
"@jambonz/time-series": "^0.2.15",
"@jambonz/verb-specifications": "^0.0.123",
"@jambonz/verb-specifications": "^0.0.125",
"@modelcontextprotocol/sdk": "^1.9.0",
"@opentelemetry/api": "^1.8.0",
"@opentelemetry/exporter-jaeger": "^1.23.0",
@@ -49,7 +49,7 @@
"debug": "^4.3.4",
"deepcopy": "^2.1.0",
"drachtio-fsmrf": "^4.1.2",
"drachtio-srf": "^5.0.14",
"drachtio-srf": "^5.0.18",
"express": "^4.19.2",
"express-validator": "^7.0.1",
"moment": "^2.30.1",