Compare commits

..

5 Commits

Author SHA1 Message Date
Dave Horton
c3e5ffa52d bugfix: transcribe of a dialed call can now occur on both legs 2022-05-15 13:45:55 -04:00
Dave Horton
0ee13fb794 minor docs 2022-05-13 21:34:35 -04:00
Dave Horton
4e84098036 Docs folder (#108)
* add how-to for developers

* fix links

* minor docs cleanup
2022-05-13 21:30:04 -04:00
Dave Horton
6d34850dc6 bugfix: transcribe Azure interim transcripts were missing 2022-05-11 19:22:14 -04:00
Dave Horton
76ff1835a6 background gather listen only once for vad and other interrupt events 2022-05-11 09:21:54 -04:00
5 changed files with 159 additions and 24 deletions

View File

@@ -2,6 +2,8 @@
This application implements the core feature server of the jambones platform.
> Note: If you are a developer looking to work on the code please read our [how-to for that](./docs/contributing.md).
## Configuration
Configuration is provided via environment variables:

121
docs/contributing.md Normal file
View File

@@ -0,0 +1,121 @@
# Contributors are welcome!
So, you want to hack on jambonz? Maybe add some features, maybe help fix some bugs? Awesome, welcome aboard!
This brief document should get you started. Here you will find instructions showing how to set up your laptop to run the regression test suite (which you should always run before committing any changes), as well as some basic info on the structure of the code.
## Getting oriented
First of all, you are in the right place to begin hacking on jambonz. The jambonz-feature-server app is kinda the center of the universe for jambonz. Most of the core logic in jambonz is implemented here: things like the [webhook verbs](../lib/tasks), [session management](../lib/session), and the [client-side webhook implementation](../lib/utils/http-requestor.js). A common thing you might want to do, for instance, is to add support for an all-new verb, and this code base is where would do that.
This jambonz-feature-server app works together quite closely with a [drachtio server](https://github.com/drachtio/drachtio-server) and a Freeswitch. In fact, these three components are bundled together into a single VM/instance (or a Deployment, in Kubernetes) that we more generally refer to as "Feature Server". The Feature Server is a horizontally-scalable unit that is deployed behind the public-facing SBC elements of a jambonz cluster (the SBC is itself a separately scalable unit). The drachtio-server handles the SIP signaling, the Freeswitch handles media operations and speech vendor integration, and the jambonz-feature-server app orchestrates all of it via the use of [drachtio-srf](https://github.com/drachtio/drachtio-srf) and [drachtio-fsmrf](https://github.com/drachtio/drachtio-fsmrf).
## How to do things
First of all, please join our [slack channel](https://joinslack.jambonz.org) in order to coordinate with us on the work, i.e. to notify us of what you are doing and make sure that no one else is already working on the same thing.
To prepare to make changes, please fork the repo to your own Github account, make changes, test them on your own running jambonz cluster, then run the regression test suite and lint check before giving us a PR.
### lint
We have some opinionated conventions that you must follow - see our [eslintrc.json](../.eslintrc.json) for details. Make sure your code passes by running:
```bash
npm run jslint
```
### test suite
#### Generate speech credentials and create run-tests.sh
The test suite also requires you to provide speech credentials for both GCP and AWS. You will want to create a new file named `run-tests.sh` in the project folder. Make the file executable and then copy in the text below, substituting your speech credentials where indicated:
```bash
#!/bin/bash
GCP_JSON_KEY='{"type":"service_account","project_id":"...etc"}' \
AWS_ACCESS_KEY_ID='your-aws-access-key-id' \
AWS_SECRET_ACCESS_KEY='your-aws-secret-access-key' \
AWS_REGION='us-east-1' \
JWT_SECRET='foobar' \
npm test
```
>> Note: The project's .gitignore file prevents this file from being sent to Github, so you do not need to worry about exposing your credentials. Just make sure you name if run-tests.sh and create it in the project folder
The GCP credential is the JSON service key in stringified format.
#### Install Docker
The test suite ralso equires [Docker](https://www.docker.com/) and docker-compose to be installed on your laptop. Docker is used to set up a network with all of the elements required to test the jambonz-feature-server in a black-box type of fashion.
Once you have docker installed, you can optionally make sure everything Docker-wise is working properly by running this command from the project folder:
```bash
docker-compose -f test/docker-compose-testbed.yaml up -d
```
This may take several minutes to complete, mainly because the mysql schema needs to be installed and seeded, but if successful the output should look like this:
```bash
$ docker-compose -f test/docker-compose-testbed.yaml up -d
Creating network "test_fs" with driver "bridge"
Creating test_webhook-transcribe_1 ... done
Creating test_webhook-decline_1 ... done
Creating test_mysql_1 ... done
Creating test_docker-host_1 ... done
Creating test_webhook-gather_1 ... done
Creating test_webhook-say_1 ... done
Creating test_freeswitch_1 ... done
Creating test_influxdb_1 ... done
Creating test_redis_1 ... done
Creating test_drachtio_1 ... done
```
At that point, you can run `docker ps` to see all of the containers running
```bash
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
abbc3594f390 drachtio/drachtio-server:latest "/entrypoint.sh drac…" About a minute ago Up About a minute 0.0.0.0:9060->9022/tcp test_drachtio_1
1f384a274f87 redis:5-alpine "docker-entrypoint.s…" 2 minutes ago Up 2 minutes 0.0.0.0:16379->6379/tcp test_redis_1
78d0bb6ec9b1 influxdb:1.8 "/entrypoint.sh infl…" 2 minutes ago Up 2 minutes 0.0.0.0:8086->8086/tcp test_influxdb_1
9616ff790709 jambonz/webhook-test-scaffold:latest "/entrypoint.sh" 2 minutes ago Up 2 minutes 0.0.0.0:3102->3000/tcp test_webhook-gather_1
7323ab273ff4 drachtio/drachtio-freeswitch-mrf:v1.10.1-full "/entrypoint.sh free…" 2 minutes ago Up 2 minutes (healthy) 0.0.0.0:8022->8021/tcp test_freeswitch_1
e45e7d28dbc7 mysql:5.7 "docker-entrypoint.s…" 2 minutes ago Up 2 minutes (healthy) 33060/tcp, 0.0.0.0:3360->3306/tcp test_mysql_1
b626e5f3067e qoomon/docker-host "/entrypoint.sh" 2 minutes ago Up 2 minutes test_docker-host_1
b0a94b5e8941 jambonz/webhook-test-scaffold:latest "/entrypoint.sh" 2 minutes ago Up 2 minutes 0.0.0.0:3101->3000/tcp test_webhook-say_1
f80adda48eb5 jambonz/webhook-test-scaffold:latest "/entrypoint.sh" 2 minutes ago Up 2 minutes 0.0.0.0:3103->3000/tcp test_webhook-transcribe_1
223db4a9c670 jambonz/webhook-test-scaffold:latest "/entrypoint.sh" 2 minutes ago Up 2 minutes 0.0.0.0:3100->3000/tcp test_webhook-decline_1
```
#### Run the regression test suite
At this point you should be able to run the tests:
```bash
./run-tests.sh
```
If the docker network has not been started (as described above) it will start now, and this will take a minute or two. Otherwise, the test suite will start running immediately.
In evaluating the test results, be advised that the output is fairly verbose, and also in the process of shutting down once the tests are complete you will see a bunch of errors from redis (`@jambonz/realtimedb-helpers - redis error`). You can ignore these errors, they are just spit out by jambonz-feature-server as the test environment is torn down and it tries and fails to reconnect to redis.
The final output will indicate the number of tests run and passed:
```bash
1..28
# tests 28
# pass 28
# ok
```
#### Adding your own tests
Running a successful regression test means you haven't broken anything - Great!
It doesn't, of course, mean that your shiny new feature or bugfix works. Adding a new test case to the suite is (unfortunately) non-trivial. We will add more documentation in the future with a how-to guide on that, but be advised it does require knowledge of the SIP protocol and the [SIPp](http://sipp.sourceforge.net/doc/reference.html) tool.
For now, if you are unable to add tests to the regression suite, please do test your feature as thoroughly as you can on your own jambonz cluster before giving us a pull request.

View File

@@ -239,10 +239,10 @@ class CallSession extends Emitter {
const t = normalizeJambones(this.logger, [gather]);
this.backgroundGatherTask = makeTask(this.logger, t[0]);
this.backgroundGatherTask
.on('dtmf', this._clearTasks.bind(this))
.on('vad', this._clearTasks.bind(this))
.on('transcription', this._clearTasks.bind(this))
.on('timeout', this._clearTasks.bind(this));
.once('dtmf', this._clearTasks.bind(this))
.once('vad', this._clearTasks.bind(this))
.once('transcription', this._clearTasks.bind(this))
.once('timeout', this._clearTasks.bind(this));
this.logger.info({gather}, 'CallSession:enableBotMode - starting background gather');
const resources = await this._evaluatePreconditions(this.backgroundGatherTask);
const {span, ctx} = this.rootSpan.startChildSpan(`background-gather:${this.backgroundGatherTask.summary}`);

View File

@@ -606,7 +606,7 @@ class TaskDial extends Task {
if (this.parentDtmfCollector) this._installDtmfDetection(cs, cs.dlg);
if (this.childDtmfCollector) this._installDtmfDetection(cs, this.dlg);
if (this.transcribeTask) this.transcribeTask.exec(cs, this.epOther);
if (this.transcribeTask) this.transcribeTask.exec(cs, this.epOther, this.ep);
if (this.listenTask) this.listenTask.exec(cs, this.epOther);
/* if we can release the media back to the SBC, do so now */

View File

@@ -58,11 +58,12 @@ class TaskTranscribe extends Task {
get name() { return TaskName.Transcribe; }
async exec(cs, ep, parentTask) {
async exec(cs, ep, ep2) {
super.exec(cs);
const {updateSpeechCredentialLastUsed} = require('../utils/db-utils')(this.logger, cs.srf);
this.ep = ep;
this.ep2 = ep2;
if ('default' === this.vendor || !this.vendor) this.vendor = cs.speechRecognizerVendor;
if ('default' === this.language || !this.language) this.language = cs.speechRecognizerLanguage;
this.sttCredentials = cs.getSpeechCredentials(this.vendor, 'stt');
@@ -78,7 +79,9 @@ class TaskTranscribe extends Task {
}).catch((err) => this.logger.info({err}, 'Error generating alert for no stt'));
throw new Error('no provisioned speech credentials for TTS');
}
await this._startTranscribing(cs, ep);
await this._startTranscribing(cs, ep, 1);
if (this.separateRecognitionPerChannel && ep2) await this._startTranscribing(cs, ep2, 2);
updateSpeechCredentialLastUsed(this.sttCredentials.speech_credential_sid)
.catch(() => {/*already logged error */});
@@ -106,11 +109,15 @@ class TaskTranscribe extends Task {
// hangup after 1 sec if we don't get a final transcription
this._timer = setTimeout(() => this.notifyTaskDone(), 1000);
}
if (this.separateRecognitionPerChannel && this.ep2 && this.ep2.connected) {
this.ep2.stopTranscription({vendor: this.vendor})
.catch((err) => this.logger.info(err, 'Error TaskTranscribe:kill'));
}
else this.notifyTaskDone();
await this.awaitTaskDone();
}
async _startTranscribing(cs, ep) {
async _startTranscribing(cs, ep, channel) {
const opts = {};
if (this.vad.enable) {
@@ -119,22 +126,24 @@ class TaskTranscribe extends Task {
if (this.vad.mode >= 0 && this.vad.mode <= 3) opts.RECOGNIZER_VAD_MODE = this.vad.mode;
}
ep.addCustomEventListener(GoogleTranscriptionEvents.Transcription, this._onTranscription.bind(this, cs, ep));
ep.addCustomEventListener(GoogleTranscriptionEvents.NoAudioDetected, this._onNoAudio.bind(this, cs, ep));
ep.addCustomEventListener(GoogleTranscriptionEvents.Transcription,
this._onTranscription.bind(this, cs, ep, channel));
ep.addCustomEventListener(GoogleTranscriptionEvents.NoAudioDetected, this._onNoAudio.bind(this, cs, ep, channel));
ep.addCustomEventListener(GoogleTranscriptionEvents.MaxDurationExceeded,
this._onMaxDurationExceeded.bind(this, cs, ep));
ep.addCustomEventListener(AwsTranscriptionEvents.Transcription, this._onTranscription.bind(this, cs, ep));
ep.addCustomEventListener(AwsTranscriptionEvents.NoAudioDetected, this._onNoAudio.bind(this, cs, ep));
this._onMaxDurationExceeded.bind(this, cs, ep, channel));
ep.addCustomEventListener(AwsTranscriptionEvents.Transcription, this._onTranscription.bind(this, cs, ep, channel));
ep.addCustomEventListener(AwsTranscriptionEvents.NoAudioDetected, this._onNoAudio.bind(this, cs, ep, channel));
ep.addCustomEventListener(AwsTranscriptionEvents.MaxDurationExceeded,
this._onMaxDurationExceeded.bind(this, cs, ep));
ep.addCustomEventListener(AzureTranscriptionEvents.Transcription, this._onTranscription.bind(this, cs, ep));
ep.addCustomEventListener(AzureTranscriptionEvents.NoSpeechDetected, this._onNoAudio.bind(this, cs, ep));
this._onMaxDurationExceeded.bind(this, cs, ep, channel));
ep.addCustomEventListener(AzureTranscriptionEvents.Transcription,
this._onTranscription.bind(this, cs, ep, channel));
ep.addCustomEventListener(AzureTranscriptionEvents.NoSpeechDetected, this._onNoAudio.bind(this, cs, ep, channel));
if (this.vendor === 'google') {
if (this.sttCredentials) opts.GOOGLE_APPLICATION_CREDENTIALS = JSON.stringify(this.sttCredentials.credentials);
[
['enhancedModel', 'GOOGLE_SPEECH_USE_ENHANCED'],
['separateRecognitionPerChannel', 'GOOGLE_SPEECH_SEPARATE_RECOGNITION_PER_CHANNEL'],
//['separateRecognitionPerChannel', 'GOOGLE_SPEECH_SEPARATE_RECOGNITION_PER_CHANNEL'],
['profanityFilter', 'GOOGLE_SPEECH_PROFANITY_FILTER'],
['punctuation', 'GOOGLE_SPEECH_ENABLE_AUTOMATIC_PUNCTUATION'],
['words', 'GOOGLE_SPEECH_ENABLE_WORD_TIME_OFFSETS'],
@@ -222,12 +231,12 @@ class TaskTranscribe extends Task {
vendor: this.vendor,
interim: this.interim ? true : false,
locale: this.language,
channels: this.separateRecognitionPerChannel ? 2 : 1
channels: /*this.separateRecognitionPerChannel ? 2 : */ 1
});
}
_onTranscription(cs, ep, evt) {
this.logger.debug({evt}, 'TaskTranscribe:_onTranscription');
_onTranscription(cs, ep, channel, evt) {
this.logger.debug({evt, channel}, 'TaskTranscribe:_onTranscription');
if ('aws' === this.vendor && Array.isArray(evt) && evt.length > 0) evt = evt[0];
if ('microsoft' === this.vendor) {
const nbest = evt.NBest;
@@ -246,6 +255,7 @@ class TaskTranscribe extends Task {
const newEvent = {
is_final: evt.RecognitionStatus === 'Success',
channel,
language_code,
alternatives
};
@@ -257,6 +267,8 @@ class TaskTranscribe extends Task {
return this._transcribe(ep);
}
evt.channel_tag = channel;
if (this.transcriptionHook) {
const b3 = this.getTracingPropagation();
const httpHeaders = b3 && {b3};
@@ -274,13 +286,13 @@ class TaskTranscribe extends Task {
}
}
_onNoAudio(cs, ep) {
this.logger.debug('TaskTranscribe:_onNoAudio restarting transcription');
_onNoAudio(cs, ep, channel) {
this.logger.debug(`TaskTranscribe:_onNoAudio restarting transcription on channel ${channel}`);
this._transcribe(ep);
}
_onMaxDurationExceeded(cs, ep) {
this.logger.debug('TaskTranscribe:_onMaxDurationExceeded restarting transcription');
_onMaxDurationExceeded(cs, ep, channel) {
this.logger.debug(`TaskTranscribe:_onMaxDurationExceeded restarting transcription on channel ${channel}`);
this._transcribe(ep);
}