Files

mod_cobalt_transcribe

A Freeswitch module that generates real-time transcriptions on a Freeswitch channel by using the streaming transcription API from Cobalt Speech. Cobalt Speech provides a speech recognition product that can be run on-prem on a Linux server.

API

Commands

The freeswitch module exposes the following API commands:

uuid_cobalt_get_version <uuid> <hostport>

Returns version information about the Cobalt server listening at the specified ip address and port

uuid_cobalt_list_models <uuid> <hostport> 

Lists the available models for a Cobalt speech server

uuid_cobalt_compile_context <uuid> <hostport> <model> <token> <phrases>

Compiles a list of hint phrases into a context string that can later be used in a transcribe command. The context string is returned as a base64-encoded string. Hints must be compiled within the context of a single model, thus it is required to provide the model name. Hints must also be associated with a "token"; the default token that you may generally use is "unk:default". See here for more details.

uuid_cobalt_transcribe <uuid> hostport start model [interim|full] [stereo|mono] [bug-name]

Attaches media bug to channel and performs streaming recognize request.

uuid_cobalt_transcribe <uuid> hostport stop model

Stop transcription on a channel.

Channel Variables

variable Description
COBALT_ENABLE_CONFUSION_NETWORK if true, enable confusion network
COBALT_METADATA custom metadata to send with a transcribe request
COBALT_COMPILED_CONTEXT_DATA base64-encoded compiled context hints to include with the transcribe request

Events

cobalt_speech::transcription - returns an interim or final transcription. The event contains a JSON body describing the transcription result.

cobalt_speech::version_response - returns the response to a uuid_cobalt_get_version request. The event contains a JSON body describing the version.

cobalt_speech::model_list_response - returns the response to a uuid_cobalt_list_models request. The event contains a JSON body describing the available models.

cobalt_speech::compile_context_response - returns the response to a uuid_cobalt_compile_context request. The event contains a JSON body containing the base64-encoded context.