Dialogic Voice Cards - CSP ec_stream does not return: use ec_rearm with caution

Dialogic Voice Cards - CSP ec_stream does not return: use ec_rearm with caution

Symptom

A Dialogic JCT-Seriies Media Board Continuous Speech Processing (CSP) streaming resource dies unexpectedly. “ec_stream” remains in a stuck state and never returns a completion event (TEC_STREAM).

Reason for the issue

This issue may occur if an application calls “ec_rearm” on a CSP resource that is NOT streaming. The next call to ec_stream on that CSP resource will put the resource into a stuck state and never return a completion event (TEC_STREAM).

Solution

The solution for this issue depends on the type of Dialogic product you are using.

For JCT-series products

  1. Application should only call “ec_rearm” on a CSP resource that streaming. 

  2. The state of the CSP resource should be maintained within the host application. The state of the CSP resource cannot be obtained by ATDX_STATE() as this function call only returns the state of the voice resource and NOT the state of the CSP resource.

  3. The termination condition for ec_stream (max silence) should not be too short. User should take into consideration the delta between when NON-SPEECH ends SPEECH begins. This silence period could trigger ec_stream termination condition for max silence (see Technical Discussion below for details).

  4. Should the CSP resource fall into a “stuck state”, the Dialogic drivers will need to be restarted in order to recover.

Discussion

The “ec_rearm( )” function is intended to be used with VAD enabled and barge-in disabled. “ec_rearm( )” temporarily stops streaming of echo-cancelled data from the board and rearms/re-enables the “Voice Activity Detector” (VAD). The playing voice prompt is not affected by this function.

If a VAD event (TEC_VAD) is received due to possible loud back ground noise or a cough (a.k.a. non-speech) and the host Application Speech Recognizer (ASR) determines the energy was non-speech, users will use “ec_rearm” to re-activate the VAD for the next burst of energy as illustrated below in figure 1 (Rearming the VAD).

Figure 1 - Rearming the Voice Activity Detector (VAD)

 

 

 

Figure 1 (Rearming the VAD) illustration, this shows the general use case of “ec_rearm”. In this case, “ec_rearm” is called when the CSP resource is streaming (blue region). Streaming is temporarily stopped and restarted by the driver internally. When the temporary stop of the streaming occurs, the application will not/should not receive a TEC_STREAM event. 
However, if ec_stream termination condition for silence is too short, there is the possibility of a race condition as to when a False VAD trigger is determined by the ASR and when the application issues “ec_rearm” as shown below in Figure 2 (Rearming the VAD caution).

Figure 2 –Rearming the Voice Activity Detector (VAD) caution

 

 

 

Figure 2 (Rearming the VAD caution), “ec_rearm” is called when the CSP resource streaming has stopped (red region). In this example, ec_stream terminationcondition was reached (tpt max_sil=1000) causing the driver to stop the streaming on the CSP and return to the application a TEC_STREAM event. Unfortunately, within the same millisecond of the application receiving the TEC_STREAM event, “ec_rearm” was issued on the CSP resource that has stopped streaming.

Product list

  • Dialogic JCT-series Media Board

GLOSSARY OF ACRONYMS / TERMS

  • ASR – Application/Automatic Speech Recognizer

  • VAD – Voice Activity Detector

  • CSP – Continuous Speech Processing