Show simple item record

dc.contributor.authorMontenegro, C.
dc.contributor.authorSantana, R.
dc.contributor.authorLozano, J.A.
dc.description.abstractAn End-Of-Turn Detection Module (EOTD-M) is an essential component of au- tomatic Spoken Dialogue Systems. The capability of correctly detecting whether a user’s utterance has ended or not improves the accuracy in interpreting the meaning of the message and decreases the latency in the answer. Usually, in di- alogue systems, an EOTD-M is coupled with an Automatic Speech Recognition Module (ASR-M) to transmit complete utterances to the Natural Language Un- derstanding unit. Mistakes in the ASR-M transcription can have a strong effect on the performance of the EOTD-M. The actual extent of this effect depends on the particular combination of ASR-M transcription errors and the sentence featurization techniques implemented as part of the EOTD-M. In this paper we investigate this important relationship for an EOTD-M based on semantic information and particular characteristics of the speakers (speech profiles). We introduce an Automatic Speech Recognition Simulator (ASR-SIM) that mod- els different types of semantic mistakes in the ASR-M transcription as well as different speech profiles. We use the simulator to evaluate the sensitivity to ASR-M mistakes of a Long Short-Term Memory network classifier trained in EOTD with different featurization techniques. Our experiments reveal the dif- ferent ways in which the performance of the model is influenced by the ASR-M errors. We corroborate that not only is the ASR-SIM useful to estimate the performance of an EOTD-M in customized noisy scenarios, but it can also be used to generate training datasets with the expected error rates of real working conditions, which leads to better performance.en_US
dc.description.sponsorshipEMPATHIC IT1244-19 TIN2016-78365-R PID2019-104966GB-I00.en_US
dc.rightsReconocimiento-NoComercial-CompartirIgual 3.0 Españaen_US
dc.subjectSpoken Dialogue Systemsen_US
dc.subjectAutomatic speech recognitionen_US
dc.subjectEnd of turn detectionen_US
dc.subjectNatural language processingen_US
dc.subjectNeural networksen_US
dc.titleAnalysis of the sensitivity of the End-Of-Turn Detection task to errors generated by the Automatic Speech Recognition process.en_US
dc.journal.titleEngineering Applications of Artificial Intelligenceen_US

Files in this item


This item appears in the following Collection(s)

Show simple item record

Reconocimiento-NoComercial-CompartirIgual 3.0 España
Except where otherwise noted, this item's license is described as Reconocimiento-NoComercial-CompartirIgual 3.0 España