Show simple item record

dc.contributor.authorMontenegro, C.
dc.contributor.authorSantana, R.
dc.contributor.authorLozano, J.A. 
dc.date.accessioned2021-02-05T17:58:12Z
dc.date.available2021-02-05T17:58:12Z
dc.date.issued2021
dc.identifier.issn0952-1976
dc.identifier.urihttp://hdl.handle.net/20.500.11824/1249
dc.description.abstractAn End-Of-Turn Detection Module (EOTD-M) is an essential component of au- tomatic Spoken Dialogue Systems. The capability of correctly detecting whether a user’s utterance has ended or not improves the accuracy in interpreting the meaning of the message and decreases the latency in the answer. Usually, in di- alogue systems, an EOTD-M is coupled with an Automatic Speech Recognition Module (ASR-M) to transmit complete utterances to the Natural Language Un- derstanding unit. Mistakes in the ASR-M transcription can have a strong effect on the performance of the EOTD-M. The actual extent of this effect depends on the particular combination of ASR-M transcription errors and the sentence featurization techniques implemented as part of the EOTD-M. In this paper we investigate this important relationship for an EOTD-M based on semantic information and particular characteristics of the speakers (speech profiles). We introduce an Automatic Speech Recognition Simulator (ASR-SIM) that mod- els different types of semantic mistakes in the ASR-M transcription as well as different speech profiles. We use the simulator to evaluate the sensitivity to ASR-M mistakes of a Long Short-Term Memory network classifier trained in EOTD with different featurization techniques. Our experiments reveal the dif- ferent ways in which the performance of the model is influenced by the ASR-M errors. We corroborate that not only is the ASR-SIM useful to estimate the performance of an EOTD-M in customized noisy scenarios, but it can also be used to generate training datasets with the expected error rates of real working conditions, which leads to better performance.en_US
dc.description.sponsorshipEMPATHIC IT1244-19 TIN2016-78365-R PID2019-104966GB-I00.en_US
dc.formatapplication/pdfen_US
dc.language.isoengen_US
dc.rightsReconocimiento-NoComercial-CompartirIgual 3.0 Españaen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/3.0/es/en_US
dc.subjectSpoken Dialogue Systemsen_US
dc.subjectAutomatic speech recognitionen_US
dc.subjectEnd of turn detectionen_US
dc.subjectNatural language processingen_US
dc.subjectNeural networksen_US
dc.titleAnalysis of the sensitivity of the End-Of-Turn Detection task to errors generated by the Automatic Speech Recognition process.en_US
dc.typeinfo:eu-repo/semantics/articleen_US
dc.relation.projectIDES/1PE/SEV-2017-0718en_US
dc.relation.projectIDEUS/BERC/BERC.2018-2021en_US
dc.relation.projectIDEUS/ELKARTEKen_US
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessen_US
dc.type.hasVersioninfo:eu-repo/semantics/acceptedVersionen_US
dc.journal.titleEngineering Applications of Artificial Intelligenceen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Reconocimiento-NoComercial-CompartirIgual 3.0 España
Except where otherwise noted, this item's license is described as Reconocimiento-NoComercial-CompartirIgual 3.0 España