Abstract
While question answering (QA) systems have witnessed great breakthroughs in reading comprehension (RC) tasks, spoken question answering (SQA) is still a much less investigated area. Previous work shows that existing SQA systems are limited by catastrophic impact of automatic speech recognition (ASR) errors [1] and the lack of large-scale real SQA datasets [2]. In this paper, we propose using contextualized word representations to mitigate the effects of ASR errors and pretraining on existing textual QA datasets to mitigate the data scarcity issue. New state-of-the-art results have been achieved using contextualized word representations on both the artificially synthesised and real SQA benchmark data sets, with 21.5 EM/18.96 F1 score improvement over the sub-word unit based baseline on the Spoken-SQuAD [1] data, and 13.11 EM/10.99 F1 score improvement on the ODSQA data [2]. By further fine-tuning pre-trained models with existing large scaled textual QA data, we obtained 38.12 EM/34.1 F1 improvement over the baseline of fine-tuned only on small sized real SQA data.
| Original language | English |
|---|---|
| Title of host publication | 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Proceedings |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 8004-8008 |
| Number of pages | 5 |
| ISBN (Electronic) | 9781509066315 |
| DOIs | |
| Publication status | Published - May 2020 |
| Event | 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Barcelona, Spain Duration: 4 May 2020 → 8 May 2020 |
Publication series
| Name | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
|---|---|
| Volume | 2020-May |
| ISSN (Print) | 1520-6149 |
Conference
| Conference | 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 |
|---|---|
| Country/Territory | Spain |
| City | Barcelona |
| Period | 4/05/20 → 8/05/20 |
Bibliographical note
Publisher Copyright:© 2020 IEEE.
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 4 Quality Education
Keywords
- BERT
- contextual word representation
- spoken question answering
- supervised pre-training
Fingerprint
Dive into the research topics of 'Improving Spoken Question Answering Using Contextualized Word Representation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver