Improving Spoken Question Answering Using Contextualized Word Representation

Dan Su, Pascale Fung

Research output: Chapter in Book/Conference Proceeding/ReportConference Paper published in a bookpeer-review

Abstract

While question answering (QA) systems have witnessed great breakthroughs in reading comprehension (RC) tasks, spoken question answering (SQA) is still a much less investigated area. Previous work shows that existing SQA systems are limited by catastrophic impact of automatic speech recognition (ASR) errors [1] and the lack of large-scale real SQA datasets [2]. In this paper, we propose using contextualized word representations to mitigate the effects of ASR errors and pretraining on existing textual QA datasets to mitigate the data scarcity issue. New state-of-the-art results have been achieved using contextualized word representations on both the artificially synthesised and real SQA benchmark data sets, with 21.5 EM/18.96 F1 score improvement over the sub-word unit based baseline on the Spoken-SQuAD [1] data, and 13.11 EM/10.99 F1 score improvement on the ODSQA data [2]. By further fine-tuning pre-trained models with existing large scaled textual QA data, we obtained 38.12 EM/34.1 F1 improvement over the baseline of fine-tuned only on small sized real SQA data.

Original languageEnglish
Title of host publication2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages8004-8008
Number of pages5
ISBN (Electronic)9781509066315
DOIs
Publication statusPublished - May 2020
Event2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Barcelona, Spain
Duration: 4 May 20208 May 2020

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2020-May
ISSN (Print)1520-6149

Conference

Conference2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020
Country/TerritorySpain
CityBarcelona
Period4/05/208/05/20

Bibliographical note

Publisher Copyright:
© 2020 IEEE.

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 4 - Quality Education
    SDG 4 Quality Education

Keywords

  • BERT
  • contextual word representation
  • spoken question answering
  • supervised pre-training

Fingerprint

Dive into the research topics of 'Improving Spoken Question Answering Using Contextualized Word Representation'. Together they form a unique fingerprint.

Cite this