Abstract
Online misogyny is a pernicious social problem that risks making online platforms toxic and unwelcoming to women. We present a new hierarchical taxonomy for online misogyny, as well as an expert labelled dataset to enable automatic classification of misogynistic content. The dataset consists of 6,567 labels for Reddit posts and comments. As previous research has found untrained crowdsourced annotators struggle with identifying misogyny, we hired and trained annotators and provided them with robust annotation guidelines. We report baseline classification performance on the binary classification task, achieving accuracy of 0.93 and F1 of 0.43. The codebook and datasets are made freely available for future researchers.
| Original language | English |
|---|---|
| Title of host publication | EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 1336-1350 |
| Number of pages | 15 |
| ISBN (Electronic) | 9781954085022 |
| Publication status | Published - 2021 |
| Externally published | Yes |
| Event | 16th Conference of the European Chapter of the Associationfor Computational Linguistics, EACL 2021 - Virtual, Online Duration: 19 Apr 2021 → 23 Apr 2021 |
Publication series
| Name | EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference |
|---|
Conference
| Conference | 16th Conference of the European Chapter of the Associationfor Computational Linguistics, EACL 2021 |
|---|---|
| City | Virtual, Online |
| Period | 19/04/21 → 23/04/21 |
Bibliographical note
Publisher Copyright:© 2021 Association for Computational Linguistics