Detecting Sexism in German Online Newspaper Comments with Open-Source Text Embeddings (Team GDA, GermEval2024 Shared Task 1: GerMS-Detect, Subtasks 1 and 2, Closed Track)

  • Sexism in online media comments is a pervasive challenge that often manifests subtly, complicating moderation efforts as interpretations of what constitutes sexism can vary among individuals. We study monolingual and multilingual open-source text embeddings to reliably detect sexism and misogyny in Germanlanguage online comments from an Austrian newspaper. We observed classifiers trained on text embeddings to mimic closely the individual judgements of human annotators. Our method showed robust performance in the GermEval 2024 GerMS-Detect Subtask 1 challenge, achieving an average macro F1 score of 0.597 (4th place, as reported on Codabench). It also accurately predicted the distribution of human annotations in GerMS-Detect Subtask 2, with an average Jensen-Shannon distance of 0.301 (2nd place). The computational efficiency of our approach suggests potential for scalable applications across various languages and linguistic contexts.

Export metadata

Additional Services

Share in X Search Google Scholar
Metadaten
Author:Florian Bremm, Patrick Gustav Blaneck, Tobias Bornheim, Niklas Grieger, Stephan BialonskiORCiD
DOI:https://doi.org/10.48550/arXiv.2403.08592
Parent Title (German):Proceedings of GermEval 2024 Task 1 GerMS-Detect Workshop on Sexism Detection in German Online News Fora (GerMS-Detect 2024)
Publisher:ACL
Place of publication:Kerrville
Document Type:Conference Proceeding
Language:English
Year of Completion:2024
Date of the Publication (Server):2024/09/20
First Page:33
Last Page:38
Note:
GermEval 2024 Task 1 GerMS-Detect Workshop on Sexism Detection in German Online News Fora (GerMS-Detect 2024), September 2024, Vienna, Austria
Peer Review:Ja
Link:https://aclanthology.org/2024.germeval-2.5
Zugriffsart:weltweit
Institutes:FH Aachen / Fachbereich Medizintechnik und Technomathematik
open_access (DINI-Set):open_access
Open Access / Gold
Licence (German): Creative Commons - Namensnennung