Existing benchmarks often fail to account for linguistic diversity, like language variants of English. In this paper, we share our experiences from our ongoing project of building a sentiment classification benchmark for three variants of English: …
This paper investigates data sampling strategies to create a benchmark for dialectal sentiment classification of Google Places reviews written in English. Based on location-based filtering, we collect a self-supervised dataset of reviews in …