Invited virtual talk for the November Wikimedia Research Showcase (Upcoming).
Description: There are over 7000 languages in the world. Yet, only a few hundred have been studied and represented in the field of Natural Language Processing. As the NLP community increasingly includes the majority of the world’s languages, the term “low-resource” is used to connote languages that have been understudied. However, there is little consensus in the field about what qualifies a language as low-resource. In this talk, we will explore the different aspects of resourcefulness that have been used in the NLP literature to connote a language as low-resource. We will also explore how those aspects interact to shape the realities of languages and their representation in research. Finally, we will discuss how we, as the NLP community, can engage with language speakers to build language technologies that are useful for speakers.