January 2023 -- December 2026
Dr. Agnieszka Faleńska
The immense influence of NLP systems on human lives raises increasing concerns about the possible harm these tools can cause. Harmful behaviors of such systems are regarded as symptoms of their bias, i.e., the systematic preference or discrimination against certain groups of users. NLP tools are commonly trained on textual corpora that display such biases already at the level of their authors. For example, Wikipedia, which is one of the most commonly used sources of training data, is created by a predominantly white and male group of editors. Such a lack of diversity among authors can lessen the impact of data from minorities and, as a consequence, result in NLP models that reflect the underlying demographic imbalances. DANIS contributes to the discourse of fairness in AI by facilitating the design of NLP intelligent systems that can recognize inputs from underrepresented groups of users and strengthen their role in the training processes.