SAGT: Computational Structural Analysis of German-Turkish Code-Switching
July 2018 - June 2021
- Short description
Code-switching, alternating between two or more languages in conversations, is commonly observed among bilinguals. The SAGT project aims to analyse Turkish-German code-switching (CS) from a computational perspective. The main goals are to develop representations and methods for analysis, to create resources and computational models based on them, and to provide input to other research areas via these analyses.
German Research Foundation (DFG)
- Long description
The SAGT project aims to analyse Turkish-German code-switching (CS) from a computational perspective. Previous research on Turkish-German CS has looked into sociolinguistic and linguistic aspects, but there has not been computationally-oriented research that would enable systematic and comprehensive structural analysis.
The new project's goals are to develop representations and methods to analyse Turkish-German code-switching, to create resources and computational models based on them, and to provide input to other research areas via these analyses.
The main tasks to achieve these goals are to build tools for core natural language processing (NLP) tasks - namely normalisation, language identification, POS tagging, morphological analysis, and parsing - and to test them on the CS corpora created from conversations and social media. The project will employ data-driven approaches that give the opportunity to apply these tools to other language pairs, and to gain insights into computational approaches to CS in general. The structural analyses will also support linguistic research and NLP oriented downstream tasks.