2.9.2023: How do the Common European Framework levels differ in terms of linguistic features? Analysing English language learners’ written corpora by using Natural Language Processing tools (Khushik)

The CEFR is a significant language education policy, serving as the basis for European syllabuses, curriculum guidelines, exams, and textbooks. Its relevance extends to both second language acquisition (SLA) and language testing research, as it seeks to comprehend the development of language proficiency and evaluate the validity of language skills. However, it is necessary to have empirical evidence for the CEFR in the case of learner languages, particularly concerning the linguistic features of EFL learners. The CEFR descriptors emphasize communicative ability rather than specific linguistic features across various languages.
This study delves into the syntactic complexity of Finnish EFL learners from Finland and Sindhi EFL learners from Pakistan across CEFR levels A1 to B2. Trained evaluators assessed the scripts, while NLP automated tools were utilized to extract syntactic complexity features from the texts. The study provides quantitative data on the syntactic complexity characteristics of EFL learners and their texts. The statistical analysis reveals differences and similarities in syntactic complexity indices across CEFR levels in EFL learners' writing in both countries. The study draws attention to linguistic similarities between communicatively equivalent CEFR levels when the target language is the same, but the learners' first language differs. It also uncovers differences in syntactic complexity linked to learners' age or educational level. The findings can be utilized in systems evaluating EFL learners' texts for linguistic features to predict CEFR levels and contribute to describing the CEFR levels' linguistic basis.
Key words: Syntactic Complexity, SFL writing, EFL learner Corpus, the CEFR, NLP automated tools.
The researcher is interested in merging the fields of SLA, language testing, and corpus linguistics. He explores linguistic features in learner corpora using NLP-automated tools.
MA Ghulam Khushik defends his doctoral dissertation in Applied linguistics (Centre for Applied Language Studies, CALS) "How do the Common European Framework levels differ in terms of linguistic features? Location: Mattilanniemi, Agora Auditorio 2, at 10:00. Opponent is Professor Xiaofei Lu (The Pennsylvania State University) and custos is Professor Ari Huhta (Ä¢¹½Ö±²¥).
Permanent link to this publication:
The event is held in English.
Further information:
Ghulam Abbas Khushik, ghabkhus@jyu.fi, 0449840532