Assessing Writing 47 (2021) 100511
Available online 11 January 2021
1075-2935/© 2020 Elsevier Inc. All rights reserved.
Lexical density and diversity in dissertation abstracts: Revisiting
English L1 vs. L2 text differences
Maryam Nasseri *, Paul Thompson
Department of English Language and Linguistics, University of Birmingham, Birmingham, B15 2TT, UK
A R T I C L E INFO
Keywords:
Writing assessment
Academic writing
Lexical profciency
Lexical density
Lexical diversity
Dissertation abstracts
ABSTRACT
This study investigated lexical density and diversity differences in English as L1 vs L2 academic
writing of EFL, ESL, and English L1 postgraduate students to compare their lexical profciency in
EFL vs. English L1 academic settings. A corpus of 210 dissertation abstracts was analysed using
three natural language processing tools [LCA, TAALED, and Coh-Metrix] where the effects of text
length and topic were controlled. In doing so, we examined the relationship between 15 lexical
indices and the construct-distinctiveness of lexical density and diversity. The measure-testing
process also assesses the effectiveness of each measure in a pair/group of closely-related mea-
sures (in terms of the quantifcation methods) in capturing lexical diversity differences of these
texts. This is to obtain a small number of unique measures that capture lexical diversity as an
indicator of lexical profciency and to assist future writing researchers in the measure-selection
process in the face of a multitude of available measures. The fndings have important implica-
tions for writing assessment and research on lexical indicators of writing profciency, materials
development in EFL academic settings especially for thesis/dissertation writing modules, and a
possible contribution of ESL academic immersion programmes in approximating English L1 and
L2 profciency.
1. Introduction
Lexical density and diversity as two dimensions of lexical complexity and aspects of productive lexical knowledge remain as two of
the most reliable indicators of lexical and linguistic profciency and development of language users in the frst and second language as
well as writing and academic studies (see e.g., Bult´ e & Housen, 2012; Lu, 2012). Lexical density is the proportion of lexical/content
words to all words/tokens; lexical density, especially a dense use of nouns, is regarded as an indicator of condensed academic writing
and advanced informational prose (e.g., in Biber, 2006; Biber & Gray, 2016; Pietil¨ a, 2015) and as a strong predictor of academic
writing profciency (e.g., Kim, 2014). Lexical diversity is the use of a range of diverse words (also known as unique word types) to
convey meaning and is regarded as an indicator and predictor of lexical profciency and development (Gonzalez, 2013; Mazgutova &
Kormos, 2015; Yoon, 2017). Lexical density and diversity, although interrelated, can be differentiated in that lexical density seeks to
Abbreviations: CEFR, (Common European Framework of Reference); EAP, English for Academic Purposes; EFL, English as a Foreign Language;
ESL, English as Second Language; English, L1 English as the frst Language; L1, frst language; L2, second language (also subsequent languages); MA,
master’s; NLP, Natural Language Processing; NS, Native Speakers of English; SLA, Second Language Acquisition; LCA, Lexical Complexity Analyzer;
TAALED, Tool for Automatic Analysis of Lexical Diversity; TTR, type-token ratio.
* Corresponding author.
E-mail address: mxn309@bham.ac.uk (M. Nasseri).
Contents lists available at ScienceDirect
Assessing Writing
journal homepage: www.elsevier.com/locate/asw
https://doi.org/10.1016/j.asw.2020.100511
Received 2 June 2020; Received in revised form 9 November 2020; Accepted 13 November 2020