Lexical richness in research articles: Corpus-based
comparative study among advanced Chinese learners of
English, English native beginner students and experts
Siyu Lei, Ruiying Yang
*
School of Foreign Studies, Xi’an Jiaotong University. No. 28 Xianning West Road, Xi’an, 710049, Shaanxi, PR China
article info
Article history:
Received 15 February 2020
Received in revised form 28 June 2020
Accepted 29 June 2020
Keywords:
Lexical richness
Research articles
Chinese PhD candidates
Nativeness
Expertise
abstract
The current study respectively compares lexical richness, i.e. lexical diversity, density and
sophistication in research article manuscripts by Chinese PhD candidates (CPhD), un-
published research papers by native final-year undergraduates and master-level students
(Native Beginner Students, NBS) and published research articles (RAs) by native experts
(NE). It aims to sketch the profile of CPhD’s use of vocabulary in terms of the three
measures of lexical richness in comparison to the NBS and the NE. Our data consisted of
142 RA manuscripts by CPhD, 71 unpublished research papers by NBS, and 128 published
RAs by NE in the field of science and engineering. The results showed that CPhD’s lexical
richness levels are between NBS and NE. Besides, the three measures of CPhD are not
balanced, namely, the lexical diversity is the lowest, similar to that of NBS, the lexical
sophistication is in the middle and the lexical density is similar to that of NE. Comparison
of the three groups of data indicates that academic expertise may play a more important
role than nativeness in the writing of RAs. Integration of EAP instruction with discipline
related research activities would be an important way to develop students’ ability of vo-
cabulary use in RA writing.
© 2020 Elsevier Ltd. All rights reserved.
1. Introduction
Vocabulary knowledge has been considered as a significant indicator of the quality of L2 academic writing (Nation, 2013),
and it is traditionally operationalized as lexical richness, including lexical diversity, lexical density, and lexical sophistication
(Read, 2000). Lexical diversity generally refers to the ratio of different word types divided by the total number of tokens in a
text or standardized length of samples, i.e. Type-Token Ratio. Lexical density refers to the ratio of content words, namely,
nouns, adjectives, verbs, and adverbs, to the total number of words and lexical sophistication is the proportion of relatively
unusual or advanced words in a text (Read, 2000). These three measures have been respectively proved to be positively
related to writing proficiency or quality such as lexical diversity by Gebril and Plakans (2016), lexical density by Gregori-
Signes and Clavel-Arroitia (2015) and lexical sophistication by Zheng (2016) and Higginbotham and Reid (2019).
According to a large-scale survey study among Hong Kong Chinese academics concerning their publication in international
refereed journals in English (Flowerdew, 1999), the biggest difficulty that Hong Kong Chinese academics encountered is the
* Corresponding author.
E-mail address: yangryd@xjtu.edu.cn (R. Yang).
Contents lists available at ScienceDirect
Journal of English for Academic Purposes
journal homepage: www.elsevier.com/locate/jeap
https://doi.org/10.1016/j.jeap.2020.100894
1475-1585/© 2020 Elsevier Ltd. All rights reserved.
Journal of English for Academic Purposes 47 (2020) 100894