International Journal of Applied Engineering Research ISSN 0973-4562 Volume 12, Number 13 (2017) pp. 3887-3893
© Research India Publications. http://www.ripublication.com
3887
Evaluating the Reliability and Quality of Examination Paper for
Multi-tier Application Development Course using Rasch
Measurement Model
Zuhaira Muhammad Zain
Information Systems Department, College of Computer and Information Sciences,
Princess Nourah Bint Abdulrahman University, Riyadh, KSA.
ORCID: 0000-0002-5973-387X
Abstract
Final exam has been used tremendously as an assessment tool
to measure students’ academic performance in most of the
higher institutions in the Kingdom of Saudi Arabia. A good
quality of a set of constructed items/questions on final exam
would be able to measure both students’ academic
performance and their cognitive skills. Rasch Measurement
Model has been used to evaluate the reliability and quality of
the final exam questions for the Multi-tier Application
Development course. The analysis indicated that the reliability
and quality of the final exam questions constructed were
relatively good and calibrated with students’ learned ability.
Keywords: Bloom’s Taxanomy; Information systems; item
constructions; quality; Rasch Model; reliability; students
performance
INTRODUCTION
Nowadays, universities in Saudi Arabia need to comply to the
American Accreditation Board of Engineering and Technology,
2000 (ABET) program accreditation requirements. One of the
ABET general criteria is student. Student performance must be
evaluated to monitor student progress in order to foster success
in attaining student outcomes, thereby enabling graduates to
attain program educational objectives [1]. Normally, student
performance measurement has been essentially dependent on
the students’ performance in carrying out tasks such as quizzes,
assignments, mid examinations, projects and final exams. A
quality task should provide the same level of cognitive thinking
skills to all students on what they have learned. In order to
increase the students’ performance quality, well organized and
constructed tasks which are based on Bloom's cognitive
thinking skills and the level of students' ability should be
considered. A reliable and high quality assessment tools in
teaching and learning process is required to measure students’
understanding and ability.
Multi-tier Application Development (IS333D) is one of new
courses introduced in the Information Systems (IS) Department
at the College of Computer and Information Sciences (CCIS) at
the Princess Nourah Bint Abdulrahman University (PNU). It is
one of the core courses that must be completed by the IS
students before they can graduate. The main objective of this
course is to introduce the concept of multi-tier architecture to
the students and they need to apply it in the Web application
development.
In this paper, the final examination questions for IS333D for
Semester 1 Session 2015/2016 are taken into account as the
assessment tool. Furthermore, in the process of constructing
these examination questions, it is vital to have fairly distributed
examination questions based on Bloom’s cognitive thinking
skills, the level of students’ ability and level of questions/items
difficulty. According to Morales, a discussion of reliability is
essential in evaluating the quality of the questions [2]. The
reliability is the degree to which an instrument consistently
measures the ability of an individual or group. Generally, to the
best of the author’s knowledge, in CCIS, there is no statistical
measurement on reliability of any examination questions. The
questions were only checked for their format, spelling, and the
relevance of questions by the course specialist. Consequently,
there is no statistical evidence to verify that a set of
examination questions is reliable.
The Rasch Measurement Model has been used to assess the
reliability and quality of examination paper of some
Engineering courses in Malaysia [3, 4, 5, 6, 7], nevertheless, to
the best of the researcher’s knowledge, it has not been applied
for Information Systems courses especially in Saudi Arabia.
The model fulfill the guidelines that has been emphasized by
Wright and Mok [8] that a measurement model must produce
linear measures, overcome missing data, provide estimates of
precision, detect misfits, and distinguish the parameters of the
object being measured from those of the measuring instrument.
Thus, it can generate meaningful inferences by transforming an
ordinal score into a linear, interval-level variable, through
estimating the fit of data to the Rasch model’s expectations.
The basic principle underlying the Rasch Model is that the
probability of a respondent/student successfully verifying a
particular item/question is governed by the difference between
the item/question’s difficulty and respondent/student’s ability
[9, 10, 11]. The logic underlying this principle is that all
respondents/students have a higher probability of answering
easier items/questions and a lower probability of answering
more difficult items/questions accurately [9]. Moreover, Rasch
Model is one of the reliable and appropriate method in