International Journal of Evaluation and Research in Education (IJERE) Vol. 14, No. 3, June 2025, pp. 1590~1596 ISSN: 2252-8822, DOI: 10.11591/ijere.v14i3.30351  1590 Journal homepage: http://ijere.iaescore.com Examining the ‘hawk-dove effects’ in portfolio assessment using the multi-facet Rasch model Andrews Cobbinah 1 , Jephtar Adu-Mensah 2 1 Department of Education and Psychology, Faculty of Educational Foundations, University of Cape Coast, Cape Coast, Ghana 2 National Council for Curriculum and Assessment, Accra, Ghana Article Info ABSTRACT Article history: Received Feb 20, 2024 Revised Oct 14, 2024 Accepted Oct 30, 2024 Concerns among students have increased due to the use of test scores in decision-making, leading them to question whether their results accurately reflect their abilities, especially when they perceive subjectivity in rater scoring. This study explores the effects of rater bias on portfolio assessment scores among student teachers in the colleges of education in Ghana. A sample of 207 student portfolios, scored by tutors, was analyzed using a three-facet design model and the FACET software. The findings revealed that tutors exhibited varying rating behaviors, including severity, leniency, and halo effects. These differing rating patterns were found to impact the students’ portfolio scores, suggesting that the subjectivity of raters plays a crucial role in the assessment process. Keywords: ‘Hawk and dove effects’ Many-facet Rasch Portfolios Rater behavior Validity This is an open access article under the CC BY-SA license. Corresponding Author: Andrews Cobbinah Department of Education and Psychology, Faculty of Educational Foundations, University of Cape Coast Cape Coast, Ghana Email: andrews.cobbinah@ucc.edu.gh 1. INTRODUCTION Portfolio assessment has gained significant popularity in the educational field. Alongside this increase in usage, concerns have been raised by educators, policymakers, and researchers about the potential flaws associated with using human scorers. Although portfolios are used to gather evidence to inform decisions on the professional development of pre-service and in-service teachers [1], the assessment is predominantly carried out by human raters, which introduces subjectivity when grading students’ portfolio work. Rater subjectivity in this study reflects any actions of the rater that systematically influence test or assessment scores [2]. Such actions may include individual biases, understanding of the scoring rubrics, and the influence of extraneous factors such as student behavior, handwriting, or relationship with the student. Raters might unintentionally let their personal beliefs or experiences influence their evaluations, resulting in inconsistent grading practices [3]. The emphasis on test results as the primary measure of student achievement or ability has heightened student concerns [4]. Studies have reported that students frequently express concerns about the assessment process, particularly their grades or test scores [5]. These concerns often stem from the belief that teachers or raters exhibit subjectivity when grading [6]–[8]. In contrast to multiple-choice questions, the scoring of portfolio tasks is subject to various human-related factors that can influence the consistency and reliability of scoring [9]–[11]. These factors include but are not limited to the scoring methods employed by raters [12], [13], the gender and professional backgrounds of raters [14], [15], understanding of the scoring criteria [15], the number of raters involved in the scoring process [16]–[18] and the extent of rater training [19]. Considering the extensive amount of written work generated by students and the inconsistencies in grading practices among raters, there is general concern regarding the fairness, reliability, and validity of portfolio scores [20].