Student Behavioral Embeddings and Their Relationship to
Outcomes in a Collaborative Online Course
Renzhe Yu
UC Irvine
renzhey@uci.edu
Zachary Pardos
UC Berkeley
pardos@berkeley.edu
John Scott
UC Berkeley
jmscott212@berkeley.edu
ABSTRACT
In online collaborative learning environments, prior work has
found moderate success in correlating behaviors to learning
after passing them through the lens of human knowledge
(e.g., hand labeled content taxonomies). However, these
manual approaches may not be cost-effective for triggering
in-time support, especially given the complexity of interper-
sonal and temporal behavioral patterns under rich interac-
tions. In this paper, we test the hypothesis that a neu-
ral embedding of students that synthesizes their event-level
course behaviors, without hand labels or knowledge about
the specific course design, can be used to make predictions
of desired outcomes and thus inform intelligent support at
scale. While our student representations predicted student
interactivity (i.e., sociality) measures, they failed to better
predict course grades and grade improvement as compared
to a naive baseline. We reflect on this result as a data point
added to the nascent trend of raw student behaviors (e.g.,
clickstream) proving difficult to directly correlate to learn-
ing outcomes and discuss the implications for big education
data modeling.
Keywords
Collaborative learning environment, neural embedding, skip-
gram, online course, higher education, behavior, predictive
modeling
1. INTRODUCTION
Representation of collaborative learning behaviors in their
raw formats has been challenging due to the complicated in-
ternal dependencies. Theory-driven approaches can extract
some conceptually important measures of these learning pro-
cesses but might not give good grounds for real-time learner
support due to the human effort required. In this paper, we
examine an aggregate, unsupervised representation of these
collaborative learning behaviors in the context of a formal
course that features sharing, remixing and interacting with
student artifacts. We use a connectionist, neural network
approach to representing a student as a function of a co-
interaction network temporally formed by peers interacting
in different ways in different weeks of the course. In reflec-
tion of the prior empirical work, we test the correspondence
of these representations to learning outcomes. First, we in-
vestigate if the sociality of a student, or how much she is
involved in the collaborative community, can be predicted
from these low-level behavioral representations, as this is a
direct goal of the special course design we analyze. Second,
given the moderate relationship between interpersonal con-
nections and learning performance in the literature, we test
whether these vector representations are indicative of their
final course performance. This exploration has strong ped-
agogical implications because an unsupervised student-level
representation that captures signals of effective learning can
be further deployed in intelligent systems to give just in-time
feedback/interventions in the face of interconnected behav-
ioral streams.
1.1 Collaborative learning behavior and out-
comes
Generations of learning theories and pedagogies have high-
lighted the benefits of social processes for effective learning
[15, 13]. Accordingly, there has been a multitude of stud-
ies that characterize these processes and examine how they
relate to learning outcomes from granular learning behavior
data [2]. One typical context of these studies is collabora-
tive learning environments where students are required to
work together in one way or another. As the interpersonal
and temporal dependencies complicate the social processes,
multiple methodological paradigms have been adopted to
represent students’ collaborative learning behavior.
To model the structures of interpersonal connections, so-
cial network analysis (SNA) conceptualizes learners as nodes
and their various formats of interaction as edges and typi-
cally identifies global or local structures. Some studies are
concentrated on the discovery of global structures such as
core-periphery structures [6] and cohesive groups [3], while
a number of others take more local perspectives and find the
predictive power of network positions for learning outcomes
[1, 5]. An alternative paradigm is the extension of psychome-
tric or knowledge tracing models to collaborative settings,
where collaboration status or group membership informa-
tion is used to construct additional terms in the original
functions [16, 9]. These adapted models have shown im-
proved predictive power of students’ learning performance.
Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).