Data-Driven Oracle Bone Rejoining: A Dataset and Practical
Self-Supervised Learning Scheme
Chongsheng Zhang
Henan University
Kaifeng, China
cszhang@ieee.org
Bin Wang
Henan University
Kaifeng, China
bin.wang@henu.edu.cn
Ke Chen
∗
South China University of Technology
Guangzhou, China
chenk@scut.edu.cn
Ruixing Zong
Henan University
Kaifeng, China
rxzong@henu.edu.cn
Bo-feng Mo
Capital Normal University
Beijing, China
mbf2001@163.com
Yi Men
Henan University
Kaifeng, China
yi.men@henu.edu.cn
George Almpanidis
Henan University
Kaifeng, China
almpanidis@acm.org
Shanxiong Chen
Southwest University
Chongqing, China
csxpml@163.com
Xiangliang Zhang
University of Notre Dame
Notre Dame, USA
xzhang33@nd.edu
ABSTRACT
Oracle Bone Inscriptions (OBI) is one of the oldest scripts in the
world. The rejoining of Oracle Bone (OB) fragments is of vital im-
portance to the research of ancient scripts and history. Although
signifcant progress has been achieved in the past decades, the re-
joining work still heavily relies on domain knowledge and manual
work, thus remains a low efcient and time-consuming process.
Therefore, an automatic and practical algorithm/system for OB
rejoining is of great value to the OBI community. To this end, we
collect a real-world dataset for rejoining Oracle Bone fragments,
namely OB-Rejoin, which consists of 998 OB rubbing images that suf-
fer from low quality image problems, due to intrinsic underground
eroding over time and extrinsic imaging conditions in the past.
Moreover, a practical Self-Supervised Splicing Network, S
3
-Net, is
proposed to rejoin the OB fragments based on shape similarity of
their borderlines. Specifcally, we frst transform the manually anno-
tated borderline strokes of OB images into times series style shape
representations, which are fed as input to a Generative Adversarial
Network for augmenting positive pairs of rejoinable OBs for each
OB fragment that does not have rejoinable counterparts. A Siamese
network is trained on such augmented data in a contrastive learn-
ing manner to retrieve the matching OB fragments of an unseen
query from an OB fragment gallery. Experiments on the OB-Rejoin
benchmark show that our data-driven approach outperforms two
recent methods for time-series analysis. In order to demonstrate its
practical potential, we deploy the proposed S
3
-Net method in real
∗
indicates corresponding author. Also afliated with Peng Cheng Laboratory.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specifc permission and/or a
fee. Request permissions from permissions@acm.org.
KDD ’22, August 14ś18, 2022, Washington, DC, USA.
© 2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9385-0/22/08. . . $15.00
https://doi.org/10.1145/3534678.3539050
tests and ultimately discover dozens of new rejoinings missed by
domain experts for decades.
CCS CONCEPTS
· Information systems → Similarity measures; · Computing
methodologies → Shape representations; Matching; Neural
networks.
KEYWORDS
Oracle Bone Rejoining, Data Augmentation, Contrastive Learning
ACM Reference Format:
Chongsheng Zhang, Bin Wang, Ke Chen
∗
, Ruixing Zong, Bo-feng Mo, Yi
Men, George Almpanidis, Shanxiong Chen, and Xiangliang Zhang. 2022.
Data-Driven Oracle Bone Rejoining: A Dataset and Practical Self-Supervised
Learning Scheme. In Proceedings of the 28th ACM SIGKDD Conference on
Knowledge Discovery and Data Mining (KDD ’22), August 14ś18, 2022, Wash-
ington, DC, USA. ACM, New York, NY, USA, 11 pages. https://doi.org/10.
1145/3534678.3539050
1 INTRODUCTION
Written language is the main carrier of human history and civiliza-
tion for thousands of years. Oracle Bone Inscriptions (OBI), which
was used in the Shang dynasty more than 3,600 years ago, is one of
the oldest writing systems in the world. It was frst discovered in
the year of 1899. Until now, there are in total over 160,000 pieces
of unearthed Oracle Bones (OB), and new Oracle Bones are being
continuously excavated. OBs were used by the religious specialists
(shamans) at that time for practicing a specifc form of divination
to foretell the future based on the cracks in the animal bones and
turtle shells (carved with OBIs) after the bones were burned. OBI
research is very important for both history and literature. Due to
historical reasons and traditions in OBI research, the main presen-
tation form of OBI materials is rubbing, as colored OB images are
rare, expensive and vastly unavailable, because Oracle Bones are
protected as antiquities in diferent museums and organizations all
over the world.
4482