GraphStorm an Easy-to-use and Scalable Graph Neural Network
Framework: From Beginners to Heroes
Jian Zhang
AWS AI
Santa Clara, USA
jamezhan@amazon.com
Da Zheng
AWS AI
Santa Clara, USA
dzzhen@amazon.com
Xiang Song
AWS AI
Santa Clara, USA
xiangsx@amazon.com
Theodore Vasiloudis
AWS AI
Seattle, USA
thvasilo@amazon.com
Israt Nisa
AWS AI
New York, USA
nisisrat@amazon.com
Jim Lu
AWS AI
Seattle, USA
luzj@amazon.com
ABSTRACT
Applying Graph Neural Networks (GNNs) to real-world problems
is challenging for machine learning (ML) practitioners due to two
major obstacles. The frst hurdle is the high barrier to learn program-
ming GNNs from scratch. The second challenge lies in overcoming
engineering difculties when scaling GNN models for large graphs
at an industry-level. To address these challenges, GraphStorm, an
open-source framework, ofers a solution by providing an easy-
to-use user interface and an end-to-end GNN training/inference
pipeline that seamlessly handles extremely large graphs in a dis-
tributed manner This tutorial aims to provide participants with a
comprehensive understanding of GraphStorm, including its design
principles, target users, and use cases, through presentations. The
hands-on sections will enable attendees to walk through four prac-
tical GraphStorm use cases that can assist them in leveraging GNNs
to address real-world business problems.
KEYWORDS
Graph Neural Networks, Distributed Training, GraphStorm
ACM Reference Format:
Jian Zhang, Da Zheng, Xiang Song, Theodore Vasiloudis, Israt Nisa, and Jim
Lu. 2023. GraphStorm an Easy-to-use and Scalable Graph Neural Network
Framework: From Beginners to Heroes. In Proceedings of the 29th ACM
SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’23),
August 6–10, 2023, Long Beach, CA, USA. ACM, New York, NY, USA, 2 pages.
https://doi.org/10.1145/3580305.3599179
1 TARGET AUDIENCE AND PREREQUISITES
FOR THE TUTORIAL
Intent audience: This tutorial targets machine learning practi-
tioners who are interested in or already working in graph machine
learning tasks, and want to leverage easy-to-use and scalable tools
to accelerate GNN adoption to address their own business problem,
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
KDD ’23, August 6–10, 2023, Long Beach, CA, USA
© 2023 Copyright held by the owner/author(s).
ACM ISBN 979-8-4007-0103-0/23/08.
https://doi.org/10.1145/3580305.3599179
and researchers who are interested in experimenting their novel
GNN models on large graphs.
Prerequisites: The attendees should have some knowledge with
deep learning on graphs, and have used deep learning frameworks,
e.g., Pytorch. Knowledge about graph neural network and DGL are
better to have, but not required.
Takeouts after participation of the tutorial: We expect that
the attendees will have an understanding of GraphStorm’s basic
information and application use cases. They will also know how to
use GraphStorm in standalone mode to train GNN models for their
own extensive graph data.
2 TUTORS
1. Jian Zhang, AWS AI, jamezhan@amazon.com
2. Da Zheng, AWS AI, dzzhen@amazon.com
3. Xiang Song, AWS AI, xiangsx@amazon.com
3 TUTORS’ SHORT BIO
3.1 List of in-person presenters
1. Jian Zhang: Jian is a senior applied scientist at AWS AI, using
ML techniques to help customers solve various problems,
such as fraud detection, image generation. He has success-
fully developed and deployed GNN solutions for customers
world-widely.
2. Da Zheng: Da is a senior applied scientist at AWS AI, leading
the efort of building frameworks and algorithms to bring
graph machine learning technologies in production. This
includes DGL for GNN, DGL-KE for knowledge graph em-
beddings, DistDGL for scaling GNN training to billion-scale
graphs, TGL for temporal GNNs, and more.
3. Xiang Song: Xiang is a senior applied scientist at AWS AI,
leading the efort of building frameworks and services for
industrial applications. This includes DGL and DistDGL for
scaling GNN to large scale graphs, Neptune ML, an graph
ML service designed for Amazon Neptune graph database.
3.2 List of contributors
1. Theodore Vasiloudis: Theodore is an applied scientist who
works in distributed machine learning and data processing.
2. Israt Nisa: Israt is an applied scientist who specializes in
developing scalable and high-performing modules for GNNs.
5790