Representation Learning for Predicting Customer Orders
Tongwen Wu
1
, Yu Yang
1
, Yanzhi Li
1
, Huiqiang Mao
3∗
, Liming Li
2
, Xiaoqing Wang
2
, Yuming Deng
2
1
City University of Hong Kong, Hong Kong, China
2
Alibaba Group, Hanzhou, China
3
Tencent, Shenzhen, China
tongwenwu2-c@my.cityu.edu.hk,{yuyang,yanzhili}@cityu.edu.hk,huiqiangmao@gmail.com,
{liming.l,robin.wxq,yuming.dym}@alibaba-inc.com
ABSTRACT
The ability to predict future customer orders is of signifcant value
to retailers in making many crucial operational decisions. Diferent
from next basket prediction or temporal set prediction, which fo-
cuses on predicting a subset of items for a single user, this paper
aims for the distributional information of future orders, i.e., the
possible subsets of items and their frequencies (probabilities), which
is required for decisions such as assortment selection for front-end
warehouses and capacity evaluation for fulfllment centers. Based
on key statistics of a real order dataset from Tmall supermarket, we
show the challenges of order prediction. Motivated by our analysis
that biased models of order distribution can still help improve the
quality of order prediction, we design a generative model to capture
the order distribution for customer order prediction. Our model
utilizes representation learning to embed items into a Euclidean
space and design a highly efcient SGD algorithm to learn the item
embeddings. Future order prediction is done by calibrating orders
obtained by random walks over the embedding graph. The experi-
ments show that our model outperforms all the existing methods.
The beneft of our model is also illustrated with an application to
assortment selection for front-end warehouses.
CCS CONCEPTS
· Applied computing → Electronic commerce.
KEYWORDS
Choice Model; Representation Learning; Random Walk; E-commerce
ACM Reference Format:
Tongwen Wu
1
, Yu Yang
1
, Yanzhi Li
1
, Huiqiang Mao
3∗
, Liming Li
2
, Xi-
aoqing Wang
2
, Yuming Deng
2
. 2021. Representation Learning for Pre-
dicting Customer Orders . In Proceedings of the 27th ACM SIGKDD Con-
ference on Knowledge Discovery and Data Mining (KDD ’21), August 14ś
18, 2021, Virtual Event, Singapore. ACM, New York, NY, USA, 10 pages.
https://doi.org/10.1145/3447548.3467170
∗This work was done while Huiqaing Mao was at Alibaba Group.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specifc permission and/or a
fee. Request permissions from permissions@acm.org.
KDD ’21, August 14ś18, 2021, Virtual Event, Singapore
© 2021 Association for Computing Machinery.
ACM ISBN 978-1-4503-8332-5/21/08. . . $15.00
https://doi.org/10.1145/3447548.3467170
1 INTRODUCTION
In mining e-commerce orders, one of the most important problems
is to learn the distribution of all possible order types from data,
where an order type is a specifc combination/set of items. Many
business applications require such distributional information. For
example, e-commerce frms often conduct simulation studies in
preparing or evaluating the handling capacity of fulfllment cen-
ters [12]. A critical input to the simulation study is the order compo-
sition during a period of time, say, the next day or the next couple
of days. The current practice is that frms directly sample the his-
torical order data as the input, which is indicative of the future but
may not be sufciently representative because, simply put, many
future orders have never appeared before. Moreover, to guarantee
speedy delivery, e-commerce frms often set up front-end ware-
houses in close proximity to customers. Optimizing assortments
for such warehouses (i.e., the products carried by the warehouse)
to maximize the number of orders that can be directly satisfed
(i.e., to avoid order splits, which will mean higher cost and lower
service level or even losing orders) also requires distributional in-
formation of orders [24]. Note here that knowing the demand for
individual items is insufcient, since an order cannot be directly
satisfed from the front-end warehouse unless all of its items are
available. Other examples include designing product bundle promo-
tions, cross-selling, and pattern mining [18]. For all such business
applications, the distributional information of orders is required.
It is worth noticing that learning distributional information of
orders is quite diferent from next-basket prediction [9, 23, 25],
temporal set prediction [3, 16, 21, 26], or frequent set mining [7, 14].
We aim to characterize the full picture of the aggregated behavior
of the market over a specifc time period. In contrast, next-basket
prediction and temporal set prediction focus on the behavior of a
specifc customer for the next shopping, irrespective of the time
of purchase. In applying to our problem, these methods would
perform poorly since they are originally not designed for such a
purpose; likewise, our method does not apply to their problems
either. Frequent set mining gives only the set of high frequency but
no exact probabilistic information of mined sets, and it does not
address the need of our business applications.
Due to the combinatorial explosion of possible order types, learn-
ing the order distribution from data faces a number of major chal-
lenges. First, the order data for learning the distribution is usually
sparse where the number of observed orders is much smaller than
the number of possible order types. Many possible order types do
not appear in the data set or just appear once. Thus, directly count-
ing the order dataset to estimate each order type’s probability does
ADS Track Paper KDD ’21, August 14–18, 2021, Virtual Event, Singapore
3735