SERA: A S che duling Fra mework for M2M Transmission in Cellular Networks UmaMaheswari Devi, Munish Goyal, Mukundan Madhavan, Ravi Kokku, Dilip Krishnaswamy IBM Research, India Abstract— Trends show that machine-to-machine (M2M) devices are going to grow by orders of magnitude, far surpassing the number of mobile devices. This unprecedented scale and the fact that M2M trafﬁc typically consists of many small-sized transmissions make the data and signaling overhead of introducing M2M trafﬁc into cellular networks a big concern. Fortunately, it is possible to exploit certain unique characteristics of M2M trafﬁc like periodicity and delay tolerance in its scheduling to alleviate these concerns. In this paper, we propose SERA — a two-level S che duled Ra ndomization framework, which does precisely this, and efﬁciently integrates M2M trafﬁc into cellular networks. Broadly, SERA consists of (i) a central controller that deﬁnes certain coarse-level transmission parameters to govern M2M trafﬁc in the next scheduling period and (ii) a simple distributed randomized algorithm at each M2M device that governs ﬁne-grained transmission decisions within the period. Using experiments and analyses, we show that compared to existing techniques for M2M trafﬁc management, SERA can lower peak trafﬁc load by 30-40%, bring down the total time spent under congestion by 30-40%, and that these gains are robust to errors in trafﬁc prediction. I. I NTRODUCTION Machine-to-machine (M2M) devices refer to smart end- devices that acquire parameters of interest in systems such as health-care, surveillance, and trafﬁc-control, and transport the acquired values to back-end servers for improved decision making and control. With the increasing instrumentation of our surroundings, the number of such M2M devices has been rapidly growing, and the number of wireless M2M devices is expected to hit 50 billion by 2020 [1]. Cellular networks already carry a signiﬁcant fraction of wireless M2M trafﬁc, mainly due to its ease of deployment and accessibility [2]. Studies have shown that M2M transmissions exhibit certain distinct characteristics. (1) They typically consist of a larger number of short, distinct transactions than user trafﬁc, thus exacerbating signaling overhead [3][4]. (2) Due to the largely time-triggered nature of their operations, they tend to be time- synchronized, typically at 1 4 -, 1 2 -, and 1-hr boundaries [2], resulting in bursty trafﬁc patterns. (3) The data they carry is often utilized in batch, and hence, the level of urgency in their transmission is diverse. In contrast, user-trafﬁc is often (but not always) urgent, with users actively waiting on it. A key aspect of embracing M2M devices in cellular networks is to realize the differences in M2M and user trafﬁc, and accordingly, manage them in different ways. While congestion alleviation in cellular networks is a classical problem, the distinct characteristics of M2M trafﬁc present newer techniques for resource management. Tradi- tional, largely, centralized approaches tend to create manage- ment bottlenecks with increasing scale, which is the case in M2M. Purely distributed scheduling algorithms, on the other hand, can scale with the number of devices but cannot easily know network load and requirements of other applications, and hence might not sufﬁciently alleviate congestion. Moreover, traditional congestion management is a reactive process due to the nature of user data; a better approach for M2M would be to leverage the periodicity and delay-tolerance inherent in a signiﬁcant fraction of these devices to appropriately defer trafﬁc generation at the source and prevent congestion in the ﬁrst place. Congestion due to M2M devices in cellular access links, especially in the signaling and control plane, is already receiving considerable attention [5], [6], [7], [8]. In this paper, we explore a hybrid solution to achieve the beneﬁts of both centralized and distributed approaches. To be both scalable and capable of using centralized knowledge on resource constraints and application requirements, we follow a design philosophy where (i) a central controller provides a coarse-level transmission plan (i.e., a schedule) for devices detailing a broad transmission strategy on how to attempt to transmit in a given interval and (ii) individual devices execute a local randomized scheduling algorithm that determines when to transmit exactly within the interval. In doing this, our solution called SERA, denoting S che duled Ra ndomization framework, takes into account estimates of non-M2M trafﬁc, requirements of periodic M2M trafﬁc, and the QoS require- ments of M2M devices. Our design intrinsically leverages the facts that M2M QoS requirements are diverse, ranging from real-time (ﬂeet tracking and emergency detector applications) to delay tolerances up to a few hours [6]; further, within the delay tolerance, our design allows data collected to be stored and aggregated for amortizing signaling overhead, and thus provides signiﬁcant ﬂexibility in scheduling control. We also note that our solution is equally applicable to signaling and data overload management. Given that M2M trafﬁc is uplink- heavy [2], we focus on managing uplink transmissions. Contributions: (1) In Sec. III-A, we present SERA, a scalable application- and network-aware trafﬁc management frame- work for a diverse set of M2M devices. We detail its end- to-end design and discuss its ease of deployment at vari- ous points in the network. (2) In Sec. IV, we formulate the problem of coarse-level M2M trafﬁc management at the central controller as one of optimizing upload transmission probabilities over time for heterogeneous classes of devices. We then propose efﬁcient solutions to this problem under diverse realistic settings using a combination of algorithmic and analytical approaches. (3) In Sec. VI, we use a MQTT-