SERA: A S che duling Fra mework for M2M Transmission in Cellular Networks UmaMaheswari Devi, Munish Goyal, Mukundan Madhavan, Ravi Kokku, Dilip Krishnaswamy IBM Research, India Abstract— Trends show that machine-to-machine (M2M) devices are going to grow by orders of magnitude, far surpassing the number of mobile devices. This unprecedented scale and the fact that M2M traffic typically consists of many small-sized transmissions make the data and signaling overhead of introducing M2M traffic into cellular networks a big concern. Fortunately, it is possible to exploit certain unique characteristics of M2M traffic like periodicity and delay tolerance in its scheduling to alleviate these concerns. In this paper, we propose SERA — a two-level S che duled Ra ndomization framework, which does precisely this, and efficiently integrates M2M traffic into cellular networks. Broadly, SERA consists of (i) a central controller that defines certain coarse-level transmission parameters to govern M2M traffic in the next scheduling period and (ii) a simple distributed randomized algorithm at each M2M device that governs fine-grained transmission decisions within the period. Using experiments and analyses, we show that compared to existing techniques for M2M traffic management, SERA can lower peak traffic load by 30-40%, bring down the total time spent under congestion by 30-40%, and that these gains are robust to errors in traffic prediction. I. I NTRODUCTION Machine-to-machine (M2M) devices refer to smart end- devices that acquire parameters of interest in systems such as health-care, surveillance, and traffic-control, and transport the acquired values to back-end servers for improved decision making and control. With the increasing instrumentation of our surroundings, the number of such M2M devices has been rapidly growing, and the number of wireless M2M devices is expected to hit 50 billion by 2020 [1]. Cellular networks already carry a significant fraction of wireless M2M traffic, mainly due to its ease of deployment and accessibility [2]. Studies have shown that M2M transmissions exhibit certain distinct characteristics. (1) They typically consist of a larger number of short, distinct transactions than user traffic, thus exacerbating signaling overhead [3][4]. (2) Due to the largely time-triggered nature of their operations, they tend to be time- synchronized, typically at 1 4 -, 1 2 -, and 1-hr boundaries [2], resulting in bursty traffic patterns. (3) The data they carry is often utilized in batch, and hence, the level of urgency in their transmission is diverse. In contrast, user-traffic is often (but not always) urgent, with users actively waiting on it. A key aspect of embracing M2M devices in cellular networks is to realize the differences in M2M and user traffic, and accordingly, manage them in different ways. While congestion alleviation in cellular networks is a classical problem, the distinct characteristics of M2M traffic present newer techniques for resource management. Tradi- tional, largely, centralized approaches tend to create manage- ment bottlenecks with increasing scale, which is the case in M2M. Purely distributed scheduling algorithms, on the other hand, can scale with the number of devices but cannot easily know network load and requirements of other applications, and hence might not sufficiently alleviate congestion. Moreover, traditional congestion management is a reactive process due to the nature of user data; a better approach for M2M would be to leverage the periodicity and delay-tolerance inherent in a significant fraction of these devices to appropriately defer traffic generation at the source and prevent congestion in the first place. Congestion due to M2M devices in cellular access links, especially in the signaling and control plane, is already receiving considerable attention [5], [6], [7], [8]. In this paper, we explore a hybrid solution to achieve the benefits of both centralized and distributed approaches. To be both scalable and capable of using centralized knowledge on resource constraints and application requirements, we follow a design philosophy where (i) a central controller provides a coarse-level transmission plan (i.e., a schedule) for devices detailing a broad transmission strategy on how to attempt to transmit in a given interval and (ii) individual devices execute a local randomized scheduling algorithm that determines when to transmit exactly within the interval. In doing this, our solution called SERA, denoting S che duled Ra ndomization framework, takes into account estimates of non-M2M traffic, requirements of periodic M2M traffic, and the QoS require- ments of M2M devices. Our design intrinsically leverages the facts that M2M QoS requirements are diverse, ranging from real-time (fleet tracking and emergency detector applications) to delay tolerances up to a few hours [6]; further, within the delay tolerance, our design allows data collected to be stored and aggregated for amortizing signaling overhead, and thus provides significant flexibility in scheduling control. We also note that our solution is equally applicable to signaling and data overload management. Given that M2M traffic is uplink- heavy [2], we focus on managing uplink transmissions. Contributions: (1) In Sec. III-A, we present SERA, a scalable application- and network-aware traffic management frame- work for a diverse set of M2M devices. We detail its end- to-end design and discuss its ease of deployment at vari- ous points in the network. (2) In Sec. IV, we formulate the problem of coarse-level M2M traffic management at the central controller as one of optimizing upload transmission probabilities over time for heterogeneous classes of devices. We then propose efficient solutions to this problem under diverse realistic settings using a combination of algorithmic and analytical approaches. (3) In Sec. VI, we use a MQTT-