arXiv:1501.04038v1 [cs.DB] 16 Dec 2014 1 A Data Driven Framework for Real-Time Power System Event Detection and Visualization Ben McCamish, Student Member, IEEE, Rich Meier, Student Member, IEEE, Jordan Landford, Student Member, IEEE, Robert B. Bass, Member, IEEE Eduardo Cotilla-Sanchez, Member, IEEE and David Chiu, Member, IEEE KeywordsPMU, data management, bitmap index, electrical distance, correlation, power system contingency. Abstract—Increased adoption and deployment of phasor mea- surement units (PMU) has provided valuable fine-grained data over the grid. Analysis over these data can provide real-time insight into the health of the grid, thereby improving control over operations. Realizing this data-driven control, however, requires validating, processing and storing massive amounts of PMU data. This paper describes a PMU data management system that supports input from multiple PMU data streams, features an event-detection algorithm, and provides an efficient method for retrieving archival data. The event-detection algorithm rapidly correlates multiple PMU data streams, providing details on events occurring within the power system in real-time. The event- detection algorithm feeds into a visualization component, allowing operators to recognize events as they occur. The indexing and data retrieval mechanism facilitates fast access to archived PMU data. Using this method, we achieved over 30× speedup for queries with high selectivity. With the development of these two components, we have developed a system that allows efficient analysis of multiple time-aligned PMU data streams. I. I NTRODUCTION Recently, power grid operations have been complicated by increased penetration of non-dispatchable generation, load congestion, demand for quality electric power, environmen- tal concerns, and threats to cyber-security and physical in- frastructure. Pressure from these issues compel engineers to create tools that leverage modern communications, signal processing, and analytics to provide operators with insight into the operational state of power systems. As Horowitz, et al. explained, there are multiple aspects to achieving the level of knowledge and control necessary to keep one of the world’s greatest engineering feats stable and operational [1]. To this end, utilities have been deploying phasor measurement units (PMU) 1 across the grid. At a high-level, PMUs are sensors that measure electrical waveforms at short fixed intervals [2]. A unique feature of PMUs is that they are equipped with global positioning systems (GPS), allowing multiple PMUs distributed in space to be synchronized across time. With a B. McCamish, R. Meier and E. Cotilla-Sanchez are with the School of Electrical Engineering and Computer Science, Oregon State University, Corvalis, OR, USA R. B. Bass and J. Landford are with the Maseeh College of Engineering and Computer Science, Portland State University, Portland, OR, USA D. Chiu is with the University of Puget Sound, Tacoma, WA, USA 1 Also known as synchrophasors, we refer to them as PMUs throughout this paper. proper set of analytics put in place, the mass deployment of PMUs can offer utility operators a holistic and real-time sense of grid status. With the recent deployment of PMUs on a large scale, their applications are growing. PMUs provide visibility over the grid at increasing speeds allowing for real-time monitoring of grid conditions [3]–[5]. PMU placement is also being optimized to provide accurate information about the grid while minimizing the number of units required to achieve observability [6]. Furthermore, this space has seen a significant increase in algorithms that aid in control and mitigation of grid operational issues. For example, efforts have emphasized using PMU data to monitor critical power paths [7], identify transmission line fault locations [8], isolate and mitigate low-frequency zonal oscillations [9], and predict critical slowing down of the network [10]. Despite increase in PMU use, there is still a lack of verification of the data generated by PMUs. Many algorithms assume input data streams to be robust, reliable, and available at all times. However, this is not the case in a real PMU network. Not only do corrupt data streams cause false positives during normal operation, but they reduce confidence in data generated during transient events. The standard for PMU mea- surements (IEEE C37.118.1-2011) provides some testing and error measurement specifications for these types of situations, but clarification of how a PMU should act is not stated [11]. Some recent works, namely [12]–[14], have made some initial steps in verifying the output of PMU devices before informing the operation of higher-level power system control algorithms. They have specifically stressed the importance of data integrity during transient situations. These efforts, however, have not sufficiently solved the event-detection problem. A second issue not addressed in many of the above works is a result of the sophisticated nature of sensing and data gathering in today’s PMUs. In the field, each PMU data stream is collected and coalesced by a device known as a phasor data concentrator (PDC) before being written to large, but slow, non-volatile storage, e.g., hard disks or tape. When data streams from many PMUs are combined, it can amount to massive volumes of data each year (on the order of 100s of TBs). Unfortunately, common data processing tasks, such as real-time event detection, ad hoc querying, data retrieval for analysis, and visualization require scanning or randomly accessing large amounts of PMU data on disk. These tasks can require prohibitive amounts of time. Therefore, in addition to the identification problem stated above, there is also a