Jitter Measurement on Deep Waveforms with Constant Memory Yanzhou Liu * , Lee Barford † , Shuvra S. Bhattacharyya *‡ * Department of Electrical and Computer Engineering University of Maryland College Park, MD 20742, USA † Keysight Laboratories Keysight Technologies, Inc. Reno, NV, USA ‡ Department of Pervasive Computing Tampere University of Technology Tampere, Finland Abstract—The time required for jitter measurement in digital communications waveforms can be dominated by computation time which increases with waveform depth. Previous work on decreasing this computation time includes the use of parallel resources on microprocessors and graphics processing units. However, the waveform depth and computation speed were limited by the need to have the entire waveform and intermediate results derived from it in memory all at once. We present a new dataflow-based method for clock recovery and time interval error (TIE) and TIE standard deviation computation. Memory usage does not grow with waveform depth, so the latter is not limited by memory size. We describe an implementation in LIDE-OCL, a tool for simplifying implementation of dataflow signal processing using multicore processors and GPUs. The resulting measurement accuracy is compared on actual measured waveforms with prior methods. I. I NTRODUCTION In the design of advanced digital communication systems, timing jitter measurement must often be performed on deep waveforms — i.e., on signals that are of relatively long duration. By jitter measurement we mean summary statistics of time interval error (TIE), the difference (in units of time) between when an feature should be found in the waveform and when it actually occurred. There are two main purposes for capturing and measuring deep waveforms in this context: (1) to increase the likeli- hood of capturing rare events that can cause communication errors [1], and (2) to enable estimation of tails in jitter probability distributions, as a replacement for or to improve the accuracy of distribution extrapolation [2]. Implementations of timing jitter measurement are available in instruments such as digital oscilloscopes. However, the computation time and memory requirements increase with waveform depth, and so it is desirable to seek methods for faster yet still cost-effective jitter computation from deep waveforms. To address this problem and help accelerate jitter mea- surement, researchers have introduced parallel algorithms for constant clock period computation. For example, [3] exploits multi-core processors such as Intel central processing units (CPUs) together with their streaming single instruction mul- tiple data extensions (SSE) [4] instruction sets to enable fast and accurate jitter measurement. However this design suffers from large memory requirements and high latency due to its “swallow and wallow” characteristic whereby the computation is started only after all input data has arrived and has been stored in memory. This limits the amount of signal data that can be measured, and results in high response time for engineers to start seeing measurement results. Another jitter measurement algorithm was demonstrated in [5] that significantly improves measurement response time by partitioning the overall data set into windows and allowing jitter measurement results to be reported for earlier windows before later windows are received. This re-formulation of jitter measurement eliminates the swallow and wallow char- acteristic, and provides improved speed. However, a memory requirement limitation still remains: the memory required (like of [3]) is unbounded. In other words, the memory requirement grows without bound as the size of the data set is increased. This characteristic again limits the amount of signal data that can be measured, which is problematic, for example, in measuring relatively long signals or signals with high sample rates with limited memory resources. In this paper, we improve the algorithm in [5] to overcome its limitation of having unbounded memory requirements. In the jitter measurement approach design proposed in this paper, the memory requirements are fixed for a given system design configuration — in particular, the memory requirements are independent of the amount of data that is processed when the system operates. This allows processing of unbounded signal streams: the measurement system can process as much data as it receives during a given execution of the system. At the same time, the method proposed in this paper provides significantly