Authors Pre-review Version. Published Version at: http://dx.doi.org/10.1109/35.989782 High Precision Traffic Measurement by the WAND Research Group John Cleary*, Ian Graham*, Tony McGregor*, Murray Pearson*, Ilze Ziedins**, James Curtis*, Stephen Donnelly*, Jed Martens*, Stele Martin* *Department of Computer Science, University of Waikato, Hamilton, New Zealand **Department of Statistics, University of Auckland, Auckland, New Zealand 1. Introduction Over recent years the size and capacity of the Internet has continued its exponential growth driven by new applications and improving network technology. These changes are particularly significant in the New Zealand context where the high costs of trans- Pacific traffic has mandated that traffic be charged for by volume. This has also lead to a significant focus within the New Zealand Internet community on issues of caching and of careful planning for capacity. Approximately three years ago the WAND research group began with a program to measure ATM traffic. We were sharply constrained by cost and decided to start by reprogramming some ATM NIC cards. This paper is largely based on our experience as we have broadened this work to include IP-based non-ATM networks and the construction of our own hardware. We have learned a number of lessons in this work, rediscovering along the way some of the hard discipline that all observational scientists must submit to. Our work continues its emphasis on cheap reliable equipment using microprocessor technology. Over time our main concerns have shifted from getting any measurements at all to the reliability and accuracy of those measurements and techniques for disseminating and analysing our results. In the main we gather traces of network traffic store them to disk and then archive and process them. We have not been directly concerned with other possibilities such as real time processing and displaying. Clearly though, the techniques we have been working with are directly applicable to that problem. The rest of this paper follows the lifetime of data captured from the net right through to processing and analysing it. At each step of this process there are recurring themes that need to be addressed. The first of these is capacity. The bandwidth of modern networks and the need to capture data over long times imposes stringent demands at every level on both bandwidth and storage capacity. This can be ameliorated by careful use of specialised hardware at critical points and by filtering data so that only what is essential is handed to the next stage.