J2.1 SURFACE DATA INTEGRATION AT NOAA’s NATIONAL CLIMATIC DATA CENTER: DATA FORMAT, PROCESSING, QC, AND PRODUCT GENERATION Stephen A. Del Greco *, Neal Lott *, Kathy Hawkins, Rich Baldwin, Dee Dee Anders, Ron Ray, Dan Dellinger NOAA National Climatic Data Center, Asheville, North Carolina Pete Jones, Fred Smith TMC Technologies, Fairmont, West Virginia ABSTRACT The National Oceanic & Atmospheric Administration (NOAA) National Climatic Data Center (NCDC) acquires, quality controls, archives and provides dissemination services for global meteorological and climatological data. These data are processed with automated Quality Assurance/Control (QA/QC) software, with some of the software being network- specific. For example, the NOAA Automated Surface Observing System (ASOS) Network, National Cooperative Observers Network (COOP), and Climate Reference Network (CRN) are processed using systems unique to their network. Additionally, some of the data undergo additional interactive QC, which involves “visual/manual” inspection of the data. While similar, QA/QC rules and algorithms for like parameters from different observing networks are sometimes not standardized. NCDC’s goal is to integrate surface data into a standard format, and process the data through standardized QA/QC algorithms and procedures. To that end, NCDC has developed a new integrated surface database, called Integrated Surface Data (ISD). To date, numerous historical datasets have been integrated into ISD, with others to follow. Also, NCDC is designing a new QA/QC system - Integrated Surface Data Processing System (ISDPS), as an end-to-end system for processing in-situ data, where QA/QC is standardized, network independent and based on reporting frequency (hourly, daily, etc.). Together, ISDPS and ISD integrate the QA/QC algorithms into a unified system, use ISD as the input/output format, and provide integrated online products to customers. This paper briefly describes the ISDPS/ISD methodologies, data format, QC/validation techniques, and the various products available to customers. 1. INTRODUCTION The development of ISD (previously called ISH— Integrated Surface Hourly) and ISDPS has been an iterative process. * Corresponding authors address: Stephen Del Greco, Neal Lott, National Climatic Data Center, 151 Patton Avenue, Asheville, NC 28801; e-mail: Stephen.A.Delgreco@noaa.gov , Neal.Lott@noaa.gov . This includes the development of the integrated format, collection of datasets to include in the initial ISD database, development of a data model to use in a relational database for customer servicing, quality control of the historical ISD, development of the end-to- end ISDPS process, and development of online products. 2. HISTORICAL DATABASE AND BACKGROUND The National Climatic Data Center (NCDC), in conjunction with Federal Climate Complex (FCC) partners (US Air Force and Navy), developed the global ISD database to address a pressing need for an integrated global database of surface climatological data. The database of approximately 20,000 stations includes data from as early as 1901 (many stations beginning in 1948-1973 timeframe), is operationally updated with the latest data, and is now being used by numerous customers in many varied applications. This effort was made possible by funding from the Environmental Services Data and Information Management (ESDIM) office, the Office of Global Programs (OGP), and extensive contributions from member agencies in the FCC. The development of ISD Version 1 and Version 2 is now complete: 1) ISD Version 1 -- the “database build” phase produced ISD Version 1 by integrating various data sources into one set of data. The new database collects NCDC U.S. hourly data, U.S. Navy hourly data, NCDC U.S. hourly precipitation data, and Air Force global hourly and synoptic data, into one global database. These data sources included over 100 original “tapedecks” (as they were called many years ago) and formats, each having already been quality-controlled to various degrees. The building of the database involved extensive research, data format conversions, time-of-observation conversions, and development of extensive metadata to drive the processing and merging. This included the complex handling of input data stored in three different station-numbering/ID systems. 2) ISD Version 2 -- two phases of quality control produced ISD Version 2. Phase one involved the correction of errors identified after the “database build” phase (e.g., due to input data file problems). Phase two involved the research, development, and programming of algorithms to correct random and systematic errors in the data, to improve the overall quality of the database;