Data Quality: Developments and Directions Bhavani Thuraisingham* and Eric Hughes The MITRE Corporation, Bedford MA, USA *On Leave at the National Science Foundation, Arlington VA, USA Abstract: This paper first examines various issues on data quality and provides an overview of current research in the area. Then it focuses on research at the MITRE Corporation to use annotations to manage data quality. Next some of the emerging directions in data quality including managing quality for the semantic web and the relationships between data quality and data mining will be discussed. Finally some of the directions for data quality will be provided. Key words: Data Quality, Annotations, Semantic Web, Data Mining, Security 1. INTRODUCTION There has been a lot of research and development on data quality for the past two decades or so. Data quality is about understanding whether data is good enough for a given use. Quality parameters will include the accuracy of the data, the timelines of the data and the precision of the data. Data quality has received increased attention after the revolution of the Internet and E- Commerce. This is because organizations now have to share data coming from all kinds of sources, and intended for various uses, and therefore it is critical that organizations have some idea as to the quality of the data. Furthermore, heterogeneous databases have to be integrated within and across organizations. This also makes data quality more important. Another reason for the increased interest in data quality is warehousing. Data warehouses are being developed by many organizations and data is brought into the warehouse from multiple sources. The sources are often inconsistent, so the data in the warehouse could appear to be useless to users. M. Gertz et al. (eds.), Integrity, Internal Control and Security in Information Systems © Springer Science+Business Media New York 2002