Locating some types of random errors in Digital Terrain Models 1 Carlos López Environmental and Natural Resources Information Systems Royal Institute of Technology Stockholm 100 44, Sweden 2 Abstract: The increasing use of Geographic Information System applications has generated a strong interest in the assessment of data quality. As an example of quantitative raster data, we analyzed errors in Digital Terrain Models (DTM). Errors might be classified as systematic (strongly dependent on the production methodology) and random. The present work attempts to locate some types of randomly distributed, weakly spatially correlated errors by applying a new methodology based on Principal Components Analysis. The Principal Components approach presented is very different from the typical scheme used in image processing. A prototype implementation has been conducted using MATLAB, and the overall procedure has been numerically tested using a Monte Carlo approach. A DTM of Stockholm, with integer-valued heights varying from 0 to 59 m has been used as a testbed. The model was contaminated by adding randomly located errors, distributed uniformly within -4m. and +4m. The procedure has been applied using both spike shaped (isolated errors) and pyramid-like errors. The preliminary results show that for the former, roughly half of the errors have been located with a type I error probability of 4.6% on average checking 1 per cent of the dataset. The associated type II error of the larger errors (of exactly +4m. or -4m.) drops from an initial value of 1.21% down to 0.63%. By checking another 1 per cent of the dataset such error drops to 0.34% implying that about 71% of the ±4m errors have been located; type I error was below 11.27%. The results for pyramid-like errors are slightly worse, with a type I error of 25.80% on average for the first 1 per cent effort, and a type II error drop from an initial value of 0.81% down to 0.65%. The procedure can be applied both for error detection during the DTM generation and by end users, and it might be of use for other quantitative raster data examples. I Introduction: Data quality has become an important aspect of Geographic Information Systems (GIS) applications. John (1993) stated that "...very wrong answers can be derived using perfectly logical GIS analysis techniques, if the user is not aware of the particular peculiarities of their data..." Although this statement holds for any kind of data, we will concentrate here on the case of Digital Terrain Models (DTM). We will not consider errors in the intermediate steps in the process of DTM generation, but we will concentrate on the errors in the final product.. Östman (1987) pointed out the fact that there exists no unique criteria or single measure for the "quality" of a DTM. He suggested that at least, one should consider accuracy in height, 1 Published in International Journal of Geographical Information Science, 11, 7, 677-698, 1997 2 Permanent address: Centro de Cálculo, Facultad de Ingeniería, Universidad de la República, CC 30, Montevideo, Uruguay