Abstract—Categorical data based on description of the agricultural landscape imposed some mathematical and analytical limitations. This problem however can be overcome by data transformation through coding scheme and the use of non-parametric multivariate approach. The present study describes data transformation from qualitative to numerical descriptors. In a collection of 103 random soil samples over a 60 hectare field, categorical data were obtained from the following variables: levels of nitrogen, phosphorus, potassium, pH, hue, chroma, value and data on topography, vegetation type, and the presence of rocks. Categorical data were coded, and Spearman’s rho correlation was then calculated using PAST software ver. 1.78 in which Principal Component Analysis was based. Results revealed successful data transformation, generating 1030 quantitative descriptors. Visualization based on the new set of descriptors showed clear differences among sites, and amount of variation was successfully measured. Possible applications of data transformation are discussed. Keywords—data transformation, numerical descriptors, principal component analysis I. INTRODUCTION ULALITY descriptions during rapid appraisal of the landscape is common. Qualitative variables such topography has categories e.g. flat, slightly rolling, hilly and steep. A particular spot of the landscape can be described as having low, medium or high in phosphorus. As such, bio-chemical and physical variables in nature can be described qualitatively, but often encounters problems when data are used in classification and in delineating boundaries. This is not a problem when landscape is classified based on a single variable; complications only appear when all variables are integrated, which often be the case in landscape evaluation. There are a number of limitations identified from using qualitative descriptions. First, the information does not yield itself to statistical testing; second, patterns of variations cannot be deciphered and measured objectively; third, there is difficulty in identifying factor contributing to large spatial variation; and lastly, hyper-variation may result if several qualitative variables are included in landscape classification; every variable added contributes to variations. With the application of numerical coding technique, categorical data obtained from qualitative measurements, can be processed statistically using non-parametric test. These concepts of non-parametric test and numerical taxonomy are popular in the field of biology. Somehow, the application was extended to the field of soil science in the second half of 20 th century. Sarkar enumerated those characteristics for inclusion D. A. Apuan is with the Department of Agricultural Sciences, College of Agriculture, Xavier University, Cagayan de Oro City, Philippines. Telephone number (088) 858 3116 loc. 3100 (dennis_apuan@yahoo.com) in numerical taxonomy of soils [1]; Rayner and Grigal used numerical classification of soils in forest areas [2, 3]. Goodall in his ecological studies pioneered the use of factor analysis – a non-parametric and multivariate technique [4]. The current study deals with transformation of categorical data generated from qualitative measurements into quantitative descriptors. The technique involves numerical coding of categorical data similar to dummy variable described by Field [5]. The quantitative descriptors are a new set of data which are generated based on Wilcoxon’s ranking and the use of least squares method. Through a non-parametric multivariate test known as Principal Component Analysis (PCA), patterns of variations can be observed and measured using the new numerical descriptors. The study specifically shows the numerical coding scheme applied, and the exportation of coded data to a platform of PAST (Paleontological Statistics) software version 1.78. Extraction of new descriptors and the implementation of PCA are through the use of this software, including estimates of variation and its pattern. The applications of the technique in rapid appraisal of the ecosystems landscape and in the field of agriculture are stressed out; especially its possible utilization for site specific intervention. II. MATERIALS AND METHODS The 60 hectare field of Manresa Research Station in Cagayan de Oro, Philippines was chosen as a sampling site due to the natural contrasting variation of the landscape, and variations caused by agricultural treatments and landuse. Eleven different sites were sampled where a total of 103 random soil samples were obtained. The soil test kit was used to analyze the samples and obtained qualitative measurements on the following variables: amount of nitrogen, phosphorus, potassium and pH. The kit has limitations and can only give categorical readings such as low, medium and high. The color variables such as the chroma, value and hue were measured using the soil Munsell color chart. Other variables measured were the topography, presence or absence of rocks and kinds of vegetations. Categories within each of these variables were recorded during the field visits. Landscape Data Transformation: Categorical Descriptions to Numerical Descriptors Dennis A. Apuan Q World Academy of Science, Engineering and Technology International Journal of Biological, Biomolecular, Agricultural, Food and Biotechnological Engineering Vol:5, No:9, 2011 512 International Scholarly and Scientific Research & Innovation 5(9) 2011 scholar.waset.org/1999.1/1076 International Science Index, Agricultural and Biosystems Engineering Vol:5, No:9, 2011 waset.org/Publication/1076