www.scholarsresearchlibrary.com t Available online a Scholars Research Library Annals of Biological Research, 2014, 5 (12):16-20 (http://scholarsresearchlibrary.com/archive.html) ISSN 0976-1233 CODEN (USA): ABRNBW 16 Scholars Research Library Analysis strategy for comparison of skewed outcomes from biological data: A recent development Devika Shanmugasundaram 1 , L. Jeyaseelan 1* , Sebastian George 2 and Geevar Zachariah 3 1 Department of Biostatistics, Christian Medical College, Vellore, India 2 Department of Statistics, St. Thomas College, Palai, Kerala, India 3 Chief Cardiologist, Mother Hospital, Thrissur, Kerala, India _____________________________________________________________________________________________ ABSTRACT When faced with the problem of comparing positively skewed outcome values, data transformations such as log and square root etc, are often used. However, this approach suffers with the difficulty in interpretability, lack of accuracy etc. That is, while the back transformation of mean is possible, but not for the standard deviation. This paper presents the analysis of comparing positively skewed outcome data by using generalized pivotal and log transformation approach for lognormally distributed data. Simulation experiment was conducted to examine the characteristics of generalized pivotal approach for small sample sizes and for large standard deviations. For the analysis of positively skewed biological data between two groups generalized p value and confidence interval approach for lognormal distribution is considered to be efficient as this provides direct statistical inference such as estimates, 95% CI and its p values. Key words: Positively skewed distribution, normal distribution, lognormal distribution, log transformation, generalized p value, generalized confidence intervals. _____________________________________________________________________________________________ INTRODUCTION In many biomedical studies, researchers are interested in estimating the difference of two sample means. One of the ways to test the above hypothesis is by doing an Independent sample t-test. However, two sample t-test approach is appropriate only if the observations are normally distributed. Many biological variables such as triglyceride levels, skinfold thickness, serum bilirubin levels etc., which are encountered in medical research are positively skewed. Data transformations are frequently used effectively in normalizing the data. Bland and Altman suggested that logarithmic transformation is frequently used for skewed outcomes as this gives nearly normal distribution [1]. Basically, the analysis is performed on the transformed scale, which can then be back transformed to the original scale. However, this will not lead to a reasonable estimate on the original scale as back transformation results in geometric mean of the original data rather than the arithmetic mean [2]. In vaccine and immunogenicity studies, the antibody titre values are log transformed and the results are summarized in terms of geometric mean titre or geometric mean ratio [3]. As such, the antilog of arithmetic means computed on log scale (geometric mean) is readily interpretable, but there is no straightforward interpretation available for the antilog of the standard deviation of the logged values [4]. Consider the example given in [5], the mean (SD) of triglyceride values of original data was 0.51 (0.22) mmol/l. The mean (SD) of the log transformed data was -0.33 (0.17). The back transformation of the mean on the log scale leads to 0.47 mmol/l which is geometric mean but the standard deviation on the log scale cannot be back transformed. Also the confidence interval for the mean in the original data cannot be regained back from the confidence interval for the mean of logged data [6]. To avoid these