Submit Manuscript | http://medcraveonline.com
another missing value mechanism, missing at random (MAR). For
quantitative responses, statistical methods, including linear and
nonlinear models, are established for correlated data. However, for
partially correlated data there are concerns which to be addressed due
to the complexity of the analysis. In particular, for small sample sizes
and when a normality assumption of the underlying populations is
not valid.
As an example of partially correlated data for the MCAR design,
consider the case where the researcher compares two different
treatment regiments for eye redness or allergy and randomly assigns
one treatment to each eye for each experimental subject. Some patients
may drop out after the frst treatment, while other patients may drop
out before the frst treatment and came back for the second treatment.
In this situation, we may have two groups of patients: the frst group of
patients who received both treatments in each eye, and are considered
as paired matched data; and the second group who received only one of
the treatments in one of the eyes, and are considered as unmatched data.
Moreover, additional examples for partially correlated data can be
found in the literature.
4–6
Several authors have presented various tests
considering the problem of estimating the difference of means of a
bivariate normal distribution when some observations corresponding
to both variables are missing. Under the assumption of bivariate
normality and MCAR, Ekbohm
7
summarized fve procedures for
testing the equality of two means. Using Monte Carlo results Ekbohm
7
indicated that the two tests based on a modifed maximum likelihood
estimator are preferred: one due to Lin and Stivers
8
when the number
of complete pairs is large, and the other proposed in Ekbohm’s
paper otherwise, provided the variances of the two responses do not
differ substantially. When the correlation coeffcient between the
two responses is small, two other tests may be used: a test proposed
by Ekbohm when the homoscedasticity assumption is not strongly
violated, and otherwise a Welch-type statistic suggested by Lin and
Stivers
8
(for further discussion, see Ekbohm
7
).
Alternatively, researchers tend to ignore some of the data – either
the correlated or the uncorrelated data depending on the size of each
subset. However, in case the missing ness not completely at random
(MCAR), Looney and Jones
9
argued that ignoring some of the
correlated observations would bias the estimation of the variance of
the difference in treatment means and would dramatically affect the
performance of the statistical test in terms of controlling type I error
rates and statistical power.
10
They propose a corrected z-test method
to overcome the challenges created by ignoring some of the correlated
observations. However, our preliminary investigation shows that the
method of Looney and Jones
9
pertains to large samples and is not
the most powerful test procedure. Furthermore, Rempala & Looney
11
studied asymptotic properties of a two-sample randomized test for
partially dependent data. They indicated that a linear combination of
randomized t-tests is asymptotically valid and can be used for non-
normal data. However, the large sample permutation tests are diffcult
to perform and only have some optimal asymptotic properties in the
Gaussian family of distributions when the correlation between the
paired observations is positive. Other researchers, such as Xu & Harra
12
and Konietschke et al.
13
also discuss the problem for continuous
variables including the normal distribution by using weighted
statistics. However, the procedure suggested by Xu & Harra
12
is a
functional smoothing to the Looney & Jones
9
procedure. As such,
the Xu and Hara procedure is not a practical alternative for the non-
statistician researcher. The procedure suggested by Konietschke et
al.
13
is a nonparametric procedure based on ranking.
Samawi & Vogel
14
presented weighted test procedure to combined
the correlated and non-correlated data. The aforementioned methods
cannot be used for non-normal and moderate, small sample size data
and categorical data. Samawi & Vogel
15
introduced several weighted
tests when the variables of interest are categorical. They showed
that their test procedures compete with other tests in the literature.
Moreover, there are several attempts to provide nonparametric test
procedures under MCAR and MAR designs.
1–3,16,17
However, there
is still a need for intensive investigation to develop more powerful
nonparametric testing procedures for MCAR and MAR designs.
Samawi et al.,
18
discussed and proposed some nonparametric testing
procedures to handle data when partially correlated data is available
without ignoring the cases with missing responses. They introduced
more powerful testing procedure which combined all cases in the
study. All the above suggested procedures will be of special importance
in meta-analysis where partially correlated data is a concern when
combining results of various studies.
Acknowledgments
None.
Conficts of interest
The authors declare that there are no conficts of interest.
Funding
None.
Biom Biostat Int J. 2015;2(1):5‒6. 5
©2015 Samawi et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which
permits unrestricted use, distribution, and build upon your work non-commercially.
On inference of partially correlated data
Volume 2 Issue 1 - 2015
Hani Samawi, Robert Vogel
Department of Biostatistics, Georgia Southern University, USA
Correspondence: Hani Samawi, Department of Biostatistics,
JPHCOPH, Georgia Southern University, Statesboro, GA 30460,
USA, Tel 912-478-1345, Fax 912-478-5811,
Email
Received: January 22, 2015 | Published: January 26, 2015
Biometrics & Biostatistics International Journal
Editorial
Open Access
Editorial
Statistical inferential methods in the felds of the social, behavioral,
economic, biological, medical, epidemiologic, health, public health,
and drug developmental sciences need has grown exponentially in
the last few decades. Study designs in the aforementioned applied
sciences give rise to correlated and partially correlated data due to
missing responses. For instances correlated data arise when subjects
are matched to controls because of confounding factors and there are
missing values in either or both groups. Other situations arise when
subjects are repeatedly measured over time as in repeated measures
designs. One assumption to consider is that observations are missing
completely at random (MCAR).
1,2
However, Akritas et al.
3
consider