P.K. Rai et al, International Journal of Computer Science and Mobile Computing, Vol.3 Issue.10, October- 2014, pg. 595-604 © 2014, IJCSMC All Rights Reserved 595 Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320088X IJCSMC, Vol. 3, Issue. 10, October 2014, pg.595 604 RESEARCH ARTICLE Unsupervised Learning on Cosmic Ray Daily Harmonic Variations Roopesh K. Dwivedi, P.K. Rai* A.P. S. University Rewa (M.P.)-India * pkrapsu@gmail.com Abstract: Clustering is division of data into groups of similar objects. From a machine learning perspective cluster correspond to hidden patterns. In unsupervised learning we find cluster to represent a data concept. Since scientific organizations also generate large volumes of data, the challenges are to analyze the data using the recent data mining techniques, so as to arrive at meaningful conclusions. For real life applications, we have used the hourly cosmic ray intensity data from 1965 to 2006 to first derive for each day, the amplitude and phase of the harmonics of the daily variation (r 1 , 1 , and r 2, 2 ). We have applied the k-mean partitioning algorithm, the agglomerative hierarchical clustering algorithm BIRCH, and the density based partitioning algorithm DBSCAN on the above set of daily data containing r 1 , 1 , and r 2, 2 for each day. Many interesting clusters have been identified. The cluster analysis indicates that a very clear-cut 10-11 year periodicity is observed in the harmonics dataset even when all the four attributes are considered together. Moreover, similar characteristics are repeated after a gap of 10-11 years and many years occurring in pairs in the two sets (out of the 4 sets, each of about 10-11 years) are the outlier years. The years 1996 and 1997 are particularly emphasized as outliers. These results are similar to that reported in literature, though by statistical methods and by considering only r 1 and 1 and not all the four attributes taken together. As such the superiority of the mining technique is revealed in the real life situations. Key Words: Clustering, Data mining, K-mean, BIRCH, DBSCAN, Cosmic ray harmonic 1. Introduction The process of grouping a set of physical or abstract object is called clustering [JMF99]. A cluster is a collection of data objects that are similar to one another within the same cluster and are dissimilar to the objects in other clusters [D93] [E93]. As a branch of statistics, cluster analysis has been studied extensively for many years. In cluster analysis main focus is on distance based cluster analysis [M96]. Many statistical analysis software packages or systems have built in feature for cluster analysis and they are being used as cluster analysis tools. These