International Journal of Computer Applications (0975 8887) Volume 27No.9, August 2011 1 Using Symmetric Multiprocessor Architectures for High Performance Computing Environments Mohsan Tanveer Dept. of SE, Foundation University, Institute of Engineering and Management Sciences (FUIEMS), Rawalpindi, Pakistan M. Aqeel Iqbal Dept. of SE, Foundation University, Institute of Engineering and Management Sciences (FUIEMS), Rawalpindi, Pakistan Farooque Azam Dept. of CE, College of Electrical & Mechanical Engineering, National University of Sciences and Technology (NUST), Islamabad, Pakistan ABSTRACT Performance enhancement for high speed computing can be carried out by using many techniques and architectures at software and high hardware level. Performance enhancement using hardware techniques may include the use of multiple computing nodes or a single node consisting of multiple processors. Symmetric multiprocessor is one of the modern architectures used to perform extensive computations. Symmetric multiprocessors have many configuration modes to carry out these heavy computations. The performance of Symmetric multiprocessors is analyzed and compared with high-fidelity models. Processors models are used to design and construct the architectures of symmetric multiprocessors. In this research paper such kind of critical design aspects of symmetric multi processors have been analyzed for further enhancement of the existing technology. 1. SYMMETRIC MULTI-PROCESSORS The demand for the processing power unit is growing day by day. The capability of execution according to speed and efficiency can be increased by different type of ways like enhancing the CPU programming lik e inserting new programming, arr an ge new registers to the model of microprocessors and grouping up the CPUs [9]. Chip improvements are required for the first two options but the third can boldly increase the processing power. However, the “CPU grouping” approach is affordable because: If we enhance the CPU programming, more efforts would be required to integrate the programs and registers. If one processor is faulty, the life of the computer would be increased by multi processors. Hence we have to design a new commercial scale super computer. Hence, we have a choice, to rely on internal changes of the CPU or we combine multiple processors/CPUs. Symmetric multiprocessing is a case of parallel multiprocessing [7], [8]. In the symmetric multiprocessing system all processors behave identically and Kernel of operating system can assign any process to any processor. A Single instance of the operating system manages all processor s. Applications have uniform access to memory and I/O. These operating systems are more special and complex unlike typical operating systems. Figure-1: Heterogeneous, Asymmetric Multi-processing (AMP), Symmetric Multi-processing (SMP) To gain the maximum advantages of symmetric multiprocessing, we required an additional synchronization code for data structures to maintain the consistency and balance the work load between multiple threads of multiple processors [8], [9]. On a multiprocessor, scheduling is multi dimensional. The scheduler allocates processes to the CPUs to execute it. This complicates the processing paths and signals of multiprocessors. Thus efficient multiprogramming is required to avail the full and maximum processing. Symmetric processors have their own front side bus that‟s why they have the advantage over cores. The scalability of symmetric multiprocessors can be increased by using mesh architectu re. SMP is one of the earliest types of computer architecture mostly used for up to 8 processors. These multiprocessors share a common main memory and I/O. A microcontroller(s) controls data flow throughout the processors and main memory [6]. Each processor has a dedicated cache for better latency and data brought into each processor‟s registers can be transferred through its cache rather than from main memory. The question arises here that may be a process on data can be cached by multiple processors. To avoid this incidence there is a task called cache coherence that ensures each processor is working on recent copy of data. The basic architecture we use for coherence is snoopy bus architecture (discussed later).