Establishing a Data Warehouse for patients on ART in Botswana Charles S. Snyman a , Philippe Boucher, b , Suzanne Cloutier c , John A Puvimanasinghe a , Ndwapi Ndwapi a a Botswana Harvard Partnership PEPFAR program, seconded to Ministry of Health, Botswana b Knowledge Information and Sharing, World Health Organisation, Switzerland c Informatics and IT Section, BOTUSA Project, Botswana Abstract: The Botswana government launched a program to make Anti-RetroViral Therapy available to all eligible HIV/AIDS patients. In order to monitor this program, it was decided to establish a data warehouse. The development of this data warehouse follows the Kimball Methodology. Using this methodology, a business matrix and dimensional model was defined. In parallel to this a fast-track development was initiated to extract and load data from the 4 largest hospitals and produce reports even before the formal design was completed. This has resulted in a number of benefits. The patients’ data are fragmented over any number of source data systems all over the country. To integrate the data, patient identifying information needs to be transported to and stored in the data warehouse. This raised the issue of privacy, confidentiality and security and special attention had to be given in the design process to ensure compliance. The situation regarding HIV/AIDS in Botswana is complex with many stake-holders and expatriates involved. This requires patience and sensitivity but can lengthen the development process. Development plans should compensate for this, and also provision for the buying in of skills for system administration / system engineering and data base administration. BACKGROUND AND REQUIREMENT The Botswana government recognised the impact of HIV/AIDS on their country and launched the MASA (New Dawn) program to roll-out Anti- Retroviral Treatment (ART) to eligible patients. Botswana has a small population (1.7m) but a high HIV infection rate estimated at around 17% for the total population. More significant is the estimate that 33% in the age group 15 - 49 (adult group) are HIV positive [1]. Based on the above figures, the target population for ART are estimated to range from 120,000 to 150,000 patients. Currently around 60,000 patients are on ART in the public sector. To set-up a data warehouse (DW) to cater for only 120,000 patients might look like an overkill. Assume 120,000 on ART and a DW lifespan of 10 years, then simple multiplication quickly tell a different story as Figure 1 illustrates. Number of patients on ARV 120,000 Number of visits per yr/pt 4 Total visit records per yr 480,000 Data warehouse life cycle (yrs) 10 Total number of visit records 4,800,000 Number of Pharmacy visits/yr/pt 12 Number of drugs/visit/pt 3 Total number of drug records/yr 4,320,000 Data warehouse life cycle (yrs) 10 Total number of drug records 43,200,000 Number of Lab test/pt/yr 50 Total Lab records per yr 6,000,000 Data warehouse life cycle (yrs) 10 Total number of lab records 60,000,000 Figure 1. DW records over 10 yr period The source data for patients on ART are collected in two very different systems. The one system is a