System Administrators are Users, Too: Designing Workspaces for Managing Internet-Scale Systems Rob Barrett IBM Almaden Research Center 650 Harry Rd San Jose, CA 95120 USA +1 214 252 0818 barrett@almaden.ibm.com Yen-Yang Michael Chen 445 Soda Hall Computer Science Division Univ. of California at Berkeley Berkeley, CA 94720 USA +1 510 643 9435 mikechen@cs.berkeley.edu Paul Maglio IBM Almaden Research Center 650 Harry Rd San Jose, CA 95120 USA +1 408 927 2857 pmaglio@almaden.ibm.com THE TOPIC Administrators as Users The focus of most human-computer interaction work has been on the end users of computing systems, those using computers to accomplish their work. However, another important class of computer users is the cohort of administrators who design, build, maintain and troubleshoot computer systems as their main work. As computers have become more ubiquitous, and particularly as networked services have grown through the use of email, the web, and instant messaging, the importance and complexity of computational infrastructure has increased. End-users have become dependent upon the availability of services such as network-attached storage, web servers, and email gateways. These Internet-scale services often have thousands of hardware and software components [3,6] and require considerable amounts of human effort to plan, configure, install, upgrade, monitor, troubleshoot, and sunset. The complexity of managing these services is alarming in that a recent survey of three Internet sites showed that 51% of all failures are caused by operator errors [7]. The goal of this workshop is to bring together researchers, product designers, and system administrators to increase interest within the CHI community for improving the environment in which system administrators work. Costs, Risks and Benefits The importance of improving the system administration environment includes three related ideas: cost, risk and benefit. There is a cost associated with the difficulty of administering complex computational systems. And due to the decreasing cost of computational technology, the human cost of operating computers has become more noticeable and significant. For example, ten years ago a data storage facility spent approximately two thirds of its money on technology for storing information and one third on the human operators of the system. But now the fractions have reversed, so the relative cost of human operators has doubled [1,2,5]. This increasing relative cost of administration suggests that significant cost savings could be gained by focusing on improving the human factors of the administration environment. Second, as technical complexity increases there is an increasing risk of costly breakdowns. Furthermore, as individuals and society become more dependent on information systems for critical tasks, these risks are not just monetary. Having a world-scale auction web site go down may lose $225,000 per hour [8], but even more importantly loss of communications and computer services can mean the difference between life and death in emergency situations. The growing threats of viruses and e-terrorism compound the urgency of the situation. Just as power plant and aircraft controls have been greatly improved through human factors work, similar detailed attention to computational systems will help to minimize these risks. Third, there are tremendous benefits to be gained through improving system administration user interfaces. Beyond the decrease in cost and risk, there is an opportunity to increase the rate of deployment of beneficial computer services as the process of taking systems from developer to operation is simplified. Furthermore, more reliable base services allow the construction of more complex and capable superstructures. In many cases, it is the human component that acts as the bottleneck in getting useful services to end users.