TECHNOLOGIES DRUGDISCOVERY TODAY Optimization algorithms and natural computing in drug discovery Tom Solmajer 1,2 , Jure Zupan 3, * 1 Laboratory for Molecular Modeling and NMR, National Institute of Chemistry, POB 660, Hajdrihova 19, 1001 Ljubljana, Slovenia 2 Drug Discovery Unit, Lek Pharmaceuticals d.d. Verovs ˇkova 57, 1526 Ljubljana, Slovenia 3 Laboratory for Chemometrics, National Institute of Chemistry, POB 660, Hajdrihova 19, 1001 Ljubljana, Slovenia Recent efforts in structural biology have lead to con- siderable growth in the number of structures available which are potential drug targets. Considerable pro- gress in docking algorithms has enabled in silico screen- ing as an attractive alternative to traditional screening for drug leads and optimization, because in vitro high- throughput screening of compounds is costly and rela- tively inefﬁcient. In molecular modeling one is often confronted with hard problems, such as highly compli- cated energy landscapes in many dimensions and a combinatorial explosion of the number of possible solutions. Recently, however, several algorithms based on situations in nature have appeared and in this review we illustrate their strengths and weaknesses. Section Editors: Paul Lewi, Frits Daeyaert – Center for Molecular Design, Janssen Pharmaceutica N.V., Vosselaar, Belgium Natural computing algorithms can give answers to outstanding hard problems encountered in structure-based drug design. In this context, Solmajer and Zupan describe the application of genetic algorithms and simulated annealing to optimization problems, artiﬁcial neural networks to data modeling and the ant algorithm to building regression trees. The research interests of Professor Solmajer include the study of structure– activity relationships, whereas Professor Zupan is known from his work on artiﬁcial neural networks and in chemometrics. Introduction Whenever scientists have to deal with a large number of data the use of sophisticated numerical methods is mandatory. The evolution of the new computational methods is a direct con- sequence of ideas based on the mimicking of natural phenom- ena and on recent growth of computational power. The most frequently cited examples of these new methods are artiﬁcial neural networks (ANNs) [1] and genetic algorithms (GA) [2]. The work with the new, nature-inspired procedures can be compared to the meticulous experimental work in the labora- tory. Hundreds of small details and information together with a good chemical knowledge must be applied at each stage in an iterative fashion to be, after a number of failures and unsuc- cessful experiments, rewarded by reasonably good results. The newcomers to the ﬁeld will often ask which method should be selected for a given problem. The answer is simple: try more, if possible, all of the available ones. Key technologies In the studies in which a chemical structure, a sequence of amino acids in proteins or similar is encoded as m-dimensional vector (x 1 , x 2 , ..., x m ), the result of each speciﬁc method depends on the choice of variables x i . Hence, the outcome of each procedure depends both on the proper representation and on the choice of the method. It is advisable to test various methods nevertheless, the selection of the representation still should be of the paramount concern. The multi-variate data handling is always an iterative two-loop optimization process. The inner loop contains the main procedure (classiﬁcation, modeling, optimization, etc.) whereas the outer one makes the selection of the data representation via optimization. Drug Discovery Today: Technologies Vol. 1, No. 3 2004 Editors-in-Chief Kelvin Lam – Pﬁzer, Inc., USA Henk Timmerman – Vrije Universiteit, The Netherlands Lead optimization *Corresponding author: (J. Zupan) jure.zupan@ki.si URL: http://www.ki.si 1740-6749/$ ß 2004 Elsevier Ltd. All rights reserved. DOI: 10.1016/j.ddtec.2004.11.011 www.drugdiscoverytoday.com 247