ECEASST Code Analysis: Past and Present Daniela da Cruz 1 , Pedro Rangel Henriques 2 and Jorge Sousa Pinto 3 1 danieladacruz@di.uminho.pt, http://www.di.uminho.pt/ danieladacruz 2 prh@di.uminho.pt, http://www.di.uminho.pt/ prh 3 jsp@di.uminho.pt, http://www.di.uminho.pt/ jsp Department of Computer Science University of Minho, Braga, Portugal Abstract: The integration of Software components within complex industrial ap- plications with severe security standards, requires strict quality assessment of each integrated component. That is, requires a guarantee that each component is compli- ant with the software development good practices and all the standards in use. If full certification is easy to obtain for proprietary modules, it is particularly hard to achieve when dealing with Open-Source Software pieces, demanding for rigorous methods and techniques to implement their certification process. In this context, code analysis plays an important role as the basis for the automatiza- tion of quality assessment of open source software projects – code analysis provides the techniques and tools to implement the necessary validation process. Although source code is still the most explored (the main support for analysis), nowadays this assessment process should be able to deal with code at different compilation levels. Due to its relevance for the open source software certification task, this paper re- views code analysis area (stages of the analyzing process, traditional approaches, and future trends), aiming at identifying what is available, and what deserves fur- ther research. Keywords: Code Analysis, Data Extraction, Information Representation 1 Introduction The increasing amount of software developed in the last few years have produced a growing demand for programmers and programmer productivity to maintain it working along the years. During maintenance, the most reliable and accurate description of the behavior of a software system is its source code. Even nowadays, when modern software projects start with the con- struction of models (e.g. using the UML) that can be “compiled” to traditional source code, source code is still considered “the truth” and “the system” (because the generated code is incomplete and requires that the programmers complete it by hand). So, Source Code Analysis—according to David Binkley, in [Bin07], the process of extracting information about a program from its source code or artifacts generated from the source code using automatic tools—is crucial to support maintenance. However, given the complexity of modern software, the manual analysis of code (source code, intermediate, or machine code), is costly and ineffective. A more viable solution is to resort to tool support. Such tools provide 1 / 10 Volume X (2009)