2048 Copyright © 2010, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. Chapter 7.12 Search Engine-Based Web Information Extraction Gijs Geleijnse Philips Research, The Netherlands Jan Korst Philips Research, The Netherlands AbstrAct In this chapter we discuss approaches to find, extract, and structure information from natural language texts on the Web. Such structured information can be expressed and shared using the standard Semantic Web languages and hence be machine interpreted. In this chapter we focus on two tasks in Web infor- mation extraction. The first part focuses on mining facts from the Web, while in the second part, we present an approach to collect community-based meta-data. A search engine is used to retrieve po- tentially relevant texts. From these texts, instances and relations are extracted. The proposed approaches are illustrated using various case-studies, showing that we can reliably extract information from the Web using simple techniques. IntroductIon Suppose we are interested in ‘the countries where Burger King can be found’, ‘the Dutch cities with a university of technology’ or perhaps ‘the genre of the music of Miles Davis’. For such diverse factual information needs, the World Wide Web in general and a search engine in particular can provide a solu- tion. Experienced users of search engines are able to construct queries that are likely to access docu- ments containing the desired information. However, current search engines retrieve Web pages, not the information itself 1 . We have to search within the search results in order to acquire the information. Moreover, we make implicit use of our knowledge (e.g. of the language and the domain), to interpret the Web pages. DOI: 10.4018/978-1-60566-112-4.ch009