Hypertext Semantic Search Based Dexter Model on Distributed Heterogeneous Systems Rabie A. Ramadan , Systems and Computers Department Al-Azhar University, Cairo, Egypt rabieramadan@yahoo.com Mohamed Z. Abd El-mageed Systems and Computers Department Al-Azhar University, Cairo, Egypt azhar@mailer.scu.eun.eg Samir I. Shaheen Computer Engineering Department Cairo University, Cairo, Egypt sshaheen@frcu.eun.eg Abstract Building hypertext out of a given text is not an easy task. The hypertext designer has to be aware of whole structure of the text. Some trials were attempted to automate such process and different hypertext models were proposed. However, the most famous model is the Dexter model; Dexter has many advantages over other models such as separating the storage layer from the presentation layer as well as using anchoring concept in the representation of the hypertext. On the other hand, Dexter is limited to a single hypertext and there is no extension to it to be used in multiple hypertexts or open systems. In this paper, we extend the Dexter hypertext model to support multiple hypertexts. This extension is used for the benefit of solving our semantic search problem where multiple Islamic books are distributed on heterogeneous machines. The heterogeneity of these machines comes from using different operating systems and text formats. We introduce a novel idea to semantically search in these books by transferring their contents to hypertext that fits the extended Dexter model. At the same time, Grammar-based and Latent Semantic Indexing (LSI) are used to construct the hypertext nodes. Moreover, word frequency is used to extract the keywords from the text. Our semantic search system seems to over perform the blind search in terms of search time and accuracy. 1. Introduction “Hypertext” is not a new term; it is as old as the Internet. It consists of two words “hyper” which means “over” or “above” and “text”. The word “hyper” is used in physics to describe new kind of “space” that was defined by Einstein‟s relative theory named “hyperspace”. The hypertext is also defined as an extension of a text [14] , text plus an abstraction [12], and non sequential writing [7]. Therefore, the concept of hypertext is described in 1945 by Bush in article titled “ As we may think” [8][1]. This description is followed by an augmentation system as a hypertext system in 1960 that is developed by Douglas Englbart group [2]. The system was based on sophisticated browsing components. This is synchronized with the mouse invention which helps in navigation through different files on the system. Late in 1970‟s, the term “hypertext” was coined by Ide Nelson, whose visionary ideas ranged through such divers of stored topics that can be folded and unfolded. Nelson published an electronic book in which blocks of its text are linked to other parts in the book [14] . During 1980‟s, there has been explosion of interest in hypertext. A large number of hypertext systems are produced at that time. These systems are based on the same concept that relates pieces of information together. However, the implementation of such systems was different. The hypertext systems can be classified into two generations [14][8][11]. In the first generation systems such as Memex, Augmented systems, ZOG FRESS, and Dynabook were produced. In the second generation, other systems such as NoteCard, HyperCard, KMS, Guid, and HyperTies were developed. The main difference between the two generations is that the second generation supports different media other than text such as graphics and animation as well as more advanced user interface. With the increasing of the hypertext systems, many of the electronic documents are transferred to a hypertext format. This leads to the following problems: 1) Disorientation: where the user gets lost in the information space due to a huge number of links that can follow. 2) Construction of a hypertext network: where the author of the document must discover all of the links and their relations. 3) Search: the lack of a semantic search that can understand what the user wants and act accordingly. Semantic search, on the other hand, tries to increase relevancy between the search results. There are many proposals for increasing the semantics of the text search. The reader is referred to [3]and [10] for more details on the semantic search techniques. The main problem of the semantic search algorithms is that most of them do not fit the nature of the hypertext. In addition, in a language like Arabic, the semantic search