Towards Ontology Extraction from Data-Intensive Web Sites: An HTML Forms-Based Reverse Engineering Approach
Sidi Benslimane1, Mimoun Malki1, Mustapha Rahmouni2, and Adellatif Rahmoun3
1Evolutionary Engineering and Distributed Information Systems Laboratory, Computer ScienceDepartment, University of Sidi Bel Abbes, Algeria
2Computer Science Department, University of Es-senia Oran, Algeria
3King Faisal University, CCS&IT, Hasa, KSA
Abstract: The advance of the Web has significantly and rapidly changed the way of information organization, sharing and distribution. However, most of the information that is available has to be interpreted by humans; machine support is rather limited. The next generation of the web, the semantic web, seeks to make information more usable by machines by introducing a more rigorous structure based on ontology. In this context we try to propose a novel and integrated approach for migrating data-intensive web into ontology-based semantic web and thus, make the web content machine-understandable. Our approach is based on the idea that semantics can be extracted from the structures and the instances of HTML forms which are the most convenient interface to communicate with relational databases on the current Web. This semantics is exploited to help build ontology.
Keywords: information search and retrieval, online information services, applications and expert systems, semantic web, data-intensive web, reverse engineering.
Received August 4, 2006; Accepted November 20, 2006
Full Text