An Ontology-based Semantic Extraction
Approach for B2C eCommerce
Ali Ghobadi1 and Maseud Rahgozar2
1Database Research Group, University of Tehran, Iran
2Control and Intelligent Processing Center of Excellence, University of Tehran, Iran
1Database Research Group, University of Tehran, Iran
2Control and Intelligent Processing Center of Excellence, University of Tehran, Iran
Abstract: Although varieties of investigations have been done on human semantic interactions with Web resources, no advanced and considerable progresses have been achieved. It could be said that comparative shopping systems are the last generations of B2C eCommerce systems that connect to multiple online stores and collect the information requested by the user. In some cases, the information is extracted from the online store sites through keyword search and other means of textual analysis. These processes make use of assumptions about the proximity of certain pieces of information. These heuristic approaches are error-prone and are not always guaranteed to work. In this paper, we propose an ontology-based approach to extract the products’ information and the vendors’ price from their public Web sites’ pages. Although most vendors on the Web present their products’ information in HTML documents that are not semantic formats. However, our approach is based on understanding semantics of HTML documents and extracting the information automatically.
Keywords: Semantic correspondence, ontology, and schema.
Received November 28, 2008; accepted May 17, 2009