A Novel Architecture for Search Engine using Domain Based Web Log Data
Prem Sharma Computer Science and Engineering, Veer Madho Singh Bhandari Uttarakhand Technical University, India This email address is being protected from spambots. You need JavaScript enabled to view it. |
Divakar Yadav School of Computer and Information Sciences, Indira Gandhi National Open University, India This email address is being protected from spambots. You need JavaScript enabled to view it. |
Abstract: Search engines, an information retrieval tool are the main source of information for users’ information need now a day. For every query, the search engine explores its repository and/or indexer to find the relevant documents/URLs for that query. Page ranking algorithms rank the Uniform Resource Locator in abstract section (URLs) according to its relevancy with respect to users’ query. It is analyzed that many of the queries fired by users on search engines are duplicate. There is a scope to improve the performance of search engine to reduce its efforts for duplicate queries. In this paper a proxy server is created that keep store the search results of user queries in web log. The proposed proxy server uses this web log to find results faster for duplicate queries fired next time. The proposed scheme has been tested and found prominent. The proposed architecture tested for ten duplicate user queries. it return all relevant web pages for duplicate user query (if query is found in web log at proxy server) from a particular domain instead of entire database. It reduces the perceived latency for duplicate query and also improves the value of precession and accuracy up to 81.8% and 99% respectively for all duplicate user queries.
Keywords: Search engine, information retrieval, web usage mining, content mining.
Received February 14, 2021; accepted March 16, 2022