A Corpus based Approach to Find Similar Keywords for Search Engine Marketing

A Corpus based Approach to Find Similar Keywords for Search Engine Marketing

Muazzam Siddiqui1, Mohammad Fayoumi2, and Nidal Yusuf3
1Faculty of Computing and Information Technology, King Abdulaziz University, KSA
2Faculty of Computers and Information Systems, Umm Al-Qura University, KSA
3Faculty of Information Technology, Al Isra University, Jordan

 

Abstract:
Automatic thesaurus generation is used by search engines for query expansion. The same concept is used by search engine marketing companies to suggest keyword terms to their clients to improve the client’s ratings for different search engines. This paper presents and evaluates a corpus based method to find similar terms. The corpus is generated by scraping websites in different categories. A feature selection method is developed that rewards category specific terms and penalizes terms shared by two or more categories. The similarity measure is decomposed into three distinct components, namely contextual, functional and lexical similarities. The contextual similarity measure finds terms that are found in the same context. Functional similarity finds terms on co-occurrence basis while the lexically similar terms share one or more words. An overall similarity measure combines the evidence from these three measures.


Keywords: Information retrieval, text mining, term similarity, search engine marketing.
 
Received July 3, 2011; accepted May 22, 2012
Read 2977 times Last modified on Sunday, 01 September 2013 02:27
Share
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…