Role of References in Similarity Estimation of Publications

Role of References in Similarity Estimation of

Publications

Muhammad Shoaib1, Ali Daud1, and Malik Khiyal2

1Department of Computer Science and Software Engineering, International Islamic University, Pakistan

2Faculty of Computer Sciences, Preston University, Pakistan

Abstract: Similarity estimation among publications is very important in classification and clustering techniques for grouping, indexing, citation matching and Author Name Disambiguation (AND)  purposes. Publication attributes are basic sources of information and play important role in similarity estimation. Most of the works in AND use title, co-authors and venue attributes for estimating similarity among publications. Many other sources of information such as self-citations, shared citations and references, topic of the publications and abstracts have also been employed to estimate optimal similarity among publications. Recently, in the field of Academic Document Clustering (ADC), reference marker contexts have been utilized for this purpose. However, the use of citations and references is less common since only a few databases include this information. In this paper, we propose to use two components of references (co-authors and titles of references) as sources of information and investigate the importance of these components in similarity estimation. To the best of our knowledge, this is the first endeavour to exploit components of references as sources of information. Experiments conducted on real publication datasets reveal that these components of references are significant source of information for similarity estimation among publications. 

Keywords: AND, references, vector space model, cosine similarity, citation matching. 

Received May 16, 2014; accepted September 11, 2014

 

Read 1891 times Last modified on Tuesday, 04 April 2017 06:45
Share
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…