Adaptive Semantic Indexing Of Documents for Locating Relevant Information In P2P Networks

Thursday, 15 January 2015 04:38

Adaptive Semantic Indexing Of Documents For Locating Relevant Information In P2P Networks

Anupriya Elumalai1, Sriman Narayana Iyengar2

1Department of Information Technology, Ibri College of Technology, Oman

2School of Computing Science & Engineering, VIT University, India

Abstract: Locating relevant Information in Peer-to-Peer (P2P) system is a challenging problem. Conventional approaches use flooding to locate the content. It is no longer applicable due to massive information available upfront in the P2P systems. Sometime, it may not be even possible to return small percent of relevant content for a search if it is an unpopular content. In this paper, we present Adaptive Semantic P2P Content Indexed System. Content Indices are generated using topical semantics of documents derived using WordNet Ontology. Similarities between document hierarchies are computed using information theoretic approach. It enables locating and retrieval of contents with minimum document movement, search space and nodes to be searched. Results illustrate that our work can achieve results better than Content Addressable Network (CAN) semantic P2P Information Retrieval system. Contrary to CAN semantic P2P IR system, we have used content aware and node aware bootstrapping instead of random bootstrapping of search process.

Keywords: Information Retrieval, Semantic Indexing, Peer-to-Peer systems, Chord, Concept Clustering, Lexical ontology, WordNet, Semantic Overlay Network

Received October 14, 2013; accepted July 24,2014

Full Text

Published in Vol 12, No.5, September 2015

Lessons Learned: The Complexity of Accurate Identification of in-Text Citations

Wednesday, 03 December 2014 13:34

Lessons Learned: The Complexity of Accurate Identification of in-Text Citations

Abdul Shahid, Muhammad Afzal and Muhammad Abdul Qadir

Department of Computer Science, Mohammad Ali Jinnah University, Pakistan

Abstract: The importance of citations is widely recognized by the scientific community. Citations are being used in making a number of vital decisions such as calculating impact factor of journals, calculating impact of a researcher (H-Index), ranking universities and research organizations. Furthermore, citation indexes, along with other criteria, employ citation counts to retrieve and rank relevant research papers. However, citing patterns and in-text citation frequency are not used for such important decisions. The identification of in-text citation from a scientific document is an important problem. However, identification of in-text citation is a tough ask due to the ambiguity between citation tag and content. This research focuses on in-text citation analysis and makes the following specific contributions such as: Provides detailed in-text citation analysis on 16,000 citations of an online journal, reports different pattern of citations-tags and its in-text citations, and highlights the problems(mathematical ambiguities, wrong allotments, commonality in content,, and string variation) in identifying in-text citations from scientific documents. The accurate identification of in-text citations will help information retrieval systems, digital libraries, and citation indexes.

Keywords: In-text citation analysis, citation frequency, citation tag, in-text citation patterns, digital library

Received June 27, 2013; accepted March 19, 2014

Full Text

Published in Vol 12, No.5, September 2015

Comparison of Segmentation Algorithms by A Mathematical Model For Resolving Islands and Gulfs in Nuc

Thursday, 04 September 2014 07:37

Comparison of Segmentation Algorithms by A Mathematical Model For Resolving Islands and Gulfs in Nuclei of Cervical Cell Images

Mohideen Fatima Alias Niraimathi M1 and Seenivasagam2
1Department of Information Technology, National College of Engineering, India
2Department of Computer Science and Engineering, National Engineering College, India

Abstract: Cell segmentation from microscopic images is the first stage of the automatic biomedical image processing, which plays a crucial role in the study of cell behaviour which is a very difficult and tedious task because of the variation that exist in illumination and dye concentration of the cells due to the staining procedure.This paper proposes a new method for segmentation of cervical cell nuclei based on a simple mathematical model to eliminate and resolve islands and gulfs which appear in the segmented output of conventional thresholding and region growing methods of segmentation. These components are eliminated and resolved and added to their related cell regions by our proposed mathematical model which first detects the borders of those structures and if it lies within the associated region they are placed within that region. The performance was evaluated and compared with the above mentioned methods. A simple mathematical vision system model to segment and analyze cytological image nuclei is proposed.

Keywords: Cervical cancer screening, pap smear images, islands and gulfs, mathematical model.

Received February 26, 2013; accepted December 24, 2013

Full Text

Published in Vol 12, No.5, September 2015

AES Based Multimodal Biometric Authentication using Cryptographic Level Fusion with Fingerprint and

Thursday, 04 September 2014 07:34

AES Based Multimodal Biometric Authentication using Cryptographic Level Fusion with Fingerprint and Finger Knuckle Print

Muthukumar Arunachalam¹and Kannan Subramanian²

¹Department of Electronics and Communication Engineering, Kalasalingam University, Krishnankoil

²Department of Electrical and Electronics Engineering, Kalasalingam University, Krishnankoil

Abstract: In general, the identification and verification are done by passwords, pin number, etc., which are easily cracked by others. In order to, overcome this issue biometrics is a unique tool to authenticate an individual person. Biometric is a quantity which consists of an individual physical characteristics of Fingerprint, Finger Knuckle Print (FKP), Iris, Face and so on. These characteristics are not easily cracked by others. Nevertheless, unimodal biometric suffers due to noise, intra class variations, spoof attacks, non-universality and some other attacks. In order to avoid these attacks, the multimodal biometrics i.e. a combination of more modalities is adapted. They are combined with cryptography, which will give more security for physical characters of biometrics. Bio-crypto system provides the authentication as well as the confidentiality of the data. This paper proposes to improve the security of multimodal systems by generating the biometric key from Fingerprint and FKP biometrics with its feature extraction using K-Means algorithm. The secret value is encrypted with biometric key using Symmetric Advanced Encryption Standard (AES) Algorithm. This paper also discusses about the integration of Fingerprint and FKP using package model cryptographic level fusion in order to improve the overall performance of the system. The encryption process will give more authentication and security for the system. The Cyclic Redundancy Check (CRC) function protects the biometric data from malicious tampering, and also it provides error checking functionality.

Keywords: AES algorithm, Biometric crypto-systems, CRC, Cryptographic level fusion methodology, K-Means algorithm, Multimodal biometrics.

Received May 17, 2013; accepted September 19, 2013

Full Text

Published in Vol 12, No.5, September 2015

Automated Retinal Vessel Segmentation Using Entropic Thresholding Based Spatial Correlation Histogra

Thursday, 04 September 2014 07:31

Automated Retinal Vessel Segmentation Using Entropic Thresholding Based Spatial Correlation Histogram of Gray Level Images

Belhadi Soumia and Benblidia Nadjia

Faculty of science, Saad Dahlab University, Algeria

Abstract: After highlighting vessel like structure by an appropriate filter in matched filter technique, thresholding strategy is needed for the automated detection of blood vessels in retinal images. For the purpose, we propose to use a new technique of entropic thresholding based on Gray Level Spatial Correlation (GLSC) histogram which takes into account the image local property. Results obtained show robustness and high accuracy detection of retinal vessel tree. An appropriate technique of thresholding allows significant improvement of the retinal vessel detection method.

Keywords: Automated screening, retinal vessel segmentation, matched filtering, thresholding.

Received July 20, 2012; accepted January 29, 2013

Full Text

Published in Vol 12, No.5, September 2015

A Vision Approach for Expiry Date Recognition using Stretched Gabor Features

Thursday, 04 September 2014 07:27

A Vision Approach for Expiry Date Recognition using Stretched Gabor Features

Ahmed Zaafouri, Mounir Sayadi and Farhat Fnaiech

Department of Electrical Engineering, University of Tunis, Tunisia

Abstract: Product-expiry date represent important information for products consumption. They must contain clear information in the label. The expiry date information stamped on the cover of product faced some challenges due to their writing in pencil and distorted characters. In this paper, an automated vision approach for recognizing expiry date numerals of industrial product is presented. The system consists of four stages namely, numeral string pre-processing, numerals string segmentation, features extraction and numeral recognition. In preprocessing module, we convert the image to binary image based on threshold. A vertical projection process is adopted to isolate numerals, in the segmentation module. In the features extraction module, Fourier Magnitude (FM), Local Energy (LE) and Complex Moments (CM) derived from Stretched Gabor (S-Gabor) filters outputs are extracted at various filter orientations. Also, the mean and the variance of each feature map are extracted. The recognition process is achieved by classifying the extracted features, which represent the numeral image, with trained Multilayer Neural Network (MNN) using k-fold cross validation procedure. Through experiments, we demonstrate the richness of the S-Gabor features of information is highlighted. Consequently, the set of features shows its usefulness for practical usage.

Keyword: Computer vision, FM, Complex moments, LE, Numeral recognition, neural network, S-Gabor filters.

Received March 21, 2013; accepted December 24, 2013

Full Text

Published in Vol 12, No.5, September 2015

Optimum Threshold Parameter Estimation of Bidimensional Empirical Mode Decomposition Using Fisher Di

Thursday, 04 September 2014 07:23

Optimum Threshold Parameter Estimation of Bidimensional Empirical Mode Decomposition Using Fisher Discriminant Analysis for Speckle Noise Reduction

Mohammad Motiur Rahman1, Mithun Kumar PK1, and Mohammad Shorif Uddin 2
1Department of Computer Science and Engineering, Mawlana Bhashani Science and Technology University, Bangladesh
2Department of Computer Science and Engineering, Jahangirnagar University, Bangladesh

Abstract: Now a days Empirical Mode Decomposition (EMD) is an important tool for image analyzing. Optimizing threshold value of Bidimensional Intrinsic Mode Function (BIMF) is one of the important tasks in speckle noise reduction in the Bidimensional Empirical Mode Decomposition (BEMD) domain. Without proper selection of threshold value image information may be lost, which is unwanted. In this paper we proposed optimum threshold parameter using Fisher Discriminant Analysis (FDA) for determining the optimum threshold value of the Intrinsic Mode Functions (IMF) for the best speckle noise reduction. In the mean time, we used the optimal threshold value for separating the higher frequency signal from BIMF to calculate the mean of these separated signals for alleviating speckle noise. It also preserves edges without loss of important image information. The method is compared with the several other classical thresholding methods on variety of images and the experimental results confirm significant improvement over existing methods.

Keywords: EMD, BEMD, FDA, IMF, BIMF, optimum threshold, speckle noise, ultrasound image.

Received August 25, 2013; accepted March 20, 2014

Full Text

Published in Vol 12, No.5, September 2015

Kernel Logistic Regression Algorithm for Large-Scale Data Classification

Thursday, 04 September 2014 07:04

Kernel Logistic Regression Algorithm for Large-Scale Data Classification

Murtada Elbashir2 and Jianxin Wang1
1School of Information Science and Engineering, Central South University, China
2Faculty of Mathematical and Computer Sciences, University of Gezira, Sudan

Abstract: Kernel Logistic Regression (KLR) is a powerful classification technique that has been applied successfully in many classification problems. However, it is often not found in large-scale data classification problems, and this is mainly because it is computationally expensive. In this paper, we present a new KLR algorithm based on Truncated Regularized Iteratively Re-Weighted Least Squares(TR-IRLS) algorithm to obtain sparse large-scale data classification in short evolution time. This new algorithm is called Nystrom Truncated Kernel Logistic Regression (NTR-KLR). The performance achieved using NTR-KLR algorithm is comparable to that of Support Vector Machines (SVMs) methods. The advantage is NTR-KLR can yield probabilistic outputs, and its extension to the multi-class case is well-defined. In addition, its computational complexity is lower than that of SVMs methods, and it is easy to implement.

Keywords: KLR, iteratively reweighted least squares, nystrom method, newton's method.

Received November 18, 2012; accepted July 2, 2013

Full Text

Published in Vol 12, No.5, September 2015

A Qualitative Approach to the Identification, Visualisation and Interpretation of Repetitive Motion

Thursday, 04 September 2014 04:07

A Qualitative Approach to the Identification, Visualisation and Interpretation of
Repetitive Motion Patterns in Groups of Moving Point Objects

Seyed Chavoshi1, Bernard De Baets2, Yi Qiang3, Guy De Tré4, Tijs Neutens1, and Nico Van de Weghe1

1Department of Geography, Ghent University, Belgium

2Department of Mathematical Modelling, Ghent University, Belgium

3Department of Environmental Sciences, Louisiana State University, United States

4Department of Telecommunications and Information Processing, Ghent University, Belgium

Abstract: Discovering repetitive patterns is important in a wide range of research areas, such as bioinformatics and human movement analysis. This study puts forward a new methodology to identify, visualise and interpret repetitive motion patterns in groups of Moving Point Objects (MPOs). The methodology consists of three steps. First, motion patterns are qualitatively described using the Qualitative Trajectory Calculus (QTC). Second, a similarity analysis is conducted to compare motion patterns and identify repetitive patterns. Third, repetitive motion patterns are represented and interpreted in a continuous triangular model. As an illustration of the usefulness of combining these hitherto separated methods, a specific movement case is examined: Samba dance, a rhythmical dance with many repetitive movements. The results show that the presented methodology is able to successfully identify, visualize and interpret the contained repetitive motions.

Keywords: MPO, QTC, similarity analysis, repetitive motion patterns, continuous triangular model (CTM).

Received August 22, 2013; accepted May 11, 2014

Full Text

Published in Vol 12, No.5, September 2015

A WK-Means Approach for Clustering

Thursday, 04 September 2014 04:03

A WK-means Approach for Clustering

Fatemeh Boobord1, Zalinda Othman2, and Azuraliza Abu Bakar3
1, 2,3Data Mining and Optimization Research Group, Center for Artificial Intelligence Technology, University Kebangsaan Malaysia, Malaysia

Abstract: Clustering is an unsupervised learning method that is used to group similar objects. One of the most popular and efficient clustering methods is K-means, as it has linear time complexity and is simple to implement. However, it suffers from gets trapped in local optima. Therefore, many methods have been produced by hybridizing K-means and other methods. In this paper, we propose a hybrid method that hybridizes Invasive Weed Optimization and K-means. The Invasive Weed Optimization algorithm is a recent population based method to iteratively improve the given population of a solution. In this study, the algorithm is used in the initial stage to generate a good quality solution for the second stage. The solutions generated by the Invasive Weed Optimization Algorithm are used as initial solutions for the K-means algorithm. The proposed hybrid method is evaluated over several real world instances and the results are compared with well-known clustering methods in the literature. Results show that the proposed method is promising compared to other methods.

Keywords: Data clustering, K-means algorithm, Invasive Weed Optimization, Hybrid evolutionary optimization algorithm, unsupervised learning

Received November 28, 2012; accepted August 12, 2013

Full Text

Published in Vol 12, No.5, September 2015

Event Extraction from Classical Arabic Texts

Thursday, 04 September 2014 03:55

Event Extraction from Classical Arabic Texts

1Razieh Baradaran and 2 BehrouzMinaei-Bidgoli

1Department of information technology, university of Qom, Iran

2Department of Computer Engineering, Iran University of Science and Technology, Iran

Abstract: Event extraction is one of the most useful and challenging information extraction tasks that can be used in many natural language processing applications in particular semantic search systems. Most of the developed systems in this field extract events from English texts; therefore, in many other languages in particular Arabic there is a need for research in this area. In this paper we develop a system for extracting person related events and their participants from classical Arabic texts with complex linguistic structure. The first and most effective step to extract event is the correct diagnosis of the event mention and determining sentences which describe events. Implementation and comparing performance and the use of various methods can help researchers to choose appropriate method for event extraction based on their conditions and limitations. In this research, we have implemented three methods including knowledge-oriented method (based on a set of keywords and rules), data-oriented method (based on support vector machine) and semantic-oriented method (based on lexical chain) to automatically classify sentences as on-event or off-eventones. The results indicate that knowledge oriented and machine learning methods have high precision and recall in event extraction process. The semantic oriented method with acceptable precision minimizes the linguistic knowledge requirements of knowledge oriented method and preprocessing requirements of data oriented method; and also improves automatic event extraction process from the raw text. Next step is developing a modular rule based approach for extracting event arguments such as time, place and other participants involved in independent subtasks.

Keywords: Event Extraction, Support Vector Machine, Lexical Chain, Rule Based Method, Classical Arabic Texts

Received February 18, 2013; accepted September 19, 2013

Full Text

Published in Vol 12, No.5, September 2015

Using Textual Case-based Reasoning in Intelligent Fatawa QA System

Thursday, 04 September 2014 03:41

Using Textual Case-based Reasoning in Intelligent Fatawa QA System

Islam Elhalwany1, Ammar Mohammed1, Khaled Wassif2, Hesham Hefny1
1Institute of Statistical Studies & Researches, Cairo University, Egypt
2Faculty of Computers and Information, Cairo University, Egypt

Abstract: Textual Case-Based Reasoning (TCBR) is an artificial intelligence approach to problem solving and learning in which textual expertise is collected in a library of past cases. One of the critical application domains is The Islamic Fatawa (Religious verdict) domain, which refers to seeking a legal ruling for religious issues that Muslims all over the globe pose on a daily basis. Official Religious Organizations like Egypt's Dar al-Ifta1 is responsible for receiving and answering people's religious inquiries daily. Due to the enormous number of inquiries Dar al-Ifta receives every day, it cannot be handled at the same time. This task actually requires a certain smart system that can help in fulfilling people's needs for answers. However, applying TCBR in the domain of issuing Fatawa faces several challenges related to the language syntax and semantics. The contribution of this paper is to propose an intelligent Fatwa Questions Answering System that can overcome the challenges and respond to a user's inquiry through providing semantically closest inquiries that previously answered. Moreover, the paper shows how the proposed system can learn when a new inquiry arrives. Finally, results will be discussed.

Keywords: Case based Reasoning, Textual Case Based Reasoning, Questions Answering Systems, Artificial Intelligence, Information Retrieval, and Knowledge-based Systems.

Received August 29, 2013; accepted March 10, 2014

Full Text

Published in Vol 12, No.5, September 2015

Enhancing Generic Pipeline Model for Code Clone Detection Using Divide and Conquer Approach

Thursday, 04 September 2014 03:11

Enhancing Generic Pipeline Model for Code Clone Detection Using Divide and Conquer Approach

Al-Fahim Mubarak-Ali1, Sharifah Mashita Syed-Mohamad1, Shahida Sulaiman2,
1School of Computer Sciences, Universiti Sains Malaysia, 11800 USM Penang, Malaysia
2Faculty of Computing, Universiti Teknologi Malaysia, 81310 UTM Skudai, Johor, Malaysia

Abstract: Code clone is known as identical copies of the same instances or fragments of source codes in software. Current code clone research focuses on the detection and analysis of code clones in order to help software developers identify code clones in source codes and reuse the source codes in order to decrease the maintenance cost. Many approaches such as textual based comparison approach, token based comparison and tree based comparison approach have been used to detect code clones. As software grows and becomes a legacy system, the complexity of these approaches in detecting code clones increases. Thus, this scenario makes it more difficult to detect code clones. Generic pipeline model is the most recent code clone detection that comprises five processes which are parsing process, pre-processing process, pooling process, comparing processes and filtering process to detect code clone. This research highlights the enhancement of the generic pipeline model using divide and conquer approach that involves concatenation process. The aim of this approach is to produce a better input for the generic pipeline model by processing smaller part of source code files before focusing on the large chunk of source codes in a single pipeline. We implement and apply the proposed approach with the support of a tool called Java Code Clone Detector. The result obtained shows an improvement in the rate of code clone detection and overall runtime performance as compared to the existing generic pipeline model.

Keywords: Code clone detection, divide and conquer approach, generic pipeline model.

Received August 31, 2012; accepted February 23, 2014

Full Text

Published in Vol 12, No.5, September 2015

September 2015. No. 5

Adaptive Semantic Indexing Of Documents For Locating Relevant Information In P2P Networks

Comparison of Segmentation Algorithms by A Mathematical Model For Resolving Islands and Gulfs in Nuclei of Cervical Cell Images

AES Based Multimodal Biometric Authentication using Cryptographic Level Fusion with Fingerprint and Finger Knuckle Print

Automated Retinal Vessel Segmentation Using Entropic Thresholding Based Spatial Correlation Histogram of Gray Level Images

Optimum Threshold Parameter Estimation of Bidimensional Empirical Mode Decomposition Using Fisher Discriminant Analysis for Speckle Noise Reduction

Kernel Logistic Regression Algorithm for Large-Scale Data Classification

A Qualitative Approach to the Identification, Visualisation and Interpretation of Repetitive Motion Patterns in Groups of Moving Point Objects

A WK-means Approach for Clustering

Event Extraction from Classical Arabic Texts

Using Textual Case-based Reasoning in Intelligent Fatawa QA System

Enhancing Generic Pipeline Model for Code Clone Detection Using Divide and Conquer Approach

A Qualitative Approach to the Identification, Visualisation and Interpretation of
Repetitive Motion Patterns in Groups of Moving Point Objects