Thursday, 30 April 2020 10:02

Normalization-based Neighborhood Model for Cold Start Problem in Recommendation System

Aafaq Zahid1, Nurfadhlina Mohd Sharef1, and Aida Mustapha2

1Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Malaysia

2Faculty of Computer Science and Information Technology, Universiti Tun Hussein Onn Malaysia, Malaysia

Abstract: Existing approaches for Recommendation Systems (RS) are mainly based on users’ past knowledge and the more popular techniques such as the neighborhood models focus on finding similar users in making recommendations. The cold start problem is due to inaccurate recommendations given to new users because of lack of past data related to those users. To deal with such cases where prior information on the new user is not available, this paper proposes a normalization technique to model user involvement for cold start problem or user likings based on the details of items used in the neighborhood models. The proposed normalization technique was evaluated using two datasets namely MovieLens and GroupLens. The results showed that the proposed technique is able to improve the accuracy of the neighborhood model, which in turn increases the accuracy of an RS.


Keywords: Recommender system, cold start, collaborative filtering, normalization.

Received May 3, 2017; accepted December 17, 2017
https://doi.org/10.34028/iajit/17/3/1
 
Thursday, 30 April 2020 10:01

Incorporating Reverse Search for Friend Recommendation with Random Walk

Qing Yang1, Haiyang Wang1, Mengyang Bian1, Yuming Lin2, and Jingwei Zhang2

1Guangxi Key Laboratory of Automatic Measurement Technology and Instrument, Guilin University of Electronic Technology, China

2Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, China

Abstract: Recommending friends is an important mechanism for social networks to enhance their vitality and attractions to users. The huge user base as well as the sparse user relationships give great challenges to propose friends on social networks. Random walk is a classic strategy for recommendations, which provides a feasible solution for the above challenges. However, most of the existing recommendation methods based on random walk are only weighing the forward search, which ignore the significance of reverse social relationships. In this paper, we proposed a method to recommend friends by integrating reverse search into random walk. First, we introduced the FP-Growth algorithm to construct both web graphs of social networks and their corresponding transition probability matrix. Second, we defined the reverse search strategy to include the reverse social influences and to collaborate with random walk for recommending friends. The proposed model both optimized the transition probability matrix and improved the search mode to provide better recommendation performance. Experimental results on real datasets showed that the proposed method performs better than the naive random walk method which considered the forward search mode only.

Keywords: Social networks, friend recommendation, reverse search.

Received September 2, 2017; accepted April 25, 2018
https://doi.org/10.34028/iajit/17/3/2

Full text  

Thursday, 30 April 2020 10:00

A Deep Learning based Arabic Script Recognition System: Benchmark on KHAT

Riaz Ahmad1, Saeeda Naz2, Muhammad Afzal3, Sheikh Rashid4, Marcus Liwicki5, and Andreas Dengel6

1Shaheed Banazir Bhutto University, Sheringal, Pakistan

2Computer Science Department, GGPGC No.1 Abbottabad, Pakistan 

3Mindgarage, University of Kaiserslautern, Germany 

 4Al Khwarizmi Institute of Computer Science, UET Lahore, Pakistan

5Department of Computer Science, Luleå University of Technology, Luleå

6German Research Center for Artificial Intelligence (DFKI) in Kaiserslautern, Germany

Abstract: This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.

Keywords: Handwritten Arabic text recognition, deep learning, data augmentation.

Received July 11, 2017; accepted April 25, 2018
https://doi.org/10.34028/iajit/17/3/3
 
Thursday, 30 April 2020 09:59

A Fog Computing-based Framework for Privacy Preserving IoT Environments

Dhiah el Diehn Abou-Tair1, Simon Büchsenstein2, and Ala Khalifeh1

1School of Electrical Engineering and Information Technology, German Jordanian University, Jordan

2Embedded Systems Engineering, University of Freiburg, Germany

Abstract: Privacy is becoming an indispensable component in the emerging Internet of Things (IoT) context. However, the IoT based devices and tools are exposed to several security and privacy threats, especially that these devices are mainly used to gather data about users’ habits, vital signs, surround environment, etc., which makes them a lucrative target to intruders. Up to date, conventional security and privacy mechanisms are not well optimized for IoT devices due to their limited energy, storage capacity, communication functionality and computing power, which influenced researchers to propose new solutions and algorithms to handle these limitations. Fog and cloud computing have been recently integrated in IoT environment to solve their resources’ limitations, thus facilitating new life scenarios-oriented applications. In this paper, a security and privacy preserving framework is proposed, which utilizes Fog and cloud computing in conjunction with IoT devices that aims at securing the users’ data and protecting their privacy. The framework has been implemented and tested using available technologies. Furthermore, a security analysis has been verified by simulating several hypothetical attack scenarios, which showed the effectiveness of the proposed framework and its capability of protecting the users’ information.

Keywords: Internet of thing, cloud computing, fog computing, privacy, security.

Received July 10, 2019; accepted Novebmer 24, 2019
https://doi.org/10.34028/iajit/17/3/4
Full text    
Thursday, 30 April 2020 09:55

Self-Organizing Map vs Initial Centroid Selection Optimization to Enhance K-Means with Genetic Algorithm to Cluster Transcribed Broadcast News Documents

Ahmed Maghawry1, Yasser Omar1, and Amr Badr2

1Department of Computer Science, Arab Academy for Science and Technology, Egypt

2Department of Computer Science, Cairo University, Egypt

Abstract: A compilation of artificial intelligence techniques are employed in this research to enhance the process of clustering transcribed text documents obtained from audio sources. Many clustering techniques suffer from drawbacks that may cause the algorithm to tend to sub optimal solutions, handling these drawbacks is essential to get better clustering results and avoid sub optimal solutions. The main target of our research is to enhance automatic topic clustering of transcribed speech documents, and examine the difference between implementing the K-means algorithm using our Initial Centroid Selection Optimization (ICSO) [16] with genetic algorithm optimization with Chi-square similarity measure to cluster a data set then use a self-organizing map to enhance the clustering process of the same data set, both techniques will be compared in terms of accuracy. The evaluation showed that using K-means with ICSO and genetic algorithm achieved the highest average accuracy.

Keywords: Clustering, k-means, self-organizing maps, genetic algorithm, speech transcripts, centroid selection.

Received May 21, 2017; accepted July 10, 2018
https://doi.org/10.34028/iajit/17/3/5
Full text     
Thursday, 30 April 2020 09:53

A Semantic Framework for Extracting Taxonomic Relations from Text Corpus

Phuoc Thi Hong Doan, Ngamnij Arch-int, and Somjit Arch-int

 Department of Computer Science, Khon Kaen University, Thailand

Abstract: Nowadays, ontologies have been exploited in many current applications due to the abilities in representing knowledge and inferring new knowledge. However, the manual construction of ontologies is tedious and time-consuming. Therefore, the automated ontology construction from text has been investigated. The extraction of taxonomic relations between concepts is a crucial step in constructing domain ontologies. To obtain taxonomic relations from a text corpus, especially when the data is deficient, the approach of using the web as a source of collective knowledge (a.k.a web-based approach) is usually applied. The important challenge of this approach is how to collect relevant knowledge from a large amount of web pages. To overcome this issue, we propose a framework that combines Word Sense Disambiguation (WSD) and web approach to extract taxonomic relations from a domain-text corpus. This framework consists of two main stages: concept extraction and taxonomic-relation extraction. Concepts acquired from the concept-extraction stage are disambiguated through WSD module and passed to stage of extraction taxonomic relations afterward. To evaluate the efficiency of the proposed framework, we conduct experiments on datasets about two domains of tourism and sport. The obtained results show that the proposed method is efficient in corpora which are insufficient or have no training data. Besides, the proposed method outperforms the state of the art method in corpora having high WSD results.

Keywords: Taxonomic relation, ontology construction, word sense disambiguation, knowledge acquisition.

Received September 22, 2017; accepted October 28, 2018
https://doi.org/10.34028/iajit/17/3/6
Full text      
Thursday, 30 April 2020 09:51

A Contrivance to Encapsulate Virtual Scaffold with Comments and Notes

Nagarajan Balasubramanaian1, Suguna Jayapal2, and Satheeshkumar Janakiraman3

1Department of Computer Applications, Arunai Engineering College, India

2Department of Computer Science, Vellalar College for Women, India

3Department of Computer Science, Bharathiar University, India

Abstract: CLOUD is an elision of Common Location-independent Online Utility available on-Demand and is based on Service Oriented Architecture (SOA). Today a chunk of researchers were working towards contrivance based on multi-tenant aware Software as a Service (SaaS) application development and still a precise pragmatic solution remains a challenge among the researchers. The first step towards resolving solution is to enhance the virtual scaffold and propose it as a System under Test (SuT). The entire work is proposed as a Model View Controller (MVC) where the tenant login through the View and write their snippet code for encapsulation. The proposed VirScaff schema acts as Controller and provides authentication and authorization by role/session assignment for tenant and thus helps to access data from the dashboard (Viz., Create, Read, Update and Delete (CRUD)). The SuT supports and accommodates both SQL and Not only Structured Query Language (NoSQL) dataset. Finally, this paper construed that SuT behaves well for both SQL and NoSQL dataset in terms of time and space complexities. To sum-up, the entire work addresses the challenges towards multitenant aware SaaS application development and highly commendable while using NoSQL dataset.

Keywords: Virtual scaffold, Multi-Tenant common gateway, pattern, model view controller, role-based access control, JavaScript object notation, not only structured query language, software as a service.

Received July 14, 2017; accepted July 28, 2019

https://doi.org/10.34028/iajit/17/3/7

Full text     
Thursday, 30 April 2020 09:49

Using Total Probability in Image Template Matching

Haval Sadeq

College of Engineering, Salahaddin University-Erbil, Erbil-Iraq

Abstract: Image template matching is a main task in photogrammetry and computer vision. The matching can be used to automatically determine the 3D coordinates of a point. A firstborn image matching method in fields of photogrammetry and computer vision is area-based matching, which is based on correlation measuring that uses normalised cross-correlation. However, this method fails at a discontinuous edge and at the area of low illumination or at geometric distortion because of changes in imaging location. Thus, these points are considered outliers. The proposed method measures correlations, which is based on normalised cross-correlation, at each point by using various sizes of window and then considering the probability of correlations for each window. Thereafter, the determined probability values are integrated. On the basis of a specific threshold value, the point of maximum total probability correlation is recognised as a corresponding point. The algorithm is applied to aerial images for Digital Surface Model (DSM) generation. Results show that the corresponding points are identified successfully at different locations, especially at a discontinuous point, and that a Digital Surface Model of high resolution is generated.

Keywords: Digital surface model, template image matching, normalised cross-correlation, probability.

Received February 11, 2018; accepted June 11, 2019
https://doi.org/10.34028/iajit/17/3/8
Full text      
Thursday, 30 April 2020 09:47

A Novel Physical Machine Overload Detection Algorithm Combined with Quiescing for Dynamic Virtual Machine Consolidation in Cloud Data Centers

Loiy Alsbatin1, Gürcü Öz1, and Ali Ulusoy2

1Department of Computer Engineering, Eastern Mediterranean University, North Cyprus via Mersin 10 Turkey

2Department of Information Technology, Eastern Mediterranean University, North Cyprus via Mersin 10 Turkey

Abstract: Further growth of computing performance has been started to be limited due to increasing energy consumption of cloud data centers. Therefore, it is important to pay attention to the resource management. Dynamic virtual machines consolidation is a successful approach to improve the utilization of resources and energy efficiency in cloud environments. Consequently, optimizing the online energy-performance trade off directly influences Quality of Service (QoS). In this paper, a novel approach known as Percentage of Overload Time Fraction Threshold (POTFT) is proposed that decides to migrate a Virtual Machine (VM) if the current Overload Time Fraction (OTF) value of Physical Machine (PM) exceeds the defined percentage of maximum allowed OTF value to avoid exceeding the maximum allowed resulting OTF value after a decision of VM migration or during VM migration. The proposed POTFT algorithm is also combined with VM quiescing to maximize the time until migration, while meeting QoS goal. A number of benchmark PM overload detection algorithms is implemented using different parameters to compare with POTFT with and without VM quiescing. We evaluate the algorithms through simulations with real world workload traces and results show that the proposed approaches outperform the benchmark PM overload detection algorithms. The results also show that proposed approaches lead to better time until migration by keeping average resulting OTF values less than allowed values. Moreover, POTFT algorithm with VM quiescing is able to minimize number of migrations according to QoS requirements and meet OTF constraint with a few quiescings.

Keywords: Distributed systems, cloud computing, dynamic consolidation, overload detection and energy efficiency.

Received November 15 2017; accepted July 29, 2018
https://doi.org/10.34028/iajit/17/3/9
Full text      
Thursday, 30 April 2020 09:45

Issues of Dialectal Saudi Twitter Corpus

Meshrif Alruily

College of Computer and Information Sciences, Jouf University, Saud Arabia

Abstract: Text mining research relies heavily on the availability of a suitable corpus. This paper presents a dialectal Saudi corpus that contains 207452 tweets generated by Saudi Twitter users. In addition, a comparison between the Saudi tweets dataset, Egyptian Twitter corpus and Arabic top news raw corpus (representing Modern Standard Arabic (MSA) in various aspects, such as the differences between formal and colloquial texts was carried out. Moreover, investigation into the issues and phenomena, such as shortening, concatenation, colloquial language, compounding, foreign language, spelling errors and neologisms on this type of dataset was performed.

Keywords: Microblogs, tweets, Saudi colloquial, corpus and modern standard Arabic.

Received January 27, 2018; accepted August 13, 2018
https://doi.org/10.34028/iajit/17/3/10

Full text     

Thursday, 30 April 2020 09:44

An Enhanced MSER Pruning Algorithm for Detection and Localization of Bangla Texts from Scene Images

Rashedul Islam, Rafiqul Islam, and Kamrul Talukder
Computer Science and Engineering Discipline, Khulna University, Bangladesh

 

Abstract: Text detection and localization have great importance for content based image analysis and text based image indexing. The efficiency of text recognition depends on the efficiency of text localization. So, the main goal of the proposed method is to detect and localize text regions with high accuracy. To achieve this goal, a new and efficient method has been introduced for localization of Bangla text from scene images. In order to improve precision and recall as well as f-measure, Maximally Stable Extremal Region (MSER) based method along with double filtering techniques have been used. As MSER algorithm generates many false positives, we have introduced double filtering method for removing these false positives to increase the f-measure to a great extent. Our proposed method works at three basic levels. Firstly, MSER regions are generated from the input color image by converting it into gray scale image. Secondly, some heuristic features are used to filter out most of the false positives or non-text regions. Lastly, Stroke Width Transform (SWT) based filtering method is used to filter out remaining non-text regions. Remaining components are then grouped into candidate text regions marked by bounding box over each region. As there is no benchmark database for Bangla text, the proposed method is implemented on our own prepared database consisting of 200 scene images of Bangla texts and has got prominent performance. To evaluate the performance of our proposed approach, we have also tested the proposed method on International Conference on Document Analysis and Recognition( ICDAR) 2013 benchmark database and have got a better result than the related existing methods.

Keywords: MSER, scene image, ICDAR, aspect ratio, euler number, bangla text.

Received July 27, 2017; accepted June19, 2018
https://doi.org/10.34028/iajit/17/3/11

Full text  

Thursday, 30 April 2020 09:43

A Smart Card Oriented Secure Electronic Voting Machine Built on NTRU

Safdar Shaheen1, Muhammad Yousaf1, and Mudassar Jalil2

1Riphah Institute of Systems Engineering, Riphah International University, Pakistan

2Department of Mathematics, COMSAT Institute of Information Technology, Pakistan

Abstract: Free and fair elections are indispensable to quantify the sentiments of the populace for forming the government of representatives in democratic countries. Due to its procedural variation from country to country and complexity, to maneuverer, it is a challenging task. Since the Orthodox paper-based electoral systems are slow and error-prone, therefore, a secure and efficient electoral system always remained a key area of research. Although a lot of literature is available on this topic. However, due to reported anomalies and weaknesses in American and France election in 2016, it once again has become a pivotal subject of research. In this article, we proposed a new secure and efficient electronic voting scheme based on public key cryptosystem dubbed as Number Theory Research Unit (NTRU). Furthermore, an efficient and robust three factors authentication protocol based on a personalized memorable password, a smartcard, and bioHash is proposed to validate the legitimacy of a voter for casting a legal vote. NTRU based blind signatures are used to preserve the anonymity and privacy of vote and voters, whereas the proficiency of secure and efficient counting of votes is achieved through NTRU based homomorphic tally. Non-coercibility and individual verifiability are attained through Mark Pledge scheme. The proposed applied electronic voting scheme is, secure, transparent and efficient for large scale elections.

Keywords: EVM, blind signature, homomorphic tally, smart card, NTRU.

Received July 29, 2017; accepted June 19, 2018
https://doi.org/10.34028/iajit/17/3/12
Thursday, 30 April 2020 09:41

Direct Text Classifier for Thematic Arabic Discourse Documents

Khalid Nahar1, Ra’ed Al-Khatib1, Moy'awiah Al-Shannaq1, Mohammad Daradkeh2, and Rami Malkawi3

1Department of Computer Sciences, Yarmouk University, Jordan

2Department of Management Information System, Yarmouk University, Jordan

3Department of Computer Information System, Yarmouk University, Jordan

Abstract: Maintaining the topical coherence while writing a discourse is a major challenge confronting novice and non-novice writers alike. This challenge is even more intense with Arabic discourse because of the complex morphology and the widespread of synonyms in Arabic language. In this research, we present a direct classification of Arabic discourse document while writing. This prescriptive proposed framework consists of the following stages: data collection, pre-processing, construction of Language Model (LM), topics identification, topics classification, and topic notification. To prove and demonstrate our proposed framework, we designed a system and applied it on a corpus of 2800 Arabic discourse documents synthesized into four predefined topics related to: Culture, Economy, Sport, and Religion. System performance was analysed, in terms of accuracy, recall, precision, and F-measure. The results demonstrated that the proposed topic modeling-based decision framework is able to classify topics while writing a discourse with accuracy of 91.0%.

Keywords: Text mining, Arabic discourse; text classification, topic modling, n-gram language model, topical coherence.

Received February 24, 2018; accepted August 13, 2018
https://doi.org/10.34028/iajit/17/3/13
Thursday, 30 April 2020 09:39

A Novel Image Retrieval Technique using Automatic and Interactive Segmentation

Asjad Amin and Muhammad Qureshi

Department of Telecommunication Engineering, The Islamia University of Bahawalpur, Pakistan

Abstract: In this paper, we present a new region-based image retrieval technique based on robust image segmentation. Traditional content-based image retrieval deals with the global description of a query image. We combine the state-of-the-art segmentation algorithms with the traditional approach to narrow the area of interest to a specific region within a query image. In case of automatic segmentation, the algorithm divides a query image automatically and computes Zernike moments for each region. For interactive segmentation, our proposed scheme takes as input a query image and some information regarding the region of interest. The proposed scheme then works by computing the Geodesic-based segmentation of the query image. The segmented image is our region of interest which is then used for computing the Zernike moments. The Euclidean distance is then used to retrieve different relevant images. The experimental results clearly show that the proposed scheme works efficiently and produces excellent results.

Keywords: CBIR, information retrieval, image segmentation, multimedia image retrieval.

Received September 29, 2017; accepted June 19, 2018
https://doi.org/10.34028/iajit/17/3/14
Full text    
Thursday, 30 April 2020 09:37

A New Metric for Class Cohesion for Object Oriented Software

Anjana Gosain1 and Ganga Sharma2

1University School of Information, Communication and Technology, Guru Gobind Singh Indraprastha University, India

2School of Engineering, G D Goenka University, India

Abstract: Various class cohesion metrics exist in literature both at design level and source code level to assess the quality of Object Oriented (OO) software. However, the idea of cohesive interactions (or relationships) between instance variables (i.e., attributes) and methods of a class for measuring cohesion varies from one metric to another. Some authors have used instance variable usage by methods of the class to measure class cohesion while some focus on similarity of methods based on sharing of instance variables. However, researchers believe that such metrics still do not properly capture cohesiveness of classes. Therefore, measures based on different perspective on the idea of cohesive interactions should be developed. Consequently, in this paper, we propose a source code level class cohesion metric based on instance variable usage by methods. We first formalize three types of cohesive interactions and then categorize these cohesive interactions by providing them ranking and weights in order to compute our proposed measure. To determine the usefulness of the proposed measure, theoretical validation using a property based axiomatic framework has been done. For empirical validation, we have used Pearson correlation analysis and logistic regression in an experimental study conducted on 28 Java classes to determine the relationship between the proposed measure and maintenance-effort of classes. The results indicate that the proposed cohesion measure is strongly correlated with maintenance-effort and can serve as a good predictor of the same.

Keywords: Class cohesion, metrics, OO software, maintenance-effort, metric validation.

Received June 18, 2017; accepted March 11, 2018
https://doi.org/10.34028/iajit/17/3/15

Full text     

Sunday, 26 April 2020 12:19

Gene Expression Prediction Using Deep Neural Networks

Raju Bhukya and Achyuth Ashok

Department of Computer Science and Engineering, National Institute of Technology, India

Abstract: In the field of molecular biology, gene expression is a term that encompasses all the information contained in an organism’s genome. Although, researchers have developed several clinical techniques to quantitatively measure the expressions of genes of an organism, they are too costly to be extensively used. The NIH LINCS program revealed that human gene expressions are highly correlated. Further research at the University of California, Irvine (UCI) led to the development of D-GEX, a Multi Layer Perceptron (MLP) model that was trained to predict unknown target expressions from previously identified landmark expressions. But, bowing to hardware limitations, they had split the target genes into different sets and constructed separate models to profile the whole genome. This paper proposes an alternative solution using a combination of deep autoencoder and MLP to overcome this bottleneck and improve the prediction performance. The microarray based Gene Expression Omnibus (GEO) dataset was employed to train the neural networks. Experimental result shows that this new model, abbreviated as E-GEX, outperforms D-GEX by 16.64% in terms of overall prediction accuracy on GEO dataset. The models were further tested on an RNA-Seq based 1000G dataset and E-GEX was found to be 49.23% more accurate than D-GEX.

Keywords: Gene expression, regression, deep learning, autoencoder, multilayer perceptron.

Received April 25, 2018; accepted October 28, 2018
https://doi.org/10.34028/iajit/17/3/16

  Full text  

Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…