Wednesday, 26 August 2020 06:03

ANN Based Execution Time Prediction Model and

Assessment of Input Parameters through ISM

Anju Shukla, Shishir Kumar, and Harikesh Singh

Department of Computer Science and Engineering, Jaypee University of Engineering and Technology, India

Abstract: Cloud computing is on-demand network access model which provides dynamic resource provisioning, selection and scheduling. The performance of these techniques extensively depends on the prediction of various factors e.g., task execution time, resource trust value etc., As the accuracy of prediction model absolutely depends on the input data that are fed into the network, Selection of suitable inputs also plays vital role in predicting the appropriate value. Based on predicted value, Scheduler can choose the suitable resource and perform scheduling for efficient resource utilization and reduced makespan estimates. However, precise prediction of execution time is difficult in cloud environment due to heterogeneous nature of resources and varying input data. As each task has different characteristic and execution criteria, the environment must be intelligent enough to select the suitable resource. To solve these issues, an Artificial Neural Network (ANN) based prediction model is proposed to predict the execution time of tasks. First, input parameters are identified and selected through Interpretive Structural Modeling (ISM) approach. Second, a prediction model is proposed for predicting the task execution time for varying number of inputs. Third, the proposed model is validated and provides 21.72% reduction in mean relative error compared to other state-of-the-art methods.

Keywords: Cloud computing, neural network, Prediction model, Resource selection.

Received September 20, 2018; accepted January 28, 2020

https://doi.org/10.34028/iajit/17/5/1

Full Text  

 

Wednesday, 26 August 2020 06:01

Otsu’s Thresholding Method Based on Plane

Intercept Histogram and Geometric Analysis

Leyi Xiao1, Honglin Ouyang1, and Chaodong Fan2

1College of Electrical and Information Engineering, Hunan University, China

2Foshan Green Intelligent Manufacturing Research Institute, Xiangtan University, China

Abstract: The Three-Dimensional (3-D) Otsu’s method is an effective improvement on the traditional Otsu’s method. However, it not only has high computational complexity, but also needs to improve its anti-noise ability. This paper presents a new Otsu’s method based on 3-D histogram. This method transforms 3-D histogram into a 1-D histogram by a plane that is perpendicular to the main diagonal of the 3-D histogram, and designs a new maximum variance criterion for threshold selection. In order to enhance its anti-noise ability, a method based on geometric analysis, which can correct noise, is used for image segmentation. Simulation experiments show that this method has stronger anti-noise ability and less time consumption, comparing with the conventional 3-D Otsu’s method, the recursive 3-D Otsu’s method, the 3-D Otsu’s method with SFLA, the equivalent 3-D Otsu’s method and the improved 3-D Otsu’s method.

Keywords: 3-D Otsu’s method, threshold selection, Otsu’s method, 3-D histogram, image segmentation.

Received October 27, 2018; accepted January 28, 2020

https://doi.org/10.34028/iajit/17/5/2
Wednesday, 26 August 2020 05:58

A Deep Learning Based Prediction of Arabic

Manuscripts Handwriting Style

Manal Khayyat1 and Lamiaa Elrefaei2

1Computer Science Department, King Abdulaziz University, Saudi Arabia

2Electrical Engineering Department, Benha University, Egypt

Abstract: With the increasing amounts of existing unorganized images on the internet today and the necessity to use them efficiently in various types of applications. There is a critical need to discover rigid models that can classify and predict images successfully and instantaneously. Therefore, this study aims to collect Arabic manuscripts images in a dataset and predict their handwriting styles using the most powerful and trending technologies. There are many types of Arabic handwriting styles, including Al-Reqaa, Al-Nask, Al-Thulth, Al-Kufi, Al-Hur, Al-Diwani, Al-Farsi, Al-Ejaza, Al-Maghrabi, Al-Taqraa, etc. However, the study classified the collected dataset images according to the handwriting styles and focused on only six types of handwriting styles that existed in the collected Arabic manuscripts. To reach our goal, we applied the MobileNet pre-trained deep learning model on our classified dataset images to automatically capture and extract the features from them. Afterward, we evaluated the performance of the developed model by computing its recorded evaluation metrics. We reached that MobileNet convolutional neural network is a promising technology since it reached 0.9583 as the highest recorded accuracy and 0.9633 as the average F-score.

Keywords: Deep Learning Model, Convolutional Neural Network, Handwriting Style Prediction, Arabic Manuscript Images.

Received October 6, 2019; accepted April 6, 2020

https://doi.org/10.34028/iajit/17/5/3

Full Text    

Wednesday, 26 August 2020 05:56

Saliency Cuts: Salient Region Extraction based on Local Adaptive Thresholding for Image Information Recognition of the Visually Impaired

Mukhriddin Mukhiddinov1, Rag-Gyo Jeong2, and Jinsoo Cho3

1Department of Hardware and Software of Control Systems in Telecommunications, Tashkent University of Information Technologies named after Muhammad al-Khwarizmi, Uzbekistan

2Korea Railroad Research Institute, Uiwang, Gyeonggi-do 16105, Republic of Korea

3Department of Computer Engineering, Gachon University, Republic of Korea

Abstract: In recent years, there has been an increased scope for assistive software and technologies, which help the visually impaired to perceive and recognize natural scene images. In this article, we propose a novel saliency cuts approach using local adaptive thresholding to obtain four regions from a given saliency map. The saliency cuts approach is an effective tool for salient object detection. First, we produce four regions for image segmentation using a saliency map as an input image and applying an automatic threshold operation. Second, the four regions are used to initialize an iterative version of the Grab Cut algorithm and to produce a robust and high-quality binary mask with a full resolution. Lastly, based on the binary mask and extracted salient object, outer boundaries and internal edges are detected by Canny edge detection method. Extensive experiments demonstrate that the proposed method correctly detects and extracts the main contents of the image sequences for delivering visually salient information to the visually impaired people compared to the results of existing salient object segmentation algorithms.

Keywords: Saliency region extraction, saliency map, saliency cuts, local adaptive thresholding, the visually impaired.

Received February 7, 2018; accepted January 6, 2020
https://doi.org/10.34028/iajit/17/5/4

Full Text    

 

Wednesday, 26 August 2020 05:50

A Novel Feature Selection Method Based on

Maximum Likelihood Logistic Regression

for Imbalanced Learning in Software Defect

Prediction

Kamal Bashir1, Tianrui Li1, and Mahama Yahaya2

1School of Information Science and Technology, Southwest Jiaotong University, China

2School of Transport and Logistics Engineering, Southwest Jiaotong University, China

Abstract: The most frequently used machine learning feature ranking approaches failed to present optimal feature subset for accurate prediction of defective software modules in out-of-sample data. Machine learning Feature Selection (FS) algorithms such as Chi-Square (CS), Information Gain (IG), Gain Ratio (GR), RelieF (RF) and Symmetric Uncertainty (SU) perform relatively poor at prediction, even after balancing class distribution in the training data. In this study, we propose a novel FS method based on the Maximum Likelihood Logistic Regression (MLLR). We apply this method on six software defect datasets in their sampled and unsampled forms to select useful features for classification in the context of Software Defect Prediction (SDP). The Support Vector Machine (SVM) and Random Forest (RaF) classifiers are applied on the FS subsets that are based on sampled and unsampled datasets. The performance of the models captured using Area Ander Receiver Operating Characteristics Curve (AUC) metrics are compared for all FS methods considered. The Analysis Of Variance (ANOVA) F-test results validate the superiority of the proposed method over all the FS techniques, both in sampled and unsampled data. The results confirm that the MLLR can be useful in selecting optimal feature subset for more accurate prediction of defective modules in software development process.

Keywords: Software defect prediction· Machine learning· Class imbalance· Maximum-likelihood logistic regression.

Received April 30, 2018; accepted January 28, 2020

https://doi.org/10.34028/iajit/17/5/5

Full Text     

 

Wednesday, 26 August 2020 05:45

Text Similarity Computation Model for Identifying Rumor Based on Bayesian Network in Microblog

Chengcheng Li, Fengming Liu, and Pu Li

Business School, Shandong Normal University, China 

Abstract: The research of text similarity, especially for rumor texts, which constructed the calculation model by known rumors and calculated its similarity. From which, people can recognize the rumor in advance, and improve their vigilance to effectively block and control rumors dissemination. Based on the Bayesian network, the similarity calculation model of microblog rumor texts was built. At the same time, taking into account not only the rumor texts have similar characters, but also the rumor producers have similar characters, and therefore the similarity calculation model of rumor texts makers was constructed. Then, the similarity between the text and the user was integrated, and the microblog similarity calculation model was established. Finally, also experimentally studied the performance of the proposed model on the microblog rumor text and the user data set. The experimental results indicated that the similarity algorithm proposed in this paper could be used to identify the rumors of texts and predict the characters of users more accurately and effectively.

Keywords: Microblog Rumor, Similarity, Bayesian Network.

Received May 8, 2018; accepted March 31, 2019

https://doi.org/10.34028/iajit/17/5/6
Full Text    
Wednesday, 26 August 2020 05:42

Enhanced Latent Semantic Indexing Using Cosine Similarity Measures for Medical Application

Fawaz Al-Anzi1 and Dia AbuZeina2

1Department of Computer Engineering, Kuwait University, Kuwait

2Computer Science Department, Palestine Polytechnic University, Palestine

Abstract: The Vector Space Model (VSM) is widely used in data mining and Information Retrieval (IR) systems as a common document representation model. However, there are some challenges to this technique such as high dimensional space and semantic looseness of the representation. Consequently, the Latent Semantic Indexing (LSI) was suggested to reduce the feature dimensions and to generate semantic rich features that can represent conceptual term-document associations. In fact, LSI has been effectively employed in search engines and many other Natural Language Processing (NLP) applications. Researchers thereby promote endless effort seeking for better performance. In this paper, we propose an innovative method that can be used in search engines to find better matched contents of the retrieving documents. The proposed method introduces a new extension for the LSI technique based on the cosine similarity measures. The performance evaluation was carried out using an Arabic language data collection that contains 800 medical related documents, with more than 47,222 unique words. The proposed method was assessed using a small testing set that contains five medical keywords. The results show that the performance of the proposed method is superior when compared to the standard LSI.

Keywords: Arabic Text, Latent Semantic Indexing, Search Engine, Dimensionality Reduction, Text Classification.

Received December 25, 2018; accepted January 28, 2020

https://doi.org/10.34028/iajit/17/5/7

Full Text   

Wednesday, 26 August 2020 05:38

Performance Evaluation of Industrial Firms Using DEA and DECORATE Ensemble Method

Hassan Najadat1, Ibrahim Al-Daher2, and Khaled Alkhatib1

1Computer Information Systems Department, Jordan University of Science and Technology, Jordan

2Computer Science Department, Jordan University of Science and Technology, Jordan

Abstract: This study introduces an approach of combining Data Envelopment Analysis (DEA) and ensemble Methods in order to classify and predict the efficiency of Decision Making Units (DMU). The approach includes applying DEA in the first stage to compute the efficiency score for each DMU, then a variables’ ranker was utilized to extract the most important variables that affect the DMU’s performance, then J48 was adopted to build a classifier whose outcomes will be enhanced by Diverse Ensemble Creation by Oppositional Relabeling of Artificial Training Examples (DECORATE) Ensemble method. To examine the approach, this study utilizes a dataset from firms’ financial statements that are listed on Amman Stock Exchange. The dataset was preprocessed and turned out to include 53 industrial firms for the years 2012 to 2015.The dataset includes 11 input variables and 11 output ratios. The examination of financial variables and ratios play a vital role in the financial analysis practice. This paper shows that financial variable and ratio averages are points of reference to evaluate and measure firms’ future financial performance as well as that of other similar firms in the same sector. In addition, the results of this work are for comparative analyses of the financial performance of the industrial sector.

Keywords: Data Envelopment Analysis, Decision Trees, Ensemble Methods, Financial Variables, Financial Ratios.

Received October 12, 2019; accepted April 8, 2020

https://doi.org/10.34028/iajit/17/5/8

Full Text   

 

Wednesday, 26 August 2020 05:32

Ontology-Based Transformation and Verification of UML Class Model

Abdul Hafeez1, Syed Abbas2, and Aqeel-ur-Rehman3

1Department of Computer Science, SMI University, Karachi

2Faculty Engineering Science and Technology, Indus University, Karachi

3Faculty of Engineering Science and Technology, Hamdard University, Karachi

Abstract: Software models describe structures, relationships and features of the software system. Especially, in Model Driven Engineering (MDE), they are considered as first-class elements instead of programming code and all software development activities move around these models. In MDE, programming code is automatically generated by the models and models’ defects can implicitly transfer to the code. These defects can harder to discover and rectify. Model verification is a promising solution to the problem. The Unified Modelling Language (UML) class model is an important part of UML and is used in both analysis and design. However, UML only provides graphical elements without any formal foundation. Therefore, verification of formal properties such as consistency, satisfiability and consequences are not possible in UML. This paper mainly focuses on ontology-based transformation and verification of the UML class model elements which have not been addressed in any existing verification methods e.g. xor association constraint, and dependencies relationships. We validate the scalability and effectiveness of the proposed solution using various UML class models. The empirical study shows that the proposed approach scales in the presence of the large and complex model.

Keywords: UML Class Model Verification, Dependency Relationship, XOR Association Constraints.

Received September 11, 2017; accepted January 28, 2019

https://doi.org/10.34028/iajit/17/5/9
Wednesday, 26 August 2020 05:28

Improved Streaming Quotient Filter: A Duplicate Detection Approach for Data Streams

Shiwei Che, Wu Yang, and Wei Wang

Information Securityresearch Center, Harbin Engineering University, China

Abstract: The unprecedented development and popularization of the Internet, combined with the emergence of a variety of modern applications, such as search engines, online transactions, climate warning systems and so on, enables the worldwide storage of data to grow unprecedented. Efficient storage, management and processing of such huge amounts of data has become an important academic research topic. The detection and removal of duplicate and redundant data from such multi-trillion data, while ensuring resource and computational efficiency, has constituted a challenging area of research.Because of the fact that all the data of potentially unbounded data streams can not be stored, and the need to delete duplicated data as accurately as possible, intelligent approximate duplicate data detection algorithms are urgently required. Many well-known methods based on the bitmap structure, Bloom Filter and its variants are listed in the literature. In this paper, we propose a new data structure, Improved Streaming Quotient Filter (ISQF), to efficiently detect and remove duplicate data in a data stream. ISQF intelligently stores the signatures of elements in a data stream, while using an eviction strategy to provide near zero error rates. We show that ISQF achieves near optimal performance with fairly low memory requirements, making it an ideal and efficient method for repeated data detection. It has a very low error rate. Empirically, we compared ISQF with some existing methods (especially Steaming Quotient Filter (SQF)). The results show that our proposed method outperforms theexisting methods in terms of memory usage and accuracy.We also discuss the parallel implementation of ISQF.

Keywords: Bloom filters, Computer Network, Data stream, Duplicate detection, False positive rates.

Received November 30, 2017; accepted July 21, 2019

https://doi.org/10.34028/iajit/17/5/10
Wednesday, 26 August 2020 05:23

A Concept-based Sentiment Analysis Approach for Arabic

Ahmed Nasser1 and Hayri Sever2

1Control and Systems Engineering Department, University of Technology, Iraq

2Department of Computer Engineering, Çankaya University, Etimesgut

Abstract: Concept-Based Sentiment Analysis (CBSA) methods are considered to be more advanced and more accurate when it compared to ordinary Sentiment Analysis methods, because it has the ability of detecting the emotions that conveyed by multi-word expressions concepts in language. This paper presented a CBSA system for Arabic language which utilizes both of machine learning approaches and concept-based sentiment lexicon. For extracting concepts from Arabic, a rule-based concept extraction algorithm called semantic parser is proposed. Different types of feature extraction and representation techniques are experimented among the building prosses of the sentiment analysis model for the presented Arabic CBSA system. A comprehensive and comparative experiments using different types of classification methods and classifier fusion models, together with different combinations of our proposed feature sets, are used to evaluate and test the presented CBSA system. The experiment results showed that the best performance for the sentiment analysis model is achieved by combined Support Vector Machine-Logistic Regression (SVM-LR) model where it obtained a F-score value of 93.23% using the Concept-Based-Features+Lexicon-Based-Features+Word2vec-Features (CBF+LEX+W2V) features combinations.

Keywords: Arabic Sentiment Analysis, Concept-based Sentiment Analysis, Machine Learning and Ensemble Learning.

Received December13, 2017; accepted July 29, 2019

https://doi.org/10.34028/iajit/17/5/11
 
Wednesday, 26 August 2020 05:14

An Enhanced Corpus for Arabic Newspapers

Comments 

Hichem Rahab1, Abdelhafid Zitouni2, and Mahieddine Djoudi3

1ICOSI Laboratory, University of Khenchela, Algeria

3TechNE Laboratory, University of Poitiers, France

Abstract: In this paper, we propose our enhanced approach to create a dedicated corpus for Algerian Arabic newspapers comments. The developed approach has to enhance an existing approach by the enrichment of the available corpus and the inclusion of the annotation step by following the Model Annotate Train Test Evaluate Revise (MATTER) approach. A corpus is created by collecting comments from web sites of three well know Algerian newspapers. Three classifiers, support vector machines, naïve Bayes, and k-nearest neighbors, were used for classification of comments into positive and negative classes. To identify the influence of the stemming in the obtained results, the classification was tested with and without stemming. Obtained results show that stemming does not enhance considerably the classification due to the nature of Algerian comments tied to Algerian Arabic Dialect. The promising results constitute a motivation for us to improve our approach especially in dealing with non Arabic sentences, especially Dialectal and French ones.

Keywords: Opinion mining, sentiment analysis, K-Nearest Neighbours, Naïve Bayes, Support Vector Machines, Arabic, comment.

Received December 22, 2017; accepted June 18, 2019

https://doi.org/10.34028/iajit/17/5/12

Full Text  

Wednesday, 26 August 2020 05:02

The Performance of Penalty Methods on Tree-Seed Algorithm for Numerical Constrained Optimization Problems

Ahmet Cinar1 and Mustafa Kiran2

1Department of Computer Engineering, Selçuk University, Turkey

2Department of Computer Engineering, Konya Technical University, Turkey

Abstract: The constraints are the most important part of many optimization problems. The metaheuristic algorithms are designed for solving continuous unconstrained optimization problems initially. The constraint handling methods are integrated into these algorithms for solving constrained optimization problems. Penalty approaches are not only the simplest way but also as effective as other constraint handling techniques. In literature, there are many penalty approaches and these are grouped as static, dynamic and adaptive. In this study, we collect them and discuss the key benefits and drawbacks of these techniques. Tree-Seed Algorithm (TSA) is a recently developed metaheuristic algorithm, and in this study, nine different penalty approaches are integrated with the TSA. The performance of these approaches is analyzed on well-known thirteen constrained benchmark functions. The obtained results are compared with state-of-art algorithms like Differential Evolution (DE), Particle Swarm Optimization (PSO), Artificial Bee Colony (ABC), and Genetic Algorithm (GA). The experimental results and comparisons show that TSA outperformed all of them on these benchmark functions.

Keywords: Constrained optimization, penalty functions, penalty approaches, tree-seed algorithm.

Received January 3, 2019; accepted February 26, 2020

https://doi.org/10.34028/iajit/17/5/13

Full Text    

 

Wednesday, 26 August 2020 04:59

Advanced Analysis of the Integrity of Access Control Policies: the Specific Case of Databases

 Faouzi Jaidi, Faten Ayachi, and Adel Bouhoula

Digital Security Research Lab, Higher School of Communication of Tunis, University of Carthage, Tunisia

Abstract: Databases are considered as one of the most compromised assets according to 2014-2016 Verizon Data Breach Reports. The reason is that databases are at the heart of Information Systems (IS) and store confidential business or private records. Ensuring the integrity of sensitive records is highly required and even vital in critical systems (e-health, clouds, e-government, big data, e-commerce, etc.,). The access control is a key mechanism for ensuring the integrity and preserving the privacy in large scale and critical infrastructures. Nonetheless, excessive, unused and abused access privileges are identified as most critical threats in the top ten database security threats according to 2013-2015 Imperva Application Defense Center reports. To address this issue, we focus in this paper on the analysis of the integrity of access control policies within relational databases. We propose a rigorous and complete solution to help security architects verifying the correspondence between the security planning and its concrete implementation. We define a formal framework for detecting non-compliance anomalies in concrete Role Based Access Control (RBAC) policies. We rely on an example to illustrate the relevance of our contribution.

Keywords: Access Control, Databases Security, Formal Validation, Integrity Analysis, Conformity Verification.

Received November 11, 2016; accepted July 7, 2019

https://doi.org/10.34028/iajit/17/5/14
Wednesday, 26 August 2020 04:53

A Sparse Topic Model for Bursty Topic Discovery in Social Networks

Lei Shi, Junping Du, and Feifei Kou

Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, Beijing University of Posts and Telecommunications, Beijing

Abstract: Bursty topic discovery aims to automatically identify bursty events and continuously keep track of known events. The existing methods focus on the topic model. However, the sparsity of short text brings the challenge to the traditional topic models because the words are too few to learn from the original corpus. To tackle this problem, we propose a Sparse Topic Model (STM) for bursty topic discovery. First, we distinguish the modeling between the bursty topic and the common topic to detect the change of the words in time and discover the bursty words. Second, we introduce “Spike and Slab” prior to decouple the sparsity and smoothness of a distribution. The bursty words are leveraged to achieve automatic discovery of the bursty topics. Finally, to evaluate the effectiveness of our proposed algorithm, we collect Sina weibo dataset to conduct various experiments. Both qualitative and quantitative evaluations demonstrate that the proposed STM algorithm outperforms favorably against several state-of-the-art methods.

Keywords: Bursty topic discovery, topic model, “Spike and Slab” prior.

Received August 15, 2017; accepted January 28, 2019

https://doi.org/10.34028/iajit/17/5/15

Full Text  

Wednesday, 26 August 2020 04:32

A Fast High Precision Skew Angle Estimation of Digitized Documents

Merouane Chettat, Djamel Gaceb, and Soumia Belhadi

Laboratory of Computer Science, Modeling, Optimization and Electronic Systems, Faculty of science, University M’hamed Bougara Boumerdes, Algeria

Abstract: In this paper, we treated the problem of automatic skew angle estimation of scanned documents. The skew of document occurs very often, due to incorrect positioning of the documents or a manipulation error during scanning. This has negative consequences on the steps of automatic analysis and recognition of text. It is therefore essential to verify, before proceeding to these steps, the presence of skew on the document to be processed and to correct it. The difficulty of this verification is associated to the presence of graphic zones, sometimes dominant, that have a considerable impact on the accuracy of the text skew angle estimation. We also noted the importance of preprocessing to improve the accuracy and the calculation cost of skew estimation approaches. These two elements have been taken into consideration in our design and development of a new approach of skew angle estimation and correction. Our approach is based on local binarization followed by horizontal smoothing by the Run Length Smoothing Algorithm (RLSA) method, detection of horizontal contours and the Hierarchical Hough Transform (HHT). The algorithms involved in our approach have been chosen to guarantee a skew estimation: accurate, fast and robust, especially to graphic dominance and real time application. The experimental tests show the effectiveness of our approach on a representative database of the Document Image Skew Estimation Contest (DISEC) contest International Conference on Document Analysis and Recognition (ICDAR).

Keywords: Skew angle estimation, document images, Hough transform, Binarization, edge detection, RLSA.

Received September 13, 2017; accepted December 24, 2018

https://doi.org/10.34028/iajit/17/5/16
 
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…