A MapReduce-based Quick Search Approach on Large Files
Ye-feng
Li1, Jia-jin Le2, and Mei Wang2
1College of Computer Science
and Technology, Beijing University of Technology, China
2College of Computer Science and
Technology, Donghua University, China
Abstract: String search is an important branch of pattern matching for information
retrieval in various fields. In the past four decades, the research importance
has been attached on skipping more unnecessary characters to improve the search
performance, and never taken into consideration on large scale of data. In this
paper, two major achievements are contributed. At first, we propose a Quick
Search algorithm for data Stream (QSS) on a single machine to support string
search in a large text file, as opposed to previous researches that limits to a
bound memory. For the next, we implement the search algorithm on MapReduce
framework to improve the velocity of retrieving the search results. The
experiments demonstrate that our approach is fast and effective for large
files.
Keywords: String search, mapreduce, data
stream and large file.
Received May 21, 2015; accepted September 24, 2017
An
Efficient Line Clipping Algorithm in 2D Space
Mamatha Elliriki1,
Chandrasekhara Reddy2, and Krishna Anand3
1Department of Mathematics, GITAM
University, India
2Department of Mathematics, Cambridge
Institute of Technology-NC, India
3Department
of Computer Science, Sreenidhi Institute of Science and Technology, India
Abstract: Clipping
problem seems to be pretty simple from human perspective point of view since
with visualization a line can easily be traced whether it is completely inside
and if not what portion of the line lies outside the window. However, from
system point of view, the number of computations and comparisons for lines with
floating point calculations are extremely large which in turn adds to inherent
complexity. It needs to minimize the number of computations thereby achieving a
significant increase in terms of efficiency. In this work, a mathematical model
has been proposed for evaluating intersection points thereby clipping lines
which decently rely on integral calculations. Besides, no further computations
are found to be necessary for evaluating intersection points. The
performance of the algorithm seems to be consistently good in terms of speed for all sizes of clipping windows.
Keywords: Ortho lengths, raster
graphics system, line clipping, intersection points, geometrical slopes, rectangle
window.
Parameter
Tuning of Neural Network for Financial Time Series Forecasting
Zeinab Fallahshojaei1 and
Mehdi Sadeghzadeh2
1Department
of Computer Engineering, Buin Zahra Branch, Islamic Azad University, Buin Zahra,
Iran
2Department of Computer Engineering, Mahshahr Branch, Islamic Azad
University, Mahshahr, Iran
Abstract:
One of the most challengeable problems in pattern recognition
domain is financial time series forecasting which aims to exactly estimate the
cost value variations of a particular object in future. One of the best
well-known financial time series prediction methods is Neural Network (NN) but
it suffers from parameter tuning such as number of neuron in hidden layer,
learning rate and number of periods that should be forecasted. To solve the problem,
this paper proposes a new meta-heuristic-based parameter tuning scheme which is
based on Harmony Search (HS). To improve the exploration and exploitation rates
of HS, the control parameters of HS are adapted during the generations. Evaluation
of the proposed method on several financial times series datasets shows the
efficiency of the improved HS on parameter setting of NN for time series
prediction.
Keywords: Financial
times series forecasting, parameter setting, NN, HS, parameter adaptation.
An Optimized and Efficient Radial Basis Neural Network using Cluster Validity Index for Diabetes Cla
An Optimized and Efficient Radial Basis Neural Network using Cluster Validity Index for Diabetes Classification
Ramalingaswamy
Cheruku, Damodar Edla, and Venkatanareshbabu Kuppili
Department of Computer Science and Engineering, National Institute
of Technology Goa, India
Abstract: This Radial Basis Function Neural Networks (RBFNNs)
have been used for classification in medical sciences, especially in diabetes
classification. These are three layer feed forward neural network with input
layer, hidden layer and output layer respectively. As the number of the training
patterns increases the number of neurons in the hidden layer of RBFNNs
increases, simultaneously network complexity increases and classification time
increases. Although various efforts have been made to address this issue by
using different clustering algorithms like k-means, k-medoids, and Self Organizing
Feature Map (SOFM) etc. to cluster the input data of diabetic to reduce the
size of the hidden layer. Though the main difficulty of determination of the
optimal number of neurons in the hidden layer remains unsolved. In this paper,
we present an efficient method for predicting diabetics using RBFNN with
optimal number of neurons in the hidden layer. This study mainly focuses on
determining the number of neurons in hidden layer using cluster validity
indexes and also find out the weights between output layer and a hidden layer
by using genetic algorithm. The proposed model was used to solve the problem of
detection of Pima Indian Diabetes and gave an accuracy of 73.50%, which was
better than most of the commonly known algorithms in the literature. And also
proposed methodology reduced the complexity of the network by 90% in terms of
number of connections, furthermore reduced the classification time of new
patterns.
Keywords: Radial basis function networks,
classification, medical diagnosis, diabetes, optimal number of clusters,
genetic algorithm.
Received February 13, 2016; accepted February 8, 2017
Edge Detection Optimization Using Fractional Order Calculus
Mohammed Mekideche and Youcef Ferdi
Department of Electrical Engineering, Skikda
University, Algeria
Abstract: In computer vision and image processing, time
and quality are major factors taken into account. In edge detection process,
the smoothing operation by a low-pass filter is commonly performed first in
order to reduce noise effect. However, performing the smoothing operation
partially requires additional computational time and alters true edges as well.
Attempting to resolve such problems, a new approach dealing with edge detection
optimization is addressed in this paper. For this purpose, a short edge
detector algorithm without smoothing operation is proposed and investigated.
This algorithm is based on a fractional order mask used as kernel of
convolution for edge enhancement. It has been shown that in the proposed
algorithm, the smoothing pre-process is no longer necessary; because, the
efficiency of our fractional order mask is expressed in term of immunity to
noise and the capability of detecting edges. Simulation results show how the
quality of edge detection can be enhanced on adjusting the fractional order
parameter. Then, our proposed edge detection method can be very useful in real time
applications in some fields such as, satellite and medical imaging.
Keywords:
Edge detection, fractional order calculus,
computational time, smoothing operation, performances evaluation.
Predicting the Winner of Delhi Assembly Election, 2015 from Sentiment Analysis on Twitter Data-A Big
Predicting the Winner of Delhi Assembly Election,
2015 from Sentiment Analysis on Twitter
Data-A BigData Perspective
Lija Mohan and Sudheep
Elayidom
Division of Computer Science, Cochin University of Science and
Technology, India
Abstract: Social media is currently a
place where people create and share contents at a massive rate. Because of its
ease of use, speed and reach, it is fast changing the public discourse in
society and setting trends and agendas in different topics including
environment, politics technology, entertainment etc. As it is a form of
collective wisdom, we decided to investigate its power at predicting real-world
outcomes. The objective was to design a Twitter-based sentiment mining. We
introduce a keyword-aware user-based collective tweet mining approach to rank
the sentiment of each user. To prove the accuracy of this method, we chose an
Election Winner Prediction application and observed how the sentiments of
people on different political issues at that time got reflected in their votes.
A Domain thesaurus is built by collecting keywords related to each issue.
Twitter data being huge in size and difficult to process, we use a scalable and
efficient Map Reduce programming model-based approach, to classify the tweets.
The experiments were designed to predict the winner of Delhi Assembly Elections
2015, by analyzing the sentiments of people on political issues and from this
analysis, we accurately predicted that Aam Admi Party has a higher support,
compared to Bharathiya Janatha Party (BJP), the ruling party. Thus,
a Big Data Approach that has widespread applications in today’s world, is used
for sentiment analysis on Twitter data.
Keywords: Election winner prediction, big data,
sentiment analysis, tweet mining, map reduce.
A New Approach for A Domain-Independent Turkish Sentiment Seed Lexicon Compilation
Ekin Ekinci and Sevinç
Omurca
Department of Computer Engineering, Kocaeli University,
Turkey
Abstract: Sentiment
analysis deals with opinions in documents and relies on sentiment lexicons;
however, Turkish is one of the poorest languages in regard to having such
ready-to-use sentiment lexicons. In this article, we propose a
domain-independent Turkish sentiment seed lexicon, which is extended from an
initial seed lexicon, consisting of 62 positive/negative seeds. The lexicon is
completed by using the beam search method to propagate the sentiment values of
initial seeds by exploiting synonym and antonym relations in the Turkish
Semantic Relations Dataset. Consequently, the proposed method assigned 94 words
as positive sentiments and 95 words as negative sentiments. To test the
correctness of the sentiment seeds and their values the first sense, the total
sum and weighted sum algorithms, which are based on SentiWordNet and SenticNet
3, are used. According to the weighted sum, experimental results indicate that
the beam search algorithm is a good alternative to automatic construction of a
domain-independent sentiment seed lexicon.
Keywords: Sentiment lexicon, beam search, pattern generation, turkish
language, unsupervised framework.
Simulating Email Worm Propagation Based
on Social Network and User Behavior
Kexin Yin1, Wanlong Li1, Ming Hu3, and Jianqi Zhu2
1School of Computer Science and Engineering, Changchun University of Technology,
China
2School
of Computer Science and Technology, Jilin University,
China
3School of Computer Technology and Engineering,
Changchun Institute of Technology, China
Abstract: Email worms pose a significant security threat to organizations and
computer users today. Because they propagate over a logical network, the
traditional epidemic model is unsuitable for modeling their propagation over
the internet. However, it is no doubt that accurate modeling the propagation of
email worms is helpful to contain th9eir attacks in advance. This paper
presents a novel email worms’ propagation model, which is based on a directed
and weighted social network. Moreover, the effects of user’s behavior are also
considered in this model. To the author’s knowledge, there is little
information available considering the effects of them in modeling their
propagation. A simulation algorithm is designed for verifying the effectiveness
of the presented model. The results show that the presented model can describe
the propagation of email worms accurately. Through simulating different containing
strategies, we demonstrate that the infected key nodes in email social
community can speed up the worm propagating. Last, a new General Susceptible
Infectious Susceptible (G-SIS) email worm model is presented, which can predict
the propagation scale of email worms accurately.
Keywords: Network security,
Email worm propagation, social network, user Behavior, G-SIS.
Received April 2, 2016; accepted November 27, 2017
Performance Analysis of Microsoft Network Policy Server and FreeRADIUS Authentication Systems in 802
Performance Analysis of Microsoft Network Policy
Server and FreeRADIUS Authentication Systems
in 802.1x based Secured Wired
Ethernet using PEAP
Farrukh Chughtai1, Riaz
UlAmin1, Abdul Sattar Malik2, and Nausheen Saeed3
1Department of Computer Science, Balochistan
University of Information Technology Engineering and Management Sciences, Pakistan
2Department
of Electrical Engineering, Bahauddin
Zakariya University, Pakistan
3Department of Computer
Science, Sardar Bahadur Khan University, Pakistan
Abstract: IEEE 802.1x is an industry standard to implement
physical port level security in wired and wireless Ethernets by using RADIUS
infrastructure. Administrators of corporate networks need secure network
admission control for their environment in a way that adds minimum traffic
overhead and does not degrade the performance of the network. This research
focuses on two widely used Remote Authentication Dial In User Service (RADIUS)
servers, Microsoft Network Policy Server (NPS) and FreeRADIUS to evaluate their
efficiency and network overhead according to a set of pre-defined key
performance indicators using Protected Extensible Authentication Protocol
(PEAP) in conjunction with Microsoft Challenged Handshake Authentication Protocol
version 2 (MSCHAPv2). The key performance indicators – authentication time,
reconnection time and protocol overhead were evaluated in real test bed
configuration. Results of the experiments explain why the performance of a
particular authentications system is better than the other in the given
scenario.
Keywords: IEEE 802.1x, Microsoft NPS, FreeRADIUS,
PEAP, MSCHAP2, performance analysis, RADIUS.
Tree Based Fast Similarity Query Search Indexing on Outsourced Cloud Data Streams
Balamurugan Balasubramanian1, Kamalraj Durai1, Jegadeeswari
Sathyanarayanan1, and Sugumaran Muthukumarasamy2
1Research Scholar, Computer Science, Bharathiar
University, India
2Computer Science and
Engineering, Pondicherry Engineering College, India
Abstract: A Cloud may be seen as flexible computing
infrastructure comprising of many nodes that support several concurrent end
users. To fully harness the power of the Cloud, efficient data query processing
has to be ascertained. This work provides extra functionalities on cloud data query
processing, a method called, Hybrid Tree Fast Similarity Query (HT-FSQS) Search
is presented. The Hybrid Tree structure used in HT-FSQS consists of E-tree and R+ tree for balancing the load and performing
similarity search. In addition, we
articulate performance optimization mechanisms for our method by indexing quasi
data objects to improve the quality of similarity search using R+ tree
mechanism. Fast
Similarity Query Search indexing build cloud data streams for handling different
types of user queries and produce the result with lesser
computational time. Fast Similarity Query Search uses inter-intra bin
pruning technique, where it resolves the data more similar to user query. E- R+ tree FSQ method branch and bound search
eliminates certain bins from consideration, speeding up the indexing operation.
The experiment results demonstrate that
the Hybrid
Tree Fast Similarity Query (HT-FSQS) Search achieve
significant performance gains in terms of computation time, quality of
similarity search and load balance factor in comparison with non-indexing
approaches.
Keywords: Cloud, hybrid tree, fast similarity query,
e-tree, r+ tree.
A Cloud-based Architecture for Mitigating Privacy
Issues in Online Social Networks
Mustafa Kaiiali1, Auwal Iliyasu2,
Ahmad Wazan3, Adib Habbal4, and Yusuf Muhammad5
1Centre for Secure
Information Technologies, Queen's University Belfast, UK
2The Department of Computer
Engineering, Kano State Polytechnic, Nigeria
3Département Informatique,
Institut de Recherche en Informatique de Toulouse, France
4InterNetWorks Research Lab,
School of Computing, Universiti Utara Malaysia, Malaysia
5The Department of Computer Science, Saadatu Rimi College of Education,
Nigeria
Abstract: Online social media networks have revolutionized the
way information is shared across our societies and around the world.
Information is now delivered for free to a large audience within a short period
of time. Anyone can publish news and information and become a content creator
over the internet. However, along with these benefits is the privacy issue that
raises a serious concern due to incidences of privacy breaches in Online Social
Networks (OSNs). Various projects have been developed to protect users’ privacy
in OSNs. This paper discusses those projects and analysestheir pros and cons. Then it proposes a new
cloud-based model to shield up OSNs users against unauthorized disclosure of
their private data. The model supports both trusted (private) as well as
untrusted (3rd party) clouds. An efficiency analysis is provided at
the end to show that the proposed model offers a lot of improvements over
existing ones.
Keywords: Online social network, cloud computing, user’s privacy,
access control, broadcast encryption.
A Trusted Virtual Network Construction Method
Based on Data Sources Dependence
Xiaorong Cheng1 and Tianqi LI2
1Department Computer Science, North China Electric Power University, China
2C-Epri Electric Power Engineering CO, LTD, China
Abstract:
At present, the isolated and single data source
cannot meet the needs of system security. Based on the research of the trusted
computing theory, this paper creatively put forward a method to construct a
trusted virtual network based on data source dependency. Firstly, the
credibility of data source is calculated by the NEWACCU algorithm, and then,
the trusted virtual network which is composed of the entity of data source is
built dynamically by calculating the credibility between data sources, which
will provide technical support for future credibility assessment and further
research on information security. Taking the data of e-commerce platform as an
example, the experimental results verify the effectiveness of the method.
Keywords: Data source, credibility, trusted virtual network, dynamics, modeling
and simulation.
A Novel and Complete Approach for Storing RDF(S) in Relational Databases
Fu Zhang1, Qiang Tong2, and Jingwei Cheng1
1School of Computer
Science and Engineering, Northeastern University, China
2School of Software, Northeastern
University, China
Abstract: Resource Description Framework (RDF) and RDF
Schema (collectively called RDF(S)) are the normative language to describe the
Web resource information. With the massive growth of RDF(S) information, how to
effectively store them is becoming an important research issue. By analysing
the characteristics of RDF(S) data and schema semantic information in depth,
this paper proposes a multiple storage model of RDF(S) based on relational
databases. An overall storage framework, some detailed storage rules, a storage
algorithm and a storage example are proposed. Also, the correctness of the
storage approach is discussed and proved. Based on the proposed storage
approach, a prototype storage tool is implemented, and experiments show that
the approach and the tool are feasible.
Keywords: RDF, RDF schema, relational database, storage.
UTP: A Novel PIN Number Based User Authentication Scheme
Srinivasan
Rajarajan and Ponnada Priyadarsini
School of Computing, SASTRA
Deemed University, India
Abstract:
This paper proposes a Personal
Identification Number (PIN) number based authentication scheme named User
Transformed PIN (UTP). It introduces a simple cognitive process with which users
may transform their PIN numbers into a dynamic one-time number. PIN numbers are
widely used for the purpose of user authentication. They are entered directly and
reused several times. This makes them vulnerable to many types of attacks. To
overcome their drawbacks, One
Time Password (OTPs) are combined with PIN numbers to form a stronger
two-factor authentication. Though it is relatively difficult to attack OTPs, nevertheless
OTPs are not foolproof to attacks. In our proposed work, we have devised a new
scheme that withstands many of the common attacks on PIN numbers and OTPs. In
our scheme, users will generate the UTP with the help of a visual pattern,
random alphabets sequence and a PIN number. Because the UTP varies for each
transaction, it acts like an OTP. Our scheme conceals PIN number within the UTP
so that no direct entry of PIN number is required. The PIN number could be retrieved
from the UTP by the authenticator module at the server. To the best our
knowledge, this is the first scheme that facilitates users to transform their
PIN numbers into a one-time number without any special device or tool. Our
scheme is an inherently multi-factor authentication by combining knowledge
factor and possession factor within itself. The user studies we conducted on
the prototype have provided encouraging results to support the scheme’s
security and usability.
Keywords: Personal identification
number, shoulder surfing, keylogging, user authentication, otp, internet
banking.
Detecting Sentences Types in the Standard Arabic
Language
Ramzi Halimouche and Hocine Teffahi
Laboratory of Spoken Communication and Signal
Processing, Electronics and Computer Science Faculty, University of Sciences
and Technology Houari Boumediene, Algeria
Abstract: The standard
Arabic language, like many other languages, contains a prosodic feature, which
is hidden in the speech signal. The studies related to this field are still in
the preliminary stages. This fact results in restraining the performance of the
communication tools. The prosodic study allows people having all the communication
tools needed in their native language. Therefore, we propose, in this paper, a
prosodic study between the various types of sentences in the standard Arabic
language. The sentences are recognized according to three modalities as the
following: declarative, interrogative and exclamatory sentences. The results of
this study will be used to synthesize the different types of pronunciation that
can be exploited in several domains namely the man-machine communication. To
this end, we developed a specific dataset, consisting of the three types of
sentences. Then, we tested two sets of features: prosodic features (Fundamental
Frequency, Energy and Duration) and spectrum features (Mel-Frequency Cepstral
Coefficients and Linear Predictive Coding) as well their combination. We
adopted the Multi-Class Support Vector Machine (MC-SVM) as classifier. The
experimental results are very encouraging.
Keywords: Standard arabic
language, sentence type detection, fundamental frequency, energy, duration, mel-frequency cepstral coefficients,
linear predictive coding.
Received January 19, 2017; accepted August 23, 2017
Data Deduplication for
Efficient Cloud Storage and Retrieval
Rishikesh
Misal and Boominathan Perumal
School of
Computer Engineering, Vellore Institute of Technology University, India
Abstract: Cloud services provide flawless service to the client by increasing the
geographic availability of the data. Increasing availability of data induces
high amount of redundancy and large amount of space required to store that
data. Data compression techniques can reduce the amount of space required for
that data to be store at various sites. Data compression will ensure that there
is no loss of availability and consistency at any site. As there is huge demand
for cloud services and storage due to this the amount of investment also
increases. By using data compression we can reduce the amount of investment
required and this will also decrease the amount of physical space and data
centers required to store data. Various security protocols can be incorporated
to secure these compressed files at various sites. We provide a reliable
technique to store deduplicates and its management in a secure manner to
accomplish high consistency as well as availability.
Keywords: Data deduplication, cloud computing, storage, file
system, distributed system.
Self-Adaptive PSO Memetic Algorithm For
Multi Objective Workflow Scheduling in Hybrid Cloud
Padmaveni Krishnan and John Aravindhar
Department of Computer
Science and Engineering, Hindustan Institute of Technology and Science, India
Abstract: Cloud computing is a technology in distributed computing
that facilitate pay per model to solve
large scale problems. The main aim of cloud computing is to give optimal
access among the distributed resources. Task
scheduling in cloud is the allocation of best resource to the demand
considering the different parameters like time, makespan, cost, throughput etc.
All the workflow scheduling algorithms available cannot be applied in cloud
since they fail to integrate the elasticity and heterogeneity in cloud. In this
paper, the cloud workflow scheduling problem is modeled considering make span, cost,
percentage of private cloud utilization and violation of deadline as four main
objectives. Hybrid approach of Particle Swarm Optimization (PSO) and Memetic Algorithm
(MA) called Self-Adaptive Particle Swarm Memetic Algorithm (SPMA) is proposed. SPMA
can be used by cloud providers to maximize user quality of service and the
profit of resource using an entropy optimization model. The heuristic is tested on several workflows. The results obtained shows
that SPMA performs better than other state of art algorithms.
Keywords: Cloud
computing, memetic algorithm, particle swarm optimization, self-adaptive
particle swarm memetic algorithm.
A Novel Adaptive Two-phase Multimodal
Biometric Recognition System
Venkatramaphanikumar Sistla1,
Venkata Krishna Kishore Kolli1, and Kamakshi Prasad Valurouthu2
1Department of Computer Science and Engineering, Vignan’s
Foundation for Science, Technology and Research, India
2Department
of Computer Science and Engineering, Jawaharlal Nehru Technological University
Hyderabad College of Engineering, India
Abstract: Multimodal biometric recognition systems are
intended to offer authentication without compromising on security, accuracy and
these systems also used to address the limitations of unimodal systems like
spoofing, intra class variations, noise and non-universality. In this paper, a
novel adaptive two-phase multimodal framework is proposed with face, finger and
speech traits. In this work, face trait reduces the search space by retrieving
few possible nearest enrolled candidates to the probe using Gabor wavelets,
semi-supervised kernel discriminant analysis and two dimensional- dynamic time
warping. This nonlinear face classification serves as a search space reducer and
affects the True Acceptance Rate (TAR). Later, level-1 and level-2 features of
fingerprint trait are fused with Dempster Shafer theory and achieved high TAR.
In the second phase, to reduce FAR and to validate the user identity, a text
dependent speaker verification with RBFNN classifier is proposed. Classification
accuracy of the proposed method is evaluated on own and standard datasets and
experimental results clearly evident that proposed technique outperforms existing
techniques in terms of search time, space and accuracy.
Keywords: Gabor filters, radial basis function,
discrete wavelet transform, dynamic time warping kernel discriminant analysis.
Received April 18, 2017; accepted June 13, 2017
EncCD: A Framework for Efficient Detection of Code Clones
Minhaj Khan
Department of Computer Science, Bahauddin Zakariya
University, Pakistan
Abstract: Code clones represent similar snippets of code
written for an application. The detection of code clones is essential for
maintenance of a software as modification to multiple snippets with a similar
bug becomes cumbersome for a large software. The clone detection techniques
perform conventional parsing before final match detection. An inefficient
parsing mechanism however deteriorates performance of the overall clone
detection mechanism. In this paper, we propose a framework called Encoded Clone
Detector (EncCD), which is based on encoded pipeline processing for efficiently
detecting clones. The proposed framework makes use of efficient labelled
encoding followed by tokenization and match detection. The experimentation
performed on the Intel Core i7 and Intel Xeon processor based systems shows
that the proposed EncCD framework outperforms the widely used JCCD and CCFinder
frameworks by producing a significant performance improvement.
Keywords: Clone detection, Software Engineering,
Software Maintenance, Optimization, Speedup.
Sentiment Analysis with Term Weighting and Word Vectors
Metin Bilgin1 and Haldun Köktaş2
1Department of
Computer Engineering, Bursa Uludağ University, Turkey
2Department of Mechatronic Engineering, Bursa Technical
University, Turkey
Abstract: It is the sentiment analysis with which it
is tried to predict the sentiment being told
in the texts in an area where Natural Language Processing (NLP) studies are being frequently used in recent years. In this
study sentiment extraction has been made from Turkish texts and performances of
methods that are used in text
representation have been compared. In the study being
conducted, besides Bag of Words (BoW) method which is traditionally used for the representation of
texts, Word2Vec, which is word vector algorithm being
developed in recent years and Doc2Vec, being document vector algorithm,
have been used. For the study 5 different
Machine Learning (ML) algorithms have been used to classify the texts being represented in 5 different ways on 3000
pieces of labeled tweets belonging to a telecom company. As a conclusion it was seen that Word2Vec, being
among text representation methods and Random Forest, being among ML algorithms
were most successful and most applicable ones. It is important as it is the
first study with which BoW and word vectors have been compared for sentiment analysis in Turkish texts.
Keywords: Word2vec, Doc2vec, sentiment
analysis, machine learning, natural language processing.
Received February 16, 2018; accepted July 22, 2018