Sentiment Analysis with Term Weighting and Word Vectors
Metin Bilgin1 and Haldun Köktaş2
1Department of
Computer Engineering, Bursa Uludağ University, Turkey
2Department of Mechatronic Engineering, Bursa Technical
University, Turkey
Abstract: It is the sentiment analysis with which it
is tried to predict the sentiment being told
in the texts in an area where Natural Language Processing (NLP) studies are being frequently used in recent years. In this
study sentiment extraction has been made from Turkish texts and performances of
methods that are used in text
representation have been compared. In the study being
conducted, besides Bag of Words (BoW) method which is traditionally used for the representation of
texts, Word2Vec, which is word vector algorithm being
developed in recent years and Doc2Vec, being document vector algorithm,
have been used. For the study 5 different
Machine Learning (ML) algorithms have been used to classify the texts being represented in 5 different ways on 3000
pieces of labeled tweets belonging to a telecom company. As a conclusion it was seen that Word2Vec, being
among text representation methods and Random Forest, being among ML algorithms
were most successful and most applicable ones. It is important as it is the
first study with which BoW and word vectors have been compared for sentiment analysis in Turkish texts.
Keywords: Word2vec, Doc2vec, sentiment
analysis, machine learning, natural language processing.
Received February 16, 2018; accepted July 22, 2018