A Sentiment Analysis System for the Hindi
Language by Integrating Gated Recurrent
Unit with Genetic Algorithm
Kush
Shrivastava and Shishir Kumar
Department of Computer Science Engineering, Jaypee
University of Engineering and Technology, India
Abstract: The growing availability and popularity
of opinion rich resources such as blogs, shopping websites, review portals, and
social media platforms have attracted several researchers to perform the
sentiment analysis task. Unlike English, Chinese, Spanish, etc. the
availability of Indian languages such as Hindi, Telugu, Tamil, etc., over the
web have also been increased at a rapid rate. This research work understands
the growing popularity of Hindi language in the web domain and considered it
for the task of sentiment analysis. The research work analyses the hidden
sentiments from the movie reviews collected from the review section of Hindi
language e-newspapers. The reviews are multilingual, which makes sentiment
analysis a challenging task. To overcome the challenges, this research work
proposes a deep learning based approach where a Gated Recurrent Unit network is
combined with the Hindi word embedding model. The strategy enables the network
to efficiently capture the semantic and syntactic relation between Hindi words
and accurately classify them into the sentiment classes. Gated Recurrent Unit
network's performance is profoundly dependent upon the selection of its
hyper-parameters; therefore, this research work also utilizes a Genetic
Algorithm to automatically build a gated recurrent network architecture
enabling it to select the best optimal hyper-parameters. It has been observed
that the proposed Genetic Algorithm-Gated Recurrent Unit (GA-GRU) model is
effective and achieves breakthrough performance results on the Hindi movie
review dataset as compared to other traditional resource-based and machine
learning approaches.
Keywords: Sentiment analysis, Hindi language,
multilingual, deep learning, gated recurrent unit, genetic algorithm.
Received
September 28, 2019; accepted May 9, 2020