ML Based Social Media Data Emotion Analyzer and Sentiment Classifier with Enriched Preprocessor

Document Type : Special Issue: Big Data Analytics and Management in Internet of Things.


1 Research Scholar, Computer Science Engineering, Bharath University, Chennai, India.

2 Provost, Bharath University, Chennai, India.


Sentiment Analysis or opinion mining is NLP's method to computationally identify and categorize user opinions expressed in textual data.  Mainly it is used to determine the user's opinions, emotions, appraisals, or judgments towards a specific event, topic, product, etc. is positive, negative, or neutral. In this approach, a huge amount of digital data generated online from blogs and social media websites is gathered and analyzed to discover the insights and help make business decisions. Social media is web-based applications that are designed and developed to allow people to share digital content in real-time quickly and efficiently.  Many people define social media as apps on their Smartphone or tablet, but the truth is, this communication tool started with computers. It became an essential and inseparable part of human life. Most business uses social media to market products, promote brands, and connect to current customers and foster new business. Online social media data is pervasive. It allows people to post their opinions and sentiments about products, events, and other people in the form of short text messages. For example, Twitter is an online social networking service where users post and interact with short messages, called "tweets." Hence, currently, social media has become a prospective source for businesses to discover people's sentiments and opinions about a particular event or product. This paper focuses on the development of a Multinomial Naïve Bayes Based social media data emotion analyzer and sentiment classifier. This paper also explains various enriched methods used in pre-processing techniques. This paper also focuses on various Machine Learning Techniques and steps to use the text classifier and different types of language models.


Billal, B., Fonseca, A., & Sadat, F. (2016, December). Efficient natural language pre-processing for analyzing large data sets. In 2016 IEEE International Conference on Big Data (Big Data) (pp. 3864-3871). IEEE.
Gupta, A., Singh, A., Pandita, I., & Parashar, H. (2019, March). Sentiment Analysis of Twitter Posts using Machine Learning Algorithms. In 2019 6th International Conference on Computing for Sustainable Global Development (INDIACom) (pp. 980-983). IEEE.
Jayamalini, K., & Ponnavaikko, M. (2019). Enhanced social media metrics analyzer using twitter corpus as an example. Int. J. Innov. Technol. Explor. Eng.(IJITEE)8(7), 822-828.
Rathi, M., Malik, A., Varshney, D., Sharma, R., & Mendiratta, S. (2018, August). Sentiment analysis of tweets using machine learning approach. In 2018 Eleventh international conference on contemporary computing (IC3) (pp. 1-3). IEEE.
Singh, G., Kumar, B., Gaur, L., & Tyagi, A. (2019, April). Comparison between multinomial and Bernoulli naïve Bayes for text classification. In 2019 International Conference on Automation, Computational and Technology Management (ICACTM) (pp. 593-596). IEEE.