Presenting a Text Mining Algorithm to Identify Emotion in Persian Corpus

Document Type: Research Paper


1 Research Instructor, Faculty of Iran Telecommunication Research Center, Tehran, Iran

2 MSc, Software Engineering, Islamic Azad University, Karaj Branch, Tehran, Iran

3 Associate Prof. of Management, Islamic Azad University, Central Tehran Branch, Tehran, Iran


The literature regarding Persian text mining indicates that most studies are conducted to detect polarity of opinions on social websites. The aim of this research is presenting an algorithm to identify emotion implemented in the text based on the following six main emotions of happiness, sadness, fear, anger, surprise and disgust. In this research, the emotion will be examined based on unsupervised lexicon method. Identifying emotions conveyed by the texts based on a single emotional word will produce low accuracy because the intervening boosters and negating words can influence the emotion of the text too. Therefore, the algorithm has been implemented in six approaches with different features. In the first approach, the algorithm is capable of detecting only one emotional word in a sentence, and then it improves to detect boosters and negating and stop word list as well. The results of running the algorithm on two domains of data showed that the more features used in the algorithm, the more accurate the algorithm becomes and that the most effective factor is part of speech.


Main Subjects

 علیمردانی، سعیده و آقایی، عبداله (2015). اندیشه‌کاوی در زبان فارسی. فصلنامۀ مدیریت فناوری اطلاعات، 2(7)، 362- 345.


Ali Mardani, S. & ‌‌Aghayi, A. (2015). Opinion Mining in Persian Language. Journal of Information technology management, 2(7), 345-362. (in Persian)

Balahur, A. & Turchi, M. (2014). Comparative experiments using supervised learning and machine translation for multilingual sentiment analysis. Computer Speech and Language, 28(1), 56–75.

Banea, C. & Mihalcea, R. & Wiebe, J. (2014). Sense-level subjectivity in a multilingual setting. Computer Speech and Language, 28(1), 7–19.

Barawi Hardyman, M. & Seng, Y. (2013). Evaluation of resource creations accuracy by using sentiment Analysis. Procedia - Social and Behavioral Sciences, 97(11), 522 – 527.

Brychcín, T. & Konopík, M. (2014). Semantic spaces for improving language modeling. Computer Speech and Language, 28(1), 192 – 209.

Dotti, F. (2013). Overcoming Problems in Automated Appraisal Recognition: the Attitude System in Inscribed Appraisa. Procedia - Social and Behavioral Sciences, 95(10), 442 – 446.

Ghazi, D. & Inkpen, D. & Szpakowicz, S. (2014). Prior and contextual emotion of words in sentential context. Computer Speech and Language, 28(1), 76–92.

Hadanoa, M. & Shimadaa, K. & Endoa, T. (2011). Aspect identification of sentiment sentences using a clustering Algorithm, Procedia - Social and Behavioral Sciences, 27(10), 22 – 31.

Haddia, E. & Liua, X. & Shib, Y. (2013). The Role of Text Pre-processing in Sentiment Analysis. Procedia Computer Science, 17(5), 26 – 32.

Lambova, D. & Paisa, S. & Diasa, G. (2011). Merged Agreement Algorithms for Domain Independent Sentiment Analysis. Procedia - Social and Behavioral Sciences, 27(10), 248 – 257.

Devika, MD. & Sunitha, C. & Ganesha, A. (2016). Sentiment Analysis: A Comparative Study On Different Approaches, Procedia Computer Science,s, 87(5), 44 – 49.

Yu, L. & Wu, Ch. & Jang, F. (2009). Psychiatric document retrieval using a discourse-aware model. Artificial Intelligence, 173(7), 817–829.