Opinion Mining in Persian Language

Document Type : Research Paper

Authors

1 MSc. student in Information Technology, Faculty of Industrial Engineering, K.N.Toosi University of Technology, Iran

2 Prof., Faculty of Industrial Engineering, K.N.Toosi University of Technology, Iran

Abstract

Rapid growth of networks and social networks results in more access to people’s opinion. These opinions contain useful information. By analyzing these opinions, people’s preferences and their positive and negative opinions about different subjects can be identified. Opinion mining is the process of analyzing people’s emotions, feelings and opinions to identify their preferences. In this article, a method for opinion mining in Persian language is introduced that is a combination of SVM and lexicon as a set of features. The lexicon is created by using SentiWordNet. To assess the algorithm, data of hotel domain is collected. Four cases were defined and among those cases, the case in which frequency of word multiplies with its orientation got the best result. The proposed method performs better compared to other methods in Persian opinion mining.

Keywords

Main Subjects


De Albornoz, J.C., Plaza, L., Gervàs, P. & Díaz, A. )2011(. A joint model of feature mining and sentiment analysis for product review rating. In 33rd European Conference on IR Research. pp. 55–66. DOI: 10.1007/978-3-642-20161-5_8.
Baccianella, S., Esuli, A. & Sebastiani, S. )2010(. SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In Proceedings of the Seventh Conference on International Language Resources and Evaluation, 2200-2204. European Language Resources Association. Retrieved from: http://lrec.elra.info/proceedings/lrec2010/ pdf/769_Paper.pdf.
Bamneshin, M., Mahdizadeh, R. & Pilehvar, A. (2011). A new stemmer for Persian verbs. In 3rd National Conference on Computer Engineering and Information Technology (CEIT2011). (in Persian)
Basari, A.S.H., Hussin, B., Ananta, G.B. & Zeniarja,J. (2013). Opinion Mining of Movie Review using Hybrid Method of Support Vector Machine and Particle Swarm Optimization. Procedia Engineering, 53: 453–462.
Brooke, J. (2009). Cross-Linguistic Sentiment Analysis : From English to Spanish. International Conference RANLP. pp. 50–54. Available in: https://www. sfu.ca/~mtaboada/docs/Brooke_et_al_RANLP_2009.pdf.
Haddi, E., Liu, X. & Shi, Y. (2013). The Role of Text Pre-processing in Sentiment Analysis. Procedia Computer Science, 17: 26- 32.
Hajmohammadi, M.S. & Ibrahim, R. )2013(. A SVM-based method for sentiment analysis in Persian language. In Z. Zhu, ed. international Conference on Graphic and Image Processing. p. 876838. DOI: 10.1117/12.2010940.
Hu, M. & Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’04. New York, New York, USA: ACM Press, p. 168.
Hung, C. & Lin, H.-K. (2013). Using Objective Words in SentiWordNet to Improve Word-of-Mouth Sentiment Classification. IEEE Intelligent Systems, 28(2): 47-54.
Joachims, T. )1998(. Text categorization with Support Vector Machines: Learning with many relevant features. 10th European Conference on Machine Learning Chemnitz. Germany: Springer Berlin Heidelberg.
Li, G. & Liu, F. (2012). Application of a clustering method on sentiment analysis. Journal of Information Science, 38(2): 127–139.
Li, G. & Liu, F. (2013). Sentiment analysis based on clustering: a framework in improving accuracy and recognizing neutral opinions. Applied Intelligence, 40(3): 441-452.
Lin, C. & He, Y. (2009). Joint Sentiment / Topic Model for Sentiment Analysis. In the 18th ACM conference on Information and knowledge management. pp. 375–384. DOI:10.1145/1645953.1646003.
Liu, B. )2012(. Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies, 5 (1): 1-176.
Martineau, J. & Finin, T. (2009). Delta TFIDF: An Improved Feature Space for Sentiment Analysis. Third AAAI International Conference on Weblogs and Social Media. Available in: http://ebiquity.umbc.edu/_file_directory_ /papers/446.pdf.
Miller, G.A. (1995). WordNet: A Lexical Database for English, Communications of the ACM, 38 (11): 39-41.
Molina-González, M.D., Martínez-Cámara, E. & Martín-Valdivia, M., Perea-Ortega, J. (2013). Semantic orientation for polarity classification in Spanish reviews. Expert Systems with Applications, 40 (18): 7250– 7257.
Montejo-Ráez, A., Martínez-Cámara, E., Martín-Valdivia, M.T. & Ureña- López, L.A. (2014). Ranked WordNet graph for Sentiment Polarity Classification in Twitter. Computer Speech & Language, 28(1): 93-107.
Moraes, R., Valiati, J.F. & Neto, W.P.G. (2013). Document-level sentiment classification: An empirical comparison between SVM and ANN. Expert Systems with Applications, 40 (2): 621–633.
Pang, B. & Lee, L. (2002). Thumbs up ? Sentiment Classification using Machine Learning Techniques. In ACL-02 conference on Empirical methods in natural language processing. PP. 79–86. DOI:10.3115/1118693. 1118704.
Saraee, M. & Bagheri, A. (2013). Feature Selection Methods in Persian Sentiment Analysis. Springer Berlin Heidelberg, 7934: 303–308.
Shams, M., Shakery, A. & Faili, H. (2012). A non-parametric LDA-based induction method for sentiment analysis. 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP 2012). IEEE, pp. 216–221. 23 May, Shiraz, Fars.
Stone, P., Dunphry, D., Smith, M. & Ogilvie, D. (1966). The General Inquirer: A Computer Approach to Content Analysis. Cambridge, MA: MIT Press.
Taboada, M., Brooke, J., Tofiloski, M., Voll, K. & Stede, M.(2011). Lexicon-Based Methods for Sentiment Analysis. Computational Linguistics, 37(2):267–307.
Turney, P.D. (2001). Thumbs up or thumbs down? Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL ’02. Morristown, NJ, USA: Association for Computational Linguistics, p. 417.
Vinodhini, G., Chandrasekaran, R. M. (2014). Sentiment Mining Using SVM-Based Hybrid Classification Model. In G. S. S. Krishnan et al., eds. Proceedings of ICC3. New Delhi: Springer India.
Wan, X. (2008). Using Bilingual Knowledge and Ensemble Techniques for Unsupervised Chinese Sentiment Analysis. EMNLP ’08 Proceedings of the Conference on Empirical Methods in Natural Language Processing. pp. 553–561. Available in: http://dl.acm.org/citation.cfm?id=1613783.
Wan, X., (2009). Co-Training for Cross-Lingual Sentiment Classification. ACL ’09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, 1: 235–243.
Wei, B. & Pal, C. (2010). Cross lingual adaptation: an experiment on sentiment classifications.  Proceedings of the ACL 2010 Conference Short Papers, pp. 258-262.
Wiebe, J., Wilson, T. & Cardie, C. (2005). Annotating expressions of opinions and emotions in language. Language Resources and Evaluation, 39 (2-3): 165-210.
Ye, Q., Zhang, Z. & Law, R. (2009). Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Systems with Applications, 36(3): 6527-6535.
Yu, L. & Ma, J. (2008). Opinion mining: A study on semantic orientation analysis for online document. In 7th World Congress on Intelligent Control and Automation. IEEE, pp. 4548-4552. DOI: 10.1109/WCICA. 2008.4594529.