Opinion Mining on Viet Thanh Nguyen’s The Sympathizer Using Topic Modelling and Sentiment Analysis

Document Type : Research Paper


1 School of Computer Sciences, University Sains Malaysia, 11800 Minden, Penang, Malaysia.

2 School of Humanities, University Sains Malaysia, 11800 Minden, Penang, Malaysia.


In attempts to examine the mapped spaces of a literary narrative, various quantitative approaches have been deployed to extract data from texts to graphs, maps, and trees. Though the existing methods offer invaluable insights, they undertake a rather different project than that of literary scholars who seek to examine privileged or unprivileged representations of certain spaces. This study aims to propose a computerized method to examine how matters of space and spatiality are addressed in literary writings. As the primary source of data, the study will focus on Viet Thanh Nguyen’s The Sympathizer (2015), which explores the lives of Vietnamese diaspora in two geographical locations, Vietnam, and America. To examine the portrayed spatial relations, that is which country is privileged over the other, and to find out the underlying opinion about the two places, this study performs topic modelling with Latent Dirichlet Allocation (LDA) and Latent Semantic Analysis (LSA) by using TextBlob. In addition, Python is used as the analytical tool for this project as it supports two LDA algorithms: Gensim and Mallet. To overcome the limitation that the performance of the model relies on the available libraries in Python, the study employs machine learning approach. Even though the results indicated that both geographical spaces are portrayed slightly positively, America achieves a higher polarity score than Vietnam and hence seems to be the favored space in the novel. This study can assist literary scholars in analyzing spatial relations more accurately in large volumes of works.


Abdelrahman, O., & Keikhosrokiani, P. (2020). Assembly line anomaly detection and root cause analysis using machine learning. IEEE Access, 8, 189661-189672. https://doi.org/0.1109/ACCESS.2020.3029826
Asl, M. P. (2018). Practices of counter-conduct as a mode of resistance in Middle East women’s life writings. 3L: Language, Linguistics, Literature, 24(2), 195-205. https://doi.org/10.17576/3L-2018-2402-15
Asl, M. P. (2019). Leisure as a space of political practice in Middle East women life writings. GEMA Online Journal of Language Studies, 19(3), 43-56. https://doi.org/10.17576/gema-2019-1903-03
Asl, M. P. (2020). The politics of space: Vietnam as a communist heterotopia in Viet Thanh Nguyen’s The Refugees. 3L: Language, Linguistics, Literature, 26(1), 156-170. https://doi.org/10.17576/3L-2020-2601-11
Hadi, N. H. A., & Asl, M. P. (2021). The objectifying gaze: A Lacanian reading of Viet Thanh Nguyen’s The Refugees. GEMA Online® Journal of Language Studies, 21(1), 62-75. https://doi.org/10.17576/gema-2021-2101-04 
Keikhosrokiani, P. (2019). Perspectives in the development of mobile medical information systems: Life cycle, management, methodological approach and application (1st ed.). Academic Press. https://doi.org/10.1016/C2018-0-02485-8
Keikhosrokiani, P. (2020). Chapter 1 - Introduction to mobile medical information system (mMIS) de-velopment. In A. Press (Ed.), Perspectives in the development of mobile medical information systems (pp. 1-22). https://doi.org/10.1016/B978-0-12-817657-3.00001-8
Khan, K., Baharudin, B., Khan, A., & Ullah, A. (2014). Mining opinion components from unstructured reviews: A review. Journal of King Saud University-Computer and Information Sciences, 26(3), 258-275. https://doi.org/10.1016/j.jksuci.2014.03.009
Kumari, K., Bhardwaj, M., & Sharma, S. (2020). OSEMN approach for real time data analysis. International Journal of Engineering and Management Research, 10(2). https://doi.org/10.31033/ijemr.10.2.11
Lum, K. (2017). Limitations of mitigating judicial bias with machine learning. Nature Human Behaviour, 1(7), 1-1. https://doi.org/10.1038/s41562-017-0141
Mohammad, S. M. (2016). 9 - Sentiment analysis: Detecting valence, emotions, and other affectual states from text. In H. L. Meiselman (Ed.), Emotion measurement (1st ed., pp. 201-237). Elsevier. https://doi.org/10.1016/B978-0-08-100508-8.00009-6
Nalisnick, E. T., & Baird, H. S. (2013). Character-to-character sentiment analysis in Shakespeare’s plays. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria.
Pourya Asl, M. (2019). Foucauldian rituals of justice and conduct in Zainab Salbi’s Between Two Worlds. Journal of Contemporary Iraq & the Arab World, 13(2-3), 227-242. https://doi.org/10.1386/jciaw_00010_1
Pourya Asl, M. (2020). Micro-physics of discipline: Spaces of the self in Middle Eastern women life writings. International Journal of Arabic-English Studies, 20(2), 223-240. https://doi.org/10.33806/ijaes2000.20.2.12
Queiroz, A. I., & Alves, D. (2015). Walking through the Revolution: A spatial reading of literary echoes. JSSE - Journal of Social Science Education, 14. https://doi.org/10.4119/jsse-741
Roque, A. (2012). Towards a computational approach to literary text analysis. Proceedings of the NAACL-HLT 2012 Workshop on Computational Linguistics for Literature, Montreal, Canada.
Schmidt, T., Kaindl, F., & Wolff, C. (2020). Distant reading of religious online communities: A case study for three religious forums on reddit. Proceedings of the Digital Humanities in the Nordic Countries 5th Conference, Riga, Latvia.
Tally Jr, R. T. (2017). The Routledge handbook of literature and space (1st ed.). Taylor & Francis. https://doi.org/10.4324/9781315745978
Van der Bergh, R. H. (2013). The contrasting structure of Acts 12: 5-17: A spatial reading. HTS Teologiese Studies/Theological Studies, 69(1), 1-5. https://doi.org/10.4102/hts.v69i1.1313
Wei, X., & Croft, W. B. (2006). LDA-based document models for ad-hoc retrieval. Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, New York, NY, USA.
Ying, S. Y., Keikhosrokiani, P., & Asl, M. P. (2021). Comparison of data analytic techniques for a spatial opinion mining in literary works: A review paper. In F. Saeed, F. Mohammed, & A. Al-Nahari (Eds.), Innovative Systems for Intelligent Health Informatics (pp. 523-535). Springer International Publishing. https://doi.org/10.1007/978-3-030-70713-2_49