TY - JOUR ID - 91569 TI - Social Media Toxic Content Filtering System using SOIR Model JO - Journal of Information Technology Management JA - JITM LA - en SN - AU - Bhandari, Nidhi AU - Navalakhe, Rachna AU - Prajapati, G.L AD - Department of Applied Mathematics, Indore Institute of Engineering and Technology, Indore, India AD - Department of Applied Mathematics and Computational Science, Shri G. S. Institute of Technology and Science, Indore, India. AD - Department of Applied Mathematics, Indore Institute of Engineering and Technology, Indore, India. Y1 - 2023 PY - 2023 VL - 15 IS - Special Issue: Digital Twin Enabled Neural Networks Architecture Management for Sustainable Computing SP - 78 EP - 94 KW - Text mining KW - Semantic Knowledge KW - information retrieval KW - Sentiment analysis KW - Lexical Pattern Analysis DO - 10.22059/jitm.2023.91569 N2 - Social media is a popular data source in the research community. It provides different opportunities to design practical applications to favor humanity and society. A significant amount of people consumes social media content. Thus, sometimes content promoters and influencers publish misleading and toxic content. Therefore, this paper proposes an unhealthy content filtering system using the information retrieval model SOIR to identify and remove poisonous content from social media. The Semantic query Optimization-based Information Retrieval (SOIR) uses Fuzzy C Means (FCM) clustering to produce a particular data structure. To incorporate a query generation technique for the generation of multiple queries to increase the probability of correct outcomes. The SOIR model is modified in this work to utilize the model with the social media toxic content filtering model. The model uses linguistic and semantically information to craft new feature sets. The Part of Speech (POS) tagging is used to construct the linguistic feature. Finally, the pattern-matching algorithm is designed to classify the tweets as toxic or nontoxic. Based on lexical and semantic analysis of similar semantic queries (Tweets), it is identified with the class labels of the tweets. Twitter text posts are used to create training and test samples in this context. Here, a total of 2002 tweets are used for the experiment. The experimental study has been carried out with the different I.R. models (K-NN, Cosine) based on precision, recall, and F1-Score demonstrating the superiority of the proposed classification model UR - https://jitm.ut.ac.ir/article_91569.html L1 - https://jitm.ut.ac.ir/article_91569_a0fa2368f3ab5a2bc5945d69d60b46f8.pdf ER -