Advanced Information Retrieval Techniques in the Big Data Era: Trends, Challenges, and Applications

Document Type : Research Paper

Authors

Department of Information Science, Faculty of Arts and Humanities, King Abdulaziz University, Jeddah, Saudi Arabia.

10.22059/jitm.2026.107234

Abstract

The rapid expansion of Big Data has introduced novel opportunities and challenges for Information Retrieval (IR). This study examines the current state of IR techniques and their evolution to manage, organize, and derive meaningful insights from massive datasets. We explore how machine learning algorithms, deep learning models, and natural language processing (NLP) enhance data retrieval ac-curacy and velocity. A comprehensive analysis of contemporary methodologies indicates that per-sonalized search engines, e-commerce, and healthcare offer significant potential for improving re-trieval precision, scalability, and relevance. Furthermore, this study addresses critical ethical consid-erations, including data privacy and algorithmic bias, while exploring novel applications in autono-mous systems and personalized AI assistants. Advancing IR methodologies is vital in the Big Data era. Future research must focus on developing novel algorithmic procedures, integrating quantum computing, and establishing ethical AI practices. Ultimately, accelerating IR advancements is essen-tial to overcoming Big Data constraints and fostering technological innovation.

Keywords


Alshareef, N., & Naif, H. (2023). Current development, challenges, and future trends in cloud computing: A survey. International Journal of Advanced Computer Science and Applications, 14(3), 37–51. doi.org
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. arxiv.org
Ding, Y. (2024). Evolution and emerging trends in musical information retrieval: A comprehensive review and future prospects. Highlights in Science, Engineering and Technology. doi.org
Ghali, G.-K., Farrag, F., Won, D., & Yu, J. (2024). Enhancing knowledge retrieval with in-context learning and semantic search through generative AI. arXiv preprint arXiv:2406.09621. doi.org
Hambarde, K. A., & Proenca, H. (2023). Information retrieval: Recent advances and beyond. IEEE Access, 11, 111160-111181. doi.org
Hamlin, A. T. (2021). Applications of doubly efficient private information retrieval [Doctoral dissertation, Northeastern University]. Northeastern University Repository.
Hassan, A., & Tarig, M. H. (2022). Real-time big data analytics for data stream challenges: An overview. European Journal of Information Technologies and Computer Science, 2(4), 1–6. doi.org
Hiwale, K., More, P., & Nayake, Y. (2024). A comprehensive review of web search engines: Evolution and impact. Engineering and Technology Journal, 9(6). doi.org
Hu, H. (2024). Research on the application of big data and artificial intelligence in search engines. International Journal of Computer Science and Information Technology, 2(1), 14. doi.org
Huan, J. (2022). Research on the application of artificial intelligence in image and text database retrieval. Frontiers in Computing and Intelligent Systems, 2(1), 39–41. doi.org
Huang, J., Chen, J., Lin, J., Qin, J., Feng, Z., Zhang, W., & Yu, Y. (2024). A comprehensive survey on retrieval methods in recommender systems. arXiv preprint arXiv:2407.21022. doi.org
Jason, J., & Jung, T. (2017). Editorial: Recent advances on big data technologies and applications. Mobile Networks and Applications, 22(4), 603–604. doi.org
Johnson, C enne. (2022). ABNIRML: Analyzing the behavior of neural IR models. Transactions of the Association for Computational Linguistics, 10, 224–239. doi.org
Kamarudin, M., Yati, K., Darmi, R., & Mat, S. S. (2020). A review of coaching and mentoring theories and models. International Journal of Academic Research in Progressive Education and Development, 9(2), 289–298. doi.org
Kaur, A., & Sandhu, S. (2022). Big data with cloud computing: Discussions and challenges. Big Data Mining and Analytics, 5(1), 32–40. doi.org
Kekevi̇, U., & Arif, A. (2022). Real-time big data processing and analytics: Concepts, technologies, and domains. Bilgisayar Bilimleri. doi.org
Kimanzi, R., Kimanga, P., Cherori, D., & Gikunda, P. (2024). Deep learning algorithms used in intrusion detection systems: A review. arXiv preprint arXiv:2402.17020. doi.org
Levin, S. (2024). Unleashing real-time analytics: A comparative study of in-memory computing vs. traditional disk-based systems. Brazilian Journal of Science, 3(5), 30–39. doi.org
Liang, P., Yanyan, L., Guo, J., Xu, J., & Cheng, X. (2017). A deep investigation of deep IR models. arXiv preprint arXiv:1704.06211. arxiv.org
Magrani, E., & Fernandes, P. G. (2023). The ethical and legal challenges of recommender systems driven by artificial intelligence. In Law, Governance and Technology Series (pp. 141–168). Springer. doi.org
Mahdi, M., Forootan, I., Abbasi, L., Rahim, Z., & Ahmadi, A. (2022). Machine learning and deep learning in energy systems: A review. Sustainability, 14(8), 4832. doi.org
Mantri, A. (2024). Real-time data streaming and AI enhancements: E-commerce live streaming shopping. International Journal of Computing and Engineering, 5(5), 22–32. doi.org
Maxwell, J., Farrell, N., Le, G., Brierley, L., Hunter, B., Scheepens, D., & Willoughby, A. (2024). The changing landscape of text mining: A review of approaches for ecology and evolution. EcoEvoRxiv. doi.org
N., D., Mhawi, H., & Oleiwi, W. (2022). An efficient information retrieval system using evolutionary algorithms. Network, 2(4), 583–605. doi.org
Naga, D., Jyothi, T., Tammineni, S., Thiyagu, T. M., Sowndharya, R., & Arvinth, N. (2024). A data management system for smart cities leveraging artificial intelligence modeling techniques to enhance privacy and security. Journal of Internet Services and Information Security, 14(1), 37–51. doi.org
Nardini, C., Rulli, C., & Venturini, R. (2024). Efficient multi-vector dense retrieval using bit vectors. arXiv preprint arXiv:2404.02805. doi.org
Naskath, J., Sivakamasundari, G., Alif, A., & Begum, S. (2022). A study on different deep learning algorithms used in deep neural nets: MLP, SOM, and DBN. Wireless Personal Communications, 128(4), 2913–2936. doi.org
Purnama, Y., Sari, M., & Yufuria, W. (2024). Implementing electronic medical records through big data in healthcare facilities. Journal of Scientific Research, Education, and Technology, 3(1). doi.org
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., & Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140), 1–67.
Rahman, M. M., Siful, I., Md, K., & Zihad, H. J. (2024). Advanced query optimization in SQL databases for real-time big data analytics. Asian Journal of Business and Information Systems, 4(3), 1–14. doi.org
Rajendra, T., & Ghiya, S. (2024). A study on cultural heritage preservation in the digital era. Indian Scientific Journal of Research in Engineering and Management, 8(2), 1–13. doi.org
Raza, M. A., Hussain, U., R., Kayani, A., Malik, A., & Suleman, A. (2023). Trends and applications of big data in education. Pakistan Journal of Science, 75(2). doi.org
Robertson, S., & Zaragoza, H. (2009). The probabilistic relevance framework: BM25 and beyond. Foundations and Trends® in Information Retrieval, 3(4), 333–389. doi.org
Ruby, J., & Dinakar, V. (2023). Real-time streaming analytics using big data paradigm and predictive modeling based on deep learning. International Journal on Recent and Innovation Trends in Computing and Communication, 11(4s), 161–165. doi.org
Singh, A., & Bhatia, D. E. (2024). Federated hierarchical tensor networks: A collaborative learning quantum AI-driven framework for healthcare. arXiv preprint arXiv:2405.07735. doi.org
Tariq, M., Irshad, A., & Afzal, A. (2022). Big data issues, challenges, and techniques: A survey. Pakistan Journal of Engineering & Technology, 5(2), 216–220. doi.org
Velaphi, C., & Thipe, T. (2022). A survey on computational intelligence applications in information retrieval. Research Square. doi.org
Vincent, N., Sarvnaz, K., & Zhenchang, X. (2023). DeBEIR: A Python package for dense bi-encoder information retrieval. The Journal of Open Source Software, 8(87), 5017. doi.org
Vlachou, M., & Macdonald, C. (2023). On coherence-based predictors for dense query performance prediction. arXiv preprint arXiv:2310.11405. doi.org
Wang, M., & Lü, H. (2024). Variational data encoding and correlations in quantum-enhanced machine learning. Chinese Physics B. doi.org
Wang, Y. (2023). Big data applications for smart cities. Journal of Innovation and Development, 5(3), 1–4. doi.org
Zhang, J., Li, Y., & Zhang, C. (2022). Application of big data analysis and cloud computing technology. Journal of Physics: Conference Series, 2386(1), Article 012022. doi.org
Zhao, F. (2024). Big data applications and mining in the healthcare field. Highlights in Science, Engineering and Technology. doi.org
Zhao, Y., & Flenner, A. (2019). Deep models, machine learning, and artificial intelligence applications in national security. Journal of Electronic Imaging, 28(4), Article 043015. doi.org