Anand, A., Gupta, A., Yadav, N., & Bajaj, S. (2024).
A comprehensive survey of AI-driven advancements and techniques in automated program repair and code generation. arXiv.
http://arxiv.org/abs/2411.07586
Cambaz, D., & Zhang, X. (2024). Use of AI-driven code generation models in teaching and learning programming: A systematic literature review. In Proceedings of the 55th ACM Technical Symposium on Computer Science Education (SIGCSE 2024) (Vol. 1, pp. 172–178).
Chen, X., Liu, C., & Song, D. (2018). Tree-to-tree neural networks for program translation.
Chung, D. J. H., Gao, Z., Kvasiuk, Y., Li, T., Münchmeyer, M., Rudolph, M., Sala, F., & Tadepalli, S. C. (2025).
Theoretical physics benchmark (TPBench): A dataset and study of AI reasoning capabilities in theoretical physics. arXiv.
http://arxiv.org/abs/2502.15815
Cruz-Benito, J., Vishwakarma, S., Martin-Fernandez, F., & Faro, I. (2021). Automated source code generation and auto-completion using deep learning: Comparing and discussing current language model-related approaches. AI (Switzerland), 2(1), 1–16.
Dou, S., Jia, H., Wu, S., Zheng, H., Zhou, W., Wu, M., Chai, M., Fan, J., Huang, C., Tao, Y., Liu, Y., Zhou, E., Zhang, M., Zhou, Y., Wu, Y., Zheng, R., Wen, M., Weng, R., Wang, J., … Huang, X. (2024).
What’s wrong with your code generated by large language models? An extensive study. arXiv.
http://arxiv.org/abs/2407.06153
Fan, Z., Gao, X., Mirchev, M., Roychoudhury, A., & Tan, S. H. (2023).
Automated repair of programs from large language models. arXiv.
http://arxiv.org/abs/2205.10583
Gao, C., Hu, X., Gao, S., Xia, X., & Jin, Z. (2025). The current challenges of software engineering in the era of large language models. ACM Transactions on Software Engineering and Methodology, 34(5).
Hou, X., Zhao, Y., Liu, Y., Yang, Z., Wang, K., Li, L., Luo, X., Lo, D., Grundy, J., & Wang, H. (2024).
Large language models for software engineering: A systematic literature review. arXiv.
http://arxiv.org/abs/2308.10620
Huang, H., Wang, S., Liu, H., Wang, H., & Wang, Y. (2024). Benchmarking large language models on communicative medical coaching: A dataset and a novel system. In
Findings of the Association for Computational Linguistics: ACL 2024 (pp. 1624–1637).
https://aclanthology.org/2024.findings-acl.94.pdf
Joshi, S. (2025). A comprehensive review of DeepSeek: Performance, architecture and capabilities. Preprints.
Ladegaard, I. (2025). Differentiation by disruption: Gatekeeper perspectives on “AI-aided writing” in three academic disciplines. Socius, 11.
Li, M., & Krishnamachari, B. (2024).
Evaluating ChatGPT-3.5 efficiency in solving coding problems of different complexity levels: An empirical analysis. arXiv.
http://arxiv.org/abs/2411.07529
Manik, M. M. H. (2025). ChatGPT vs. DeepSeek: A comparative study on AI-based code generation.
Mulder, R., Aivaloglou, F., & Zhang, X. (2023).
AI in coding: How can code generation models support developing computational thinking skills? The use of code generation models in programming support activities.
http://repository.tudelft.nl/
Shakya, R., Vadiee, F., & Khalil, M. (2025).
A showdown of ChatGPT vs DeepSeek in solving programming tasks. In
International Conference on New Trends in Computing Sciences (pp. 413–418). IEEE.
https://arxiv.org/pdf/2503.13549
Shi, L., Tang, Z., Zhang, N., Zhang, X., & Yang, Z. (2024). A survey on employing large language models for text-to-SQL tasks. ACM Computing Surveys, 58(2).
Tang, X., Qian, B., Gao, R., Chen, J., Chen, X., & Gerstein, M. B. (2024). BioCoder: A benchmark for bioinformatics code generation with large language models. Bioinformatics, 40(Supplement_1), i266–i276.
Wang, X., Gong, Z., Wang, G., Jia, J., Xu, Y., Zhao, J., Fan, Q., Wu, S., Hu, W., & Li, X. (2023). ChatGPT performs on the Chinese National Medical Licensing Examination.
Xu, H., & Yu, X.-Y. (2025).
From PowerPoint UI sketches to web-based applications: Pattern-driven code generation for GIS dashboard development using knowledge-augmented LLMs, context-aware visual prompting, and the React framework.
http://arxiv.org/abs/2502.08756
Yao, X., Li, H., Chan, T. H., Xiao, W., Yuan, M., Huang, Y., Chen, L., & Yu, B. (2025).
HDLdebugger: Streamlining HDL debugging with large language models. ACM Transactions on Design Automation of Electronic Systems.
https://doi.org/10.1145/3735638