RAG for Smarter Resume Analysis: Beyond Basic LLMs
Igor Zuykov , Chief Software Engineer, G-71 Inc. Ashkelon, IsraelAbstract
The article examines an architectural approach to resume analysis based on Retrieval-Augmented Generation (RAG), designed to overcome the systemic limitations of traditional keyword-matching algorithms (like TF-IDF and BM25) and the inherent constraints of large language models (LLMs) used in isolation under conditions of an overloaded and semantically heterogeneous hiring market. The relevance of the work is driven by the growth in the volume and variability of resumes, the need to capture latent semantic correspondences between experience phrasing and vacancy requirements, and the risks of algorithmic biases, as well as the plausible yet unreliable generation of personnel decisions. The study aims to formalize a dual-loop scheme for processing a resume corpus, in which dense semantic retrieval over vector representations of document fragments is coupled with answer generation strictly constrained by the retrieved context and complex refusal rules under insufficient grounds. The scientific novelty lies in interpreting the RAG approach as a mechanism of search-based non-parametric memory for a corporate resume array, where the chunking strategy (determined at the ingestion phase) and the retrieval parameters such as topK and similarity. Threshold, directly governing the scope and quality of information passed to the retrieval act as controllable regulators of the recall–noise–cost trade-off, and where requirements for explainability, traceability, and privacy are derived from HR-specific constraints rather than declared post factum. It is demonstrated that separating retrieval and generation functions, offloading compute-intensive corpus preparation into an asynchronous loop, and locally deploying models jointly reduce LLM load, decrease the incidence of hallucinations, and enable verifiable candidate ranking based on the semantic proximity of the experience to the recruiter’s query. It is concluded that the reliability of systems of this class is determined not by model strength, but by the architecture of source control and the discipline of context management. The article will be helpful for researchers and engineers developing intelligent talent selection systems, as well as for practicing recruiters and HR analysts implementing RAG solutions in corporate processes.
Keywords
resume analysis, Retrieval-Augmented Generation, semantic search, dense retrieval, vector representations
References
Ajjam, M.-H., & Al-Raweshidy, H. S. (2025). AI-driven semantic similarity-based job matching framework for recruitment systems. Information Sciences, 724, 122728. https://doi.org/10.1016/j.ins.2025.122728
Bouhsaien, L., & Azmani, A. (2025). Challenges and Strategies in Recruitment: Insights from Digital Transformation. Lecture Notes in Networks and Systems, 1310, 328–340. https://doi.org/10.1007/978-3-031-88653-9_33
Gupta, S., Ranjan, R., & Singh, S. N. (2024). A Comprehensive Survey of Retrieval-Augmented Generation (RAG): Evolution, Current Landscape, and Future Directions. ArXiv. https://doi.org/10.48550/arxiv.2410.12837
Karpukhin, V., Oguz, B., Min, S., Lewis, P., Wu, L., Edunov, S., Chen, D., & Yih, W. (2020). Dense Passage Retrieval for Open-Domain Question Answering. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 6769–6781. https://doi.org/10.18653/v1/2020.emnlp-main.550
Lin, S., Hilton, J., & Evans, O. (2022). TruthfulQA: Measuring How Models Mimic Human Falsehoods. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, 1, 3214–3252. https://doi.org/10.18653/v1/2022.acl-long.229
Ponomarenko, A. (2025). Three Algorithms for Merging Hierarchical Navigable Small World Graphs. ArXiv. https://doi.org/10.48550/arxiv.2505.16064
Raghavan, M., Barocas, S., Kleinberg, J., & Levy, K. (2020). Mitigating Bias in Algorithmic Hiring. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 469–481. https://doi.org/10.1145/3351095.3372828
Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. ArXiv. https://doi.org/10.48550/arxiv.1908.10084
Download and View Statistics
Copyright License
Copyright (c) 2025 Igor Zuykov

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain the copyright of their manuscripts, and all Open Access articles are disseminated under the terms of the Creative Commons Attribution License 4.0 (CC-BY), which licenses unrestricted use, distribution, and reproduction in any medium, provided that the original work is appropriately cited. The use of general descriptive names, trade names, trademarks, and so forth in this publication, even if not specifically identified, does not imply that these names are not protected by the relevant laws and regulations.


Engineering and Technology
| Open Access |
DOI: