AI-Powered Resume Screening: Opportunities, Biases, and Interpretability Challenges
DOI:
https://doi.org/10.21590/ehz64g49Abstract
Automated resume screening tools are increasingly adopted by HR departments to streamline talent acquisition. However, concerns around algorithmic bias, fairness, and explainability have drawn scrutiny from regulators and researchers alike. This paper investigates the use of AI models—particularly natural language processing (NLP) and machine learning (ML)—in resume parsing, ranking, and filtering. We develop and evaluate three models: a logistic regression baseline with TF-IDF features, a fine-tuned BERT model for semantic understanding, and a gradient-boosted decision tree (XGBoost) trained on hand-labeled hiring outcomes from a publicly available dataset. BERT-based models improve prediction accuracy by 15% over traditional keyword matching, identifying relevant experience even with unstructured or unconventional formats. However, interpretability suffers due to the opaque nature of deep language models. We analyze bias using gender-swapped resumes and show that both BERT and XGBoost models exhibit measurable disparities in ranking outcomes, favoring traditionally male-coded language and experience gaps. Feature importance and SHAP value visualizations are used to probe decision logic. Our study highlights the tension between performance and fairness in AI-based hiring tools. We propose a hybrid approach combining interpretable shallow models for screening with deep models for contextual scoring, alongside human-in-the-loop validation. This paper contributes guidelines for responsible AI use in recruitment systems.