Text Classification: A Comprehensive Survey of Methods, Applications, and Future Directions
DOI:
https://doi.org/10.21590/ijtmh.8.03.04Keywords:
Text Classification, Natural Language Processing, Machine Learning, Deep Learning, Transformers, BERT, Neural Networks, Sentiment Analysis, Topic CategorizationAbstract
Text classification stands as a fundamental task in natural language processing, involving the automated assignment of predefined categories to textual documents, sentences, or phrases. This comprehensive survey examines the evolution of text classification methodologies from traditional machine learning approaches through deep learning innovations to contemporary transformer-based architectures. We analyze the progression from feature-engineered methods like Naïve Bayes and Support Vector Machines, through representation learning with CNNs and LSTMs, to pre-trained language models including BERT, RoBERTa, and GPT variants. Our analysis encompasses benchmark datasets, evaluation metrics, and quantitative performance comparisons across methodologies. We explore emerging paradigms including prompt-based learning, few-shot classification, and multilingual adaptation while addressing critical challenges in interpretability, fairness, and computational efficiency. The survey provides practitioners with decision frameworks for selecting appropriate approaches based on task requirements, data availability, and resource constraints. We conclude by identifying open research questions and future directions that will shape the next generation of text classification systems.


