Text Classification: A Comprehensive Survey of Methods, Applications, and Future Directions
DOI:
https://doi.org/10.21590/ijtmh.8.03.04Keywords:
Text Classification, Natural Language Processing, Machine Learning, Deep Learning, Transformers, BERT, Neural Networks, Sentiment Analysis, Topic CategorizationAbstract
Text classification stands as a fundamental task in natural language processing, involving the automated assignment of predefined categories to textual documents, sentences, or phrases. This comprehensive survey examines the evolution of text classification methodologies from traditional machine learning approaches through deep learning innovations to contemporary transformer-based architectures. We analyze the progression from feature-engineered methods like Naïve Bayes and Support Vector Machines, through representation learning with CNNs and LSTMs, to pre-trained language models including BERT, RoBERTa, and GPT variants. Our analysis encompasses benchmark datasets, evaluation metrics, and quantitative performance comparisons across methodologies. We explore emerging paradigms including prompt-based learning, few-shot classification, and multilingual adaptation while addressing critical challenges in interpretability, fairness, and computational efficiency. The survey provides practitioners with decision frameworks for selecting appropriate approaches based on task requirements, data availability, and resource constraints. We conclude by identifying open research questions and future directions that will shape the next generation of text classification systems.
References
Aggarwal, C. C., & Zhai, C. (2012). Mining text data. Springer. https://doi.org/10.1007/978-1-4614- 3223-4
Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., … Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877– 1901.
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of NAACL-HLT 2019, 4171–4186. https://doi.org/10.18653/v1/N19-1423
Galke, L., Scherp, A., & Stumme, G. (2017). A survey on automated document classification for the semantic web. Web Semantics: Science, Services and Agents on the World Wide Web, 37–38, 83–95. https://doi.org/10.1016/j.websem.2015.09.002
Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2017). Bag of tricks for efficient text classification. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 427–431.
Kowsari, K., Meimandi, K. J., Heidarysafa, M., Mendu, S., Barnes, L., & Brown, D. (2019). Text classification algorithms: A survey. Information, 10(4), 150. https://doi.org/10.3390/info10040150
Li, Y., Yang, T., Zhang, Y., & Wang, X. (2021). A survey of deep learning-based text classification.
IEEE Access, 9, 14653–14675. https://doi.org/10.1109/ACCESS.2021.3056780
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., … Stoyanov, V. (2019). RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692. https://arxiv.org/abs/1907.11692
Transforming Diagnostics Manufacturing at Cepheid: Migration from Paper-Based Processes to Digital Manufacturing using Opcenter MES. (2022). International Journal of Research and Applied Innovations, 5(1), 9451-9456. https://doi.org/10.15662/IJRAI.2022.0501005
Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to information retrieval. Cambridge University Press.
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. https://arxiv.org/abs/1301.3781
Nguyen, T. H., & Grishman, R. (2015). Relation extraction: Perspective from convolutional neural networks. Proceedings of NAACL-HLT 2015, 39–48. Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys, 34(1), 1–47. https://doi.org/10.1145/505282.505283
Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C. D., Ng, A. Y., & Potts, C. (2013). Recursive deep models for semantic compositionality over a sentiment treebank. Proceedings of EMNLP 2013, 1631–1642.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 5998–6008.
Zhang, X., Zhao, J., & LeCun, Y. (2015). Character-level convolutional networks for text classification.
Advances in Neural Information Processing Systems, 28, 649–657.
Satish Kumar Nalluri, Venkata Krishna Bharadwaj Parasaram. (2019). Software-Centric Automation Frameworks Integrating AI and Cybersecurity Principles. International Journal of Engineering Science & Humanities, 9(1), 30–40. Retrieved from https://www.ijesh.com/j/article/view/539
Nalluri, S. K., & Parasaram, V. K. B. (2016). Early Approaches to Robotic Process Automation in Enterprise Systems. International Journal of Humanities and Information Technology, 1(01), 12-28. https://doi.org/10.21590/ijhit.01.01.06
Parasaram, V. K. B., & Nalluri, S. K. (2016). A Comparative Analysis of Risk Management Frameworks in Enterprise IT Projects. SAMRIDDHI : A Journal of Physical Sciences, Engineering and Technology, 8(02), 147-155. https://doi.org/10.18090/samriddhi.v8i2.7149
Zhao, W. X., Zhou, J., Li, Z., Wang, W., & Wen, J. R. (2023). A survey of large language models. arXiv preprint arXiv:2303.18223. https://arxiv.org/abs/2303.18223


