Explainable Bangla Linguistic Style Classification into Saint and Common Forms

October 28, 2025·

Gazi Maliha Raisa Noor

Afia Fahmida

Raihan Tanvir

Faisal Muhammad Shah

· 0 min read

Preprint DOI

Abstract

This study explores a deep neural network and transformer-based framework for classifying Bengali texts into saint (Sadhu) and common (Cholito) forms. We evaluated six architectures: BiLSTM with attention, GRU-CNN, BanglaBERT, BanglaBERT-Enhanced CNN, XLM-RoBERTa Large, and SahajBERT, all implemented using the dataset BanglaBlend. Among these, transformer-based models, particularly SahajBERT and XLM-RoBERTa Large, consistently achieved high performance across the evaluation metrics. SahajBERT achieved the best overall results, with performance metrics of 0.95 ± 0.01, outperforming BiLSTM, GRU-CNN, and BanglaBERT by a significant margin in predictive accuracy and robustness. To enhance interpretability, we incorporated LIME, a widely used explainable AI (XAI) technique that provides token-level attribution for individual predictions. We further examined the robustness of these explanations across random seeds, assessed lexical overlap between splits to ensure fair evaluation, and benchmarked inference efficiency for the transformer models. This enables transparent validation of stylistic cues aligned with linguistic expectations. Our findings demonstrate the strength of transformer-based models in capturing stylistic and lexical distinctions in Bangla, setting a benchmark for future research in literary style detection, text normalization, and digital language preservation.

Type

Book section

Publication

3rd International Conference on Big Data, IoT and Machine Learning (BIM)

Last updated on October 28, 2025

Natural Language Processing Bengali Language Processing Text Classification Explainable AI

Authors

Raihan Tanvir

Senior Lecturer

Thoughtful by nature, driven by curiosity. Learning, unlearning, and growing—every day.

← Culinary Culture: A Global Exploration of Health and Diversity in Cuisine December 1, 2025

Predicting Agricultural Land Suitability and Soil Quality: A Deep Learning Approach for Precision Agriculture October 28, 2025 →