A Hybrid MuRIL–Attention–Random Forest Framework for Hate Speech Detection Against Women in Hindi

Neha Tyagi; Gopal Krishna Sharma; Narendra Kumar Sharma

Research Article

A Hybrid MuRIL–Attention–Random Forest Framework for Hate Speech Detection Against Women in Hindi

by Neha Tyagi, Gopal Krishna Sharma, Narendra Kumar Sharma

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 187 - Issue 96

Published: April 2026

Authors: Neha Tyagi, Gopal Krishna Sharma, Narendra Kumar Sharma

10.5120/ijca2ad422afaaf6

PDF

Neha Tyagi, Gopal Krishna Sharma, Narendra Kumar Sharma . A Hybrid MuRIL–Attention–Random Forest Framework for Hate Speech Detection Against Women in Hindi. International Journal of Computer Applications. 187, 96 (April 2026), 51-59. DOI=10.5120/ijca2ad422afaaf6

                        @article{ 10.5120/ijca2ad422afaaf6,
                        author  = { Neha Tyagi,Gopal Krishna Sharma,Narendra Kumar Sharma },
                        title   = { A Hybrid MuRIL–Attention–Random Forest Framework for Hate Speech Detection Against Women in Hindi },
                        journal = { International Journal of Computer Applications },
                        year    = { 2026 },
                        volume  = { 187 },
                        number  = { 96 },
                        pages   = { 51-59 },
                        doi     = { 10.5120/ijca2ad422afaaf6 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }

                        %0 Journal Article
                        %D 2026
                        %A Neha Tyagi
                        %A Gopal Krishna Sharma
                        %A Narendra Kumar Sharma
                        %T A Hybrid MuRIL–Attention–Random Forest Framework for Hate Speech Detection Against Women in Hindi%T 
                        %J International Journal of Computer Applications
                        %V 187
                        %N 96
                        %P 51-59
                        %R 10.5120/ijca2ad422afaaf6
                        %I Foundation of Computer Science (FCS), NY, USA

Abstract

Societal hate speech against women on social media is growing, especially in dialects with limited resources like Hindi, where diversity of linguistics, unofficial writing styles, and social stratification make machine-generated detection fail. Modern ML and learning methods struggle to capture contextual semantics and control differences, resulting in low accuracy. This study proposes a blended framework that combines MuRIL, a multilingual transformer-inspired language model for Indian languages, with a focus mechanism and a random forest classifier to recognize Hindi sexist comments directed at women. MuRIL embeds provide deep background visualizations, while the attention layer reveals patterns of hateful language. The 2,020 professionally labelled Hindi social network database is used for comprehensive assessments. The proposed hybrid framework is compared against TF-IDF with SVM, CNN, Bi-LSTM with attention, and separate MuRIL-based models. Studies show that the MuRIL–Attention–Random Forest design outperforms traditional models in targeted detection, with an average precision of 92.82% and a greater group-wise difference. Using transformer-driven meaning representations with machine learning ensembles improves spotting accuracy in limited-resource and unbalanced situations. configurations. The current arrangement is an effective and durable solution for Hindi offensive language identification and a solid foundation for multilingual and continuing regulatory system.

References

K. Ghosh and A. Senapati, "Hate speech detection in low-resourced Indian languages: An analysis of transformer-based monolingual and multilingual models with cross-lingual experiments," Natural Language Processing, vol. 31, pp. 393-414, 2025.
A. Ahmad, M. Waqas, A. Hamza, S. Usman, I. Batyrshin, and G. Sidorov, "UA-HSD-2025: Multilingual hate speech detection from tweets using pre-trained transformers," Computers, vol. 14, no. 6, p. 239, 2025.
S. Kakarla and G. S. B. Venkata, "Code-mixed Telugu-English hate speech detection," arXiv preprint, 2025.
M. Z. U. Rehman, S. K. R. Kasu, S. R. R. Koppula, S. R. R. Chirra, S. S. Singh, and N. Kumar, "X-MuTeST: A multilingual benchmark for explainable hate speech detection and a novel LLM-consulted explanation framework," arXiv preprint, Jan. 2026.
L. Mednini, Z. Noubigh, and M. D. Turki, "Natural language processing for detecting brand hate speech," Journal of Telecommunications and the Digital Economy, vol. 12, no. 1, pp. 486-509, 2024.
A. Mohasseb, E. Amer, F. Chiroma, and A. Tranchese, "Leveraging advanced NLP techniques and data augmentation to enhance online misogyny detection," Applied Sciences, vol. 15, no. 2, p. 856, 2025.
"SafeSpeech: A three-module pipeline for hate intensity mitigation of social media texts in Indic languages," Social Network Analysis and Mining, vol. 14, art. no. 245, 2024.
K. Ghosh and A. Senapati, Hate Speech Detection in Low-Resourced Indian Languages. Cambridge University Press, 2025.
B. S. Rathore and S. Chaurasia, "Fine tuning large language models for hate speech detection in Hinglish and code-mixed custom dataset," Sustainability, 2025.
S. Yadav, A. Kaushik, and K. McDaid, "An underexplored application for explainable multimodal misogyny detection in code-mixed Hindi-English," arXiv preprint, Jan. 2026.
A. Singh et al., "Misogynistic attitude detection in YouTube comments," Computer Speech & Language, 2025.
F. K. Saddozai et al., "Multimodal hate speech detection: A novel deep learning framework," PeerJ Computer Science, 2025.
M. Abusaqer, J. Saquer, and H. Shatnawi, "Efficient hate speech detection: Evaluating 38 models from traditional methods to transformers," arXiv preprint, 2025.
S. Jahan, F. Hassan, W. Aransa, and A. Bouchekif, "Multilingual hate speech detection using ensemble of transformer models," CEUR Workshop Proceedings, 2023.
P. Kar et al., "Sentimental analysis & hate speech detection on English," ScienceDirect, 2023.
E. Hashmi et al., "Enhancing misogyny detection in bilingual texts using multilingual transformer models," Complex & Intelligent Systems, 2025.
J. Purbey et al., "1-800-SHARED-TASKS @ NLU of Devanagari script languages: Detection of language, hate speech, and targets using LLMs," HuggingFace Papers, 2024.
S. Gupta, S. Singhal, and A. T. Wasi, "IITRCIOL@NLU of Devanagari script languages 2025: Multilingual hate speech detection and target identification," arXiv preprint, 2024.
S. R. Aodhora, S. Ahsan, and M. M. Hoque, "CUET_HateShield@NLU of Devanagari script languages 2025," in Proc. CHiPSAL, 2025.
A. Guragain et al., "NLPineers@ NLU of Devanagari script languages 2025: Hate speech detection using ensembling of BERT-based models," in Proc. CHiPSAL, 2025.
S. Fortuna and S. Nunes, "A survey on automatic detection of hate speech in text," ACM Computing Surveys, vol. 51, no. 4, 2018.
L. Abusaqer et al., "Efficient hate speech detection: Evaluating models from traditional methods to transformers," arXiv preprint, 2025.
N. Patel et al., "Transformers and deep learning models for hate speech detection," ACM Digital Library, 2025.
A. Nasir, A. Sharma, and K. Jaidka, "LLMs and finetuning: Benchmarking cross-domain performance for hate speech detection," arXiv preprint, 2023.
S. Yadav, A. Kaushik, and K. McDaid, "Leveraging weakly annotated data for code-mixed hate speech detection using transfer learning with LLMs," arXiv preprint, 2024.
A. F. Hidayatullah et al., "A systematic review on language identification of code-mixed text," IEEE Access, vol. 10, 2022.
N. Ding et al., "Parameter-efficient fine-tuning of large-scale pre-trained LMs," Nature Machine Intelligence, vol. 5, pp. 220-235, 2023.
L. Hu, Z. Liu, and Z. Zhao, "A survey of knowledge enhanced pre-trained LMs," IEEE Trans. Knowledge and Data Engineering, vol. 35, no. 8, pp. 7890-7909, 2023.
Y. Xu et al., "Zero-shot hate speech detection strategies," in Findings of ACL, 2024.
K. Thomas et al., "Supporting human raters with detection of harmful content using LLMs," arXiv preprint, 2024.
A. Negretti and M. M. Raimundo, "Evaluating hate speech detection to unseen target groups," in SBC Proceedings, 2024.
J. M. Pérez et al., "Exploring LLMs for hate speech detection in Spanish," arXiv preprint, 2024.
P. Pookpanich and T. Siriborvornratanakul, "Offensive language detection using deep learning," Social Network Analysis and Mining, vol. 14, 2024.
S. Chanda and S. Pal, "Hate content identification in code-mixed social media data," in Text & Social Media Analytics, CRC Press, 2025.
Y. Wei Jie et al., "Interpretable reasoning explanations from prompting LLMs," in Findings of NAACL, 2024.
P. J. Piot, "Towards efficient and explainable hate speech detection via model distillation," Springer, 2025.
N. Kandpal and C. Raffel, "Position: The most expensive part of an LLM should be its training data," arXiv preprint, 2025.
A. Nasir and A. Sharma, "Benchmarking cross-domain performance for hate speech detection," arXiv preprint, 2023.
K. Guo et al., "An investigation of large language models for real-world hate speech detection," in Proc. ICMLA, 2023.
S. Yadav, A. Kaushik, and K. McDaid, "Explainable machine learning for hate speech detection," in Proc. IEEE ISTAS, IEEE, 2023.
D. Sharma, V. Gupta, and V. K. Singh, "Detection of abusive comments in Tamil with deep learning techniques," in Computational Intelligence Techniques for Sentiment Analysis in NLP Applications, pp. 207-226, Morgan Kaufmann, 2024.
N. Tyagi, G. K. Sharma, and N. K. Sharma, "Combating hate speech: Challenges and solutions in detection techniques," in Proc. PiCET, pp. 1741-1746, 2025. doi:10.1049/icp.2025.1705.
D. Sharma, A. Singh, and V. K. Singh, "A high-quality Hindi-English code-mixed dataset for targeted hate speech against religion," 2024.

Index Terms

Computer Science

Information Sciences

No index terms available.

Keywords

Hate speech Hybrid framework women machine and deep learning models