Multilingual Toxic comments Classification using Bert
International Journal of Development Research
Multilingual Toxic comments Classification using Bert
Received 10th December, 2024; Received in revised form 16th December, 2024; Accepted 25th January, 2025; Published online 27th February, 2025
Copyright©2025, Akshaya et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
The swift expansion of online platforms has led to a surge in toxic comments, disrupting digital communities and adversely affecting users. Tackling this pervasive issue presents significant challenges, particularly in a multilingual context, as most available solutions tend to focus primarily on English. This project presents a multilingual toxic comments classification system harnessing Multilingual BERT (mBERT) capabilities. By utilizing mBERT's proficiency in various languages, the system can proficiently detect and classify toxic content—ranging from hate speech to abusive language—in real time. Fine-tuned on a diverse multilingual dataset, it promotes inclusivity by catering to less-resourced languages and providing a toxicity score for each comment to facilitate moderation. This innovative solution offers a robust and scalable method for cultivating healthier and more respectful online communities worldwide.