Résumé:
In this study, we addressed the issue of detecting offensive language on social media in Arabic, a language often underrepresented in natural language processing (NLP) research. By leveraging a recently published public dataset, we trained several machine learning and deep learning models to accomplish this task.
The machine learning models used include Naive Bayes, SVM, Decision Tree, and Random Forest. In parallel, we explored deep learning architectures such as convolutional neural networks (CNN) and recurrent neural networks (RNN). Our experiments yielded remarkable results, demonstrating the effectiveness of these approaches in detecting offensive language in Arabic.
To enhance user experience and facilitate the application of our work, we also developed a comprehensive user interface in Python. This interface allows for intuitive use of our detection models, making the technology accessible to a non-technical audience.
The results obtained are promising and pave the way for future improvements, particularly through the optimization of current models and the exploration of new machine learning and deep learning techniques.