Impact des techniques de prétraitement sur la performance des modèles de classification du diabète.

Thumbnail Image

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

university of bordj bou arreridj

Abstract

Diabetes is a chronic disease for which early diagnosis is crucial to prevent serious com plications. In this work, we study the impact of preprocessing techniques on the performance of classification models applied to diabetes data. To this end, we use two medical datasets : the Pima Indians dataset and a local dataset from Iraq. We evaluate three classification algo rithms : logistic regression, support vector machines (SVM), and decision trees. We apply two normalization techniques (MinMaxScaler and StandardScaler) and three feature selection me thods (SelectKBest, GenericUnivariateSelect, SelectFromModel). The results, evaluated using cross-validation, show that a well-chosen preprocessing strategy significantly improves model accuracy, with varying performance depending on the nature of the data and the algorithm used.

Description

Keywords

Diabetes, Classification, Preprocessing, Feature Selection, Normalization, Cross Validation.

Citation

Endorsement

Review

Supplemented By

Referenced By