The Use of big data and data analytics in the prevention, Diagnosis and prediction of long term diseases.

Thumbnail Image

Date

2026

Journal Title

Journal ISSN

Volume Title

Publisher

university of bordj bou arreridj

Abstract

The increasing prevalence of long-term diseases, particularly diabetes, presents significant chal lenges to global healthcare systems. Early prediction, accurate diagnosis, and continuous mon itoring are crucial for improving patient outcomes and reducing healthcare costs. This thesis explores the use of Big Data and data analytics in the prevention, diagnosis, and monitoring of long-term diseases, focusing specifically on diabetes. The core objective is the development of an integrated system that supports individuals throughout the disease lifecycle. The proposed system is structured into three main phases: first, the creation of predictive algorithms capa ble of estimating an individual’s risk of developing diabetes within a ten-year period; second, the application of explainable neural networks to diagnose diabetes based on retinal imaging, ensuring transparency and trust in AI-driven decisions; and third, the development of a digital platform to continuously monitor patients, facilitating proactive management and personalized care. By leveraging machine learning, Big Data technologies, and explainable AI, this work aims to contribute to a more predictive, preventive, and participatory healthcare model for chronic disease management

Description

Machine learning (ML) and deep learning (DL) techniques have garnered significant attention in medical diagnosis and healthcare due to their ability to analyze complex datasets, such as medical images, clinical records, and genetic information. These approaches are particularly effective in supporting healthcare professionals in the diagnosis and management of chronic diseases like diabetes, where ML and DL can detect subtle patterns in medical imaging and other data sources that may be challenging for humans to discern. Diabetes presents unique challenges for healthcare systems due to the complexity of its diagnosis, progression monitoring, and individualized management requirements. Traditionally, diabetes diagnosis and monitoring rely on blood tests to measure glucose levels, HbA1c, and other biomarkers. However, these methods are invasive, may require lab facilities, and can be cumbersome for continuous monitoring. Therefore, there is a growing need for accurate, non invasive, and readily accessible diagnostic methods, particularly for early detection and risk assessment. ML and DL techniques show promise in addressing these challenges by leveraging data from imaging modalities such as retinal fundus photography, MRI, and CT scans to build reliable diagnostic models. Imaging data, especially retinal images, are widely used in diabetic diagnosis and moni toring because of their ability to reveal microvascular changes related to diabetic retinopathy, one of the most common diabetes-related complications. ML and DL models trained on these images can offer significant diagnostic support by detecting early indicators of diabetes and re lated complications, reducing the dependency on invasive testing. This thesis seeks to develop a quick, accurate, and accessible diagnostic approach for diabetes by leveraging ML and DL techniques with a focus on image-based analysis. In line with these goals, a platforms was developed as part of this research to supportthe seamless monitoring and management of diabetes. The platform integrate ML and DL diagnostic models into intuitive user interfaces, enabling healthcare professionals to upload and analyze patient imaging data in real time. They include features for continuous monitoring, automated risk stratification, and alert generation for critical findings. Designed with a focus on clinical applicability, the platform support secure data handling, and role-based access for healthcare staff. A mobile-compatible module was also implemented to enable remote screening and monitoring, especially in resource-limited settings. These tools not only enhance diagnostic efficiency but also facilitate timely interventions and follow-up by providing actionable insights at the point of care. The research questions guiding this thesis focus on (1) the feasibility of using ML-based diagnostic systems to match or complement the performance of traditional glucose-monitoring methods, (2) the necessity of DL methods in developing a robust diabetes diagnosis system based on imaging data, (3) strategies for addressing class imbalance issues in available diabetes datasets, particularly in large imaging datasets such as retinal images from diabetic and non diabetic patients, and (4) the development of platforms for deploying ML/DL models in clinical settings with monitoring and decision-making capabilities. Specific objectives were set to provide a thorough theoretical background in ML, DL, dimensionality reduction, and data augmentation, followed by a detailed literature review of diabetes detection studies. A total of 40 studies were categorized into ML-based, DL-based, and comparative analyses. This analysis spanned approaches from ML to DL, feature extraction and selection techniques, and data augmentation and class-balancing strategies. Key limitations identified in the review included the need for high-quality, well-annotated imaging data, the lack of sufficient diabetic samples in some datasets, and the limited use of sensitivity metrics, often overlooked in existing studies, with values ranging from 73% to 81.2%. Machine learning (ML) and deep learning (DL) are increasingly being used for diabetes diagnosis, particularly through analyzing imaging data. These technologies can identify com plex patterns in medical images, such as retinal scans, that may be subtle or challenging for human clinicians to discern. This capability enhances the accuracy of early diagnosis, sup ports disease progression monitoring, and aids in predicting potential complications associated with diabetes. For instance, retinal fundus photography is commonly used to detect diabetic retinopathy, while MRIs, CT scans, and foot thermography are employed to identify other complications like neuropathy and cardiovascular risks. In ML/DL applications for diabetes imaging, convolutional neural networks (CNNs) are widely used due to their strength in feature extraction. These models are trained to recognize visual cues such as microaneurysms or hemorrhages in retinal images, which signal diabetic retinopathy. The accuracy of these models can be high, often rivaling the diagnostic capabilities of trained professionals. However, the effectiveness of ML and DL models is heavily influence by the quality of the data, the specific model architecture, and the type of complication beingaddressed. Real-world validation is essential to ensure these models perform well in diverse clinical environments. While ML and DL present innovative tools for diabetes diagnosis, they are best viewed as complements to traditional methods, not replacements. These models can help streamline diag nosis and identify at-risk patients, but confirmation through standard tests, like blood glucose measurements, remains crucial. There are challenges in developing these models, including the need for high-quality, annotated images, handling class imbalances (where non-diabetic cases often outnumber diabetic cases), and ensuring data privacy and security. Ethical consider ations, such as mitigating model bias and maintaining transparency in predictions, are also paramount, especially in clinical settings. Data privacy in ML/DL diabetes research is maintained through methods like data anonymiza tion and differential privacy, which help protect sensitive medical information. Looking ahead, future advancements in data augmentation techniques could address class imbalance issues, while hybrid models that integrate imaging data with clinical and genetic information may enhance diagnostic precision. Explainability methods, which clarify how models reach their conclusions, can build clinician trust. Real-time monitoring through remote imaging devices and wearables could further transform diabetes care by enabling continuous, proactive man agement. These advancements hold great promise for enhancing diabetes diagnosis and management, contributing to more personalized and timely healthcare interventions.

Keywords

DATA -DATA ANALYTICS -DIAGNOSIS

Citation

Endorsement

Review

Supplemented By

Referenced By