Résumé:
With the explosion of data volume generated daily, Big Data has become a major
concern across various domains. The significance of Big Data lies in its ability to provide
valuable insights and facilitate informed decision-making. However, to fully harness
this potential, it is essential to employ machine learning techniques that can process,
analyze, and extract relevant information from these vast datasets.
This thesis presents a systematic literature review on machine learning methods
for Big Data processing and analysis, accompanied by a case study. The study covers
various supervised, unsupervised, semi-supervised, and deep learning techniques, along
with their algorithms, including SVM, regression, decision trees, convolutional neural
networks (CNN), recurrent neural networks (RNN), and clustering techniques such as
HDDC, SOM, FCM, and k-means. A rigorous methodology was employed to identify
and evaluate relevant studies. In the case study, the k-means algorithm was applied to
the Iris dataset, demonstrating its effectiveness in identifying patterns within the data.
In conclusion, this systematic review has highlighted different machine learning
techniques for addressing Big Data challenges and their limitations. Through this
study, current issues have been identified, paving the way for exploring avenues for
improvement and resolution of these issues in the future.