Identification of DNA-protein relationships using association rules
dc.contributor.author | CHADI, Nadjet | |
dc.contributor.author | SLAMA, Ratiba | |
dc.date.accessioned | 2022-02-22T09:26:07Z | |
dc.date.available | 2022-02-22T09:26:07Z | |
dc.date.issued | 2021 | |
dc.description.abstract | In biology, cells perform multiple functional processes related mainly to heredity, and among these processes we mention transcription and translation, which target the links between protein and deoxyribonucleic acid, specifically TF factors and TFBS sites, which play a fundamental role in these processes, and with the quantitative development of informatics in terms of storage capacity, it became possible to store it, so we targeted in this work, we explore the relationship between these factors by using data mining techniques with the use of the algorithm of ‘Apriori ‘ and ‘Eclat’. Résumé: Dans la biologie, les cellules exécutent de multiples processus fonctionnels liés principalement à l'hérédité, et parmi ces processus, nous mentionnons la transcription et la traduction, qui ciblent les liens entre les protéines et l'acide désoxyribonucléique, en particulier les facteurs FT et les sites TFBS, qui jouent un rôle fondamental dans ces processus, nous explorons la relation entre ces facteurs en utilisant des techniques de data mining avec l'utilisation de l'algorithme Eclat et Apriori. تلخيص في علم الأحياء ، تؤدي الخلايا عمليات وظيفية متعددة تتعلق أساسًا بالوراثة ، ومن بين هذه العمليات نذكر النسخ والترجمة ، والتي تستهدف الروابط بين البروتين وحمض الريبي منقوص الاكسجين ، وتحديدًا عوامل F T ومواقع TFBS ، والتي تلعب دورًا أساسيًا في هذه ألعمليات و مع التطور الكمي للمعلوماتية من حيث السعة التخزينية ، أصبح من الممكن تخزينها ، لذلك استهدفنا في هذا العمل استكشاف العلاقة بين هذه العوامل باستخدام تقنيات التنقيب عن البيانات باستخدام الخوارزمية Apriori و | en_US |
dc.identifier.issn | MM/558 | |
dc.identifier.uri | http://10.10.1.6:4000/handle/123456789/1870 | |
dc.language.iso | fr | en_US |
dc.publisher | Université Mohamed el-Bachir el-Ibrahimi Bordj Bou Arréridj Faculté de Mathématique et Informatique | en_US |
dc.title | Identification of DNA-protein relationships using association rules | en_US |
dc.type | Thesis | en_US |
Files
Original bundle
1 - 1 of 1
- Name:
- mémoireF-converti.pdf
- Size:
- 1.4 MB
- Format:
- Adobe Portable Document Format
- Description:
- With the current deluge of biological data, computer methods have become essential for biological investigation. Originally developed for the analysis of biological sequences, bioinformatics now covers a wide range of fields including structural biology, genomics and the study of gene expression, while biology and bioinformatics are two very broad fields. This thesis deals with a problem in bioinformatics "identifying DNA- protein bonds using data mining techniques"; this problem requires the use of data mining in biology. On choose to apply the Apriori algorithm which performs a horizontal count on the database (BFS: Breadth First Search) and the Eclat algorithm which performs a vertical count on the database (DFS: Depth First Search). The choice of a data mining technique or algorithm to solve a problem strongly depends on the context of the application. the nature of the data and the resources available. An analysis of the data helps to choose the best algorithm To arrive at a solution for the problem we built a base of the TF sequences and all its TFBS using one of the sequence libraries published in the net TRANSFAC then we follow two steps, which are the preprocessing, and the use of Data mining and towards the end, we arrive at association rules. The preprocessing phase aims to extract the frequent words after splitting the sequences into items using the K-mer and then constructing the binary matrix. The use phase of the data mining we applied the two methods related to the extraction of the association rules which are the A priori and Eclat algorithm. In the end, very necessary relationships are arrived in several biological activities.
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed to upon submission
- Description: