Deep Learning For Text Catégorization

ZEGADI, Walid

→
Faculté des mathématiques et de l'informatique
→
Master Informatique
→
Voir le document

dc.contributor.author	ZEGADI, Walid
dc.date.accessioned	2022-04-12T09:20:21Z
dc.date.available	2022-04-12T09:20:21Z
dc.date.issued	2020
dc.identifier.issn	MM578
dc.identifier.uri	https://dspace.univ-bba.dz:443/xmlui/handle/123456789/2195
dc.description.abstract	With the constant growth of the number of available numeric texts numerous researches are led in order to organize and to make exploitable this immense basis of information. This data represents an important source of information for several applications such as recommendation systems, community detection, marketing and computer vision. In this context the categorization of texts has for objective to regroup texts « thematically» near within the same category. The majority of these approaches generally this problematic like an all. However the categorization of texts is a double problem. The first problem corresponds to the textual representation or, in other words, how to get a mathematical and numerical representation of a text, thus the selection of the best characteristics (relevant term) which ensures a better classification. The problematic second is located, mainly in the domain of the training. This is the use of one of the most recent techniques and principles of deep learning, from a practice game, to categorize all new text. In this thesis, we apply one of the powerful deep learning models which is the convolutional neural network (CNN) on a textual data set to solve text classification problems, in addition to that we use the CNN as feature selection. We evaluate and compare the performance of CNN with deferred machine learning algorithm such as (SVM, decision tree.), Moreover we compare the performance of the feature selection method namely Back Propagation (BP CNN) with the Gain information (GI) most commonly used in the classification of texts.يع انضٚادج انًستًشج فٙ عذد انُظٕص انشلًٛح انًتاحح ، ٚتى إخشاء انكثٛش يٍ األتحاث يٍ أخم تُظٛى لاعذج انثٛاَاخ يًًٓ نهًعهٕياخ نهعذٚذ يٍ انتطثٛماخ يثم أَظًح انتٕطٛح انٓائهح ٔخعهٓا لاتهح نالستخذاو. تًثم ْاتّ انثٛاَاخ يظذ ًسا ا ٔاكتشاف انًدتًع ٔانتسٕٚك ٔسؤٚح انكًثٕٛتش. فٙ ْزا انسٛاق ، ٚٓذف تظُٛف انُظٕص إنٗ تدًٛع انُظٕص "انًتشاتٓح يٕضٕعًٛا" فٙ يدًٕعح طُف ٔاحذج . تعانح غانثٛح ْاتّ األسانٛة تشكم عاو ْاتّ انًشكهح ككم. ٔيع رنك ، فإٌ تظُٛف انُظٕص ًٚثم يشكهح يضدٔخح: انًشكهح األٔنٗ تتعهك تانتًثٛم انُظٙ أٔ ، تعثاسج أخشٖ ، كٛفٛح انحظٕل عهٗ انتًثٛم انشٚاضٙ ٔانشلًٙ نهُض ، ٔتانتانٙ تظُٛف أفضم ؛ انًشكهح انثاَٛح ْٙ تشكم سئٛسٙ فٙ يدال ً اختٛاس أفضم انخظائض )يظطهح راخ انظهح( يًا ٚضًٍ ا انتعهى االنٙ. ٚتعهك األيش تاستخذاو ٔاحذج يٍ أحذث تمُٛاخ ٔيثادئ انتعهى انعًٛك ، حٛج اَطاللا يٍ يعاندح يعطٛاخ ، ًٚكٍ تظُٛف أ٘ َض خذٚذ. فٙ ْزِ األطشٔحح ، َطثك أحذ ًَارج انتعهى انعًٛك انمٕٚح ْٕٔ انشثكح انعظثٛح انتالفٛفٛح )CNN )عهٗ يدًٕعح تٛاَاخ َظٛح نحم يشاكم تظُٛف انُض ، تاإلضافح إنٗ رنك َستخذو شثكح تالفٛفٛح يٍ اخم اختٛاس انخظائض. َمٕو تتمٛٛى ٔيماسَح أداء CNN يع تعض خٕاسصيٛاخ انتعهى اٜنٙ يثم )SVM ، شدشج انمشاس.( ، عالٔج عهٗ رنك َمٕو تًماسَح أداء ًيا فٙ تظُٛف انُظٕص. طشٚمح اختٛاس انًًٛضاخ تاستخذاو CNN ْٙٔ Propagation Back يع )GI )األكثش استخذا	en_US
dc.language.iso	fr	en_US
dc.publisher	Université de Bordj Bou Arreridj	en_US
dc.subject	Text Mining, automatic categorization of texts, textual representation, Deep Learning, Convolutional neural network, Machine Learning, BackPropagation, Information Gain, Feature Selection.	en_US
dc.title	Deep Learning For Text Catégorization	en_US
dc.type	Thesis	en_US