dc.contributor.advisor | Ponce Cruz, Pedro | |
dc.contributor.author | Baltazar Reyes, Germán Eduardo | |
dc.creator | BALTAZAR REYES, GERMAN EDUARDO; 852898 | |
dc.date.accessioned | 2022-03-01T23:22:31Z | |
dc.date.available | 2022-03-01T23:22:31Z | |
dc.date.issued | 2021-12-02 | |
dc.identifier.citation | Baltazar Reyes, G. E.(2021). Analysis and Use of Textual Definitions through a Transformer Neural Network Model and Natural Language Processing [Tesis de doctorado sin publicar]. Instituto Tecnológico y de Estudios Superiores de Monterrey. | es_MX |
dc.identifier.uri | https://hdl.handle.net/11285/645423 | |
dc.description | https://orcid.org/0000-0001-7035-5286 | es_MX |
dc.description.abstract | There is currently an information overload problem, where data is excessive, disorganized, and presented statically. These three problems are deeply related to the vocabulary used in each document since the usefulness of a document is directly related to the number of understood vocabulary. At the same time, there are multiple Machine Learning algorithms and applications that analyze the structure of written information. However, most implementations are focused on the bigger picture of text analysis, which is to understand the structure and use of complete sentences and how to create new documents as long as the originals. This problem directly affects the static presentation of data. For these past reasons, this proposal intends to evaluate the semantical similitude between a complete phrase or sentence and a single keyword, following the structure of a regular dictionary, where a descriptive sentence explains and shares the exact meaning of a single word. This model uses a GPT-2 Transformer neural network to interpret a descriptive input phrase and generate a new phrase that intends to speak about the same abstract concept, similar to a particular keyword. The validation of the generated text is in charge of a Universal Sentence Encoder network, which was finetuned for properly relating the semantical similitude between the total sum of words of a sentence and its corresponding keyword. The results demonstrated that the proposal could generate new phrases that resemble the general context of the descriptive input sentence and the ground truth keyword. At the same time, the validation of the generated text was able to assign a higher similarity score between these phrase-word pairs. Nevertheless, this process also showed that it is still needed deeper analysis to ponderate and separate the context of different pairs of textual inputs.
In general, this proposal marks a new area of study for analyzing the abstract relationship of meaning between sentences and particular words and how a series of ordered vocables can be detected as similar to a single term, marking a different direction of text analysis than the one currently proposed and researched in most of the Natural Language Processing community. | es_MX |
dc.format.medium | Texto | es_MX |
dc.language.iso | eng | es_MX |
dc.publisher | Instituto Tecnológico y de Estudios Superiores de Monterrey | es_MX |
dc.relation.isFormatOf | versión publicada | es_MX |
dc.relation.isreferencedby | REPOSITORIO NACIONAL CONACYT | |
dc.rights | openAccess | es_MX |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0 | es_MX |
dc.subject.classification | CIENCIAS FÍSICO MATEMÁTICAS Y CIENCIAS DE LA TIERRA::MATEMÁTICAS::CIENCIA DE LOS ORDENADORES::INTELIGENCIA ARTIFICIAL | es_MX |
dc.subject.lcsh | Technology | es_MX |
dc.title | Analysis and use of textual definitions through a transformer neural network model and natural language processing | es_MX |
dc.type | Tesis Doctorado / doctoral Thesis | es_MX |
dc.contributor.department | School of Engineering and Sciences | es_MX |
dc.contributor.committeemember | McDaniel, Troy | |
dc.contributor.committeemember | Balderas Silva, David Christopher | |
dc.contributor.committeemember | Rojas Hernández, Mario | |
dc.contributor.mentor | López Caudana, Edgar Omar | |
dc.identifier.orcid | https://orcid.org/0000-0002-2358-2002 | es_MX |
dc.subject.keyword | Natural Language Processing | es_MX |
dc.subject.keyword | Machine Learning | es_MX |
dc.subject.keyword | Transformer | es_MX |
dc.subject.keyword | Semantic Analysis | es_MX |
dc.contributor.institution | Campus Ciudad de México | es_MX |
dc.contributor.cataloger | puemcuervo | es_MX |
dc.description.degree | Doctor of Philosophy in Engineering Science | es_MX |
dc.identifier.cvu | 852898 | es_MX |
dc.date.accepted | 2021-12-06 | |
dc.audience.educationlevel | Público en general/General public | es_MX |
dc.identificator | 1||12||1203||120304 | es_MX |