Show simple item record

dc.contributor.advisorTreviño Alvarado, Víctor Manuel
dc.contributor.authorGarza Hernández, Débora
dc.creatorGARZA HERNANDEZ, DEBORA; 501329
dc.date.accessioned2022-06-21T20:08:17Z
dc.date.available2022-06-21T20:08:17Z
dc.date.issued2019-08
dc.identifier.citationGarza Hernández, D. (2022). Feature selection from biological bigdata: identification of significant associations applying multivariate machine learning algorithms to genome-wide association studies (GWAS) (Tesis Doctoral). Instituto Tecnológico y de Estudios Superiores de Monterrey. Recuperado de: https://hdl.handle.net/11285/648490es_MX
dc.identifier.doihttps://doi.org/10.1016/j.compbiomed.2022.105398
dc.identifier.urihttps://hdl.handle.net/11285/648490
dc.descriptionhttps://orcid.org/0000-0002-7472-9844es_MX
dc.description.abstractCrohn's Disease (CD) is a type of Inflammatory Bowel Disease (IBD) affecting the gastrointestinal tract with diverse symptoms. At present, Genome-Wide Association Studies (GWAS) have discovered over 140 genetic loci associated with CD. Usual univariate GWAS methods have allowed the discovery of minor effects from common variants. It assumes independence among them, which can lead to missing subtle combinatorial signals. Considering the importance of CD, multivariate approaches can aid to elucidate the etiology of the disease and facilitate the identification of novel associations. However, current univariate-based and multivariate CD models have a broad performance spectrum and have been assessed in different datasets under diverse methodological settings. Other multivariate methods and models (LASSO, XGBoost, Random Forest, BSWiMS, and LDpred) were compared under a strict sub-sampling and cross-validation approach to predict CD risk in a GWAS dataset (de Lange et al. 2017). The predictions were explored and compared to whether the generated models could provide additional information about variants and genes associated with CD. Additionally, the effect of common strategies was assessed by increasing and decreasing the number of SNP markers (using genotype imputation and LD-clumping). The LDpred model without imputation appears to be the best model among all tested models to predict Crohn’s disease risk (AUROC = 0.667 ± 0.024) in this dataset. The best models were validated in a second dataset (NIDDK IBD Genetics), where LDpred was also the best method with similar performance (AUROC = 0.634 ± 0.009). Finally, based on the importance of the variants yielded by the multivariate models, an unnoticed region was identified within chromosome 6, SNP rs4945943, close to gene MARCKS, which appears to contribute to CD risk.es_MX
dc.format.mediumTextoes_MX
dc.language.isoenges_MX
dc.publisherInstituto Tecnológico y de Estudios Superiores de Monterreyes_MX
dc.relationCONACYTes_MX
dc.relation.isFormatOfdraftes_MX
dc.relation.isreferencedbyREPOSITORIO NACIONAL CONACYT
dc.relation.urlhttps://www.sciencedirect.com/science/article/abs/pii/S0010482522001901es_MX
dc.rightsopenAccesses_MX
dc.rights.urihttp://creativecommons.org/licenses/by/4.0es_MX
dc.subject.classificationINGENIERÍA Y TECNOLOGÍA::CIENCIAS TECNOLÓGICAS::TECNOLOGÍA DE LOS ORDENADORES::MODELOS CAUSALESes_MX
dc.subject.lcshSciencees_MX
dc.titleFeature selection from biological bigdata: identification of significant associations applying multivariate machine learning algorithms to genome-wide association studies (GWAS)es_MX
dc.typeTesis Doctorado / doctoral Thesises_MX
dc.contributor.departmentEscuela de Ingeniería y Cienciases_MX
dc.contributor.committeememberBrunck, Marion Emilie Genevieve
dc.contributor.committeememberTrejo Rodríguez, Luis Angel
dc.contributor.mentorEstrada, Karol
dc.identifier.orcidhttps://orcid.org/0000-0002-0005-4223es_MX
dc.subject.keywordCrohn's diseasees_MX
dc.subject.keywordGWASes_MX
dc.subject.keywordMultivariate analysises_MX
dc.subject.keywordSNPes_MX
dc.contributor.institutionCampus Monterreyes_MX
dc.contributor.catalogerpuemcuervo, emipsanchezes_MX
dc.description.degreeDoctorado en Ciencias Computacionaleses_MX
dc.identifier.cvu501329es_MX
dc.date.accepted2022-05-01
dc.audience.educationlevelInvestigadores/Researcherses_MX
dc.identifier.scopusid57217171283es_MX
dc.identificator7||33||3304||120307es_MX


Files in this item

Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

openAccess
Except where otherwise noted, this item's license is described as openAccess