Computational estimation of system-level gene coexpression across human tissues
Cortés Guzmán, Miguel Ángel
MetadataShow full item record
Large-scale gene coexpression projects have been a valuable resource for researchers involved in bioinformatics, molecular biology and biomedical sciences as they provide support for formulating hypotheses regarding gene functions and interactions, as well as for prioritizing genes in experimental designs. Such projects however, contain results calculated from all sorts of samples including healthy, disease and experimental condition specimens in addition to many of them not being based on sequencing technologies. The understanding of normality in the context of human gene coexpression is pivotal as this helps uncovering new functional associations for previously known or unknown genes and it serves as a comparison point when studying disease states. Other tools besides the Pearson Correlation Coefficient have not been traditionally explored for large-scale coexpression, potentially letting more complex non-linear associations between genes pass. In this computer science master thesis, a system-level coexpression estimation across a variety of normal human tissues is proposed. The objective is not only improve on the current areas of opportunity that exist in the large-scale coexpression research domain, but to also provide the scientific community with a novel and useful resource of system-level human coexpression data. Results comprise the first large-scale coexpression estimation in the literature that exclusively considers normal samples in the input data that were profiled with sequencing technologies in combination with 3 distinct coexpression metrics considered for calculation: the Pearson Correlation Coefficient, the Spearman Rank Correlation Coefficient and the highly interpretable Chi-square test of independence.