Tesis de doctorado

A novel functional tree for class imbalance problems

Loading...
Thumbnail Image

Citation

View formats

Share

Bibliographic managers

Abstract

Decision trees (DTs) are popular classifiers partly because they provide models that are easy to explain and because they show remarkable performance. To improve the classification performance of individual DTs, researchers have used linear combinations of features in inner nodes (Multivariate Decision Trees), leaf nodes (Model Trees), or both (Functional Trees). Our general objective is to develop a DT using linear feature combinations that outperforms the rest of such DTs in terms of classification performance as measured by the Area Under the ROC Curve (AUC), particularly in class imbalance problems, where one of the classes in the database has few objects compared to another class. We establish that, in terms of classification performance, there exists a hierarchy, where Functional Trees (FTs) surpass Model Trees, that in turn surpass Multivariate Decision Trees. Having shown that Gama's FT, the only FT to date, has the best classification performance, we identify limitations that hinder its classification performance. To improve the classification performance of FTs, we introduce the Functional Tree for class imbalance problems (FT4cip), which takes care in each design decision to improve AUC. The decision of what pruning method to use led us to the design of the AUC-optimizing Cost-Complexity pruning algorithm, a novel pruning algorithm that does not degrade classification performance in class imbalance problems because it optimizes AUC. We show how each design decision taken when building FT4cip contributes to classification performance or to simple tree models. We demonstrate through a set of tests that FT4cip outperforms Gama's FT and excels in class imbalance problems. All our results are supported by a thorough experimental comparison in 110 databases using Bayesian statistical tests.

Description

https://orcid.org/0000-0002-3465-995X

Collections

Loading...

Document viewer

Select a file to preview:
Reload

logo

El usuario tiene la obligación de utilizar los servicios y contenidos proporcionados por la Universidad, en particular, los impresos y recursos electrónicos, de conformidad con la legislación vigente y los principios de buena fe y en general usos aceptados, sin contravenir con su realización el orden público, especialmente, en el caso en que, para el adecuado desempeño de su actividad, necesita reproducir, distribuir, comunicar y/o poner a disposición, fragmentos de obras impresas o susceptibles de estar en formato analógico o digital, ya sea en soporte papel o electrónico. Ley 23/2006, de 7 de julio, por la que se modifica el texto revisado de la Ley de Propiedad Intelectual, aprobado

Licencia