Técnicas avanzadas de selección de atributos para clasificación : análisis y estudio empírico
Date
Subject
Publisher
Abstract
Trabajo de Máster Universitario en Economía, Finanzas y Computación (2024-25). Tutor: Dr. Antonio Javier Tallón Ballesteros. This study explores and evaluates the effectiveness of advanced feature selection techniques in the context of machine learning, with a particular focus on classification problems. In the Big Data era, where vast amounts of information are generated, identifying and selecting the most relevant features is crucial to building efficient classification models and avoiding overfitting. This research focuses on five advanced techniques: Probability-based Particle Swarm Optimization (PSO), Golden Fish Search (GFS), Lasso, Ridge, and Elastic Net, evaluating their capacity to enhance model performance through dimensionality reduction. By applying these techniques to high-dimensional datasets, the impacts on accuracy, recall, F1-score, AUC-ROC, and execution time are analyzed. The findings help determine best practices for feature selection in complex environments and provide recommendations for their application in various classification domains, highlighting the ability of advanced techniques to optimize model efficiency and accuracy in scenarios where traditional methods prove ineffective.
Trabajo de Máster Universitario en Economía, Finanzas y Computación (2024-25). Tutor: Dr. Antonio Javier Tallón Ballesteros. This study explores and evaluates the effectiveness of advanced feature selection techniques in the context of machine learning, with a particular focus on classification problems. In the Big Data era, where vast amounts of information are generated, identifying and selecting the most relevant features is crucial to building efficient classification models and avoiding overfitting. This research focuses on five advanced techniques: Probability-based Particle Swarm Optimization (PSO), Golden Fish Search (GFS), Lasso, Ridge, and Elastic Net, evaluating their capacity to enhance model performance through dimensionality reduction. By applying these techniques to high-dimensional datasets, the impacts on accuracy, recall, F1-score, AUC-ROC, and execution time are analyzed. The findings help determine best practices for feature selection in complex environments and provide recommendations for their application in various classification domains, highlighting the ability of advanced techniques to optimize model efficiency and accuracy in scenarios where traditional methods prove ineffective.