BIRD, BCAM's Institutional Repository Data
https://bird.bcamath.org:443
The BIRD digital repository system captures, stores, indexes, preserves, and distributes digital research material.2024-03-18T21:32:03ZCharacterization of rankings generated by pseudo-Boolean functions
http://hdl.handle.net/20.500.11824/1786
Characterization of rankings generated by pseudo-Boolean functions
Unanue, I.; Merino, M.; Lozano, J.A.
In this paper we pursue the study of pseudo-Boolean functions as ranking generators. The objective of the work is to find new insights between the relation of the degree
of a pseudo-Boolean function and the rankings that can be generated by these insights. Based on a characterization theorem for pseudo-Boolean functions of degree
, several observations are made. First, we verify that pseudo-Boolean functions of degree
, where
is the search space dimension, cannot generate all the possible rankings of the solutions. Secondly, the sufficient condition for a ranking to be generated by a pseudo-Boolean function of dimension
is presented, and also the necessary condition is conjectured. Finally, we observe that the same argument is not sufficient to prove which ranking can be generated by pseudo-Boolean functions of degree
.
2024-01-01T00:00:00ZSpeeding-Up Evolutionary Algorithms to Solve Black-Box Optimization Problems
http://hdl.handle.net/20.500.11824/1785
Speeding-Up Evolutionary Algorithms to Solve Black-Box Optimization Problems
Echevarrieta, J.; Arza, E.; Pérez, A.
Population-based evolutionary algorithms are often considered when approaching computationally expensive black-box optimization problems. They employ a selection mechanism to choose the best solutions from a given population after comparing their objective values, which are then used to generate the next population. This iterative process explores the solution space efficiently, leading to improved solutions over time. However, one of the challenges of these algorithms is that they require a large number of evaluations to provide a quality solution, which might be computationally expensive when the evaluation cost is high. In some cases, it is possible to replace the original objective function with a less accurate approximation of lower cost. This introduces a trade-off between the evaluation cost and its accuracy. In this paper, we propose a technique capable of choosing an appropriate approximate function cost during the execution of the optimization algorithm. The proposal finds the minimum evaluation cost at which the solutions are still properly ranked, and consequently, more evaluations can be computed in the same amount of time with minimal accuracy loss. An experimental section on four very different problems reveals that the proposed approach can reach the same objective value in less than half of the time in certain cases.
2024-01-10T00:00:00ZFast Computation of Cluster Validity Measures for Bregman Divergences and Benefits
http://hdl.handle.net/20.500.11824/1784
Fast Computation of Cluster Validity Measures for Bregman Divergences and Benefits
Capó, M.; Pérez, A.; Lozano, J.A.
Partitional clustering is one of the most relevant unsupervised learning and pattern recognition techniques. Unfortunately, one of the main drawbacks of these methodologies refer to the fact that the number of clusters is generally assumed to be known beforehand and automating its selection is not straightforward. On the same token, internal validity measures, such as the Silhouette index, Davies-Bouldin and Caliski-Harabasz measures have emerged as the standard techniques to be used when comparing the goodness of clustering results obtained via different clustering methods. These measures take into consideration both the inter and intra-cluster simmilarities and can be adapted to different metrics. Unfortunately, their used has been hindered due to their large computational complexities, which are commonly quadratic with respect to the number of instances of the data set. In this work, we show that the time complexity of computing the most popular internal validity measures can be utterly reduced by making used of the within-cluster errors and different properties of the Bregman divergences. This contribution ultimately allows us to massively speed-up the selection of an adequate number of clusters for a given data set as verified with extensive empirical comparisons.
2023-01-01T00:00:00ZFast K-Medoids With the l_1-Norm
http://hdl.handle.net/20.500.11824/1783
Fast K-Medoids With the l_1-Norm
Capó, M.; Pérez, A.; Lozano, J.A.
K-medoids clustering is one of the most popular techniques in exploratory data analysis. The most commonly used algorithms to deal with this problem are quadratic on the number of instances, n, and usually the quality of the obtained solutions strongly depends upon their initialization phase. In this work, we propose an algorithm for the K -medoids problem on the l_1-norm/Manhattan distance with a computational complexity of O(n⋅max{log n,K}⋅d) , along with theoretical guarantees in terms of the accuracy of the obtained approximation. In addition, we propose a cheap split-merge mechanism that can be used to re-start the proposed algorithm after its convergence to a fixed point. Under some mild assumptions, we prove that such a re-start procedure reduces the error of the given fixed point. The work also includes an extensive experimentation, in which we compare our method to the most popular approaches for the K -medoids problem: PAM, CLARA and Park's K -medoids. The obtained empirical results show the proposed algorithm to consistently converge to the solutions with the lowest errors, up to two orders of magnitude of relative error lower than the previously mentioned methods, while also requiring the lowest computational running times among them: up to three orders of magnitude lower.
2023-07-26T00:00:00Z