Data Science (DS)
http://hdl.handle.net/20.500.11824/9
Sat, 01 Oct 2022 14:09:12 GMT2022-10-01T14:09:12ZA General Framework Based on Walsh Decomposition for Combinatorial Optimization Problems
http://hdl.handle.net/20.500.11824/1521
A General Framework Based on Walsh Decomposition for Combinatorial Optimization Problems
Unanue, I.; Merino, M.; Lozano, J.A.
In this paper we pursue the use of the Fourier transform for a general analysis of combinatorial optimization problems. While combinatorial optimization problems are defined by means of different notions like weights in a graph, set of numbers, distance between cities, etc., the Fourier transform allows to put all of them in the same framework, the Fourier coefficients. This permits its comparison looking for similarities and differences. Particularly, the Walsh transform has recently been used over pseudo-boolean functions in order to design new surrogate models in the black-box scenario and to generate new algorithms for the linkage discovery problem, among others, presenting very promising results. In this paper we focus on binary problems and the Walsh transform. After presenting the Walsh decomposition and some main properties, we compute the transform of the Unconstrained Binary Quadratic Problem and several particular cases of this problem such as the Max-Cut Problem and the Number Partitioning Problem. The obtained Walsh coefficients not only reinforce the similarities and differences among the problems which are known in the literature, but given a set of Walsh coefficients we can say whether or not they are produced by any of the problems analyzed. Finally, a geometrical interpretation of the space of Walsh coefficients with maximum order 2 and the subspace of each analyzed problem is presented.
Fri, 01 Jan 2021 00:00:00 GMThttp://hdl.handle.net/20.500.11824/15212021-01-01T00:00:00ZA mathematical analysis of EDAs with distance-based exponential models
http://hdl.handle.net/20.500.11824/1520
A mathematical analysis of EDAs with distance-based exponential models
Unanue, I.; Merino, M.; Lozano, J.A.
Estimation of Distribution Algorithms have been successfully used to solve permutation-based Combinatorial Optimization Problems. In this case, the algorithms use probabilistic models specifically designed for codifying probability distributions over permutation spaces. One class of these probability models are distance-based exponential models, and one example of this class is the Mallows model. In spite of its practical success, the theoretical analysis of Estimation of Distribution Algorithms for permutation-based Combinatorial Optimization Problems has not been developed as extensively as it has been for binary problems. With this motivation, this paper presents a first mathematical analysis of the convergence behavior of Estimation of Distribution Algorithms based on Mallows models. The model removes the randomness of the algorithm in order to associate a dynamical system to it. Several scenarios of increasing complexity with different fitness functions and initial probability distributions are analyzed. The obtained results show: a) the strong dependence of the final results on the initial population, and b) the possibility to converge to non-degenerate distributions even in very simple scenarios, which has not been reported before in the literature.
Thu, 01 Sep 2022 00:00:00 GMThttp://hdl.handle.net/20.500.11824/15202022-09-01T00:00:00ZAd-Hoc Explanation for Time Series Classification
http://hdl.handle.net/20.500.11824/1517
Ad-Hoc Explanation for Time Series Classification
Abanda, A.; Mori, U.; Lozano, J.A.
In this work, a perturbation-based model-agnostic explanation method for time series classification is presented. One of the main novelties of the proposed method is that the considered perturbations are interpretable and specific for time series. In real-world time series, variations in the speed or the scale of a particular action, for instance, may determine the class, so modifying this type of characteristic leads to ad-hoc explanations for time series. To this end, four perturbations or transformations are proposed: warp, scale, noise, and slice. Given a transformation, an interval of a series is considered relevant for the prediction of a classifier if a transformation in this interval changes the prediction. Another novelty is that the method provides a two-level explanation: a high-level explanation, where the robustness of the prediction with respect to a particular transformation is measured, and a low-level explanation, where the relevance of each region of the time series in the prediction is visualized. In order to analyze and validate our proposal, first some illustrative examples are provided, and then a thorough quantitative evaluation is carried out using a specifically designed evaluation procedure.
Sat, 01 Jan 2022 00:00:00 GMThttp://hdl.handle.net/20.500.11824/15172022-01-01T00:00:00ZLearning a Battery of COVID-19 Mortality Prediction Models by Multi-objective Optimization
http://hdl.handle.net/20.500.11824/1515
Learning a Battery of COVID-19 Mortality Prediction Models by Multi-objective Optimization
Martinez, M.; García-Gutierrez, S.; Armañanzas, R.; Díaz, A.; Inza, I.; Lozano, J.A.
The COVID-19 pandemic is continuously evolving with drastically changing epidemiological situations which are approached with different decisions: from the reduction of fatalities to even the selection of patients with the highest probability of survival in critical clinical situations. Motivated by this, a battery of mortality prediction models with different performances has been developed to assist physicians and hospital managers. Logistic regression, one of the most popular classifiers within the clinical field, has been chosen as the basis for the generation of our models. Whilst a standard logistic regression only learns a single model focusing on improving accuracy, we propose to extend the possibilities of logistic regression by focusing on sensitivity and specificity. Hence, the log-likelihood function, used to calculate the coefficients in the logistic model, is split into two objective functions: one representing the survivors and the other for the deceased class. A multi-objective optimization process is undertaken on both functions in order to find the Pareto set, composed of models not improved by another model in both objective functions simultaneously. The individual optimization of either sensitivity (deceased patients) or specificity (survivors) criteria may be conflicting objectives because the improvement of one can imply the worsening of the other. Nonetheless, this conflict guarantees the output of a battery of diverse prediction models. Furthermore, a specific methodology for the evaluation of the Pareto models is proposed. As a result, a battery of COVID-19 mortality prediction models is obtained to assist physicians in decision-making for specific epidemiological situations.
Sat, 09 Jul 2022 00:00:00 GMThttp://hdl.handle.net/20.500.11824/15152022-07-09T00:00:00Z