Data Science (DS)
Browse by
Recent Submissions

Revisiting Implicit and Explicit Averaging for Noisy Optimization
(20231001)Explicit and implicit averaging are two wellknown strategies for noisy optimization. Both strategies can counteract the disruptive effect of noise; however, a critical question remains: which one is more efficient? This ... 
DiscretizationBased Feature Selection as a Bilevel Optimization Problem
(20230801)Discretizationbased feature selection (DBFS) approaches have shown interesting results when using several metaheuristic algorithms, such as particle swarm optimization (PSO), genetic algorithm (GA), ant colony optimization ... 
DoubleWeighting for Covariate Shift Adaptation
(202307)Supervised learning is often affected by a covariate shift in which the marginal distributions of instances (covariates $x$) of training and testing samples $p_\text{tr}(x)$ and $p_\text{te}(x)$ are different but the label ... 
Challenging test problems for multi and manyobjective optimization
(20230801)In spite of the extensive studies that have been conducted regarding the construction of multiobjective test problems, researchers have mainly focused their interests on designing complicated search spaces, disregarding, ... 
Time consistent expected meanvariance in multistage stochastic quadratic optimization: a model and a matheuristic
(2019)In this paper, we present a multistage time consistent Expected Conditional Risk Measure for minimizing a linear combination of the expected mean and the expected variance, socalled Expected MeanVariance. The model is ... 
The Natural Bias of Artificial Instances
(2023)Many exact and metaheuristic algorithms presented in the literature are tested by comparing their performance in different sets of instances. However, it is known that when these sets of instances are generated randomly, ... 
Statistical Modelling for Recurrent Events in Sports Injury Research with Applications to Football Injury Data
(2023)Sports injuries stand as undesirable side effects of athletic participation, carrying serious consequences for athletes' health, their professional careers, and overall team performance. With the growing availability of ... 
On the Use of Second Order Neighbors to Escape from Local Optima
(20230712)Designing efficient local search based algorithms requires to consider the specific properties of the problems. We introduce a simple and effi cient strategy, the Extended Reach, that escapes from local optima ob tained ... 
MinimumFuel LowThrust Trajectory Optimization Via a Direct Adaptive Evolutionary Approach
(20231128)Space missions with lowthrust propulsion systems are of appreciable interest to space agencies because of their practicality due to higher specific impulses. This research proposes a technique to the solution of minimumfuel ... 
Adaptive Estimation of Distribution Algorithms for LowThrust Trajectory Optimization
(20230802)A direct adaptive scheme is presented as an alternative approach for minimumfuel lowthrust trajectory design in noncoplanar orbit transfers, utilizing fitness landscape analysis (FLA). Spacecraft dynamics is modeled ... 
Robust Estimation of Distribution Algorithms via Fitness Landscape Analysis for Optimal LowThrust Orbital Maneuvers
(202309)One particular kind of evolutionary algorithms known as Estimation of Distribution Algorithms (EDAs) has gained the attention of the aerospace industry for its ability to solve nonlinear and complicated problems, particularly ... 
Learning a logistic regression with the help of unknown features at prediction stage
(2023)The use of features available at training time, but not at prediction time, as additional information for training models is known as learning using privileged information paradigm. In this paper, the handling of ... 
Spatio‑temporal modelling of high‑throughput phenotyping data
(20231013)High throughput phenotyping (HTP) platforms and devices are increasingly used to characterise growth and developmental processes for large sets of plant genotypes. This dissertation is motivated by the need to accurately ... 
Derivative curve estimation in longitudinal studies using Psplines
(20230918)The estimation of curve derivatives is of interest in many disciplines. It allows the extraction of important characteristics to gain insight about the underlying process. In the context of longitudinal data, the derivative ... 
A revisited branchandcut algorithm for largescale orienteering problems
(20240216)The orienteering problem is a route optimization problem which consists of finding a simple cycle that maximizes the total collected profit subject to a maximum distance limitation. In the last few decades, the occurrence ... 
A kernelenriched orderdependent nonparametric spatiotemporal process
(2023)Spatiotemporal processes are necessary modeling tools for various environmental, biological, and geographical problems. The underlying model is commonly considered to be parametric and to be a Gaussian process. Additionally, ... 
Female Models in AI and the Fight Against COVID19
(20221101)Gender imbalance has persisted over time and is well documented in science, technology, engineering and mathematics (STEM) and singularly in artificial intelligence (AI). In this article we emphasize the importance of ... 
Efficient Learning of Minimax Risk Classifiers in High Dimensions
(20230801)Highdimensional data is common in multiple areas, such as health care and genomics, where the number of features can be tens of thousands. In such scenarios, the large number of features often leads to inefficient ... 
Selecting the number of categories of the lymph node ratio in cancer research: A bootstrapbased hypothesis test
(2021)The high impact of the lymph node ratio as a prognostic factor is widely established in colorectal cancer, and is being used as a categorized predictor variable in several studies. However, the cutoff points as well as ...