A Novel Grouping Harmony Search Algorithm for Clustering Problems
MetadataShow full item record
The problem of partitioning a data set into disjoint groups or clusters of related items plays a key role in data analytics, in particular when the information retrieval becomes crucial for further data analysis. In this context, clustering approaches aim at obtaining a good parti- tion of the data based on multiple criteria. One of the most challenging aspects of clustering techniques is the inference of the optimal number of clusters. In this regard, a number of clustering methods from the literature assume that the number of clusters is known a priori and sub- sequently assign instances to clusters based on distance, density or any other criterion. This paper proposes to override any prior assumption on the number of clusters or groups in the data at hand by hybridizing the grouping encoding strategy and the Harmony Search (HS) algorithm. The resulting hybrid approach optimally infers the number of clusters by means of the tailored design of the HS operators, which estimates this important structural clustering parameter as an implicit byproduct of the instance-to-cluster mapping performed by the algorithm. Apart from inferring the optimal number of clusters, simulation results ver- ify that the proposed scheme achieves a better performance than other na ̈ıve clustering techniques in synthetic scenarios and widely known data repositories.