Show simple item record

dc.contributor.authorCapo, M.
dc.contributor.authorPérez, A. 
dc.contributor.authorLozano, J.A. 
dc.date.accessioned2018-05-28T20:44:24Z
dc.date.available2018-05-28T20:44:24Z
dc.date.issued2017-02-01
dc.identifier.issn0950-7051
dc.identifier.urihttp://hdl.handle.net/20.500.11824/797
dc.description.abstractDue to the progressive growth of the amount of data available in a wide variety of scientific fields, it has become more difficult to manipulate and analyze such information. In spite of its dependency on the initial settings and the large number of distance computations that it can require to converge, the K-means algorithm remains as one of the most popular clustering methods for massive datasets. In this work, we propose an efficient approximation to the K-means problem intended for massive data. Our approach recursively partitions the entire dataset into a small number of subsets, each of which is characterized by its representative (center of mass) and weight (cardinality), afterwards a weighted version of the K-means algorithm is applied over such local representation, which can drastically reduce the number of distances computed. In addition to some theoretical properties, experimental results indicate that our method outperforms well-known approaches, such as the K-means++ and the minibatch K-means, in terms of the relation between number of distance computations and the quality of the approximation.en_US
dc.description.sponsorshipMINECO (TIN2013-41272P), Spanish Ministry of Economy and Competitivenessen_US
dc.formatapplication/pdfen_US
dc.language.isoengen_US
dc.rightsReconocimiento-NoComercial-CompartirIgual 3.0 Españaen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/3.0/es/en_US
dc.subjectK-meansen_US
dc.subjectclusteringen_US
dc.subjectK-means++en_US
dc.subjectminibatch K-meansen_US
dc.titleAn efficient approximation to the K-means clustering for Massive Dataen_US
dc.typeinfo:eu-repo/semantics/articleen_US
dc.identifier.doi10.1016/j.knosys.2016.06.031
dc.relation.publisherversionhttps://www.sciencedirect.com/science/article/pii/S0950705116302027en_US
dc.relation.projectIDES/1PE/SEV-2013-0323en_US
dc.relation.projectIDEUS/ELKARTEKen_US
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessen_US
dc.type.hasVersioninfo:eu-repo/semantics/publishedVersionen_US
dc.journal.titleKnowledge-Based Systemsen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Reconocimiento-NoComercial-CompartirIgual 3.0 España
Except where otherwise noted, this item's license is described as Reconocimiento-NoComercial-CompartirIgual 3.0 España