12th International Symposium on Intelligent Distributed Computing
IDC 2018
15-17 October 2018, Bilbao, Spain

Sponsored by:

Latest news:

October 14th, 2018:
Slides for the opening talk available.

October 13th, 2018:
Social Program available (further details to be given during the conference opening)

October 11th, 2018:
New invited speaker: Josu Ceberio, in replacement of Prof. Herrera (apologies on his behalf due to illness)

September 12th, 2018:
Early registration fees will be maintained until the celebration of the conference

September 4th, 2018:
Program is available

June 28th, 2018:
Registration is open

June 4th, 2018:
Updated notification deadline: June 5th, 2018

May 1st, 2018:
Submission deadline extended: May 15th, 2018 (no more extensions will be granted)

April 9, 2018:
Submission deadline extended: May 1st, 2018.

March 26, 2018:
New tutorial: ANDROPYTOOL.

March 6, 2018:
Confirmed Special Issue on Applied Soft Computing.

March 6, 2018:
New invited speaker: Albert Bifet.

January 19, 2018:
New Accepted Workshop: ML-PdM.

December 13, 2017:
New tutorial: JMETALSP.

December 13, 2017:
New tutorial: KMBD.

December 7, 2017:
New invited speakers: Francisco Herrera and Eleni I. Vlahogianni.

December 5, 2017:
New Accepted Workshop: COMPSUS.

November 15, 2017:
New Accepted Workshop: INDILOG.

November 6, 2017:
Confirmed Special Issue on Future Generation Computer Systems.

October 30, 2017:
Definitive conference dates published.

October 25, 2017:
Tentative conference dates published.

October 20, 2017:
First CFP published.

October 19, 2017:
Invited speakers: Jose A. Lozano and David Camacho.

October 18, 2017:
IDC 2018 web site was launched.

TUTORIAL: The K-means algorithm on Big Data domains (KMBD)

By Marco Capó, Aritz Peréz and Jose A. Lozano

Contents of the tutorial:

Cluster analysis is known to be one of the most commonly used tasks in data analysis. Among its different techniques, the K-means algorithm [1] stands out as the most popular one [2–4] because of the easiness of its implementation, straight- forward parallelizability and relatively low computational cost. However, due to the exponential increase of the data volumes that scientists, from different backgrounds, face on a daily basis, improving the scalability of the K-means algorithm, without affecting the quality of its results, has gained special attention [5]. In this tutorial, we want to review the state-of-the-art in this topic, as well as presenting different practical and theoretical results that we have recently produced.

[1] S. Lloyd, “Least squares quantization in pcm,” IEEE Transactions on Information Theory, vol. 28, no. 2, pp. 129–137, 1982.
[2] P. Berkhin et al., “A survey of clustering data mining techniques.,” Grouping Multidimensional Data, vol. 25, p. 71, 2006.
[3] A. K. Jain, “Data clustering: 50 years beyond k-means,” Pattern Recognition Letters, vol. 31, no. 8, pp. 651–666, 2010.
[4] X. Wu, V. Kumar, J. R. Quinlan, J. Ghosh, Q. Yang, H. Motoda, G. J. McLachlan, A. Ng, B. Liu, S. Y. Philip, et al., “Top 10 algorithms in data mining,” Knowledge and Information Systems, vol. 14, no. 1, pp. 1–37, 2008.
[5] M. Jordan, “Committee on the analysis of massive data, committee on applied and theoretical statistics, board on mathematical sciences and their applications, division on engineering and physical sciences, council, nr, 2013. frontiers in massive data analysis,” Frontiers in Massive Data Analysis.

Intended audience:

Open to all audiences interested in unsupervised learning on Big Data applications.

Tutorial format:

Mainly practical although some theoretical results might be presented and exemplified.

About the presenters:

Marco Capó received the BSc deegre in Applied Mathematics from Universidad Simón Bolı́var, Venezuela and MSc in Mathematical Modelling Engineering from Universität Hamburg, Germany. He is currently a PhD candidate at the Basque Center for Applied Mathematics. His research interests are in machine learning and optimization, with a particular focus on unsupervised learning problems.

Aritz Pérez received in 2010 the PhD degree from the the University of Basque Country, department of Computer Science and Artificial Intelligence. Currently, he is a postdoctoral researcher at the Basque Center for Applied Mathematics. His current scientific interests includes supervised, unsupervised and weak classification, probabilistic graphical models, model selection and evaluation, time series and crowd learning.

Jose A. Lozano graduated in Mathematics (1991) and Computer Science (1992) at the University of the Basque Country. In 1998 he got his PhD degree from the University of the Basque Country. He became a full professor at the Department of Computer Science and Artificial Intelligence in 2008. Since 2005 he leads the Intelligent Systems Group (ISG) based in the Computer Science School. His research areas are evolutionary computation, machine learning and probabilistic graphical models and its application in the solution of 1real problems in biomedicine, industry or finance. He has published 4 books, more tan 100 scientific ISI journal articles and about 150 contributions to national and international conferences. These publications have received more than 8600 citations. Prof. Lozano is associate editor of IEEE Trans. on Evolutionary Computation and IEEE Trans. on Neural Network and Learning Systems among other prestigious journals.