keyboard_arrow_up
Connectivity-Based Clustering for Mixed Discrete and Continuous Data

Authors

Mahfuza Khatun1 and Sikandar Siddiqui2, 1Jahangirnagar University, Bangladesh , 2Deloitte Audit Analytics GmbH, Germany

Abstract

This paper introduces a density-based clustering procedure for datasets with variables of mixed type. The proposed procedure, which is closely related to the concept of shared neighbourhoods, works particularly well in cases where the individual clusters differ greatly in terms of the average pairwise distance of the associated objects. Using a number of concrete examples, it is shown that the proposed clustering algorithm succeeds in allowing the identification of subgroups of objects with statistically significant distributional characteristics.

Keywords

Cluster analysis, mixed data, distance measures