A customizable hybrid approach to data clustering

Yu Qian*, Kang Zhang

*Corresponding author for this work

Research output: Contribution to conferenceConference Paperpeer-review

5 Citations (Scopus)

Abstract

Most current data clustering algorithms in data mining are based on a distance calculation in certain metric space. For Spatial Database Systems (SDBS), the Euclidean distance between two data points is often used to represent the relationship between data points. However, in some spatial settings and many other applications, distance alone is not enough to represent all the attributes of the relation between data points. We need a more powerful model to record more relational information between data objects. This paper adopts a graph model by which a database is regarded as a graph: each vertex of the graph represents a data point, and each edge, weighted or unweighted, is used to record the relation between two data points connected by the edge. Based on the graph model, this paper presents a set of cluster analysis criteria to guide data clustering. The criteria can be used to measure clustering results and help improving the quality of clustering. Further, a customizable algorithm using the criteria is proposed and implemented. This algorithm can produce clusters according to users' specifications. Preliminary experiments show encouraging results.

Original languageEnglish
Pages485-489
Number of pages5
DOIs
Publication statusPublished - 2003
Externally publishedYes
EventProceedings of the 2003 ACM Symposium on Applied Computing - Melbourne, FL, United States
Duration: 9 Mar 200312 Mar 2003

Conference

ConferenceProceedings of the 2003 ACM Symposium on Applied Computing
Country/TerritoryUnited States
CityMelbourne, FL
Period9/03/0312/03/03

Fingerprint

Dive into the research topics of 'A customizable hybrid approach to data clustering'. Together they form a unique fingerprint.

Cite this