ACM DL

Knowledge Discovery from Data (TKDD)

Menu

Search Issue
enter search term and/or author name

Archive


ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 3 Issue 1, March 2009

Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering
Hans-Peter Kriegel, Peer Kröger, Arthur Zimek
Article No.: 1
DOI: 10.1145/1497577.1497578

As a prolific research area in data mining, subspace clustering and related problems induced a vast quantity of proposed solutions. However, many publications compare a new proposition—if at all—with one or two competitors, or even...

Semi-analytical method for analyzing models and model selection measures based on moment analysis
Amit Dhurandhar, Alin Dobra
Article No.: 2
DOI: 10.1145/1497577.1497579

In this article we propose a moment-based method for studying models and model selection measures. By focusing on the probabilistic space of classifiers induced by the classification algorithm rather than on that of datasets, we obtain efficient...

Closed patterns meet n-ary relations
Loïc Cerf, Jérémy Besson, Céline Robardet, Jean-François Boulicaut
Article No.: 3
DOI: 10.1145/1497577.1497580

Set pattern discovery from binary relations has been extensively studied during the last decade. In particular, many complete and efficient algorithms for frequent closed set mining are now available. Generalizing such a task to n-ary...

DOLPHIN: An efficient algorithm for mining distance-based outliers in very large datasets
Fabrizio Angiulli, Fabio Fassetti
Article No.: 4
DOI: 10.1145/1497577.1497581

In this work a novel distance-based outlier detection algorithm, named DOLPHIN, working on disk-resident datasets and whose I/O cost corresponds to the cost of sequentially reading the input dataset file twice, is presented.

It is both...

Bellwether analysis: Searching for cost-effective query-defined predictors in large databases
Bee-Chung Chen, Raghu Ramakrishnan, Jude W. Shavlik, Pradeep Tamma
Article No.: 5
DOI: 10.1145/1497577.1497582

How to mine massive datasets is a challenging problem with great potential value. Motivated by this challenge, much effort has concentrated on developing scalable versions of machine learning algorithms. However, the cost of mining large datasets...