Knowledge Discovery from Data (TKDD)


Search Issue
enter search term and/or author name


ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 9 Issue 4, June 2015

Mathematical Modeling and Analysis of Product Rating with Partial Information
Hong Xie, John C. S. Lui
Article No.: 26
DOI: 10.1145/2700386

Many Web services like Amazon, Epinions, and TripAdvisor provide historical product ratings so that users can evaluate the quality of products. Product ratings are important because they affect how well a product will be adopted by the market. The...

Optimizing Text Quantifiers for Multivariate Loss Functions
Andrea Esuli, Fabrizio Sebastiani
Article No.: 27
DOI: 10.1145/2700406

We address the problem of quantification, a supervised learning task whose goal is, given a class, to estimate the relative frequency (or prevalence) of the class in a dataset of unlabeled items. Quantification has several...

Information Measures in Statistical Privacy and Data Processing Applications
Bing-Rong Lin, Daniel Kifer
Article No.: 28
DOI: 10.1145/2700407

In statistical privacy, utility refers to two concepts: information preservation, how much statistical information is retained by a sanitizing algorithm, and usability, how (and with how much difficulty) one extracts this...

Density-Aware Clustering Based on Aggregated Heat Kernel and Its Transformation
Hao Huang, Shinjae Yoo, Dantong Yu, Hong Qin
Article No.: 29
DOI: 10.1145/2700385

Current spectral clustering algorithms suffer from the sensitivity to existing noise and parameter scaling and may not be aware of different density distributions across clusters. If these problems are left untreated, the consequent clustering...

Classification with Streaming Features: An Emerging-Pattern Mining Approach
Kui Yu, Wei Ding, Dan A. Simovici, Hao Wang, Jian Pei, Xindong Wu
Article No.: 30
DOI: 10.1145/2700409

Many datasets from real-world applications have very high-dimensional or increasing feature space. It is a new research problem to learn and maintain a classifier to deal with very high dimensionality or streaming features. In this article, we...

Supporting Exploratory Hypothesis Testing and Analysis
Guimei Liu, Haojun Zhang, Mengling Feng, Limsoon Wong, See-Kiong Ng
Article No.: 31
DOI: 10.1145/2701430

Conventional hypothesis testing is carried out in a hypothesis-driven manner. A scientist must first formulate a hypothesis based on what he or she sees and then devise a variety of experiments to test it. Given the rapid growth of data, it has...

Process Discovery under Precedence Constraints
Gianluigi Greco, Antonella Guzzo, Francesco Lupia, Luigi Pontieri
Article No.: 32
DOI: 10.1145/2710020

Process discovery has emerged as a powerful approach to support the analysis and the design of complex processes. It consists of analyzing a set of traces registering the sequence of tasks performed along several enactments of a transactional...

Improving Top-N Recommendation for Cold-Start Users via Cross-Domain Information
Nima Mirbakhsh, Charles X. Ling
Article No.: 33
DOI: 10.1145/2724720

Making accurate recommendations for cold-start users is a challenging yet important problem in recommendation systems. Including more information from other domains is a natural solution to improve the recommendations. However, most previous work...

Chromatic Correlation Clustering
Francesco Bonchi, Aristides Gionis, Francesco Gullo, Charalampos E. Tsourakakis, Antti Ukkonen
Article No.: 34
DOI: 10.1145/2728170

We study a novel clustering problem in which the pairwise relations between objects are categorical. This problem can be viewed as clustering the vertices of a graph whose edges are of different types (colors). We introduce an...