Knowledge Discovery from Data (TKDD)


Search Issue
enter search term and/or author name


ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 11 Issue 3, April 2017

Graph Manipulations for Fast Centrality Computation
Ahmet Erdem Sariyüce, Kamer Kaya, Erik Saule, Ümit V. Çatalyürek
Article No.: 26
DOI: 10.1145/3022668

The betweenness and closeness metrics are widely used metrics in many network analysis applications. Yet, they are expensive to compute. For that reason, making the betweenness and closeness centrality computations faster is an important and...

Finding Dynamic Dense Subgraphs
Polina Rozenshtein, Nikolaj Tatti, Aristides Gionis
Article No.: 27
DOI: 10.1145/3046791

Online social networks are often defined by considering interactions of entities at an aggregate level. For example, a call graph is formed among individuals who have called each other at least once; or at least k times....

Modeling Buying Motives for Personalized Product Bundle Recommendation
Guannan Liu, Yanjie Fu, Guoqing Chen, Hui Xiong, Can Chen
Article No.: 28
DOI: 10.1145/3022185

Product bundling is a marketing strategy that offers several products/items for sale as one bundle. While the bundling strategy has been widely used, less efforts have been made to understand how items should be bundled with respect to...

Combining Structured Node Content and Topology Information for Networked Graph Clustering
Ting Guo, Jia Wu, Xingquan Zhu, Chengqi Zhang
Article No.: 29
DOI: 10.1145/2996197

Graphs are popularly used to represent objects with shared dependency relationships. To date, all existing graph clustering algorithms consider each node as a single attribute or a set of independent attributes, without realizing that content...

An Influence Propagation View of PageRank
Qi Liu, Biao Xiang, Nicholas Jing Yuan, Enhong Chen, Hui Xiong, Yi Zheng, Yu Yang
Article No.: 30
DOI: 10.1145/3046941

For a long time, PageRank has been widely used for authority computation and has been adopted as a solid baseline for evaluating social influence related applications. However, when measuring the authority of network nodes, the traditional...

Learning Multiple Diagnosis Codes for ICU Patients with Local Disease Correlation Mining
Sen Wang, Xue Li, Xiaojun Chang*, Lina Yao, Quan Z. Sheng, Guodong Long
Article No.: 31
DOI: 10.1145/3003729

In the era of big data, a mechanism that can automatically annotate disease codes to patients’ records in the medical information system is in demand. The purpose of this work is to propose a framework that automatically annotates the...

Scalable and Efficient Flow-Based Community Detection for Large-Scale Graph Analysis
Seung-Hee Bae, Daniel Halperin, Jevin D. West, Martin Rosvall, Bill Howe
Article No.: 32
DOI: 10.1145/2992785

Community detection is an increasingly popular approach to uncover important structures in large networks. Flow-based community detection methods rely on communication patterns of the network rather than structural properties to determine...

Robust Graph Regularized Nonnegative Matrix Factorization for Clustering
Chong Peng, Zhao Kang, Yunhong Hu, Jie Cheng, Qiang Cheng
Article No.: 33
DOI: 10.1145/3003730

Matrix factorization is often used for data representation in many data mining and machine-learning problems. In particular, for a dataset without any negative entries, nonnegative matrix factorization (NMF) is often used to find a low-rank...

Partitioned Similarity Search with Cache-Conscious Data Traversal
Xun Tang, Maha Alabduljalil, Xin Jin, Tao Yang
Article No.: 34
DOI: 10.1145/3014060

All pairs similarity search (APSS) is used in many web search and data mining applications. Previous work has used techniques such as comparison filtering, inverted indexing, and parallel accumulation of partial results. However, shuffling...

Recommendations Based on Comprehensively Exploiting the Latent Factors Hidden in Items’ Ratings and Content
Shanshan Feng, Jian Cao, Jie Wang, Shiyou Qian
Article No.: 35
DOI: 10.1145/3003728

To improve the performance of recommender systems in a practical manner, several hybrid approaches have been developed by considering item ratings and content information simultaneously. However, most of these hybrid approaches make...

Spatial Prediction for Multivariate Non-Gaussian Data
Xutong Liu, Feng Chen, Yen-Cheng Lu, Chang-Tien Lu
Article No.: 36
DOI: 10.1145/3022669

With the ever increasing volume of geo-referenced datasets, there is a real need for better statistical estimation and prediction techniques for spatial analysis. Most existing approaches focus on predicting multivariate Gaussian spatial...

Moving Destination Prediction Using Sparse Dataset: A Mobility Gradient Descent Approach
Liang Wang, Zhiwen Yu, Bin Guo, Tao Ku, Fei Yi
Article No.: 37
DOI: 10.1145/3051128

Moving destination prediction offers an important category of location-based applications and provides essential intelligence to business and governments. In existing studies, a common approach to destination prediction is to match the given query...

A Randomized Rounding Algorithm for Sparse PCA
Kimon Fountoulakis, Abhisek Kundu, Eugenia-Maria Kontopoulou, Petros Drineas
Article No.: 38
DOI: 10.1145/3046948

We present and analyze a simple, two-step algorithm to approximate the optimal solution of the sparse PCA problem. In the proposed approach, we first solve an ℓ1-penalized version of the NP-hard sparse PCA optimization problem and...