ACM DL

Knowledge Discovery from Data (TKDD)

Menu

Search Issue
enter search term and/or author name

Archive


ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 11 Issue 1, August 2016

Shop-Type Recommendation Leveraging the Data from Social Media and Location-Based Services
Zhiwen Yu, Miao Tian, Zhu Wang, Bin Guo, Tao Mei
Article No.: 1
DOI: 10.1145/2930671

It is an important yet challenging task for investors to determine the most suitable type of shop (e.g., restaurant, fashion) for a newly opened store. Traditional ways are predominantly field surveys and empirical estimation, which are not...

Leveraging Neighbor Attributes for Classification in Sparsely Labeled Networks
Luke K. McDowell, David W. Aha
Article No.: 2
DOI: 10.1145/2898358

Many analysis tasks involve linked nodes, such as people connected by friendship links. Research on link-based classification (LBC) has studied how to leverage these connections to improve classification accuracy. Most such prior research...

Convex Sparse PCA for Unsupervised Feature Learning
Xiaojun Chang, Feiping Nie, Yi Yang, Chengqi Zhang, Heng Huang
Article No.: 3
DOI: 10.1145/2910585

Principal component analysis (PCA) has been widely applied to dimensionality reduction and data pre-processing for different applications in engineering, biology, social science, and the like. Classical PCA and its variants seek for linear...

Listwise Learning to Rank from Crowds
Ou Wu, Qiang You, Fen Xia, Lei Ma, Weiming Hu
Article No.: 4
DOI: 10.1145/2910586

Learning to rank has received great attention in recent years as it plays a crucial role in many applications such as information retrieval and data mining. The existing concept of learning to rank assumes that each training instance is associated...

Scalable Clustering by Iterative Partitioning and Point Attractor Representation
Junming Shao, Qinli Yang, Hoang-Vu Dang, Bertil Schmidt, Stefan Kramer
Article No.: 5
DOI: 10.1145/2934688

Clustering very large datasets while preserving cluster quality remains a challenging data-mining task to date. In this paper, we propose an effective scalable clustering algorithm for large datasets that builds upon the concept of...

Latent Time-Series Motifs
Josif Grabocka, Nicolas Schilling, Lars Schmidt-Thieme
Article No.: 6
DOI: 10.1145/2940329

Motifs are the most repetitive/frequent patterns of a time-series. The discovery of motifs is crucial for practitioners in order to understand and interpret the phenomena occurring in sequential data. Currently, motifs are searched among series...

Sampling for Nyström Extension-Based Spectral Clustering: Incremental Perspective and Novel Analysis
Xianchao Zhang, Linlin Zong, Quanzeng You, Xing Yong
Article No.: 7
DOI: 10.1145/2934693

Sampling is the key aspect for Nyström extension based spectral clustering. Traditional sampling schemes select the set of landmark points on a whole and focus on how to lower the matrix approximation error. However, the matrix approximation...

Fast Sampling for Time-Varying Determinantal Point Processes
Maoying Qiao, Richard Yi Da Xu, Wei Bian, Dacheng Tao
Article No.: 8
DOI: 10.1145/2943785

Determinantal Point Processes (DPPs) are stochastic models which assign each subset of a base dataset with a probability proportional to the subset’s degree of diversity. It has been shown that DPPs are particularly appropriate in data...

Greedily Improving Our Own Closeness Centrality in a Network
Pierluigi Crescenzi, Gianlorenzo D'angelo, Lorenzo Severini, Yllka Velaj
Article No.: 9
DOI: 10.1145/2953882

The closeness centrality is a well-known measure of importance of a vertex within a given complex network. Having high closeness centrality can have positive impact on the vertex itself: hence, in this paper we consider the optimization problem of...

The Convergence Behavior of Naive Bayes on Large Sparse Datasets
Xiang Li, Charles X. Ling, Huaimin Wang
Article No.: 10
DOI: 10.1145/2948068

Large and sparse datasets with a lot of missing values are common in the big data era, such as user behaviors over a large number of items. Classification in such datasets is an important topic for machine learning and data mining. Practically,...

Modeling of Geographic Dependencies for Real Estate Ranking
Yanjie Fu, Hui Xiong, Yong Ge, Yu Zheng, Zijun Yao, Zhi-Hua Zhou
Article No.: 11
DOI: 10.1145/2934692

It is traditionally a challenge for home buyers to understand, compare, and contrast the investment value of real estate. Although a number of appraisal methods have been developed to value real properties, the performances of these methods have...