ACM Transactions on

Knowledge Discovery from Data (TKDD)

Latest Articles

Shop-Type Recommendation Leveraging the Data from Social Media and Location-Based Services

It is an important yet challenging task for investors to determine the most suitable type of shop (e.g., restaurant, fashion) for a newly opened... (more)

Leveraging Neighbor Attributes for Classification in Sparsely Labeled Networks

Many analysis tasks involve linked nodes, such as people connected by friendship links. Research on link-based classification (LBC) has studied how to... (more)

Convex Sparse PCA for Unsupervised Feature Learning

Principal component analysis (PCA) has been widely applied to dimensionality reduction and data pre-processing for different applications in... (more)

Listwise Learning to Rank from Crowds

Learning to rank has received great attention in recent years as it plays a crucial role in many applications such as information retrieval and data mining. The existing concept of learning to rank assumes that each training instance is associated with a reliable label. However, in practice, this assumption does not necessarily hold true as it may... (more)

Scalable Clustering by Iterative Partitioning and Point Attractor Representation

Clustering very large datasets while preserving cluster quality remains a challenging data-mining task to date. In this paper, we propose an effective... (more)

Latent Time-Series Motifs

Motifs are the most repetitive/frequent patterns of a time-series. The discovery of motifs is crucial for practitioners in order to understand and interpret the phenomena occurring in sequential data. Currently, motifs are searched among series sub-sequences, aiming at selecting the most frequently occurring ones. Search-based methods, which try... (more)

Sampling for Nyström Extension-Based Spectral Clustering

Sampling is the key aspect for Nyström extension based spectral clustering. Traditional sampling schemes select the set of landmark points on a... (more)

Fast Sampling for Time-Varying Determinantal Point Processes

Determinantal Point Processes (DPPs) are stochastic models which assign each subset of a base dataset with a probability proportional to the... (more)

Greedily Improving Our Own Closeness Centrality in a Network

The closeness centrality is a well-known measure of importance of a vertex within a given complex network. Having high closeness centrality can have... (more)

The Convergence Behavior of Naive Bayes on Large Sparse Datasets

Large and sparse datasets with a lot of missing values are common in the big data era, such as user behaviors over a large number of items.... (more)

Modeling of Geographic Dependencies for Real Estate Ranking

It is traditionally a challenge for home buyers to understand, compare, and contrast the investment value of real estate. Although a number of... (more)


New options for ACM authors to manage rights and permissions for their work

ACM introduces a new publishing license agreement, an updated copyright transfer agreement, and a new author-pays option which allows for perpetual open access through the ACM Digital Library. For more information, visit the ACM Author Rights webpage.

About TKDD 

The ACM Transactions on Knowledge Discovery from Data (TKDD) publishes original archival papers in the area of knowledge discovery from data and closely related disciplines.  The majority of the papers that appear in TKDD is expected to address the logical and technical foundation of knowledge discovery and data mining.

Forthcoming Articles
Combining Structured Node Content and Topology Information for Networked Graph Clustering

In this paper, we formulate a new networked graph clustering task where the network contains a set of inter-connected (or networked) super-nodes, each of which is a single-attribute graph. The new super-node representation is applicable for many real-world applications, such as a citation network where each node denotes a paper whose content can be described as a graph and citation relationships between all papers form a super-graph. Networked graph clustering is to find similar node groups, each of which contains nodes with similar content and structure information. The main challenge is to properly calculate the similarity between super-nodes for clustering. To solve the problem, we propose to characterize node similarity by integrating the structure and content information of each super-node. To measure node content similarity, we use cosine distance by considering the overlapped attributes between two super-nodes. To measure structure similarity, we propose an attributed random walk kernel to calculate similarity between super-nodes. Detailed node content analysis is also included to build relationships between super-nodes with shared internal structure information, so the structure similarity can be calculated in a precise way. By integrating the structure similarity and content similarity as one matrix, the spectral clustering is used to achieve networked graph clustering.

Partitioning Networks with Node Attributes by Compressing Information Flow

Real-world networks are often organized as modules or communities of similar nodes that serve as functional units. These networks are also rich in content, with nodes having distinguishing features or attributes. In order to discover a network's modular structure, it is necessary to take into account not only its links but also node attributes. We describe an information-theoretic method that identifies modules by compressing descriptions of information flow on a network. Our formulation introduces node content into the description of information flow, which we then minimize to discover groups of nodes with similar attributes that also tend to trap the flow of information. The method has several advantages: it is conceptually simple and does not require ad-hoc parameters to specify the number of modules or to control the relative contribution of links and node attributes to network structure. We apply the proposed method to partition real-world networks with known community structure. We demonstrate that adding node attributes helps recover the underlying community structure in content-rich networks more effectively than using links alone. In addition, we show that our method is faster and more accurate than alternative state-of-the-art algorithms.

Comparing Clustering with Pairwise and Relative Constraints: A Unified Framework

Clustering can be improved with the help of side information about the similarity relationships among instances. Two types of constraints have been widely considered to represent such information: pairwise and relative constraints. A pairwise constraint indicates that two instances $x_a$ and $x_b$ are similar or dissimilar to each other, and a relative constraint specifies whether instance $x_a$ is more similar to $x_b$ than to $x_c$ . Prior work has mostly considered these two types of constraints separately, and different algorithms have been developed to incorporate them into the learning problem. In practice, where constraints are obtained by querying users, it is critical to understand the usefulness of the two types of constraints as well as the cost of acquiring them; this is the primary focus of this paper. Specifically, our contributions are two-fold. First, we propose an efficient semi-supervised clustering framework that can incorporate both types of constraints in a unified manner. This framework enables us to compare the two constraints on equal ground. Second, we compare the effect of the two types of constraints from two different perspectives. To understand the impact on the users, we evaluate the ease of acquiring different types of constraints and the labeling accuracy of the constraints via a user study. To understand the influence on the learning system, we compare the effectiveness of different types of constraints in improving clustering. Our experiments with synthetically generated constraints demonstrate that the proposed framework is highly effective at utilizing both types of constraints to aid clustering; our user study provides valuable insights regarding the impact of the constraints on human users; and our experiments on clustering with different types of constraints (from the user study) reveal that relative constraints might be more efficient at improving clustering compared with pairwise constraints.

Scalable and Accurate Online Feature Selection for Big Data

Feature selection is important in many big data applications. There are at least two critical challenges. Firstly, in many applications, the dimensionality is extremely high, in millions, and keeps growing. Secondly, feature selection has to be highly scalable, preferably in an online manner such that each feature can be processed in a sequential scan. In this paper, we develop SAOLA, a Scalable and Accurate OnLine Approach for feature selection. With a theoretical analysis on bounds of the pairwise correlations between features, SAOLA employs novel online pairwise comparison techniques to address the two challenges and maintain a parsimonious model over time in an online manner. Furthermore, to tackle the dimensionality that arrives by groups, we extend our SAOLA algorithm, and then propose a novel group-SAOLA algorithm for online group feature selection. The group-SAOLA algorithm can online maintain a set of feature groups that is sparse at the level of both groups and individual features simultaneously. An empirical study using a series of benchmark real data sets shows that our two algorithms, SAOLA and group-SAOLA, are scalable on data sets of extremely high dimensionality, and have superior performance over the state-of-the-art feature selection methods.

A novel bipartite graph based competitiveness degree analysis from query logs

Competitiveness degree analysis is a focal point of business strategy and competitive intelligence, aimed to help managers closely monitor to what extent their rivals are competing with them. This paper proposes a novel method, namely BCQ, to measure the competitiveness degree between peers from query logs as an important form of user generated contents, which reflect the wisdom of crowds from the search engine users perspective. In doing so, a bipartite graph model is developed to capture the competitive relationships through conjoint attributes hidden in query logs, where the notion of competitiveness degree for entity pairs is introduced, and then used to identify the competitive paths mapped in the bipartite graph. Subsequently, extensive experiments are conducted to demonstrate the effectiveness of BCQ to quantify the competitiveness degrees. Experimental results reveal that BCQ can well support competitors ranking, which is helpful for devising competitive strategies and pursuing market performance. In addition, efficiency experiments on synthetic data show a good scalability of BCQ on large scale of query logs.

Unsupervised Head-Modifier Detection in Search Queries

Interpreting the user intent in search queries is a key task in query understanding. Query intent classification has been widely studied. In this paper, we go one step further to understand the query from the view of head-modifier analysis. For example, given the query popular iphone 5 smart cover, instead of using coarse-grained semantic classes (e.g., find electronic product), we interpret that smart cover is the head or the intent of the query and iphone 5 is its modifier. Query head-modifier detection can help search engines to obtain particularly relevant content, which is also important for applications such as ads matching and query recommendation. We introduce an unsupervised semantic approach for query head-modifier detection. First, we mine a large number of instance level head-modifier pairs from search log. Then, we develop a conceptualization mechanism to generalize the instance level pairs to concept level. Finally, we derive weighted concept patterns that are concise, accurate, and have strong generalization power in head-modifier detection. The developed mechanism has been used in production for search relevance and ads matching. We use extensive experiment results to demonstrate the effectiveness of our approach.

World Knowledge as Indirect Supervision for Document Clustering

One of the key obstacles in making learning protocols realistic in applications is the need to supervise them, a costly process that often requires hiring domain experts. We consider the framework to use the world knowledge as indirect supervision. World knowledge is general-purpose knowledge, which is not designed for any specific domain. Then the key challenges are how to adapt the world knowledge to domains and how to represent it for learning. In this paper, we provide an example of using world knowledge for domain dependent document clustering. We provide three ways to specify the world knowledge to domains by resolving the ambiguity of the entities and their types, and represent the data with world knowledge as a heterogeneous information network. Then we propose a clustering algorithm that can cluster multiple types and incorporate the sub-type information as constraints. In the experiments, we use two existing knowledge bases as our sources of world knowledge. One is Freebase, which is collaboratively collected knowledge about entities and their organizations. The other is YAGO2, a knowledge base automatically extracted from Wikipedia and maps knowledge to the linguistic knowledge base, WordNet. Experimental results on two text benchmark datasets (20newsgroups and RCV1) show that incorporating world knowledge as indirect supervision can significantly outperform the state-of-the-art clustering algorithms as well as clustering algorithms enhanced with world knowledge features.

Assignment Problems of Different-Sized Inputs in MapReduce

A MapReduce algorithm can be described by a mapping schema, which assigns inputs to a set of reducers, such that for each required output there exists a reducer that receives all the inputs that participate in the computation of this output. Reducers have a capacity, which limits the sets of inputs that they can be assigned. However, individual inputs may vary in terms of size. We consider, for the first time, mapping schemas where input sizes are part of the considerations and restrictions. One of the significant parameters to optimize in any MapReduce job is communication cost between the map and reduce phases. The communication cost can be optimized by minimizing the number of copies of inputs sent to the reducers. The communication cost is closely related to the number of reducers of constrained capacity that are used to accommodate appropriately the inputs, so that the requirement of how the inputs must meet in a reducer is satisfied. In this work, we consider a family of problems where it is required that each input meets with each other input in at least one reducer. We also consider a slightly different family of problems in which, each input of a set, $X$, is required to meet each input of another set, $Y$, in at least one reducer. We prove that finding an optimal mapping schema for these families of problems is NP-hard, and present a bin-packing-based approximation algorithm for finding a near optimal mapping schema.

Solving Inverse Frequent Itemset Mining with Infrequency Constraints via Large-Scale Linear Programs

Efficient Discovery of Association Rules and Frequent Itemsets through Sampling with Tight Performance Guarantees.

Exploiting Viral Marketing for Location Promotion in Location-based Social Networks

With the explosion of smartphones and social network services, location-based social networks (LBSNs) are increasingly seen as tools for businesses (e.g., restaurants, hotels) to promote their products and services. In this paper, we investigate the key techniques that can help businesses promote their locations by advertising wisely through the underlying LBSNs. In order to maximize the benefit of location promotion, we formalize it as an influence maximization problem in an LBSN, i.e., given a target location and an LBSN, which set of k users (called seeds) should be advertised to initially such that they can successfully propagate and attract most other users to visit the target location. Existing studies have proposed different ways to calculate the information propagation probability, that is, how likely it is that a user may influence another, in the setting of a static social network. However, it is more challenging to derive the propagation probability in an LBSN since it is heavily affected by the target location and the user mobility, both of which are dynamic and query dependent. This paper proposes two user mobility models, namely Gaussian-based and distance-based mobility models, to capture the check-in behavior of individual LBSN users, based on which location-aware propagation probabilities can be derived. Extensive experiments based on two real LBSN datasets have demonstrated the superior effectiveness of our proposals compared with existing static models of propagation probabilities to truly reflect the information propagation in LBSNs.

Adaptive Cluster Tendency Visualisation and Anomaly Detection for Streaming Data

The growth in pervasive network infrastructure called the Internet of Things (IoT) enables a wide range of physical objects and environments to be monitored in fine spatial and temporal detail. The detailed, dynamic data that are collected in large quantities from sensor devices provide the basis for a variety of applications. Automatic interpretation of these evolving voluminous data are required for timely detection of interesting events. This article develops and exemplifies two new relatives of the visual assessment of tendency (VAT) and improved visual assessment of tendency (iVAT) model, which uses cluster heat maps to visualize structure in static data sets. One new model is initialized with a static VAT/iVAT image, and then incrementally (hence inc-VAT/inc-iVAT) updates the current minimal spanning tree (MST) used by VAT with an efficient edge insertion scheme. Similarly dec-VAT/dec-iVAT efficiently removes a node from the current VAT MST. A sequence of inc-iVAT/dec-iVAT images can be used for (visual) anomaly detection in evolving data streams and for sliding window based cluster assessment for time series data. The method is illustrated with four real data sets (three of them being smart city IoT data). Evaluation demonstrates the algorithms ability to successfully isolate anomalies and visualise changing cluster structure in the streaming data.

Structural Analysis of User Choices for Mobile App Recommendation

Advances in smartphone technology have promoted the rapid development of mobile apps. However, a huge number of mobile apps available in application stores has imposed the challenge of finding the right apps to meet the user needs. Indeed, there is a critical demand for personalized app recommendations. Along this line, there are opportunities and challenges posed by two unique characteristics of mobile apps. First, app markets have organized apps in a hierarchical taxonomy. Second, apps with similar functionalities are competing with each other. While there are a variety of approaches for mobile app recommendations, these approaches do not have a focus on dealing with these opportunities and challenges. To this end, in this paper, we provide a systematic study for addressing these challenges. Specifically, we develop a structural user choice model (SUCM) to learn fine-grained user preferences by exploiting the hierarchical taxonomy of apps as well as the competitive relationships among apps. As a result, the SUCM model can provide fine-level personalized app recommendations. Moreover, we design an efficient learning algorithm to estimate the parameters for the SUCM model. Finally, we perform extensive experiments on a large app adoption data set collected from Google Play. The results show that SUCM consistently outperforms state-of-the-art Top-N recommendation methods by a significant margin.

Parallel Field Ranking

Distributed Algorithms for Computing Very Large Thresholded Covariance Matrices

Mining for Topics to Suggest Knowledge Model Extensions

Electronic concept maps, interlinked with other concept maps and multimedia resources, can provide rich knowledge models to capture and share human knowledge. This article presents and evaluates methods to support experts as they extend existing knowledge models, by suggesting new context-relevant topics mined from Web search engines. The task of generating topics to support knowledge model extension raises two research questions: first, how to extract topic descriptors and discriminators from concept maps; and second, how to use these topic descriptors and discriminators to identify candidate topics on the Web with the right balance of novelty and relevance. To address these questions, this article first develops the theoretical framework required for a "topic suggester" to aid information search in the context of a knowledge model under construction. It then presents and evaluates algorithms based on this framework and applied in Extender, an implemented tool for topic suggestion. Extender has been developed and tested within CmapTools, a widely used system for supporting knowledge modeling using concept maps. However, the generality of the algorithms makes them applicable to a broad class of knowledge modeling systems, and to Web search in general.

Batch Mode Active Sampling based on Marginal Probability Distribution Matching

Life Cycle Modeling for Buzz Temporal Pattern Discovery

In social media analysis, one critical task is detecting a burst of topics or buzz, which is reflected by extremely frequent mentions of certain keywords in a short time interval. Detecting buzz not only provides useful insights into the information propagation mechanism, but also plays an essential role in preventing malicious rumors. However, buzz modeling is a challenging task because a buzz time-series often exhibits sudden spikes and heavy tails, which fails most existing time-series models. In this paper, we propose novel buzz modeling approaches which capture the rise and fade temporal patterns via Product Life Cycle (PLC) model, a classical concept in economics. More specifically, we propose to model multiple peaks in buzz time-series with PLC mixture or PLC group mixture, and develop a probabilistic graphical model (K-MPLC) to automatically discover inherent life cycle patterns within a collection of buzzes. Furthermore, we effectively utilize the model parameters of PLC mixture or PLC group mixture for burst prediction. Our experiment results show that our proposed methods significantly outperform existing leading approaches on buzz clustering and buzz type prediction.

Permanence and Community Structure in Complex Networks

The goal of community detection algorithms is to identify densely-connected units within large networks. An implicit assumption is that all the constituent nodes belong equally to their associated community. As a result, to date, efforts have been primarily driven to identify communities as a whole, rather than understanding by how much an individual node belongs to its community. In this paper, we argue that the belongingness of nodes in a community is not uniform. We quantify the degree of belongingness of a vertex within a community by a new vertex-based metric called permanence. The central idea of permanence is based on the observation that the strength of membership of a vertex to a community depends upon two factors: (i) the distribution of the connectivity of the vertex to individual communities, and (ii) how tightly the vertex is connected internally. We present the formulation of permanence based on these two quantities. We demonstrate that compared to other existing metrics, the change in permanence is more commensurate to the level of perturbation in ground-truth communities. We discuss how permanence can help us understand and utilize the structure and evolution of communities in a network. We further show that permanence is an excellent metric for identifying communities. We show that the process of maximizing permanence (abbreviated as MaxPerm) produces meaningful communities that concur with the ground-truth community structure of the networks more accurately than eight other popular community detection algorithms. Finally, we provide mathematical proofs to demonstrate the correctness of finding communities by maximizing permanence. In particular, we show that the communities obtained by this method are (i) less affected by the changes in vertex-ordering, and (ii) more resilient to resolution limit, degeneracy of solutions and asymptotic growth of values.

Differentially-Private Multidimensional Data Publishing

Scalable and Efficient Flow-Based Community Detection for Large-Scale Graph Analysis

Community detection is a powerful approach to uncover important structures in large networks. For real networks that often describe the flow of some entity, flow-based community detection methods are particularly important. Infomap is a flow-based community detection algorithm that optimizes the objective function known as the map equation. Third-party benchmarks have found that Infomap is the most effective algorithm for identifying clusters in large graphs. Unfortunately, though Infomap works well, it is an inherently serial algorithm and thus cannot take advantage of multi-core processing in modern computers, limiting its use for analyzing large graphs quickly. In this paper, we propose a novel algorithm to optimize the map equation called RelaxMap. RelaxMap provides two important improvements over Infomap: parallelization, so that the map equation can be optimized over much larger graphs, and prioritization, so that the most important work occurs first, iterations take less time, and the algorithm converges faster. We implement these techniques using OpenMP on shared-memory multicore systems, and evaluate our approach on a variety of graphs from standard graph clustering benchmarks as well as real graph datasets. Our evaluation shows that both techniques are effective: RelaxMap achieves 70% parallel efficiency on 8 cores, and prioritization improves algorithm performance by an additional 20%50%. Additionally, RelaxMap converges in the similar number of iterations and provides solutions of equivalent quality as the serial Infomap implementation.


Publication Years 2007-2016
Publication Count 258
Citation Count 2138
Available for Download 258
Downloads (6 weeks) 3546
Downloads (12 Months) 30571
Downloads (cumulative) 176031
Average downloads per article 682
Average citations per article 8
First Name Last Name Award
John Canny ACM Doctoral Dissertation Award (1987)
Carlos A. Castillo ACM Senior Member (2014)
Chris Clifton ACM Senior Member (2006)
Graham R. Cormode ACM Distinguished Member (2013)
Benjamin Fung ACM Senior Member (2013)
John E Hopcroft ACM Karl V. Karlstrom Outstanding Educator Award (2008)
ACM A. M. Turing Award (1986)
Piotr Indyk ACM Paris Kanellakis Theory and Practice Award (2012)
Jon Kleinberg ACM AAAI Allen Newell Award (2014)
ACM Prize in Computing (2008)
Chih-Jen Lin ACM Distinguished Member (2011)
ACM Senior Member (2010)
Chang-Tien Lu ACM Distinguished Member (2015)
Tao Mei ACM Senior Member (2012)
Sethuraman Panchanathan ACM Senior Member (2009)
Jian Pei ACM Senior Member (2007)
Domenico Sacca ACM Senior Member (2007)
Hui Xiong ACM Distinguished Member (2014)
ACM Senior Member (2010)
Qiang Yang ACM Distinguished Member (2011)
Mohammed Zaki ACM Distinguished Member (2010)
Ben Yanbin Zhao ACM Distinguished Member (2015)
Yu Zheng ACM Senior Member (2011)
Zhi-Hua Zhou ACM Distinguished Member (2013)
ACM Senior Member (2011)
Zhi-Hua Zhou ACM Distinguished Member (2013)
ACM Senior Member (2011)

First Name Last Name Paper Counts
Christos Faloutsos 12
Jieping Ye 7
Tao Li 5
Philip Yu 4
Hui Xiong 4
Jian Pei 4
Zhihua Zhou 4
Shenghuo Zhu 4
Heng Huang 4
John Lui 4
Feiping Nie 4
Huan Liu 4
Aristides Gionis 4
John Hopcroft 3
Zhiwen Yu 3
Lise Getoor 3
Jure Leskovec 3
Malik Magdon-Ismail 3
Mingsyan Chen 3
Jilles Vreeken 3
Yun Chi 3
Yasushi Sakurai 3
Evimaria Terzi 3
Yihong Gong 3
Lei Tang 3
Bin Guo 3
Hong Cheng 3
Dingding Wang 3
Fabio Fassetti 3
Christopher Jermaine 3
Fabrizio Angiulli 3
Andrea Esuli 2
Qi Liu 2
Jilei Tian 2
Ping Luo 2
B Prakash 2
Yuru Lin 2
Vivekanand Gopalkrishnan 2
Jie Tang 2
Xiaoli Fern 2
Eugene Agichtein 2
Sanjay Ranka 2
Carlotta Domeniconi 2
Jiliang Tang 2
Dantong Yu 2
Charalampos Tsourakakis 2
Srinivasan Parthasarathy 2
Hari Sundaram 2
Spiros Papadimitriou 2
Joydeep Ghosh 2
Wei Fan 2
Dino Pedreschi 2
Pinghui Wang 2
Hao Huang 2
Hong Qin 2
Yehuda Koren 2
Heikki Mannila 2
Panayiotis Tsaparas 2
Jianhui Chen 2
Yu Zhang 2
Fabrizio Sebastiani 2
Zhu Wang 2
Arthur Zimek 2
Chengqi Zhang 2
Michalis Vazirgiannis 2
Junzhou Zhao 2
Xiaohong Guan 2
Geoffrey Webb 2
Panagis Magdalinos 2
Indrajit Bhattacharya 2
Qiang Yang 2
Jin Huang 2
Charles Ling 2
Shinjae Yoo 2
Ruoming Jin 2
Ian Davidson 2
Antonella Guzzo 2
Guofei Jiang 2
Jiawei Han 2
Antônio Loureiro 2
Hanghang Tong 2
Xiang Zhang 2
Daniel Kifer 2
Laks Lakshmanan 2
Jimeng Sun 2
Don Towsley 2
Rita Chattopadhyay 2
Sucheta Soundarajan 2
Xianchao Zhang 2
Yong Ge 2
Dacheng Tao 2
Belle Tseng 2
Enhong CHEN 2
Xiao Yu 2
Maguelonne Teisseire 1
Boleslaw Szymanski 1
Paolo Boldi 1
Lini Thomas 1
Sachindra Joshi 1
Tharam Dillon 1
Leonid Hrebien 1
Pei Yang 1
Li Li 1
Denian Yang 1
Zhishan Guo 1
Yunsing Koh 1
Yixin Chen 1
Xuanhong Dang 1
Silei Xu 1
Bo Liu 1
Shumo Chu 1
Luigi Pontieri 1
Bingrong Lin 1
Francesco Bonchi 1
Wei Ding 1
Kasim Candan 1
Sunil Vadera 1
S Upham 1
Thomas Porta 1
Hongzhi Yin 1
Jeffrey Erman 1
Ming Li 1
Dora Erdős 1
Joydeep Ghosh 1
Kaiyuan Zhang 1
Carlos Ordonez 1
Fosca Giannotti 1
James Cheng 1
Peter Christen 1
U Kang 1
Daniel Dunlavy 1
Christos Doulkeridis 1
Joao Duarte 1
David Dominguez-Sal 1
Danai Koutra 1
Hiroshi Motoda 1
Steven Skiena 1
Chris Volinsky 1
Andreas Krause 1
Hsiangfu Yu 1
Aditya Parameswaran 1
Binbin Lin 1
Johannes Gehrke 1
Christo Wilson 1
Ben Zhao 1
Yijuan Lu 1
Feng Liu 1
Yufeng Wang 1
Ernest Garcia 1
Shamkant Navathe 1
Wei Fan 1
Wei Wei 1
Hoangvu Dang 1
Rezwan Ahmed 1
Fen Xia 1
Linlin Zong 1
Duygu Ucar 1
Mustafa Bilgic 1
Ben Kao 1
David Cheung 1
Christopher Leckie 1
Cheng Zeng 1
Atreya Srivathsan 1
Tong Sun 1
Songhua Xu 1
Yanchi Liu 1
Kun Liu 1
Duo Zhang 1
Dmitry Pavlov 1
Raymond Ng 1
Piotr Indyk 1
Anne Laurent 1
Christopher Carothers 1
Satyanarayana Valluri 1
Ashish Verma 1
Jérémy Besson 1
Raghu Ramakrishnan 1
Rong Ge 1
Byronju Gao 1
Yubao Wu 1
Maryam Tahani 1
Hamid Rabiee 1
Li Tu 1
Saharon Rosset 1
Claudia Perlich 1
Tuannhon Dang 1
Ying Wei 1
Seekiong Ng 1
Kui Yu 1
Ramana Kompella 1
Chengkai Li 1
Salvatore Ruggieri 1
Hong Xie 1
Vasileios Kandylas 1
Jing Zhang 1
Rodrigo Alves 1
Juhua Hu 1
Yu Jin 1
Veerabhadran Baladandayuthapani 1
Giulio Rossetti 1
Timothy De Vries 1
Eric Xing 1
Albert Bifet 1
Xiaoming Li 1
Josep Brunat 1
Jiang Bian 1
Padhraic Smyth 1
Brandon Westover 1
Eamonn Keogh 1
Jiayu Pan 1
Claudia Plant 1
Ron Eyal 1
Avi Rosenfeld 1
Asaf Shabtai 1
Shifeng Weng 1
Dityan Yeung 1
Evangelos Papalexakis 1
Nicholas Sidiropoulos 1
Jilei Tian 1
Davoud Moulavi 1
Junming Shao 1
Yllka Velaj 1
Michael Lyu 1
George Karypis 1
Qiang Qu 1
Xiaojun Chang 1
Lars Schmidt-Thieme 1
Koji Hino 1
Masaru Kitsuregawa 1
Xiang Zhang 1
Jenwei Huang 1
James Bailey 1
Jianping Zhang 1
Manas Somaiya 1
Graham Cormode 1
Fernando Kuipers 1
Dick Epema 1
Linpeng Tang 1
Min Wang 1
Bin Li 1
Marc Maier 1
William Street 1
Lionel Ni 1
Mohamed Bouguessa 1
Mingxi Wu 1
Ye Chen 1
John Canny 1
Benjamin Fung 1
Dominique Laurent 1
Yeowwei Choong 1
Meghana Deodhar 1
Luca Becchetti 1
Ying Cui 1
Keli Xiao 1
Hans Kriegel 1
Bo Long 1
Martin Ester 1
Gunjan Gupta 1
Ling Feng 1
Diana Inkpen 1
Hongxia Yang 1
Haoda Fu 1
Dawei Zhou 1
Jingrui He 1
Liming Chen 1
Wei Wang 1
Michalis Faloutsos 1
Maryam Ramezani 1
Shebuti Rayana 1
Kuan Zhang 1
Vetle Torvik 1
Edoardo Serra 1
Claudio Schifanella 1
Nesreen Ahmed 1
Min Wang 1
Luigi Moccia 1
Shuiwang Ji 1
Ali Pınar 1
Michail Vlachos 1
Ling Chen 1
Chunxiao Xing 1
Yang Liu 1
Dechuan Zhan 1
Ruggero Pensa 1
Jose Hern´ndez-Orallo 1
Rainer Gemulla 1
Guangtao Wang 1
Xueying Zhang 1
Yiping Ke 1
Saurabh Paul 1
Eli Upfal 1
Evrim Acar 1
Yang Zhou 1
Ben London 1
Jirong Wen 1
Charu Aggarwal 1
Joseph Ruiz Md 1
Neil Shah 1
Masahiro Kimura 1
Alexander Ihler 1
Kaiwei Chang 1
Forrest Briggs 1
Gustavo Batista 1
Qiang Zhu 1
Philip Yu 1
Jure Leskovec 1
Jon Kleinberg 1
Maya Bercovitch 1
Naren Ramakrishnan 1
Qi Tian 1
Jennifer Neary 1
Minoru Kanehisa 1
Ling Liu 1
Huilei He 1
Hua Wang 1
Stefan Kramer 1
Irwin King 1
Huaimin Wang 1
Qiang You 1
Luke McDowell 1
Miao Tian 1
Fei Zou 1
Virgílio Almeida 1
Christos Faloutsos 1
Laiwan Chan 1
Saurav Sahay 1
Xiaowen Ding 1
Jörg Sander 1
Siyuan Liu 1
Gianlorenzo D'Angelo 1
Yanjie Fu 1
Yu Zheng 1
Maria Halkidi 1
Luming Zhang 1
Lei Zou 1
Jian Wang 1
Nitin Agarwal 1
S Muthukrishnan 1
Kunta Chuang 1
Anthony Tung 1
Adelelu Jia 1
Alexandru Iosup 1
Aniket Chakrabarti 1
Reza Zafarani 1
Saurabh Kataria 1
Amin Saberi 1
Matthew Rattigan 1
Geoffrey Barbier 1
Limin Yao 1
Kristina Lerman 1
Olvi Mangasarian 1
Chris Clifton 1
Cheukkwong Lee 1
Mohammed Zaki 1
Jennifer Dy 1
Shaojun Wang 1
Loïc Cerf 1
Henry Tan 1
Yanjun Qi 1
Theodoros Lappas 1
Munmun De Choudhury 1
Wenjie Li 1
Chen Chen 1
Tina Eliassi-Rad 1
Yada Zhu 1
Leman Akoglu 1
Gianluigi Greco 1
Francesco Gullo 1
Guimei Liu 1
Jennifer Neville 1
Gensheng Zhang 1
Min Ding 1
Vassilios Vassiliadis 1
Yiming Yang 1
Kaiming Ting 1
Ayan Acharya 1
Sreangsu Acharyya 1
Zhiqiang Xu 1
Christophe Giraud-Carrier 1
Arnold Boedihardjo 1
Changtien Lu 1
Aditya Menon 1
Ruud Van De Bovenkamp 1
Manos Papagelis 1
Clyde Giles 1
Wei Peng 1
David Gleich 1
Steven Hoi 1
David Jensen 1
Tengfei Bao 1
Brook Wu 1
Glenn Fung 1
Zeeshan Syed 1
Kamalakar Karlapalem 1
Peer Kröger 1
Dimitrios Mavroeidis 1
Céline Robardet 1
Dale Schuurmans 1
Jean Boulicaut 1
Pradeep Tamma 1
Zengjian Hu 1
Boaz Ben-Moshe 1
Moshe Kam 1
Jieping Ye 1
Licong Cui 1
Xiaofeng Zhu 1
Neil Smalheiser 1
Shachar Kaufman 1
Ori Stitelman 1
Nikolaj Tatti 1
Leland Wilkinson 1
James Cheng 1
Hockhee Ang 1
Steven Hoi 1
Weekeong Ng 1
Mengling Feng 1
Xiao Jiang 1
Franco Turini 1
José Balcázar 1
Lyle Ungar 1
Comandur Seshadhri 1
Luan Tang 1
Quanquan Gu 1
Xintao Wu 1
Nick Duffield 1
Chun Li 1
Jianyong Wang 1
Feitony Liu 1
Petros Drineas 1
Zhongfei Zhang 1
Matthew Rowe 1
Edward Chang 1
Kazumi Saito 1
ChengXiang Zhai 1
Dong Xin 1
Dafna Shahaf 1
Stephen Fienberg 1
Raviv Raich 1
Bilson Campana 1
Christian Böhm 1
Vibhor Rastogi 1
Deng Cai 1
Chris Ding 1
Lior Rokach 1
Sigal Sina 1
Sarit Kraus 1
Longjie Li 1
Xiaolin Wang 1
Tingting Gao 1
Dityan Yeung 1
Bruno Abrahão 1
T Murali 1
Kiyoko Aoki-Kinoshita 1
Ravi Janardan 1
Sudhir Kumar 1
Tiancheng Lou 1
Guna Seetharaman 1
Giacomo Berardi 1
Xiaotong Zhang 1
Han Liu 1
Kathleen Carley 1
Lorenzo Severini 1
Ou Wu 1
Lei Ma 1
Xing Yong 1
Xiaodan Song 1
Yasuhiro Fujiwara 1
Wei Wang 1
ChienWei Chen 1
Weiyin Loh 1
John Salerno 1
Nitin Kumar 1
Flip Korn 1
Siqi Shen 1
Lei Li 1
Ying Wang 1
Sanjay Chawla 1
Jinpeng Wang 1
Arnau Prat-Pérez 1
Josep Larriba-Pey 1
Risa Myers 1
Qingtian Zeng 1
Brian Gallagher 1
John Hutchins 1
Taneli Mielikäinen 1
Ji Liu 1
Sethuraman Panchanathan 1
Abdullah Mueen 1
Manuel Gomez-Rodriguez 1
Yizhou Sun 1
Xiaofei He 1
Muthuramakrishnan Venkitasubramaniam 1
Victor Lee 1
Zhi Yang 1
Yafei Dai 1
Robert Kleinberg 1
Ying Jin 1
Hiroshi Mamitsuka 1
Pierluigi Crescenzi 1
Zijun Yao 1
Weiming Hu 1
Maoying Qiao 1
Wei Bian 1
Sitaram Asur 1
Jerry Kiernan 1
Kevin Yip 1
Wei Zheng 1
Zhenxing Wang 1
Ravi Konuru 1
Baoxing Huai 1
Hengshu Zhu 1
Pritam Gundecha 1
Lei Chen 1
Edward Wild 1
Murat Kantarcıoğlu 1
John Guttag 1
Fan Guo 1
Marc Plantevit 1
Alin Dobra 1
Jinlin Chen 1
Shantanu Godbole 1
Binay Bhattacharya 1
Ke Wang 1
Xinran He 1
Chris Ding 1
Benoît Dumoulin 1
Jing Zhang 1
Xiuyao Song 1
John Gums 1
Yin Zhang 1
Zhongfei Zhang 1
Yunxin Zhao 1
Jude Shavlik 1
Beilun Wang 1
Chihya Shen 1
Zhitao Wang 1
Wei Cheng 1
Jingrui He 1
Ali Hemmatyar 1
Meng Jiang 1
Peng Cui 1
Qian Sun 1
Yangqiu Song 1
Yi Zhen 1
Sibel Adalı 1
Xiaohui Lu 1
Francesco Lupia 1
Nima Mirbakhsh 1
Antti Ukkonen 1
Xindong Wu 1
Domenico Saccà 1
Zheng Wang 1
Johannes Schneider 1
Bin Cui 1
Juanzi Li 1
Patrick Haffner 1
Zhili Zhang 1
Qingyan Yang 1
Scott Burton 1
Christos Boutsidis 1
Bingsheng Wang 1
Hui Ke 1
Tamara Kolda 1
Jie Wang 1
Galileo Namata 1
Karthik Subbian 1
Yulan He 1
John Frenzel MD 1
Hua Duan 1
Zoran Obradović 1
Wangchien Lee 1
Sri Ravana 1
Alex Beutel 1
Shiqiang Yang 1
Bin Zhou 1
Anushka Anand 1
Yicheng Tu 1
Dan Simovici 1
Hao Wang 1
Madhav Jha 1
Alice Leung 1
Siddharth Gopal 1
Renato Assunção 1
Subhabrata Sen 1
Dino Ienco 1
Rosa Meo 1
Pauli Miettinen 1
Eduardo Hruschka 1
Hongliang Fei 1
Jun Huan 1
Carlos Garcia-Alvarado 1
Ana Appel 1
Jeffreyxu Yu 1
Zhen Guo 1
Yashu Liu 1
Waynexin Zhao 1
Faming Lu 1
Stephen North 1
Andrew Mehler 1
Seungil Huh 1
Chojui Hsieh 1
Chihjen Lin 1
Zheng Wang 1
Thanawin Rakthanmanon 1
Jesin Zakaria 1
Kedar Bellare 1
Brandon Norick 1
Jiawei Han 1
Ming Ji 1
Yuval Elovici 1
Changshui Zhang 1
Ming Lin 1
Sougata Mukherjea 1
Ashwin Ram 1
Zhanpeng Fang 1
Jing Peng 1
Yang Zhou 1
Yandong Liu 1
Joshua Vogelstein 1
Qiaozhu Mei 1
Takeshi Yamada 1
Suresh Iyengar 1
Jiawei Han 1
Ashwin Machanavajjhala 1
Erheng Zhong 1
Wei Fan 1
Xinjiang Lu 1
Qinli Yang 1
Josif Grabocka 1
Nicolas Schilling 1
Xiang Li 1
David Aha 1
Richard Xu 1
Dengyong Zhou 1
Ming Zhang 1
Biru Dai 1
Divesh Srivastava 1
Hungleng Chen 1
Zhenjie Zhang 1
Liang Hong 1
Hunghsuan Chen 1
Venu Satuluri 1
Rose Yu 1
Yan Liu 1
Yao Zhang 1
Aisling Kelliher 1
Paul Castro 1
Lian Duan 1
Bruno Ribeiro 1
Siyuan Liu 1
Anon Plangprasopchok 1
Shengrui Wang 1
Ganesh Ramesh 1
A Patterson 1
Patrick Hung 1
Manolis Kellis 1
Carlos Castillo 1
Amit Dhurandhar 1
Tianbing Xu 1
Sanmay Das 1
Beechung Chen 1
Fedja Hadzic 1
Elizabeth Chang 1
Aminul Islam 1
Feiyu Xiong 1
Guoqiang Zhang 1
Fei Wang 1
Shiqiang Tao 1
Weekeong Ng 1
Sethuraman Panchanathan 1
Michael Mampaey 1
Yu Lei 1
Li Wan 1
Haojun Zhang 1
Limsoon Wong 1
Maria Sapino 1
Shipeng Yu 1
Zhiting Hu 1
Pedro Melo 1
Yuan Jiang 1
Qinbao Song 1
Michele Coscia 1
Matteo Riondato 1
Yi Wang 1
Charles Elkan 1
Jaideep Srivastava 1
João Gama 1
Carlos Guestrin 1
Tomoharu Iwata 1
Naonori Ueda 1
Qi Lou 1
Wei Fan 1
Xifeng Yan 1
Julian McAuley 1
Kosuke Hashimoto 1
Nobuhisa Ueda 1
Jie Tang 1
Aparna Varde 1
Ricardo Campello 1
Bertil Schmidt 1
Haiqin Yang 1
Shuhui Wang 1
Yi Yang 1
Quanzeng You 1
Tao Mei 1
Pedro Vaz De Melo 1
Jeffrey Chan 1
Michael Houle 1
Dimitrios Gunopulos 1
Daxin Jiang 1
Muna Al-Razgan 1
Lei Zhang 1
Mohsen Bayati 1
Raymond Wong 1
Ada Fu 1
Peilin Zhao 1
Li Zheng 1
Noman Mohammed 1
Jaideep Vaidya 1
Chao Liu 1
Collin Stultz 1

Affiliation Paper Counts
University of Rochester 1
Naval Research Laboratory 1
Stevens Institute of Technology 1
Jerusalem College of Technology 1
University of California, Los Angeles 1
National Taiwan University of Science and Technology 1
Oracle Corporation 1
Lanzhou University 1
Northeastern University 1
Research Organization of Information and Systems National Institute of Informatics 1
University of Malaya 1
University of Milan 1
Temple University 1
Syracuse University 1
University of Queensland 1
Curtin University of Technology, Perth 1
US Naval Academy 1
University of Roma La Sapienza 1
University of New Mexico 1
Saarland University 1
Institute of Mathematics and Informatics Lithuanian 1, Inc. 1
Harvard School of Engineering and Applied Sciences 1
Ariel University Center of Samaria 1
Siemens USA 1
Microsoft Research Asia 1
Innopolis University 1
Ryukoku University 1
National Research Institute of Science and Technology for Environment and Agriculture 1
University of Michigan 1
Anhui University 1
University of Ontario Institute of Technology 1
Universite de Cergy-Pontoise 1
Princeton University 1
Queens College, City University of New York 1
University of Arkansas - Fayetteville 1
Yale University 1
University of Auckland 1
University of Missouri-Columbia 1
John F. Kennedy School of Government 1
The University of North Carolina at Charlotte 1
University of South Florida Tampa 1
Valley Laboratory 1
University of Salford 1
Hong Kong Polytechnic University 1
Australian National University 1
University of Texas at Dallas 1
University of Vermont 1
University of Arizona 1
Nanjing University of Science and Technology 1
Washington University in St. Louis 1
HP Labs 1
Universidad Politecnica de Valencia 1
BBN Technologies 1
Air Force Research Laboratory Information Directorate 1
University of Shizuoka 1
MITRE Corporation 1
Norwegian University of Science and Technology 1
Indian Institute of Science 1
Zhejiang Wanli University 1
Aston University 1
Colorado School of Mines 1
University of Southern California, Information Sciences Institute 1
John Carroll University 1
Radboud University Nijmegen 1
Brigham and Women's Hospital 1
University of Toronto 1
De Montfort University 1
Wright State University 1
Singapore Management University 1
Air Force Research Laboratory 1
Universite Montpellier 2 Sciences et Techniques 1
Nanjing University of Aeronautics and Astronautics 1
University of Florence 1
University of Connecticut 1
Industrial Technology Research Institute of Taiwan 1
Hong Kong Red Cross Blood Transfusion Service 1
Nokia USA 1
Universite Claude Bernard Lyon 1 1
Lancaster University 1
Osaka University 1
University of Iowa 1
National University of Defense Technology China 1
Missouri University of Science and Technology 1
University of California, Berkeley 1
Shanghai Jiaotong University 1
Wright-Patterson AFB 1
Eli Lilly and Company 1
Swiss Federal Institute of Technology, Zurich 1
Lawrence Livermore National Laboratory 1
Max Planck Institute for Informatics 2
Hefei University of Technology 2
Zhejiang University 2
Institute of High Performance Computing, Singapore 2
Johns Hopkins University 2
University of Electronic Science and Technology of China 2
Tel Aviv University 2
University of Minnesota System 2
University of Houston 2
The University of Hong Kong 2
Brigham Young University 2
Brown University 2
Montclair State University 2
Hong Kong Baptist University 2
Renmin University of China 2
University of California, Davis 2
Drexel University 2
University of Texas M. D. Anderson Cancer Center 2
University of Kansas Lawrence 2
Academia Sinica Taiwan 2
University of Quebec in Outaouais 2
Institute for Systems and Computer Engineering of Porto 2
University of Virginia 2
University of Massachusetts Boston 2
University of Tokyo 2
University Michigan Ann Arbor 2
Nokia 2
University of Ottawa, Canada 2
University of Athens 2
IBM Zurich Research Laboratory 2
Kent State University 2
University of California, San Diego 2
Istituto di Scienza e Tecnologie dell'Informazione A. Faedo 2
Qatar Computing Research institute 2
International Institute of Information Technology Hyderabad 3
Shandong University of Science and Technology 3
Bar-Ilan University 3
University of Hildesheim 3
Rice University 3
University of Pennsylvania 3
University of California, Irvine 3
University of Sao Paulo 3
The University of British Columbia 3
University of Kentucky 3
INSA Lyon 3
George Mason University 3
Institute of Automation Chinese Academy of Sciences 3
Xerox Corporation 3
Chinese Academy of Sciences 3
Binghamton University State University of New York 3
Italian National Research Council 3
University of Sydney 3
Microsoft 3
University of Melbourne 3
University of California, Santa Barbara 3
Wuhan University 3
University of Southern California 3
University of Alberta 3
Johannes Gutenberg University Mainz 3
Emory University 4
Institute for Infocomm Research, A-Star, Singapore 4
Google Inc. 4
Brookhaven National Laboratory 4
Universitat Politecnica de Catalunya 4
The University of Western Ontario 4
IBM Research 4
University of Antwerp 4
National University of Singapore 4
Athens University of Economics and Business 4
Monash University 4
Boston University 4
Massachusetts Institute of Technology 4
Ben-Gurion University of the Negev 4
Sharif University of Technology 4
University of Pisa 4
Rutgers, The State University of New Jersey 4
Yahoo Research Barcelona 4
Aalto University 4
Case Western Reserve University 5
Rutgers University-Newark Campus 5
University of Texas at San Antonio 5
Ohio State University 5
Dalian University of Technology 5
Purdue University 5
Kyoto University 5
University of Turin 5
Oregon State University 5
Sandia National Laboratories 5
New Jersey Institute of Technology 5
The University of North Carolina at Chapel Hill 5
Pennsylvania State University 6
Delft University of Technology 6
AT&T Laboratories Florham Park 6
University of Massachusetts Amherst 6
Nippon Telegraph & Telephone 6
Ludwig Maximilian University of Munich 6
University of Minnesota Twin Cities 6
Yahoo Inc. 6
University of Florida 7
University of Maryland 7
Federal University of Minas Gerais 7
University of Science and Technology of China 7
Peking University 7
Microsoft Research 7
University of Wisconsin Madison 7
University of California, Riverside 7
Stanford University 8
Virginia Tech 8
Nanjing University 8
Nanyang Technological University 8
Xi'an Jiaotong University 8
Stony Brook University 8
University of Texas at Austin 8
Yahoo Research Labs 8
Georgia Institute of Technology 8
IBM Thomas J. Watson Research Center 8
Florida International University 9
University of Illinois at Chicago 10
National Taiwan University 10
Hong Kong University of Science and Technology 10
Rensselaer Polytechnic Institute 11
University of Technology Sydney 12
University of Calabria 12
Cornell University 12
Northwestern Polytechnical University China 12
Simon Fraser University 12
NEC Laboratories America, Inc. 15
University of Texas at Arlington 15
University of Illinois at Urbana-Champaign 16
Chinese University of Hong Kong 19
Tsinghua University 21
Carnegie Mellon University 32
Arizona State University 45

ACM Transactions on Knowledge Discovery from Data (TKDD) Archive


Volume 11 Issue 1, August 2016
Volume 10 Issue 4, July 2016 Special Issue on SIGKDD 2014, Special Issue on BIGCHAT and Regular Papers
Volume 10 Issue 3, February 2016


Volume 10 Issue 2, October 2015
Volume 10 Issue 1, July 2015
Volume 9 Issue 4, June 2015
Volume 9 Issue 3, April 2015 TKDD Special Issue (SIGKDD'13)


Volume 9 Issue 2, November 2014
Volume 9 Issue 1, October 2014
Volume 8 Issue 4, October 2014
Volume 8 Issue 3, June 2014
Volume 8 Issue 2, June 2014
Volume 8 Issue 1, February 2014 Casin special issue


Volume 7 Issue 4, November 2013
Volume 7 Issue 3, September 2013 Special Issue on ACM SIGKDD 2012
Volume 7 Issue 2, July 2013
Volume 7 Issue 1, March 2013


Volume 6 Issue 4, December 2012 Special Issue on the Best of SIGKDD 2011
Volume 6 Issue 3, October 2012
Volume 6 Issue 2, July 2012
Volume 6 Issue 1, March 2012
Volume 5 Issue 4, February 2012


Volume 5 Issue 3, August 2011
Volume 5 Issue 2, February 2011


Volume 5 Issue 1, December 2010
Volume 4 Issue 4, October 2010
Volume 4 Issue 3, October 2010
Volume 4 Issue 2, May 2010
Volume 4 Issue 1, January 2010


Volume 3 Issue 4, November 2009
Volume 3 Issue 3, July 2009
Volume 3 Issue 2, April 2009
Volume 3 Issue 1, March 2009
Volume 2 Issue 4, January 2009


Volume 2 Issue 3, October 2008
Volume 2 Issue 2, July 2008
Volume 2 Issue 1, March 2008
Volume 1 Issue 4, January 2008


Volume 1 Issue 3, December 2007
Volume 1 Issue 2, August 2007
Volume 1 Issue 1, March 2007
All ACM Journals | See Full Journal Index