ACM Transactions on

Knowledge Discovery from Data (TKDD)

Latest Articles

Guest Editorial

Kernelized Information-Theoretic Metric Learning for Cancer Diagnosis Using High-Dimensional Molecular Profiling Data

With the advancement of genome-wide monitoring technologies, molecular expression data have become... (more)

Jointly Modeling Label and Feature Heterogeneity in Medical Informatics

Multiple types of heterogeneity including label heterogeneity and feature heterogeneity often co-exist in many real-world data mining applications,... (more)

Mining Dual Networks

Finding the densest subgraph in a single graph is a fundamental problem that has been extensively studied. In many emerging applications, there exist dual networks. For example, in genetics, it is important to use protein interactions to interpret genetic interactions. In this application, one network represents physical interactions among nodes,... (more)

Biomedical Ontology Quality Assurance Using a Big Data Approach

This article presents recent progresses made in using scalable cloud computing environment, Hadoop and MapReduce, to perform ontology quality... (more)

Less is More

Ensemble learning for anomaly detection has been barely studied, due to difficulty in acquiring ground truth and the lack of inherent objective functions. In contrast, ensemble approaches for classification and clustering have been studied and effectively used for long. Our work taps into this gap and builds a new ensemble approach for anomaly... (more)

Co-Clustering Structural Temporal Data with Applications to Semiconductor Manufacturing

Recent years have witnessed data explosion in semiconductor manufacturing due to advances in instrumentation and storage techniques. The large amount... (more)

Unsupervised Rare Pattern Mining

Association rule mining was first introduced to examine patterns among frequent items. The original motivation for seeking these rules arose from need to examine customer purchasing behaviour in supermarket transaction data. It seeks to identify combinations of items or itemsets, whose presence in a transaction affects the likelihood of the... (more)


Multi-view graph clustering aims to enhance clustering performance by integrating heterogeneous information collected in different domains. Each domain provides a different view of the data instances. Leveraging cross-domain information has been demonstrated an effective way to achieve better clustering results. Despite the previous success,... (more)

Featuring, Detecting, and Visualizing Human Sentiment in Chinese Micro-Blog

Micro-blog has been increasingly used for the public to express their opinions, and for organizations to detect public sentiment about social events... (more)


New options for ACM authors to manage rights and permissions for their work

ACM introduces a new publishing license agreement, an updated copyright transfer agreement, and a new author-pays option which allows for perpetual open access through the ACM Digital Library. For more information, visit the ACM Author Rights webpage.

About TKDD 

The ACM Transactions on Knowledge Discovery from Data (TKDD) publishes original archival papers in the area of knowledge discovery from data and closely related disciplines.  The majority of the papers that appear in TKDD is expected to address the logical and technical foundation of knowledge discovery and data mining.

Forthcoming Articles
Sampling for Nystr ¨om Extension based Spectral Clustering: Incremental Perspective and Novel Analysis

Sampling is the key aspect for Nystr¨om extension based spectral clustering. Traditional sampling schemes select the set of landmark points on a whole and focus on how to lower the matrix approximation error. How- ever, matrix approximation error does not have direct impact on the clustering performance. In this paper, we propose a sampling framework from incremental perspective, i.e., the landmark points are selected one by one, and each next point to be sampled is determined by previously selected landmark points. Incremen- tal sampling builds explicit relationships among landmark points, thus they work together well and provide theoretical guarantee on the clustering performance. We provide two novel analysis methods and propose t- wo schemes for selecting-the-next-one of the framework. The first scheme is based on clusterability analysis, which provides better guarantee on clustering performance than schemes based on matrix approximation error analysis. The second scheme is based on loss analysis, which provides maximized predictive ability of the landmark points on the (implicit) labels of the unsampled points. Experimental results on a wide range of benchmark datasets demonstrate the superiorities of our proposed incremental sampling schemes over existing sampling schemes.

Shop Type Recommendation Leveraging the Data from Social Media and Location-based Services

It is an important yet challenging task for investors to determine the most suitable type of shop (e.g., restaurant, fashion, etc.) for a newly opened store. Traditional ways are predominantly field surveys and empirical estimation, which are not effective as they lack shop-related data. As social media and location-based services (LBS) are becoming more and more pervasive, user-generated data from these platforms is providing rich information not only about individual consumption experiences but also about shop attributes. In this paper, we investigate the recommendation of shop types for a given location, by leveraging heterogeneous data that are mainly historical user preferences and location context from social media and LBS. Our goal is to select the most suitable shop type, seeking to maximize the number of customers served from a candidate set of types. We propose a novel bias learning matrix factorization method with feature fusion for shop popularity prediction. Features are defined and extracted from two perspectives: location, where features are closely related to location characteristics, and commercial, where features are about the relationships between shops in the neighborhood. Experimental results show that the proposed method outperforms state-of-the-art solutions.

Listwise Learning to Rank from Crowds

Learning to rank has received great attention in recent years as it plays a crucial role in many applications such as information retrieval, data mining. The existing concept of learning to rank assumes that each training instance is associated with a reliable label. However, in practice, this assumption does not necessarily hold true as it may be infeasible or remarkably expensive to obtain reliable labels for many learning to rank applications. Therefore, a feasible approach is to collect labels from crowds and then learn a ranking function from crowdsourcing labels. This study explores the listwise learning to rank with crowdsourcing labels obtained from multiple annotators, who may be unreliable. A new probabilistic ranking model is first proposed by combining two existing models. Subsequently, a ranking function is trained by proposing a maximum likelihood learning approach, which estimates ground-truth labels and annotator expertise, and learns the ranking function iteratively. In practical crowdsourcing machine learning, valuable prior information (e.g., professional grades) about the annotators is normaly attainable. Therefore, this study also investigates learning to rank from crowd labels when prior information on the exptertise of involved annotators is avaliable. In particular, three basic types of prior information are investigated, and corresponding learning algorithms are consequently introduced. The proposed algorithms are tested on both synthetic and real-world data. Results reveal that the maximum likelihood approach significantly outperforms the average approach, and its results are comparable to those of the learning model in consideration reliable labels. The results of the investigation further indicate that prior information is helpful in inferring both ranking functions and expertise degrees of annotators.

Heterogeneous Translated Hashing: A Scalable Solution towards Multi-modal Similarity Search

Multi-modal similarity search has attracted considerable attention to meet the need of information retrieval across different types of media. To enable efficient multi-modal similarity search in large-scale databases recently, researchers start to study multi-modal hashing. Most of the existing methods are applied to search across multi-views among which explicit correspondence is provided. Given a multi-modal similarity search task, we observe that abundant multi-view data can be found on the Web which can serve as an auxiliary bridge. In this paper, we propose a Heterogeneous Translated Hashing (HTH) method with such auxiliary bridge incorporated not only to improve current multi-view search but also to enable similarity search across heterogeneous media which have no direct correspondence. HTH provides more flexible and discriminative ability by embedding heterogeneous media into different Hamming spaces, compared to almost all existing methods that map heterogeneous data in a common Hamming space. We formulate a joint optimization model to learn hash functions embedding heterogeneous media into different Hamming spaces, and a translator aligning different Hamming spaces. The extensive experiments on two real-world datasets, one publicly available dataset of Flickr and the other MIRFLICKR-Yahoo Answers dataset, highlight the effectiveness and efficiency of our algorithm.

Introduction to the Special Issue of best Papers in ACM SIGKDD 2014

Scalable Clustering by Iterative Partitioning and Point Attractor Representation

Clustering very large data sets while preserving cluster quality remains a challenging data mining task to date. In this paper, we propose an effective scalable clustering algorithm for large data sets that builds upon the concept of synchronization. Inherited from the powerful concept of synchronization, the proposed algorithm, CIPA (Clustering by Iterative Partitioning and Point Attractor Representations), is capable of handling very large data sets by iteratively partitioning them into thousands of subsets and clustering each subset separately. Using dynamic clustering by synchronization, each subset is then represented by a set of point attractors and outliers. Finally, CIPA identifies the cluster structure of the original data set by clustering the newly generated data set consisting of points attractors and outliers from all subsets. We demonstrate that our new scalable clustering approach has several attractive benefits: (a) CIPA faithfully captures the cluster structure of the original data by performing clustering on each separate data iteratively instead of using any sampling or statistical summarization technique. (b) It allows clustering very large data set efficiently with high cluster quality. (c) CIPA is parallelizable and also suitable for distributed data. Extensive experiments demonstrate the effectiveness and efficiency of our approach.

Solving Inverse Frequent Itemset Mining with Infrequency Constraints via Large-Scale Linear Programs

Efficient Discovery of Association Rules and Frequent Itemsets through Sampling with Tight Performance Guarantees.

Fast Sampling for Time-Varying Determinantal Point Processes

Determinantal Point Processes (DPPs) is a stochastic model which assigns each subset of base dataset with a probability proportional to the subset's degree of diversity. It has been shown that DPPs is particularly appropriate in data (e.g. news, videos) subset selection and summarization, where diversity amongst the selected subset is preferred but other conventional models cannot offer. However, DPP's inference algorithms have a polynomial time complexity which makes it difficult to handle large and time-varying datasets, especially when real-time processing requirement is needed. To address such a limitation, we developed a fast sampling algorithm for DPPs which takes advantage of the nature of some time-varying data, where changes occur in data between each time stamp are relatively small, such as news corpora updating and communication network evolving. The algorithm proposed is built upon the simplification of marginal density functions over successive time stamps and the sequential Monte Carlo (SMC) sampling technique. Evaluations on both real-world news dataset and Enron corpus confirm the efficiency of the proposed algorithm.

Leveraging Neighbor Attributes for Classification in Sparsely-Labeled Networks

Many analysis tasks involve linked nodes, such as people connected by friendship links. Research on "link-based classification" (LBC) has studied how to leverage these connections to improve classification accuracy. Most such prior research has assumed the provision of a densely-labeled training network. Instead, this article studies the common and challenging case when LBC must use a single sparsely-labeled network for both learning and inference, a case where existing methods often yield poor accuracy. To address this challenge, we introduce a novel method that enables prediction via "neighbor attributes," which were briefly considered by early LBC work but then abandoned due to perceived problems. We then explain, using both extensive experiments and loss decomposition analysis, how using neighbor attributes often significantly improves accuracy. We further show that using appropriate semi-supervised learning (SSL) is essential to obtaining the best accuracy in this domain, and that the gains of neighbor attributes remain across a range of SSL choices and data conditions. Finally, given the challenges of label sparsity for LBC and the impact of neighbor attributes, we show that multiple previous studies must be re-considered, including studies regarding the best model features, the impact of noisy attributes, and strategies for active learning.

Product Selection Problem: Improve Market Share by Learning Consumer Behavior

It is often crucial for manufacturers to decide what products to produce so that they can increase their market share in an increasingly fierce market. To decide which products to produce, manufacturers need to analyze the consumers' requirements and how consumers make their purchase decisions so that the new products will be competitive in the market. In this paper, we first present a general distance-based product adoption model to capture consumers' purchase behavior. Using this model, various distance metrics can be used to describe different real life purchase behavior. We then provide a learning algorithm to decide which set of distance metrics one should use when we are given some accessible historical purchase data. Based on the product adoption model, we formalize the {\em \mbox{$k$ most} marketable products (or $k$-$\MMP$)} selection problem and formally prove that the problem is {\em NP-hard}. To tackle this problem, we propose an efficient greedy-based approximation algorithm with a provable solution guarantee. Using submodularity analysis, we prove that our approximation algorithm can achieve at least 63\% of the optimal solution. We apply our algorithm on both synthetic datasets and real-world datasets (, and show that our algorithm can easily achieve five or more orders of speedup over the exhaustive search and achieve about 96\% of the optimal solution on average. Our experiments also demonstrate the robustness of our distance metric learning method, and illustrate how one can adopt it to improve the accuracy of product selection.

The Convergence Behavior of Naive Bayes on Large Sparse Datasets

Large and sparse datasets with a lot of missing values are common in the big data era, such as user behaviors over a large number of items. Classification in such datasets is an important topic for machine learning and data mining. Practically, naive Bayes is still a popular classification algorithm for large sparse datasets, as its time and space complexity scales linearly with the size of non-missing values. However, several important questions about the behavior of naive Bayes are yet to be answered. For example, how different mechanisms of missing, data sparsity and the number of attributes systematically affect the learning curves and convergence? In this paper, we address several common data missing mechanisms and propose novel data generation methods based on these mechanisms. We generate large and sparse data systematically, and study the entire AUC (Area Under ROC Curve) learning curve and convergence behavior of naive Bayes. We not only have several important experiment observations, but also provide detailed theoretic studies. Our empirical and theoretic results provide a useful guideline for classifying large sparse datasets with naive Bayes.

Catching Synchronized Behaviors in Large Networks: A Graph Mining Approach

Given a directed graph of millions of nodes, how can we automatically spot anomalous, suspicious nodes, judging only from their connectivity patterns? Suspicious graph patterns show up in many applications, from Twitter users who buy fake followers, manipulating the social network, to botnet members performing distributed denial of service attacks, disturbing the network traffic graph. We propose a fast and effective method, CATCHSYNC, which exploits two of the tell-tale signs left in graphs by fraudsters: (a) synchronized behavior: suspicious nodes have extremely similar behavior pattern, because they are often required to perform some task together (such as follow the same user); and (b) rare behavior: their connectivity patterns are very different from the majority. We introduce novel measures to quantify both concepts (synchronicity and normality) and we propose a parameter-free algorithm that works on the resulting synchronicity-normality plots. Thanks to careful design, CATCHSYNC has the following desirable properties: (a) it is scalable to large datasets, being linear on the graph size; (b) it is parameter free; and (c) it is side-information-oblivious: it can operate using only the topology, without needing labeled data, nor timing information, etc., while still capable of using side information, if available. We applied CATCHSYNC on three large, real datasets 1-billion-edge Twitter social graph, 3-billion-edge and 12-billion-edge Tencent Weibo social graphs, and several synthetic ones; CATCHSYNC consistently outperforms existing competitors, both in detection accuracy by 36% on Twitter and 20% on Tencent Weibo, as well as in speed.

Parallel Field Ranking

Inferring Dynamic Diffusion Networks in Online Media

Online media play an important role in information societies by providing a convenient infrastructure for different processes. Information diffusion that is a fundamental process taking place on social and information networks has been investigated in many studies. Research on information diffusion in these networks faces two main challenges: 1) In most cases diffusion takes place on an underlying network which is latent and its structure is unknown. 2) This latent network is not fixed and changes over time. In this paper, we investigate the diffusion network extraction problem when the underlying network is dynamic and latent. We model the diffusion behavior (existence probability) of each edge as a stochastic process and utilize the Hidden Markov Model to discover the most probable diffusion links according to the current observation of the diffusion process, which is the infection time of nodes and the past diffusion behavior of links. We evaluate the performance of our Dynamic Diffusion Network Extraction (DDNE) method, on both synthetic and real datasets. Experimental results show that the performance of the proposed method is independent of the cascade transmission model and outperforms the state of art method in terms of F-measure.

Latent Time-Series Motifs

Motifs are the most repetitive/frequent patterns of a time-series. The discovery of motifs is crucial for practitioners in order to understand and interpret the phenomena occurring in sequential data. Currently, motifs are searched among series sub-sequences, aiming at selecting the most frequently occurring ones. Search-based methods, which try out series sub-sequence as motif candidates, are currently believed to be the best methods in finding the most frequent patterns. However, this paper proposes an entirely new perspective in finding motifs. We demonstrate that searching is non-optimal since the domain of motifs is restricted, and instead we propose a principled optimization approach able to find optimal motifs. We treat the occurrence frequency as a function and time-series motifs as its parameters, therefore we learn the optimal motifs that maximize the frequency function. In contrast to searching, our method is able to discover the most repetitive patterns (hence optimal), even in cases where they do not explicitly occur as sub-sequences. Experiments on several real-life time-series datasets show that the motifs found by our method are highly more frequent than the ones found through searching, for exactly the same distance threshold.

Distributed Algorithms for Computing Very Large Thresholded Covariance Matrices

Modeling of Geographical Dependencies for Real Estate Ranking

It is traditionally a challenge for home buyers to understand, compare and contrast the investment values of estates. While a number of estate appraisal methods have been developed to value real properties, the performances of these methods have been limited by the traditional data sources for estate appraisal. With the development of new ways of collecting estate-related mobile data, there is a potential to leverage geographic dependencies of estates for enhancing estate appraisal. Indeed, the geographic dependencies of the investment value of an estate can be from the characteristics of its own neighborhood (individual), the values of its nearby estates (peer), and the prosperity of the affiliated latent business area (zone). To this end, in this paper, we propose a geographic method, named ClusRanking, for estate appraisal by leveraging the mutual enforcement of ranking and clustering power. ClusRanking is able to exploit geographic individual, peer, and zone dependencies in a probabilistic ranking model. Specifically, we first extract the geographic utility of estates from geography data, estimate the neighborhood popularity of estates by mining taxicab trajectory data, and model the influence of latent business areas. Also, we fuse these three influential factors and predict real estate investment value. Moreover, we simultaneously consider individual, peer and zone dependencies, and derive an estate-specific ranking likelihood as the objective function. Furthermore, we propose an improved method named CR-ClusRanking by incorporating checkin information as a regularization term which reduces the performance volatility of estate ranking system. Finally, we conduct a comprehensive evaluation with the real estate related data of Beijing, and the experimental results demonstrate the effectiveness of our proposed methods.

Spatial-Proximity Optimization for Rapid Task Group Deployment

Spatial proximity is one of the most important factors for the quick deployment of the task groups in various time-sensitive missions. This paper proposes a new spatial query, Spatio-Social Team Query (SSTQ), that forms a strong task group by considering 1) the groups spatial distance (i.e., transportation time), 2) skills of the candidate group members, and 3) social rapport among the candidates. Efficient processing of SSTQ is very challenging, because the aforementioned spatial, skill, and social factors need to be carefully examined. In this paper, therefore, we first formulate two subproblems of SSTQ, namely Hop-Constrained Team Problem (HCTP) and Connection-Oriented Team Query (COTQ). HCTP is a decision problem that considers only social and skill dimensions. We prove that HCTP is NP-Complete. Moreover, based on the hardness of HCTP, we prove that SSTQ is NP-Hard and inapproximable within any factor. On the other hand, COTQ is a special case of SSTQ that relaxes the social constraint. We prove that COTQ is NP-Hard and propose an approximation algorithm for COTQ, COTprox, that achieves the best approximation ratio. Furthermore, based on the observations on COTprox, we devise an approximation algorithm, SSTprox, with a guaranteed error bound for SSTQ. Finally, to efficiently obtain the optimal solution to SSTQ for small instances, we design two efficient algorithms, SpatialFirst and SkillFirst, with different scenarios in mind. These two algorithms incorporate various effective ordering and pruning techniques to reduce the search space for answering SSTQ. Experimental results on real datasets indicate that the proposed algorithms can efficiently answer SSTQ under various parameter settings.

Eigen-Optimization on Large Graphs by Edge Manipulation

Batch Mode Active Sampling based on Marginal Probability Distribution Matching

Convex Sparse PCA for Unsupervised Feature Analysis

Principal component analysis (PCA) has been widely applied to dimensionality reduction and data pre-processing for different applications in engineering, biology and social science. Classical PCA and its variants seek for linear projections of the original variables to obtain a low dimensional feature representation with maximal variance. One limitation is that it is very difficult to interpret the results of PCA. In addition, the classical PCA is vulnerable to certain noisy data. In this paper, we propose a convex sparse principal component analysis (CSPCA) algorithm and apply it to feature analysis. First we show that PCA can be formulated as a low-rank regression optimization problem. Based on the discussion, the $l_{2,1}$-norm minimization is incorporated into the objective function to make the regression coefficients sparse, thereby robust to the outliers. In addition, based on the sparse model used in CSPCA, an optimal weight is assigned to each of the original feature, which in turn provides the output with good interpretability. With the output of our CSPCA, we can effectively analyze the importance of each feature under the PCA criteria. The objective function is convex, and we propose an iterative algorithm to optimize it. We apply the CSPCA algorithm to feature selection and conduct extensive experiments on six different benchmark datasets. Experimental results demonstrate that the proposed algorithm outperforms state-of-the-art unsupervised feature selection algorithms.

A Framework for Exploiting Local Information to Enhance Density Estimation of Data Streams

Differentially-Private Multidimensional Data Publishing

Top Downloaded Articles

L-diversity: Privacy beyond k-anonymity

Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering

Addressing Big Data Time Series: Mining Trillions of Time Series Subsequences Under Dynamic Time Warping

Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection

Socializing by Gaming: Revealing Social Relationships in Multiplayer Online Games

Multimodal Data Mining in a Multimedia Database Based on Structured Max Margin Learning

User Identification Across Social Media

Mining User Development Signals for Online Community Churner Detection

Mining Influencers Using Information Flows in Social Streams

Mining Product Adopter Information from Online Reviews for Improving Product Recommendation


Publication Years 2007-2016
Publication Count 240
Citation Count 1893
Available for Download 240
Downloads (6 weeks) 3126
Downloads (12 Months) 29079
Downloads (cumulative) 166346
Average downloads per article 693
Average citations per article 8
First Name Last Name Award
John Canny ACM Doctoral Dissertation Award (1987)
Carlos A. Castillo ACM Senior Member (2014)
Chris Clifton ACM Senior Member (2006)
Graham R. Cormode ACM Distinguished Member (2013)
Benjamin C. M. Fung ACM Senior Member (2013)
John E Hopcroft ACM Karl V. Karlstrom Outstanding Educator Award (2008)
ACM A. M. Turing Award (1986)
Piotr Indyk ACM Paris Kanellakis Theory and Practice Award (2012)
Jon Kleinberg ACM AAAI Allen Newell Award (2014)
ACM-Infosys Foundation Award in the Computing Sciences (2008)
Chih-Jen Lin ACM Distinguished Member (2011)
ACM Senior Member (2010)
Sethuraman Panchanathan ACM Senior Member (2009)
Jian Pei ACM Senior Member (2007)
Domenico Sacca ACM Senior Member (2007)
Qiang Yang ACM Distinguished Member (2011)

First Name Last Name Paper Counts
Christos Faloutsos 10
Jieping Ye 7
Tao Li 5
Philip Yu 4
Jian Pei 4
Shenghuo Zhu 4
Aristides Gionis 4
Huan Liu 4
John Hopcroft 3
Lise Getoor 3
Jure Leskovec 3
Hui Xiong 3
Malik Magdon-Ismail 3
Jilles Vreeken 3
Zhihua Zhou 3
John Lui 3
Lei Tang 3
Heng Huang 3
Yun Chi 3
Yasushi Sakurai 3
Evimaria Terzi 3
Yihong Gong 3
Feiping Nie 3
Hong Cheng 3
Dingding Wang 3
Fabio Fassetti 3
Christopher Jermaine 3
Fabrizio Angiulli 3
Antonella Guzzo 2
Shinjae Yoo 2
Andrea Esuli 2
Jilei Tian 2
Ping Luo 2
Yuru Lin 2
Ian Davidson 2
Guofei Jiang 2
Jiawei Han 2
Daniel Kifer 2
Antônio Loureiro 2
Laks Lakshmanan 2
Jimeng Sun 2
Rita Chattopadhyay 2
Sucheta Soundarajan 2
Don Towsley 2
Enhong CHEN 2
Qi Liu 2
Belle Tseng 2
Mingsyan Chen 2
Vivekanand Gopalkrishnan 2
Jie Tang 2
Xiaoli Fern 2
Eugene Agichtein 2
Charalampos Tsourakakis 2
Dantong Yu 2
Carlotta Domeniconi 2
Sanjay Ranka 2
Jiliang Tang 2
Srinivasan Parthasarathy 2
Hari Sundaram 2
Spiros Papadimitriou 2
Joydeep Ghosh 2
Dino Pedreschi 2
Hong Qin 2
Hao Huang 2
Yehuda Koren 2
Pinghui Wang 2
Heikki Mannila 2
Panayiotis Tsaparas 2
Jianhui Chen 2
Fabrizio Sebastiani 2
Yu Zhang 2
Arthur Zimek 2
Michalis Vazirgiannis 2
Jin Huang 2
Geoffrey Webb 2
Panagis Magdalinos 2
Indrajit Bhattacharya 2
Xiao Yu 2
Junzhou Zhao 2
Xiaohong Guan 2
Yu Lei 1
Maria Sapino 1
Shipeng Yu 1
Zhiting Hu 1
Pedro Melo 1
Yuan Jiang 1
Matteo Riondato 1
Qinbao Song 1
Michele Coscia 1
Yi Wang 1
Lian Duan 1
Siyuan Liu 1
Bruno Ribeiro 1
Charles Elkan 1
Jaideep Srivastava 1
João Gama 1
Julian McAuley 1
Carlos Guestrin 1
Tomoharu Iwata 1
Naonori Ueda 1
Qi Lou 1
Wei Fan 1
Xifeng Yan 1
Kosuke Hashimoto 1
Nobuhisa Ueda 1
Luigi Pontieri 1
Bingrong Lin 1
Francesco Bonchi 1
Wei Ding 1
Pei Yang 1
Leonid Hrebien 1
Lei Zhang 1
Jie Tang 1
Aparna Varde 1
Haiqin Yang 1
Ricardo Campello 1
Xianchao Zhang 1
Shuhui Wang 1
Pedro Vaz De Melo 1
Michael Houle 1
Jeffrey Chan 1
Dimitrios Gunopulos 1
Daxin Jiang 1
Muna Al-Razgan 1
Mohsen Bayati 1
Peilin Zhao 1
Raymond Wong 1
Ada Fu 1
Noman Mohammed 1
Chao Liu 1
Dacheng Tao 1
Jaideep Vaidya 1
Collin Stultz 1
Boleslaw Szymanski 1
Maguelonne Teisseire 1
Paolo Boldi 1
Lini Thomas 1
Sachindra Joshi 1
Tharam Dillon 1
Yixin Chen 1
Xuanhong Dang 1
Shumo Chu 1
Kasim Candan 1
Yong Ge 1
S Upham 1
Sunil Vadera 1
Thomas Porta 1
Hongzhi Yin 1
Jeffrey Erman 1
Ming Li 1
Joydeep Ghosh 1
Dora Erdős 1
Kaiyuan Zhang 1
Carlos Ordonez 1
Fosca Giannotti 1
James Cheng 1
Li Zheng 1
U Kang 1
Peter Christen 1
Daniel Dunlavy 1
Christos Doulkeridis 1
Joao Duarte 1
David Dominguez-Sal 1
Christo Wilson 1
Ben Zhao 1
Danai Koutra 1
Steven Skiena 1
Hiroshi Motoda 1
Chris Volinsky 1
Andreas Krause 1
Hsiangfu Yu 1
Aditya Parameswaran 1
Binbin Lin 1
Johannes Gehrke 1
Yijuan Lu 1
Feng Liu 1
Yufeng Wang 1
Ernest Garcia 1
Shamkant Navathe 1
Seekiong Ng 1
Hong Xie 1
Kui Yu 1
Cheng Zeng 1
Wei Fan 1
Atreya Srivathsan 1
Tong Sun 1
Rezwan Ahmed 1
Duygu Ucar 1
Wei Wei 1
Mustafa Bilgic 1
Ben Kao 1
David Cheung 1
Christopher Leckie 1
Ron Eyal 1
Avi Rosenfeld 1
Asaf Shabtai 1
Shifeng Weng 1
Kun Liu 1
Duo Zhang 1
Dmitry Pavlov 1
Raymond Ng 1
Piotr Indyk 1
Christopher Carothers 1
Anne Laurent 1
Satyanarayana Valluri 1
Ashish Verma 1
Jérémy Besson 1
Raghu Ramakrishnan 1
Rong Ge 1
Byronju Gao 1
Li Tu 1
Saharon Rosset 1
Claudia Perlich 1
Tuannhon Dang 1
Ramana Kompella 1
Chengkai Li 1
Vasileios Kandylas 1
Limin Yao 1
Kristina Lerman 1
Cheukkwong Lee 1
Olvi Mangasarian 1
Chris Clifton 1
Mohammed Zaki 1
Jennifer Dy 1
Shaojun Wang 1
Loïc Cerf 1
Henry Tan 1
Jennifer Neville 1
Min Ding 1
Gensheng Zhang 1
Yiming Yang 1
Vassilios Vassiliadis 1
Kaiming Ting 1
Christophe Giraud-Carrier 1
Ayan Acharya 1
Sreangsu Acharyya 1
Arnold Boedihardjo 1
Changtien Lu 1
Zhiqiang Xu 1
Geoffrey Barbier 1
Aditya Menon 1
Zhongfei Zhang 1
Matthew Rowe 1
Edward Chang 1
Longjie Li 1
Bruno Abrahão 1
Xiaolin Wang 1
Tingting Gao 1
Kazumi Saito 1
ChengXiang Zhai 1
Dong Xin 1
Christian Böhm 1
Dafna Shahaf 1
Stephen Fienberg 1
Raviv Raich 1
Bilson Campana 1
Vibhor Rastogi 1
Deng Cai 1
T Murali 1
Kiyoko Aoki-Kinoshita 1
Ravi Janardan 1
Sudhir Kumar 1
Salvatore Ruggieri 1
Jing Zhang 1
Rodrigo Alves 1
Juhua Hu 1
Yu Jin 1
Giulio Rossetti 1
Veerabhadran Baladandayuthapani 1
Yanchi Liu 1
Songhua Xu 1
Timothy De Vries 1
Eric Xing 1
Albert Bifet 1
Xiaoming Li 1
Josep Brunat 1
Jiang Bian 1
Padhraic Smyth 1
Claudia Plant 1
Jiayu Pan 1
Brandon Westover 1
Eamonn Keogh 1
Dawei Zhou 1
Hongxia Yang 1
Haoda Fu 1
Jingrui He 1
Fernando Kuipers 1
Dick Epema 1
Linpeng Tang 1
Min Wang 1
Nicholas Sidiropoulos 1
Michael Lyu 1
Dityan Yeung 1
Evangelos Papalexakis 1
George Karypis 1
Jilei Tian 1
Bin Guo 1
Qiang Qu 1
Davoud Moulavi 1
Koji Hino 1
Masaru Kitsuregawa 1
Xiang Zhang 1
Jenwei Huang 1
James Bailey 1
Jianping Zhang 1
Manas Somaiya 1
Graham Cormode 1
Maya Bercovitch 1
Bin Li 1
Antti Ukkonen 1
Francesco Lupia 1
Nima Mirbakhsh 1
Xindong Wu 1
Beilun Wang 1
Siqi Shen 1
Lei Li 1
Tiancheng Lou 1
Guna Seetharaman 1
Xinran He 1
Giacomo Berardi 1
Zhu Wang 1
Xiaotong Zhang 1
Han Liu 1
Kathleen Carley 1
Xiaodan Song 1
Yasuhiro Fujiwara 1
Wei Wang 1
ChienWei Chen 1
Weiyin Loh 1
John Salerno 1
Nitin Kumar 1
Flip Korn 1
Ying Wang 1
Ke Wang 1
Jing Zhang 1
Benoît Dumoulin 1
Xiuyao Song 1
John Gums 1
Yin Zhang 1
Yunxin Zhao 1
Zhongfei Zhang 1
Jude Shavlik 1
Qian Sun 1
Sibel Adalı 1
Xiaohui Lu 1
Domenico Saccà 1
Zheng Wang 1
Bin Cui 1
Chengqi Zhang 1
Juanzi Li 1
Johannes Schneider 1
Patrick Haffner 1
Zhili Zhang 1
Qingyan Yang 1
Scott Burton 1
Christos Boutsidis 1
Marc Maier 1
Mohamed Bouguessa 1
Mingxi Wu 1
John Canny 1
Benjamin Fung 1
Ye Chen 1
Dominique Laurent 1
Yeowwei Choong 1
Meghana Deodhar 1
Luca Becchetti 1
Ying Cui 1
Keli Xiao 1
Bo Long 1
Hans Kriegel 1
Martin Ester 1
Gunjan Gupta 1
Ling Feng 1
Diana Inkpen 1
Kuan Zhang 1
Vetle Torvik 1
Wei Fan 1
Luigi Moccia 1
Edoardo Serra 1
Claudio Schifanella 1
Nesreen Ahmed 1
Min Wang 1
Ali Pınar 1
Shuiwang Ji 1
Michail Vlachos 1
Ling Chen 1
Yang Liu 1
Chunxiao Xing 1
Dechuan Zhan 1
Ruggero Pensa 1
Saurabh Paul 1
Jose Hern´ndez-Orallo 1
Xueying Zhang 1
Rainer Gemulla 1
Eli Upfal 1
Guangtao Wang 1
Yiping Ke 1
William Street 1
Lionel Ni 1
Evrim Acar 1
Yang Zhou 1
Charu Aggarwal 1
Bingsheng Wang 1
Chris Ding 1
Hui Ke 1
Tamara Kolda 1
Jie Wang 1
Yulan He 1
Karthik Subbian 1
Galileo Namata 1
John Frenzel MD 1
Yandong Liu 1
Hua Duan 1
Erheng Zhong 1
Wei Fan 1
Qiang Yang 1
Joshua Vogelstein 1
Qiaozhu Mei 1
Takeshi Yamada 1
Suresh Iyengar 1
Jiawei Han 1
Ashwin Machanavajjhala 1
Saurav Sahay 1
Charles Ling 1
Mengling Feng 1
Jieping Ye 1
Moshe Kam 1
Luming Zhang 1
Lei Zou 1
Jian Wang 1
Ruud Van De Bovenkamp 1
Xiaowen Ding 1
Manos Papagelis 1
Clyde Giles 1
Wei Peng 1
B Prakash 1
Jörg Sander 1
Siyuan Liu 1
Maria Halkidi 1
David Gleich 1
Steven Hoi 1
David Jensen 1
Glenn Fung 1
Zeeshan Syed 1
Kamalakar Karlapalem 1
Dale Schuurmans 1
Dimitrios Mavroeidis 1
Jean Boulicaut 1
Peer Kröger 1
Céline Robardet 1
Pradeep Tamma 1
Zengjian Hu 1
Boaz Ben-Moshe 1
Neil Smalheiser 1
Ori Stitelman 1
James Cheng 1
Shachar Kaufman 1
Nikolaj Tatti 1
Weekeong Ng 1
Leland Wilkinson 1
Jirong Wen 1
Joseph Ruiz Md 1
Ben London 1
Neil Shah 1
Masahiro Kimura 1
Alexander Ihler 1
Kaiwei Chang 1
Forrest Briggs 1
Gustavo Batista 1
Qiang Zhu 1
Philip Yu 1
Jure Leskovec 1
Jon Kleinberg 1
Naren Ramakrishnan 1
Qi Tian 1
Jennifer Neary 1
Minoru Kanehisa 1
Gianluigi Greco 1
Francesco Gullo 1
Guimei Liu 1
Theodoros Lappas 1
Yanjun Qi 1
Adelelu Jia 1
Alexandru Iosup 1
Reza Zafarani 1
Aniket Chakrabarti 1
Saurabh Kataria 1
Irwin King 1
Ling Liu 1
Huilei He 1
Hua Wang 1
Fei Zou 1
Virgílio Almeida 1
Christos Faloutsos 1
Laiwan Chan 1
Nitin Agarwal 1
S Muthukrishnan 1
Anthony Tung 1
Kunta Chuang 1
Sarit Kraus 1
Sigal Sina 1
Chris Ding 1
Lior Rokach 1
Amin Saberi 1
Dityan Yeung 1
Matthew Rattigan 1
José Balcázar 1
Hockhee Ang 1
Steven Hoi 1
Xiao Jiang 1
Lyle Ungar 1
Comandur Seshadhri 1
Franco Turini 1
Luan Tang 1
Quanquan Gu 1
Xintao Wu 1
Nick Duffield 1
Jianyong Wang 1
Chun Li 1
Feitony Liu 1
Petros Drineas 1
Tengfei Bao 1
Brook Wu 1
Sanjay Chawla 1
Jinpeng Wang 1
Arnau Prat-Pérez 1
Josep Larriba-Pey 1
Risa Myers 1
Ruoming Jin 1
Victor Lee 1
Robert Kleinberg 1
Zhi Yang 1
Yafei Dai 1
Qingtian Zeng 1
Brian Gallagher 1
John Hutchins 1
Taneli Mielikäinen 1
Ji Liu 1
Manuel Gomez-Rodriguez 1
Sethuraman Panchanathan 1
Abdullah Mueen 1
Yizhou Sun 1
Xiaofei He 1
Muthuramakrishnan Venkitasubramaniam 1
Ying Jin 1
Hiroshi Mamitsuka 1
Dan Simovici 1
Hao Wang 1
Zhiwen Yu 1
Sitaram Asur 1
Jerry Kiernan 1
Kevin Yip 1
Wei Zheng 1
Zhenxing Wang 1
Yuval Elovici 1
Ming Lin 1
Changshui Zhang 1
Ravi Konuru 1
Fan Guo 1
Edward Wild 1
Murat Kantarcıoğlu 1
John Guttag 1
Marc Plantevit 1
Jinlin Chen 1
Shantanu Godbole 1
Alin Dobra 1
Binay Bhattacharya 1
Bin Zhou 1
Anushka Anand 1
Yicheng Tu 1
Siddharth Gopal 1
Madhav Jha 1
Alice Leung 1
Renato Assunção 1
Subhabrata Sen 1
Dino Ienco 1
Rosa Meo 1
Eduardo Hruschka 1
Hongliang Fei 1
Jun Huan 1
Pauli Miettinen 1
Carlos Garcia-Alvarado 1
Baoxing Huai 1
Hengshu Zhu 1
Pritam Gundecha 1
Lei Chen 1
Ana Appel 1
Jeffreyxu Yu 1
Zhen Guo 1
Yashu Liu 1
Waynexin Zhao 1
Faming Lu 1
Andrew Mehler 1
Stephen North 1
Seungil Huh 1
Chojui Hsieh 1
Chihjen Lin 1
Zheng Wang 1
Thanawin Rakthanmanon 1
Jesin Zakaria 1
Kedar Bellare 1
Brandon Norick 1
Jiawei Han 1
Ming Ji 1
Sougata Mukherjea 1
Ashwin Ram 1
Haojun Zhang 1
Limsoon Wong 1
Feiyu Xiong 1
Liang Hong 1
Zhanpeng Fang 1
Jing Peng 1
Hunghsuan Chen 1
Venu Satuluri 1
Rose Yu 1
Yan Liu 1
Yao Zhang 1
Yang Zhou 1
Xinjiang Lu 1
Dengyong Zhou 1
Ming Zhang 1
Biru Dai 1
Divesh Srivastava 1
Zhenjie Zhang 1
Hungleng Chen 1
Aisling Kelliher 1
Paul Castro 1
Anon Plangprasopchok 1
Shengrui Wang 1
Patrick Hung 1
Ganesh Ramesh 1
A Patterson 1
Manolis Kellis 1
Carlos Castillo 1
Tianbing Xu 1
Sanmay Das 1
Amit Dhurandhar 1
Beechung Chen 1
Fedja Hadzic 1
Elizabeth Chang 1
Aminul Islam 1
Li Wan 1
Weekeong Ng 1
Sethuraman Panchanathan 1
Michael Mampaey 1

Affiliation Paper Counts
University of Milan 1
Syracuse University 1
University of Queensland 1
Curtin University of Technology, Perth 1
University of Roma La Sapienza 1
University of the Saarland 1
Institute of Mathematics and Informatics Lithuanian 1, Inc. 1
Harvard School of Engineering and Applied Sciences 1
Ariel University Center of Samaria 1
Siemens USA 1
Microsoft Research Asia 1
Innopolis University 1
Ryukoku University 1
Cemagref 1
University of Michigan 1
Anhui University 1
University of Ontario Institute of Technology 1
Universite de Cergy-Pontoise 1
Princeton University 1
Queens College, City University of New York 1
University of Arkansas - Fayetteville 1
Yale University 1
University of Missouri-Columbia 1
John F. Kennedy School of Government 1
The University of North Carolina at Charlotte 1
University of South Florida Tampa 1
Valley Laboratory 1
University of Salford 1
Australian National University 1
University of Texas at Dallas 1
University of Vermont 1
Nanjing University of Science and Technology 1
Washington University in St. Louis 1
HP Labs 1
Universidad Politecnica de Valencia 1
BBN Technologies 1
Air Force Research Laboratory Information Directorate 1
University of Shizuoka 1
MITRE Corporation 1
Norwegian University of Science and Technology 1
Indian Institute of Science 1
Zhejiang Wanli University 1
Aston University 1
Colorado School of Mines 1
University of Southern California, Information Sciences Institute 1
John Carroll University 1
Radboud University Nijmegen 1
Brigham and Women's Hospital 1
University of Toronto 1
Wright State University 1
Singapore Management University 1
Air Force Research Laboratory 1
Universite Montpellier 2 Sciences et Techniques 1
Nanjing University of Aeronautics and Astronautics 1
Industrial Technology Research Institute of Taiwan 1
Hong Kong Red Cross Blood Transfusion Service 1
Nokia USA 1
Universite Claude Bernard Lyon 1 1
Lancaster University 1
Osaka University 1
University of Iowa 1
University of California, Berkeley 1
Shanghai Jiaotong University 1
Wright-Patterson AFB 1
Eli Lilly and Company 1
Swiss Federal Institute of Technology, Zurich 1
Lawrence Livermore National Laboratory 1
Stevens Institute of Technology 1
Jerusalem College of Technology 1
National Taiwan University of Science and Technology 1
Oracle Corporation 1
Lanzhou University 1
Northeastern University 1
Research Organization of Information and Systems National Institute of Informatics 1
Kent State University 1
Max Planck Institute for Informatics 2
Hefei University of Technology 2
Zhejiang University 2
Institute of High Performance Computing, Singapore 2
Johns Hopkins University 2
Tel Aviv University 2
University of Minnesota System 2
University of Houston 2
The University of Hong Kong 2
Brigham Young University 2
The University of Western Ontario 2
Brown University 2
Montclair State University 2
Hong Kong Baptist University 2
Renmin University of China 2
University of California, Davis 2
Drexel University 2
University of Texas M. D. Anderson Cancer Center 2
University of Kansas Lawrence 2
Chinese Academy of Sciences 2
University of Quebec in Outaouais 2
Institute for Systems and Computer Engineering of Porto 2
University of Virginia 2
University of Massachusetts Boston 2
University of Tokyo 2
University Michigan Ann Arbor 2
Nokia 2
University of Ottawa, Canada 2
University of Athens 2
IBM Zurich Research Laboratory 2
University of California, San Diego 2
Rutgers University 2
Istituto di Scienza e Tecnologie dell'Informazione A. Faedo 2
Qatar Computing Research institute 2
International Institute of Information Technology Hyderabad 3
Shandong University of Science and Technology 3
Bar-Ilan University 3
Dalian University of Technology 3
Rice University 3
University of Pennsylvania 3
University of California, Irvine 3
University of Sao Paulo 3
The University of British Columbia 3
IBM Research 3
INSA Lyon 3
George Mason University 3
Xerox Corporation 3
Binghamton University State University of New York 3
Italian National Research Council 3
The University of North Carolina at Chapel Hill 3
University of Sydney 3
Microsoft 3
University of Melbourne 3
University of California, Santa Barbara 3
Wuhan University 3
University of Southern California 3
University of Alberta 3
Rutgers University-Newark Campus 4
Emory University 4
Institute for Infocomm Research, A-Star, Singapore 4
Google Inc. 4
Brookhaven National Laboratory 4
Universitat Politecnica de Catalunya 4
University of Antwerp 4
National University of Singapore 4
Athens University of Economics and Business 4
Monash University 4
Boston University 4
Massachusetts Institute of Technology 4
Ben-Gurion University of the Negev 4
University of Pisa 4
Yahoo Research Barcelona 4
Aalto University 4
Pennsylvania State University 5
University of Texas at San Antonio 5
Ohio State University 5
Northwestern Polytechnical University China 5
Purdue University 5
Kyoto University 5
University of Turin 5
Oregon State University 5
Sandia National Laboratories 5
Microsoft Research 5
New Jersey Institute of Technology 5
University of Technology Sydney 5
Delft University of Technology 6
AT&T Laboratories Florham Park 6
University of Massachusetts Amherst 6
Georgia Institute of Technology 6
Nippon Telegraph & Telephone 6
Stony Brook University 6
Ludwig Maximilian University of Munich 6
University of Minnesota Twin Cities 6
Yahoo Inc. 6
Hong Kong University of Science and Technology 7
University of Florida 7
Peking University 7
University of Science and Technology of China 7
University of Maryland 7
Virginia Tech 7
University of California, Riverside 7
Federal University of Minas Gerais 7
University of Wisconsin Madison 7
Nanjing University 7
Nanyang Technological University 8
Stanford University 8
University of Texas at Austin 8
IBM Thomas J. Watson Research Center 8
Xi'an Jiaotong University 8
Yahoo Research Labs 8
National Taiwan University 9
Florida International University 9
University of Illinois at Chicago 10
Rensselaer Polytechnic Institute 11
Cornell University 12
Simon Fraser University 12
University of Calabria 12
University of Texas at Arlington 13
University of Illinois at Urbana-Champaign 15
NEC Laboratories America, Inc. 15
Chinese University of Hong Kong 17
Tsinghua University 18
Carnegie Mellon University 29
Arizona State University 41
All ACM Journals | See Full Journal Index