enter search term and/or author name
Distributed Algorithms for Computing Very Large Thresholded Covariance Matrices
Zekai J. Gao, Chris Jermaine
Article No.: 12
Computation of covariance matrices from observed data is an important problem, as such matrices are used in applications such as principal component analysis (PCA), linear discriminant analysis (LDA), and increasingly in the learning and...
World Knowledge as Indirect Supervision for Document Clustering
Chenguang Wang, Yangqiu Song, Dan Roth, Ming Zhang, Jiawei Han
Article No.: 13
One of the key obstacles in making learning protocols realistic in applications is the need to supervise them, a costly process that often requires hiring domain experts. We consider the framework to use the world knowledge as indirect...
The goal of community detection algorithms is to identify densely connected units within large networks. An implicit assumption is that all the constituent nodes belong equally to their associated community. However, some nodes are more important...
Partitioning Networks with Node Attributes by Compressing Information Flow
Laura M. Smith, Linhong Zhu, Kristina Lerman, Allon G. Percus
Article No.: 15
Real-world networks are often organized as modules or communities of similar nodes that serve as functional units. These networks are also rich in content, with nodes having distinguished features or attributes. In order to discover a...
Scalable and Accurate Online Feature Selection for Big Data
Kui Yu, Xindong Wu, Wei Ding, Jian Pei
Article No.: 16
Feature selection is important in many big data applications. Two critical challenges closely associate with big data. First, in many big data applications, the dimensionality is extremely high, in millions, and keeps growing. Second, big data...
Advances in smartphone technology have promoted the rapid development of mobile apps. However, the availability of a huge number of mobile apps in application stores has imposed the challenge of finding the right apps to meet the user needs....
A MapReduce algorithm can be described by a mapping schema, which assigns inputs to a set of reducers, such that for each required output there exists a reducer that receives all the inputs participating in the computation of this output....
Unsupervised Head--Modifier Detection in Search Queries
Zhongyuan Wang, Fang Wang, Haixun Wang, Zhirui Hu, Jun Yan, Fangtao Li, Ji-Rong Wen, Zhoujun Li
Article No.: 19
Interpreting the user intent in search queries is a key task in query understanding. Query intent classification has been widely studied. In this article, we go one step further to understand the query from the view of head--modifier analysis. For...
Lifecycle Modeling for Buzz Temporal Pattern Discovery
Yi Chang, Makoto Yamada, Antonio Ortega, Yan Liu
Article No.: 20
In social media analysis, one critical task is detecting a burst of topics or buzz, which is reflected by extremely frequent mentions of certain keywords in a short-time interval. Detecting buzz not only provides useful insights into the...
Competitiveness degree analysis is a focal point of business strategy and competitive intelligence, aimed to help managers closely monitor to what extent their rivals are competing with them. This article proposes a novel method, namely BCQ, to...
Comparing Clustering with Pairwise and Relative Constraints: A Unified Framework
Yuanli Pei, Xiaoli Z. Fern, Teresa Vania Tjahja, Rómer Rosales
Article No.: 22
Clustering can be improved with the help of side information about the similarity relationships among instances. Such information has been commonly represented by two types of constraints: pairwise constraints and relative...
Electronic concept maps, interlinked with other concept maps and multimedia resources, can provide rich knowledge models to capture and share human knowledge. This article presents and evaluates methods to support experts as they...
Adaptive Cluster Tendency Visualization and Anomaly Detection for Streaming Data
Dheeraj Kumar, James C. Bezdek, Sutharshan Rajasegarar, Marimuthu Palaniswami, Christopher Leckie, Jeffrey Chan, Jayavardhana Gubbi
Article No.: 24
The growth in pervasive network infrastructure called the Internet of Things (IoT) enables a wide range of physical objects and environments to be monitored in fine spatial and temporal detail. The detailed, dynamic data that are collected in...
With the explosion of smartphones and social network services, location-based social networks (LBSNs) are increasingly seen as tools for businesses (e.g., restaurants and hotels) to promote their products and services. In this article, we...