enter search term and/or author name
From Context to Distance: Learning Dissimilarity for Categorical Data Clustering
Dino Ienco, Ruggero G. Pensa, Rosa Meo
Article No.: 1
Clustering data described by categorical attributes is a challenging task in data mining applications. Unlike numerical attributes, it is difficult to define a distance between pairs of values of a categorical attribute, since the values are not...
Efficient Mining of Gap-Constrained Subsequences and Its Various Applications
Chun Li, Qingyan Yang, Jianyong Wang, Ming Li
Article No.: 2
Mining frequent subsequence patterns is a typical data-mining problem and various efficient sequential pattern mining algorithms have been proposed. In many application domains (e.g., biology), the frequent subsequences confined by the predefined...
Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation. This article proposes a method called Isolation Forest (iForest), which...
A Modular Machine Learning System for Flow-Level Traffic Classification in Large Networks
Yu Jin, Nick Duffield, Jeffrey Erman, Patrick Haffner, Subhabrata Sen, Zhi-Li Zhang
Article No.: 4
The ability to accurately and scalably classify network traffic is of critical importance to a wide range of management tasks of large networks, such as tier-1 ISP networks and global enterprise networks. Guided by the practical constraints and...