Distance De Mahalanobis Classification Essay

Simple Summary

This paper is an extended version of our paper published in the 1st International Electronic Conference on Entropy and Its Applications (www.sciforum.net/conference/ecea-1).


Network anomaly detection and classification is an important open issue in network security. Several approaches and systems based on different mathematical tools have been studied and developed, among them, the Anomaly-Network Intrusion Detection System (A-NIDS), which monitors network traffic and compares it against an established baseline of a “normal” traffic profile. Then, it is necessary to characterize the “normal” Internet traffic. This paper presents an approach for anomaly detection and classification based on Shannon, Rényi and Tsallis entropies of selected features, and the construction of regions from entropy data employing the Mahalanobis distance (MD), and One Class Support Vector Machine (OC-SVM) with different kernels (Radial Basis Function (RBF) and Mahalanobis Kernel (MK)) for “normal” and abnormal traffic. Regular and non-regular regions built from “normal” traffic profiles allow anomaly detection, while the classification is performed under the assumption that regions corresponding to the attack classes have been previously characterized. Although this approach allows the use of as many features as required, only four well-known significant features were selected in our case. In order to evaluate our approach, two different data sets were used: one set of real traffic obtained from an Academic Local Area Network (LAN), and the other a subset of the 1998 MIT-DARPA set. For these data sets, a True positive rate up to 99.35%, a True negative rate up to 99.83% and a False negative rate at about 0.16% were yielded. Experimental results show that certain q-values of the generalized entropies and the use of OC-SVM with RBF kernel improve the detection rate in the detection stage, while the novel inclusion of MK kernel in OC-SVM and k-temporal nearest neighbors improve accuracy in classification. In addition, the results show that using the Box-Cox transformation, the Mahalanobis distance yielded high detection rates with an efficient computation time, while OC-SVM achieved detection rates slightly higher, but is more computationally expensive. View Full-Text

Keywords: generalized entropies; network traffic; anomaly detection; OC-SVM; Mahalanobis kernel; Mahalanobis distance; non-Gaussian datageneralized entropies; network traffic; anomaly detection; OC-SVM; Mahalanobis kernel; Mahalanobis distance; non-Gaussian data

►▼ Figures

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

Share & Cite This Article

Title: Time Series Classification by Class-Specific Mahalanobis Distance Measures

Authors:Zoltán Prekopcsák, Daniel Lemire

(Submitted on 7 Oct 2010 (v1), last revised 2 Jul 2012 (this version, v6))

Abstract: To classify time series by nearest neighbors, we need to specify or learn one or several distance measures. We consider variations of the Mahalanobis distance measures which rely on the inverse covariance matrix of the data. Unfortunately --- for time series data --- the covariance matrix has often low rank. To alleviate this problem we can either use a pseudoinverse, covariance shrinking or limit the matrix to its diagonal. We review these alternatives and benchmark them against competitive methods such as the related Large Margin Nearest Neighbor Classification (LMNN) and the Dynamic Time Warping (DTW) distance. As we expected, we find that the DTW is superior, but the Mahalanobis distance measures are one to two orders of magnitude faster. To get best results with Mahalanobis distance measures, we recommend learning one distance measure per class using either covariance shrinking or the diagonal approach.

Submission history

From: Zoltan Prekopcsak [view email]
[v1] Thu, 7 Oct 2010 19:48:23 GMT (33kb,DS)
[v2] Sat, 16 Apr 2011 01:25:57 GMT (28kb,D)
[v3] Mon, 27 Jun 2011 09:20:28 GMT (148kb,D)
[v4] Fri, 30 Dec 2011 22:21:45 GMT (178kb,D)
[v5] Tue, 22 May 2012 10:02:44 GMT (189kb,D)
[v6] Mon, 2 Jul 2012 20:57:01 GMT (190kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

0 Replies to “Distance De Mahalanobis Classification Essay”

Lascia un Commento

L'indirizzo email non verrà pubblicato. I campi obbligatori sono contrassegnati *