To develop algorithms for the automatic detection of anomalies from multidimensional, undersampled, non-complete datasets and unreliable sources. Links to L_WP2 in exploiting domain knowledge; L_WP3 & L_WP4 to aid in weak signal detection, for example in MIMO and to L_WP5 for real implementation. Data quality and ambiguity measures will be used to ensure the resulting models of normality are not corrupted by unreliable and ambiguous data.
L_WP1.1 Statistical anomaly detection
We will develop a generic framework for statistical anomaly detection and apply it to a number of relevant domains via the design of algorithms appropriate for each aspect of operation. For instance, for Internet signals the approach will involve the analysis of meta-data related to packets and their transmission. The optimal form of meta-data will be determined via analysis of different summaries from a selection identified as significant. The merit of deploying multiple detectors operating on different fields of the packet, or from different packet receptors will be considered. Fusion algorithms will be investigated, both at the feature level, using multiple kernel fusion with kernel de-noising, and the decision level, using Dempster-Schafer theory, which is ideally suited for handling unreliable data. For objects or phenomena characterized by multiple measurement probability distributions, anomaly and drift will be detected using Kulback-Leibler measure, or K-S statistics. As the use of generative models for decision-making and outlier detection compromises classification performance and computational efficiency, we will realize fast anomaly detection using machine learning. A one class classifier for the mixture probability distributions combining all the class conditional measurement distribution models will be learnt. The robustness of the anomaly detection system to undersampling and missing value problems will be addressed by techniques such as the neutral point substitution recently developed at Surrey, sparse sensing, and robust tensor based models.
L_WP1.2 Anomaly detection in complex networks
Many battlefield applications can be modeled as complex networks (graphs) of interconnected nodes representing objects or phenomena, exhibiting spatio-temporal relations (domain knowledge modeled in L_WP2). Typical examples are networks of communication systems, networks of radar, or scene interpretation systems. The behavior of such heterogeneous networks can be characterized in terms of hierarchical models where at the low level we are dealing with individual nodes and their class membership and, at the global network level, the typical interaction of these nodes is manifest in terms of configuration patterns that can exhibit dynamic properties, as well as inter-node relations. The multiplicity of interpretation (e.g. contextual and non-contextual) of the activity of each node in complex networks offers an efficient anomaly detection mechanism, based on detecting incongruence between the node and node configuration interpretations. This would still allow the use of discriminative approaches for decision making, while engaging generative models only when an anomaly is present and to facilitate the identification of its nature and nuance. Incongruence will be gauged using Bayesian surprise measure or its variants. To minimize false positives, we propose to use data quality measures, jointly with decision confidence measures, and develop mechanisms enabling the suppression of anomaly detection for observations of low quality (contaminated by noise, fading). The use of high dimensional search techniques, such as latent semantic indexing emerging from data mining, for both labelled and unlabelled datasets, will also be considered.