Dr Colin O'Reilly


Research Fellow

Academic and research departments

About

Biography

I am a Research Fellow at the Institute for Communication Systems (ICS), University of Surrey. My research is in the area of machine learning and data mining. I am involved in the FP7 SocIoTal project.

I received the B.Sc. degree in Mathematics from Queen Mary College, University of London, an M.Eng in Telecommunications Engineering from Dublin City University, and the Ph.D from University of Surrey. My Ph.D thesis focused on anomaly detection.

Research interests

Machine learning; Anomaly detection; Distributed data; Non-stationary data; Univariate and multivariate time-series analysis; Kernel methods; Applications of machine learning in a wide variety of domains

Publications

O Reilly C, Gluhak A, Imran M (2014) Adaptive Anomaly Detection with Kernel Eigenspace Splitting and Merging, IEEE Transactions on Knowledge and Data Engineering271pp. 3-16
O Reilly C, Gluhak A, Imran M, Rajasegarar S (2012) Online anomaly rate parameter tracking for anomaly detection in wireless sensor networks, Sensor, Mesh and Ad Hoc Communications and Networks (SECON), 2012 9th Annual IEEE Communications Society Conference onpp. 191 -199-191 -199
OReilly C, Gluhak A, Imran M, Rajasegarar S (2014) Anomaly Detection in Wireless Sensor Networks in a Non-Stationary Environment,IEEE Surveys and Tutorials IEEE
O Reilly C, Gluhak A, Imran M (2013) Online anomaly detection with an incremental centred kernel hypersphere, Machine Learning for Signal Processing (MLSP), 2013 IEEE International Workshop onpp. 1-6
O'Reilly Colin, Gluhak A, Imran A (2016) Distributed Anomaly Detection using Minimum Volume Elliptical Principal Component Analysis,IEEE Transactions on Knowledge and Data Engineering28(9)pp. 2320-2333 IEEE
Principal component analysis and the residual error is an effective anomaly detection technique. In an environment where anomalies are present in the training set, the derived principal components can be skewed by the anomalies. A further aspect of anomaly detection is that data might be distributed across different nodes in a network and their communication to a centralized processing unit is prohibited due to communication cost. Current solutions to distributed anomaly detection rely on a hierarchical network infrastructure to aggregate data or models, however, in this environment links close to the root of the tree become critical and congested. In this paper, an algorithm is proposed that is more robust in its derivation of the principal components of a training set containing anomalies. A distributed form of the algorithm is then derived where each node in a network can iterate towards the centralized solution by exchanging small matrices with neighbouring nodes. Experimental evaluations on both synthetic and real-world data sets demonstrate the superior performance of the proposed approach in comparison to principal component analysis and alternative anomaly detection techniques. In addition, it is shown that in a variety of network infrastructures, the distributed form of the anomaly detection model is able to derive a close approximation of the centralized model.
Anomaly detection is an important aspect of data analysis in order to identify data items that significantly differ from normal data. It is used in a variety of fields such as machine monitoring, environmental monitoring and security applications and is a well-studied area in the field of pattern recognition and machine learning. In this thesis, the key challenges of performing anomaly detection in non-stationary and distributed environments are addressed separately. In non-stationary environments the data distribution may alter, meaning that the concepts to be learned evolve in time. Anomaly detection techniques must be able to adapt to a non-stationary data distribution in order to perform optimally. This requires an update to the model that is being used to classify data. A batch approach to the problem requires a reconstruction of the model each time an update is required. Incremental learning overcomes this issue by using the previous model as the basis for an update. Two kernel-based incremental anomaly detection techniques are proposed. One technique uses kernel principal component analysis to perform anomaly detection. The kernel eigenspace is incrementally updated by splitting and merging kernel eigenspaces. The technique is shown to be more accurate than current state-of-the-art solutions. The second technique offers a reduction in the number of computations by using an incrementally updated hypersphere in kernel space. In addition to updating a model, in a non-stationary environment an update to the parameters of the model are required. Anomaly detection algorithms require the selection of appropriate parameters in order to perform optimally for a given data set. If the distribution of the data changes, an update to the parameters of a model is required. An automatic parameter optimization procedure is proposed for the one-class quartersphere support vector machine where the v parameter is selected automatically based on the anomaly rate in the training set. In environments such as wireless sensor networks, data might be distributed amongst a number of nodes. In this case, distributed learning is required where nodes construct a classifier, or an approximation of the classifier, that would have been formed had all the data been available to one instance of the algorithm. A principal component analysis based anomaly detection method is proposed that uses the solution to a convex optimization problem. The convex optimization problem is then derived in a distributed form, with each node running a local instance of the algorithm. Nodes are able to iterate towards an anomaly detector equivalent to the global solution by exchanging short messages. Detailed evaluations of the proposed techniques are performed against existing state-of-the-art techniques using a variety of synthetic and real-world data sets. Results in the area of a non-stationary environment illustrate the necessity to adapt an anomaly detection model to the changing data distribution. It is shown that the proposed incremental techniques are maintain accuracy while reducing the number of computations. In addition, optimal parameters derived from an unlabelled training set are shown to exhibit superior performance to statically selected parameters. In the area of a distributed environment, it is shown that local learning is insufficient due to the lack of examples. Distributed learning can be performed in a manner where a centralized model can be derived by passing small amounts of information between neighbouring nodes. This approach yields a model that obtains performance equal to that of the centralized model.