Dr David Windridge

Visiting Senior Fellow

Email:
Room no: 20 AB 05

Further information

Biography

Further details can be found on my personal web page.

Publications

Highlights

  • Windridge D, Kittler J, De Campos T, Yan F, William C, Aftab K. (2014) 'Rule Induction for Adaptive Sport Video Characterization Using MLN Clause Templates'. IEEE Transactions on MultiMedia,
  • Khan A, Windridge D, Kittler J. (2014) 'Multilevel Chinese Takeaway Process and Label-Based Processes for Rule Induction in the Context of Automated Sports Video Annotation'. Cybernetics, IEEE Transactions on, PP Article number 99 , pp. 1-1.

    Abstract

    We propose four variants of a novel hierarchical hidden Markov models strategy for rule induction in the context of automated sports video annotation including a multilevel Chinese takeaway process (MLCTP) based on the Chinese restaurant process and a novel Cartesian product label-based hierarchical bottom-up clustering (CLHBC) method that employs prior information contained within label structures. Our results show significant improvement by comparison against the flat Markov model: optimal performance is obtained using a hybrid method, which combines the MLCTP generated hierarchical topological structures with CLHBC generated event labels. We also show that the methods proposed are generalizable to other rule-based environments including human driving behavior and human actions.

  • Kittler J, Christmas W, de Campos T, Windridge D, Yan F, Illingworth J, Osman M. (2013) 'Domain Anomaly Detection in Machine Perception: A System Architecture and Taxonomy.'. IEEE Trans Pattern Anal Mach Intell,

    Abstract

    We address the problem of anomaly detection in machine perception. The concept of domain anomaly is introduced as distinct from the conventional notion of anomaly used in the literature. We propose a unified framework for anomaly detection which exposes the multifaceted nature of anomalies and suggest effective mechanisms for identifying and distinguishing each facet as instruments for domain anomaly detection. The framework draws on the Bayesian probabilistic reasoning apparatus which clearly defines concepts such as outlier, noise, distribution drift, novelty detection (object, object primitive), rare events, and unexpected events. Based on these concepts we provide a taxonomy of domain anomaly events. One of the mechanisms helping to pinpoint the nature of anomaly is based on detecting incongruence between contextual and noncontextual sensor(y) data interpretation. The proposed methodology has wide applicability. It underpins in a unified way the anomaly detection applications found in the literature. To illustrate some of its distinguishing features, in here the domain anomaly detection methodology is applied to the problem of anomaly detection for a video annotation system.

  • Windridge D, Felsberg M, Shaukat A. (2013) 'A Framework for Hierarchical Perception-Action Learning Utilizing Fuzzy Reasoning'. Cybernetics, IEEE Transactions on, 43 Article number 1 , pp. 155-169-155-169.
  • Hope C, Sterr A, Elangovan P, Geades N, Windridge D, Young K, Wells K. (2013) 'High throughput screening for mammography using a human-computer interface with Rapid Serial Visual Presentation (RSVP)'. Proceedings of SPIE - The International Society for Optical Engineering, 8673

    Abstract

    The steady rise of the breast cancer screening population, coupled with data expansion produced by new digital screening technologies (tomosynthesis/CT) motivates the development of new, more efficient image screening processes. Rapid Serial Visual Presentation (RSVP) is a new fast-content recognition approach which uses electroencephalography to record brain activity elicited by fast bursts of image data. These brain responses are then subjected to machine classification methods to reveal the expert's 'reflex' response to classify images according to their presence or absence of particular targets. The benefit of this method is that images can be presented at high temporal rates (∼10 per second), faster than that required for fully conscious detection, facilitating a high throughput of image (screening) material. In the present paper we present the first application of RSVP to medical image data, and demonstrate how cortically coupled computer vision can be successfully applied to breast cancer screening. Whilst prior RSVP work has utilised multichannel approaches, we also present the first RSVP results demonstrating discriminatory response on a single electrode with a ROC area under the curve of 0.62-0.86 using a simple Fisher discriminator for classification. This increases to 0.75-0.94 when multiple electrodes are used in combination. © 2013 SPIE.

  • Windridge D, Shaukat A, Hollnagel E. (2012) 'Characterizing Driver Intention via Hierarchical Perception–Action Modeling'. IEEE Transactions on Human-Machine Systems, 43 (1), pp. 17-31.

    Abstract

    We seek a mechanism for the classification of the intentional behavior of a cognitive agent, specifically a driver, in terms of a psychological Perception-Action (P-A) model, such that the resulting system would be potentially suitable for use in intelligent driver assistance. P-A models of human intentionality assume that a cognitive agent's perceptual domain is learned in response to the outcome of the agent's actions rather than vice versa. In this way, the perceptual domain is maintained at an appropriate level of complexity in relation to the agent's embodied motor capabilities, greatly simplifying visual processing. A subsumptive P-A model further captures the hierarchical nature of the subtask structure implicit in human actions and assumes that a parallel hierarchical structuring exists within the perceptual domain. Adopting this model enables us to characterize intentions at each level of the P-A hierarchy in terms of a range of descriptors derived from the U.K. Highway Code by examining their correlation with driver gaze behavior. The problem of classifying intentions thus becomes one of reconciling high-level protocols (i.e., Highway Code rules) with low-level perceptual features. We perform a “proof-of-concept” assessment of the model by comparative evaluation of a number of logic-based methods (both stochastic and deductive) for carrying out this classification utilizing the control, signal, and motor inputs of an instrumented vehicle driven by a single driver, and find that a deductive model gives superior intentional classification performance due to the strongly protocol-governed nature of the driving environment.

  • Seredin O, Mottl V, Tatarchuk A, Razin N, Windridge D. (2012) 'Convex support and Relevance Vector Machines for selective multimodal pattern recognition'. IEEE Pattern Recognition (ICPR), 2012 21st International Conference on, Tsukuba, Japan: 21st International Conference on Pattern Recognition (ICPR2012), pp. 1647-1650.

    Abstract

    We address the problem of featureless pattern recognition under the assumption that pair-wise comparison of objects is arbitrarily scored by real numbers. Such a linear embedding is much more general than the traditional kernel-based approach, which demands positive semi-definiteness of the matrix of object comparisons. This demand is frequently prohibitive and is further complicated if there exist a large number of comparison functions, i.e., multiple modalities of object representation. In these cases, the experimenter typically also has the problem of eliminating redundant modalities and objects. In the context of the general pair-wise comparison space this problem becomes mathematically analogous to that of wrapper-based feature selection. The resulting convex SVM-like training criterion is analogous to Tipping's Relevance Vector Machine, but essentially generalizes it via the presence of a structural parameter controlling the selectivity level. © 2012 ICPR Org Committee.

  • Razin N, Sungurov D, Mottl V, Torshin I, Sulimova V, Seredin O, Windridge D. (2012) 'Application of the multi-modal relevance vector machine to the problem of protein secondary structure prediction'. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7632 LNBI, pp. 153-165.

    Abstract

    The aim of the paper is to experimentally examine the plausibility of Relevance Vector Machines (RVM) for protein secondary structure prediction. We restrict our attention to detecting strands which represent an especially problematic element of the secondary structure. The commonly adopted local principle of secondary structure prediction is applied, which implies comparison of a sliding window in the given polypeptide chain with a number of reference amino-acid sequences cut out of the training proteins as benchmarks representing the classes of secondary structure. As distinct from the classical RVM, the novel version applied in this paper allows for selective combination of several tentative window comparison modalities. Experiments on the RS126 data set have shown its ability to essentially decrease the number of reference fragments in the resulting decision rule and to select a subset of the most appropriate comparison modalities within the given set of the tentative ones. © 2012 Springer-Verlag.

  • Taya S, Windridge D, Osman M. (2012) 'Looking to score: the dissociation of goal influence on eye movement and meta-attentional allocation in a complex dynamic natural scene.'. PLoS One, United States: 7 (6)

    Abstract

    Several studies have reported that task instructions influence eye-movement behavior during static image observation. In contrast, during dynamic scene observation we show that while the specificity of the goal of a task influences observers' beliefs about where they look, the goal does not in turn influence eye-movement patterns. In our study observers watched short video clips of a single tennis match and were asked to make subjective judgments about the allocation of visual attention to the items presented in the clip (e.g., ball, players, court lines, and umpire). However, before attending to the clips, observers were either told to simply watch clips (non-specific goal), or they were told to watch the clips with a view to judging which of the two tennis players was awarded the point (specific goal). The results of subjective reports suggest that observers believed that they allocated their attention more to goal-related items (e.g. court lines) if they performed the goal-specific task. However, we did not find the effect of goal specificity on major eye-movement parameters (i.e., saccadic amplitudes, inter-saccadic intervals, and gaze coherence). We conclude that the specificity of a task goal can alter observer's beliefs about their attention allocation strategy, but such task-driven meta-attentional modulation does not necessarily correlate with eye-movement behavior.

  • Goswami D, Chan CH, Windridge D, Kittler J. (2011) 'Evaluation of face recognition system in heterogeneous environments (visible vs NIR)'. IEEE Proceedings of the IEEE International Conference on Computer Vision, Barcelona, Spain: ICCV 2011, pp. 2160-2167.

    Abstract

    Performing facial recognition between Near Infrared (NIR) and visible-light (VIS) images has been established as a common method of countering illumination variation problems in face recognition. In this paper we present a new database to enable the evaluation of cross-spectral face recognition. A series of preprocessing algorithms, followed by Local Binary Pattern Histogram (LBPH) representation and combinations with Linear Discriminant Analysis (LDA) are used for recognition. These experiments are conducted on both NIR→VIS and the less common VIS→NIR protocols, with permutations of uni-modal training sets. 12 individual baseline algorithms are presented. In addition, the best performing fusion approaches involving a subset of 12 algorithms are also described. © 2011 IEEE.

  • Panov M, Tatarchuk A, Mottl V, Windridge D. (2011) 'A modified neutral point method for kernel-based fusion of pattern-recognition modalities with incomplete data sets'. Lecture Notes in Computer Science, 6713, pp. 126-136.

    Abstract

    It is commonly the case in multi-modal pattern recognition that certain modality-specific object features are missing in the training set. We address here the missing data problem for kernel-based Support Vector Machines, in which each modality is represented by the respective kernel matrix over the set of training objects, such that the omission of a modality for some object manifests itself as a blank in the modality-specific kernel matrix at the relevant position. We propose to fill the blank positions in the collection of training kernel matrices via a variant of the Neutral Point Substitution (NPS) method, where the term ”neutral point” stands for the locus of points defined by the ”neutral hyperplane” in the hypothetical linear space produced by the respective kernel. The current method crucially differs from the previously developed neutral point approach in that it is capable of treating missing data in the training set on the same basis as missing data in the test set. It is therefore of potentially much wider applicability. We evaluate the method on the Biosecure DS2 data set.

  • De Campos T, Barnard M, Mikolajczyk K, Kittler J, Yan F, Christmas W, Windridge D. (2011) 'An evaluation of bags-of-words and spatio-temporal shapes for action recognition'. 2011 IEEE Workshop on Applications of Computer Vision, WACV 2011, , pp. 344-351.

    Abstract

    Bags-of-visual-Words (BoW) and Spatio-Temporal Shapes (STS) are two very popular approaches for action recognition from video. The former (BoW) is an un-structured global representation of videos which is built using a large set of local features. The latter (STS) uses a single feature located on a region of interest (where the actor is) in the video. Despite the popularity of these methods, no comparison between them has been done. Also, given that BoW and STS differ intrinsically in terms of context inclusion and globality/locality of operation, an appropriate evaluation framework has to be designed carefully. This paper compares these two approaches using four different datasets with varied degree of space-time specificity of the actions and varied relevance of the contextual background. We use the same local feature extraction method and the same classifier for both approaches. Further to BoW and STS, we also evaluated novel variations of BoW constrained in time or space. We observe that the STS approach leads to better results in all datasets whose background is of little relevance to action classification. © 2010 IEEE.

  • Windridge D. (2010) 'Tomographic Considerations in Ensemble Bias/Variance Decomposition'. SPRINGER-VERLAG BERLIN MULTIPLE CLASSIFIER SYSTEMS, PROCEEDINGS, Cairo, EGYPT: 9th International Workshop on Multiple Classifier Systems 5997, pp. 43-53.
  • Windridge D, Kittler J. (2010) 'Perception-Action Learning as an Epistemologically-Consistent Model for Self-Updating Cognitive Representation'. BRAIN INSPIRED COGNITIVE SYSTEMS 2008, Sao Luis, BRAZIL: 657, pp. 95-134.

    Abstract

    As well as having the ability to formulate models of the world capable of experimental falsification, it is evident that human cognitive capability embraces some degree of representational plasticity, having the scope (at least in infancy) to modify the primitives in terms of which the world is delineated. We hence employ the term 'cognitive bootstrapping' to refer to the autonomous updating of an embodied agent's perceptual framework in response to the perceived requirements of the environment in such a way as to retain the ability to refine the environment model in a consistent fashion across perceptual changes.We will thus argue that the concept of cognitive bootstrapping is epistemically ill-founded unless there exists an a priori percept/motor interrelation capable of maintaining an empirical distinction between the various possibilities of perceptual categorization and the inherent uncertainties of environment modeling.As an instantiation of this idea, we shall specify a very general, logically-inductive model of perception-action learning capable of compact re-parameterization of the percept space. In consequence of the a priori percept/action coupling, the novel perceptual state transitions so generated always exist in bijective correlation with a set of novel action states, giving rise to the required empirical validation criterion for perceptual inferences. Environmental description is correspondingly accomplished in terms of progressively higher-level affordance conjectures which are likewise validated by exploratory action.Application of this mechanism within simulated perception-action environments indicates that, as well as significantly reducing the size and specificity of the a priori perceptual parameter-space, the method can significantly reduce the number of iterations required for accurate convergence of the world-model. It does so by virtue of the active learning characteristics implicit in the notion of cognitive bootstrapping.

  • Poh N, Windridge D, Mottl V, Tatarchuk A, Eliseyev A. (2010) 'Addressing Missing Values in Kernel-based Multimodal Biometric Fusion using Neutral Point Substitution'. IEEE Transactions on Information Forensics and Security, 5 (3), pp. 461-469.

    Abstract

    In multimodal biometric information fusion, it is common to encounter missing modalities in which matching cannot be performed. As a result, at the match score level, this implies that scores will be missing. We address the multimodal fusion problem involving missing modalities (scores) using support vector machines (SVMs) with the neutral point substitution (NPS) method. The approach starts by processing each modality using a kernel. When a modality is missing, at the kernel level, the missing modality is substituted by one that is unbiased with regards to the classification, called a neutral point. Critically, unlike conventional missing-data substitution methods, explicit calculation of neutral points may be omitted by virtue of their implicit incorporation within the SVM training framework. Experiments based on the publicly available Biosecure DS2 multimodal (scores) data set show that the SVM-NPS approach achieves very good generalization performance compared to the sum rule fusion, especially with severe missing modalities.

  • Shevchenko M, Windridge D, Kittler J. (2009) 'A linear-complexity reparameterisation strategy for the hierarchical bootstrapping of capabilities within perception-action architectures'. Image and Vision Computing, 27 (11), pp. 1702-1714.

    Abstract

    Perception-action (PA) architectures are capable of solving a number of problems associated with artificial cognition, in particular, difficulties concerned with framing and symbol grounding. Existing PA algorithms tend to be 'horizontal' in the sense that learners maintain their prior percept-motor competences unchanged throughout learning. We here present a methodology for simultaneous 'horizontal' and 'vertical' perception-action learning in which there additionally exists the capability for incremental accumulation of novel percept-motor competences in a hierarchical fashion. The proposed learning mechanism commences with a set of primitive 'innate' capabilities and progressively modifies itself via recursive generalising of parametric spaces within the linked perceptual and motor domains so as to represent environmental affordances in maximally-compact manner. Efficient reparameterising of the percept domain is here accomplished by the exploratory elimination of dimensional redundancy and environmental context. Experimental results demonstrate that this approach exhibits an approximately linear increase in computational requirements when learning in a typical unconstrained environment, as compared with at least polynomially-increasing requirements for a classical perception-action system. © 2008 Elsevier B.V. All rights reserved.

  • Windridge D, Poh N, Mottl V, Tatarchuk A, Eliseyev A. (2009) 'Handling Multimodal Information Fusion with Missing Observations Using the Neutral Point Substitution Method'. SPRINGER-VERLAG BERLIN MULTIPLE CLASSIFIER SYSTEMS, PROCEEDINGS, Univ Iceland, Reykjavik, ICELAND: 8th International Workshop on Multiple Classifier Systems 5519, pp. 161-170.
  • Windridge D, Kittler J. (2008) 'Epistemic constraints on autonomous symbolic representation in natural and artificial agents'. Studies in Computational Intelligence, 122, pp. 395-422.

    Abstract

    We set out to address, in the form of a survey, the fundamental constraints upon self-updating representation in cognitive agents of natural and artificial origin. The foundational epistemic problem encountered by such agents is that of distinguishing errors of representation from inappropriateness of the representational framework. Resolving this conceptual difficulty involves ensuring the empirical falsifiability of both the representational hypotheses and the entities so represented, while at the same time retaining their epistemic distinguishability. We shall thus argue that perception-action frameworks provide an appropriate basis for the development of an empirically meaningful criterion for validating perceptual categories. In this scenario, hypotheses about the agent’s world are defined in terms of environmental affordances (characterised in terms of the agent’s active capabilities). Agents with the capability to hierarchically-abstract this framework to a level consonant with performing syntactic manipulations and making deductive conjectures are consequently able to form an implicitly symbolic representation of the environment within which new, higher-level, modes of environment manipulation are implied (e.g. tool-use). This abstraction process is inherently open-ended, admitting a wide-range of possible representational hypotheses — only the form of the lowest-level of the hierarchy need be constrained a priori (being the minimally sufficient condition necessary for retention of the ability to falsify high-level hypotheses). In biological agents capable of autonomous cognitive-updating, we argue that the grounding of such a priori ‘bootstrap’ representational hypotheses is ensured via the process of natural selection.

  • Kittler J, Windridge D, Goswami D. (2008) 'Subsurface Scattering Deconvolution for Improved NIR-Visible Facial Image Correlation'. IEEE 2008 8TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2008), VOLS 1 AND 2, Amsterdam, NETHERLANDS: 8th IEEE International Conference on Automatic Face and Gesture Recognition, pp. 889-894.

    Abstract

    Significant improvements in face-recognition performance have recently been achieved by obtaining near infrared (NIR) probe images. We demonstrate that by taking into account the differential effects of sub-surface scattering, correlation between facial images in the visible (VIS) and NIR wavelengths can be significantly improved. Hence, by using Fourier analysis and Gaussian deconvolution with variable thresholds for the scattering deconvolution radius and frequency, sub-surface scattering effects are largely eliminated from perpendicular isomap transformations of the facial images. (Isomap images are obtained via scanning reconstruction, as in our case, or else, more generically, via model fitting). Thus, small-scale features visible in both the VIS and NIR, such as skin-pores and certain classes of skin-mottling, can be equally weighted within the correlation analysis. The method can consequently serves as the basis for more detailed forms of facial comparison

  • Windridge D, Mottl V, Tatarchuk A, Eliseyev A. (2007) 'The neutral point method for kernel-based combination of disjoint training data in multi-modal pattern recognition'. SPRINGER-VERLAG BERLIN Multiple Classifier Systems, Proceedings, Prague, CZECH REPUBLIC: 7th International Workshop on Multiple Classifier Systems 4472, pp. 13-21.
  • Windridge D, Kittler J. (2005) 'Performance measures of the tomographic classifier fusion methodology'. Int. Journal of Pattern Recognition and Artificial Intelligence, 19 (6), pp. 731-753.

    Abstract

    We seek to quantify both the classification performance and estimation error robustness of the authors' tomographic classifier fusion methodology by contrasting it in field tests and model scenarios with the sum and product classifier fusion methodologies. In particular, we seek to confirm that the tomographic methodology represents a generally optimal strategy across the entire range of problem dimensionalities, and at a sufficient margin to justify the general advocation of its use. Final results indicate, in particular, a near 25% improvement on the next nearest performing combination scheme at the extremity of the tested dimensional range.

  • Kittler J, Christmas WJ, Kostin A, Yan F, Kolonias I, Windridge D. (2005) 'A memory architecture and contextual reasoning framework for cognitive vision'. SPRINGER-VERLAG BERLIN IMAGE ANALYSIS, PROCEEDINGS, Joensuu, FINLAND: 14th Scandinavian Conference on Image Analysis 3540, pp. 343-358.
  • Windridge D. (2005) Cognitive Bootstrapping: A Survey of Bootstrap Mechanisms for Emergent Cognition. in (ed.) CVSSP Technical Report VSSP-TR-2/2005 Article number CVSSP Technical Report VSSP-TR-2/2005

    Abstract

    We propose 'Cognitive Bootstrapping' as a blanket term for all instances of the process by which the perceptual apparatus of an autonomous agent is conceptually extended and experimentally validated. Bootstrap techniques are necessary to transcend the paradox inherent in validating perceptual categorisations via the objects of perception. Perception may thus become self-founding only within certain crucial a priori limits required to maintain referentiality and provide a validation criterion for the proposed perceptual updates. We hence survey the subject areas in which this mechanism occurs, ultimately advocating a hierarchically open-ended perception/action approach to artificial cognition in order to objectively ground perceptual updating

  • Windridge D. (2005) 'Morphologically Debiased Classifier Fusion: A Tomography-Theoretic Approach'. Advances in Imaging and Electron Physics, 134, pp. 181-266.

    Abstract

    We set out in this review article to construct a generalized theory of classi er combination for classi ers that, at least in the theory's initial form, act within noncoincident feature-spaces. Doing so involves the postulation of an equivalence between the various strategies for classi er combination and the tomographic reconstruction of the joint pattern-space probability density function, where the classi ers themselves are interpreted as extremely bandwidth limited Radon transform data. This analogue will immediately suggest techniques for improving the process, as well as de ning the optimal performance to be gained by such combinatorial approaches with respect to arbitrary joint pattern-space PDF morphologies. Furthermore, this methodology of optimality naturally will also encompass the feature selection process to present a uni ed perspective on the various di ering aspects of classi er combination. A practical implementation of the methodology is also given, along with a series of tests to establish its performance in relation to both model and real-word classi cation scenarios.

  • Windridge D, Bowden R. (2005) 'Hidden Markov chain estimation and parameterisation via ICA-based feature-selection'. Pattern Analysis Applications, 8 Article number 1-2 , pp. 115-124-115-124.
  • Windridge D, Bowden R. (2004) 'Induced Decision Fusion in Automated Sign Language Interpretation: Using ICA to Isolate the Underlying Components of Sign'. Multiple Classifier Systems, , pp. 303-313-303-313.
  • Bowden R, Windridge D, Kadir T, Zisserman A, Brady M. (2004) 'A Linguistic Feature Vector for the Visual Interpretation of Sign Language'. European Conference on Computer Vision, , pp. 390-401-390-401.
  • Kittler J, Ahmadyfard A, Windridge D. (2003) 'Serial multiple classifier systems exploiting a coarse to fine output coding'. Berlin : Springer Multiple Classifier Systems, , pp. 106-114-106-114.
  • Windridge D, Kittler J. (2003) 'A morphologically optimal strategy for classifier combination: Multiple expert fusion as a tomographic process'. IEEE Trans. on Pattern Analysis and Machine Intelligence, 25, pp. 343-353-343-353.
  • Kittler J, Yusoff Y, Christmas W, Windeatt T, Windridge D. (2001) 'Boosting multiple experts by joint optimisation of decision thresholds'. Pattern Recognition and Image Analysis, 11 Article number 3 , pp. 529-541.

    Abstract

    We consider a multiple classifier system which combines the hard decisions of experts by voting. We argue that the individual experts should not set their own decision thresholds. The respective thresholds should be selected jointly as this will allow compensation of the weaknesses of some experts by the relative strengths of the others. We perform the joint optimization of decision thresholds for a multiple expert system by a systematic sampling of the multidimensional decision threshold space. We show the effectiveness of this approach on the important practical application of video shot cut detection.

  • Windridge D, Phillipps S. (2000) 'A Fluctuation Analysis for Optical Cluster Galaxies - I. Theory'. Monthly Notices of the Royal Astronomical Society, 319, pp. 591-591.

Journal articles

  • Brown M, Windridge D, Guillemaut J. (2016) 'A Generalised Framework for Saliency-Based Point Feature Detection'. Computer Vision and Image Understanding, 157, pp. 117-137.

    Abstract

    Here we present a novel, histogram-based salient point feature detector that may naturally be applied to both images and 3D data. Existing point feature detectors are often modality specific, with 2D and 3D feature detectors typically constructed in separate ways. As such, their applicability in a 2D-3D context is very limited, particularly where the 3D data is obtained by a LiDAR scanner. By contrast, our histogram-based approach is highly generalisable and as such, may be meaningfully applied between 2D and 3D data. Using the generalised approach, we propose salient point detectors for images, and both untextured and textured 3D data. The approach naturally allows for the detection of salient 3D points based jointly on both the geometry and texture of the scene, allowing for broader applicability. The repeatability of the feature detectors is evaluated using a range of datasets including image and LiDAR input from indoor and outdoor scenes. Experimental results demonstrate a significant improvement in terms of 2D-2D and 2D-3D repeatability compared to existing multi-modal feature detectors.

  • Khan A, Windridge D, Kittler J. (2014) 'Multilevel Chinese Takeaway Process and Label-Based Processes for Rule Induction in the Context of Automated Sports Video Annotation'. Cybernetics, IEEE Transactions on, PP Article number 99 , pp. 1-1.

    Abstract

    We propose four variants of a novel hierarchical hidden Markov models strategy for rule induction in the context of automated sports video annotation including a multilevel Chinese takeaway process (MLCTP) based on the Chinese restaurant process and a novel Cartesian product label-based hierarchical bottom-up clustering (CLHBC) method that employs prior information contained within label structures. Our results show significant improvement by comparison against the flat Markov model: optimal performance is obtained using a hybrid method, which combines the MLCTP generated hierarchical topological structures with CLHBC generated event labels. We also show that the methods proposed are generalizable to other rule-based environments including human driving behavior and human actions.

  • Dalio M, Biral F, Bertolazzi E, Galvani M, Bosetti P, Windridge D, Saroldi A, Tango F. (2014) 'Artificial Codrivers as a Universal Enabling Technology for Future Intelligent Vehicles and Transportation Systems'. IEEE Transactions on Intelligent Transportation Systems and Intelligent Transportation Systems Magazine,
  • Windridge D, Kittler J, De Campos T, Yan F, William C, Aftab K. (2014) 'Rule Induction for Adaptive Sport Video Characterization Using MLN Clause Templates'. IEEE Transactions on MultiMedia,
  • Kittler J, Christmas W, de Campos T, Windridge D, Yan F, Illingworth J, Osman M. (2013) 'Domain Anomaly Detection in Machine Perception: A System Architecture and Taxonomy.'. IEEE Trans Pattern Anal Mach Intell,

    Abstract

    We address the problem of anomaly detection in machine perception. The concept of domain anomaly is introduced as distinct from the conventional notion of anomaly used in the literature. We propose a unified framework for anomaly detection which exposes the multifaceted nature of anomalies and suggest effective mechanisms for identifying and distinguishing each facet as instruments for domain anomaly detection. The framework draws on the Bayesian probabilistic reasoning apparatus which clearly defines concepts such as outlier, noise, distribution drift, novelty detection (object, object primitive), rare events, and unexpected events. Based on these concepts we provide a taxonomy of domain anomaly events. One of the mechanisms helping to pinpoint the nature of anomaly is based on detecting incongruence between contextual and noncontextual sensor(y) data interpretation. The proposed methodology has wide applicability. It underpins in a unified way the anomaly detection applications found in the literature. To illustrate some of its distinguishing features, in here the domain anomaly detection methodology is applied to the problem of anomaly detection for a video annotation system.

  • Windridge D, Felsberg M, Shaukat A. (2013) 'A Framework for Hierarchical Perception-Action Learning Utilizing Fuzzy Reasoning'. Cybernetics, IEEE Transactions on, 43 Article number 1 , pp. 155-169-155-169.
  • Taya S, Windridge D, Osman M. (2013) 'Trained eyes: experience promotes adaptive gaze control in dynamic and uncertain visual environments.'. PLoS One, United States: 8 (8)

    Abstract

    Current eye-tracking research suggests that our eyes make anticipatory movements to a location that is relevant for a forthcoming task. Moreover, there is evidence to suggest that with more practice anticipatory gaze control can improve. However, these findings are largely limited to situations where participants are actively engaged in a task. We ask: does experience modulate anticipative gaze control while passively observing a visual scene? To tackle this we tested people with varying degrees of experience of tennis, in order to uncover potential associations between experience and eye movement behaviour while they watched tennis videos. The number, size, and accuracy of saccades (rapid eye-movements) made around 'events,' which is critical for the scene context (i.e. hit and bounce) were analysed. Overall, we found that experience improved anticipatory eye-movements while watching tennis clips. In general, those with extensive experience showed greater accuracy of saccades to upcoming event locations; this was particularly prevalent for events in the scene that carried high uncertainty (i.e. ball bounces). The results indicate that, even when passively observing, our gaze control system utilizes prior relevant knowledge in order to anticipate upcoming uncertain event locations.

  • Windridge D, Shaukat A, Hollnagel E. (2012) 'Characterizing Driver Intention via Hierarchical Perception–Action Modeling'. IEEE Transactions on Human-Machine Systems, 43 (1), pp. 17-31.

    Abstract

    We seek a mechanism for the classification of the intentional behavior of a cognitive agent, specifically a driver, in terms of a psychological Perception-Action (P-A) model, such that the resulting system would be potentially suitable for use in intelligent driver assistance. P-A models of human intentionality assume that a cognitive agent's perceptual domain is learned in response to the outcome of the agent's actions rather than vice versa. In this way, the perceptual domain is maintained at an appropriate level of complexity in relation to the agent's embodied motor capabilities, greatly simplifying visual processing. A subsumptive P-A model further captures the hierarchical nature of the subtask structure implicit in human actions and assumes that a parallel hierarchical structuring exists within the perceptual domain. Adopting this model enables us to characterize intentions at each level of the P-A hierarchy in terms of a range of descriptors derived from the U.K. Highway Code by examining their correlation with driver gaze behavior. The problem of classifying intentions thus becomes one of reconciling high-level protocols (i.e., Highway Code rules) with low-level perceptual features. We perform a “proof-of-concept” assessment of the model by comparative evaluation of a number of logic-based methods (both stochastic and deductive) for carrying out this classification utilizing the control, signal, and motor inputs of an instrumented vehicle driven by a single driver, and find that a deductive model gives superior intentional classification performance due to the strongly protocol-governed nature of the driving environment.

  • Taya S, Windridge D, Osman M. (2012) 'Looking to score: the dissociation of goal influence on eye movement and meta-attentional allocation in a complex dynamic natural scene.'. PLoS One, United States: 7 (6)

    Abstract

    Several studies have reported that task instructions influence eye-movement behavior during static image observation. In contrast, during dynamic scene observation we show that while the specificity of the goal of a task influences observers' beliefs about where they look, the goal does not in turn influence eye-movement patterns. In our study observers watched short video clips of a single tennis match and were asked to make subjective judgments about the allocation of visual attention to the items presented in the clip (e.g., ball, players, court lines, and umpire). However, before attending to the clips, observers were either told to simply watch clips (non-specific goal), or they were told to watch the clips with a view to judging which of the two tennis players was awarded the point (specific goal). The results of subjective reports suggest that observers believed that they allocated their attention more to goal-related items (e.g. court lines) if they performed the goal-specific task. However, we did not find the effect of goal specificity on major eye-movement parameters (i.e., saccadic amplitudes, inter-saccadic intervals, and gaze coherence). We conclude that the specificity of a task goal can alter observer's beliefs about their attention allocation strategy, but such task-driven meta-attentional modulation does not necessarily correlate with eye-movement behavior.

  • Razin N, Sungurov D, Mottl V, Torshin I, Sulimova V, Seredin O, Windridge D. (2012) 'Application of the multi-modal relevance vector machine to the problem of protein secondary structure prediction'. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7632 LNBI, pp. 153-165.

    Abstract

    The aim of the paper is to experimentally examine the plausibility of Relevance Vector Machines (RVM) for protein secondary structure prediction. We restrict our attention to detecting strands which represent an especially problematic element of the secondary structure. The commonly adopted local principle of secondary structure prediction is applied, which implies comparison of a sliding window in the given polypeptide chain with a number of reference amino-acid sequences cut out of the training proteins as benchmarks representing the classes of secondary structure. As distinct from the classical RVM, the novel version applied in this paper allows for selective combination of several tentative window comparison modalities. Experiments on the RS126 data set have shown its ability to essentially decrease the number of reference fragments in the resulting decision rule and to select a subset of the most appropriate comparison modalities within the given set of the tentative ones. © 2012 Springer-Verlag.

  • Panov M, Tatarchuk A, Mottl V, Windridge D. (2011) 'A modified neutral point method for kernel-based fusion of pattern-recognition modalities with incomplete data sets'. Lecture Notes in Computer Science, 6713, pp. 126-136.

    Abstract

    It is commonly the case in multi-modal pattern recognition that certain modality-specific object features are missing in the training set. We address here the missing data problem for kernel-based Support Vector Machines, in which each modality is represented by the respective kernel matrix over the set of training objects, such that the omission of a modality for some object manifests itself as a blank in the modality-specific kernel matrix at the relevant position. We propose to fill the blank positions in the collection of training kernel matrices via a variant of the Neutral Point Substitution (NPS) method, where the term ”neutral point” stands for the locus of points defined by the ”neutral hyperplane” in the hypothetical linear space produced by the respective kernel. The current method crucially differs from the previously developed neutral point approach in that it is capable of treating missing data in the training set on the same basis as missing data in the test set. It is therefore of potentially much wider applicability. We evaluate the method on the Biosecure DS2 data set.

  • Poh N, Windridge D, Mottl V, Tatarchuk A, Eliseyev A. (2010) 'Addressing Missing Values in Kernel-based Multimodal Biometric Fusion using Neutral Point Substitution'. IEEE Transactions on Information Forensics and Security, 5 (3), pp. 461-469.

    Abstract

    In multimodal biometric information fusion, it is common to encounter missing modalities in which matching cannot be performed. As a result, at the match score level, this implies that scores will be missing. We address the multimodal fusion problem involving missing modalities (scores) using support vector machines (SVMs) with the neutral point substitution (NPS) method. The approach starts by processing each modality using a kernel. When a modality is missing, at the kernel level, the missing modality is substituted by one that is unbiased with regards to the classification, called a neutral point. Critically, unlike conventional missing-data substitution methods, explicit calculation of neutral points may be omitted by virtue of their implicit incorporation within the SVM training framework. Experiments based on the publicly available Biosecure DS2 multimodal (scores) data set show that the SVM-NPS approach achieves very good generalization performance compared to the sum rule fusion, especially with severe missing modalities.

  • Windridge D, Kittler J. (2010) 'Perception-Action Learning as an Epistemologically-Consistent Model for Self-Updating Cognitive Representation'. BRAIN INSPIRED COGNITIVE SYSTEMS 2008, Sao Luis, BRAZIL: 657, pp. 95-134.

    Abstract

    As well as having the ability to formulate models of the world capable of experimental falsification, it is evident that human cognitive capability embraces some degree of representational plasticity, having the scope (at least in infancy) to modify the primitives in terms of which the world is delineated. We hence employ the term 'cognitive bootstrapping' to refer to the autonomous updating of an embodied agent's perceptual framework in response to the perceived requirements of the environment in such a way as to retain the ability to refine the environment model in a consistent fashion across perceptual changes.We will thus argue that the concept of cognitive bootstrapping is epistemically ill-founded unless there exists an a priori percept/motor interrelation capable of maintaining an empirical distinction between the various possibilities of perceptual categorization and the inherent uncertainties of environment modeling.As an instantiation of this idea, we shall specify a very general, logically-inductive model of perception-action learning capable of compact re-parameterization of the percept space. In consequence of the a priori percept/action coupling, the novel perceptual state transitions so generated always exist in bijective correlation with a set of novel action states, giving rise to the required empirical validation criterion for perceptual inferences. Environmental description is correspondingly accomplished in terms of progressively higher-level affordance conjectures which are likewise validated by exploratory action.Application of this mechanism within simulated perception-action environments indicates that, as well as significantly reducing the size and specificity of the a priori perceptual parameter-space, the method can significantly reduce the number of iterations required for accurate convergence of the world-model. It does so by virtue of the active learning characteristics implicit in the notion of cognitive bootstrapping.

  • Shevchenko M, Windridge D, Kittler J. (2009) 'A linear-complexity reparameterisation strategy for the hierarchical bootstrapping of capabilities within perception-action architectures'. Image and Vision Computing, 27 (11), pp. 1702-1714.

    Abstract

    Perception-action (PA) architectures are capable of solving a number of problems associated with artificial cognition, in particular, difficulties concerned with framing and symbol grounding. Existing PA algorithms tend to be 'horizontal' in the sense that learners maintain their prior percept-motor competences unchanged throughout learning. We here present a methodology for simultaneous 'horizontal' and 'vertical' perception-action learning in which there additionally exists the capability for incremental accumulation of novel percept-motor competences in a hierarchical fashion. The proposed learning mechanism commences with a set of primitive 'innate' capabilities and progressively modifies itself via recursive generalising of parametric spaces within the linked perceptual and motor domains so as to represent environmental affordances in maximally-compact manner. Efficient reparameterising of the percept domain is here accomplished by the exploratory elimination of dimensional redundancy and environmental context. Experimental results demonstrate that this approach exhibits an approximately linear increase in computational requirements when learning in a typical unconstrained environment, as compared with at least polynomially-increasing requirements for a classical perception-action system. © 2008 Elsevier B.V. All rights reserved.

  • Windridge D, Kittler J. (2008) 'Epistemic constraints on autonomous symbolic representation in natural and artificial agents'. Studies in Computational Intelligence, 122, pp. 395-422.

    Abstract

    We set out to address, in the form of a survey, the fundamental constraints upon self-updating representation in cognitive agents of natural and artificial origin. The foundational epistemic problem encountered by such agents is that of distinguishing errors of representation from inappropriateness of the representational framework. Resolving this conceptual difficulty involves ensuring the empirical falsifiability of both the representational hypotheses and the entities so represented, while at the same time retaining their epistemic distinguishability. We shall thus argue that perception-action frameworks provide an appropriate basis for the development of an empirically meaningful criterion for validating perceptual categories. In this scenario, hypotheses about the agent’s world are defined in terms of environmental affordances (characterised in terms of the agent’s active capabilities). Agents with the capability to hierarchically-abstract this framework to a level consonant with performing syntactic manipulations and making deductive conjectures are consequently able to form an implicitly symbolic representation of the environment within which new, higher-level, modes of environment manipulation are implied (e.g. tool-use). This abstraction process is inherently open-ended, admitting a wide-range of possible representational hypotheses — only the form of the lowest-level of the hierarchy need be constrained a priori (being the minimally sufficient condition necessary for retention of the ability to falsify high-level hypotheses). In biological agents capable of autonomous cognitive-updating, we argue that the grounding of such a priori ‘bootstrap’ representational hypotheses is ensured via the process of natural selection.

  • Tatarchuk A, Mottl V, Eliseyev A, Windridge D. (2008) 'Selectivity supervision in combining pattern-recognition modalities by feature- and Kernel-selective Support Vector Machines'. Proceedings - International Conference on Pattern Recognition,
  • Windridge D, Kittler J. (2005) 'Performance measures of the tomographic classifier fusion methodology'. Int. Journal of Pattern Recognition and Artificial Intelligence, 19 (6), pp. 731-753.

    Abstract

    We seek to quantify both the classification performance and estimation error robustness of the authors' tomographic classifier fusion methodology by contrasting it in field tests and model scenarios with the sum and product classifier fusion methodologies. In particular, we seek to confirm that the tomographic methodology represents a generally optimal strategy across the entire range of problem dimensionalities, and at a sufficient margin to justify the general advocation of its use. Final results indicate, in particular, a near 25% improvement on the next nearest performing combination scheme at the extremity of the tested dimensional range.

  • Windridge D, Bowden R. (2005) 'Hidden Markov chain estimation and parameterisation via ICA-based feature-selection'. Pattern Analysis Applications, 8 Article number 1-2 , pp. 115-124-115-124.
  • Windridge D. (2005) 'Morphologically Debiased Classifier Fusion: A Tomography-Theoretic Approach'. Advances in Imaging and Electron Physics, 134, pp. 181-266.

    Abstract

    We set out in this review article to construct a generalized theory of classi er combination for classi ers that, at least in the theory's initial form, act within noncoincident feature-spaces. Doing so involves the postulation of an equivalence between the various strategies for classi er combination and the tomographic reconstruction of the joint pattern-space probability density function, where the classi ers themselves are interpreted as extremely bandwidth limited Radon transform data. This analogue will immediately suggest techniques for improving the process, as well as de ning the optimal performance to be gained by such combinatorial approaches with respect to arbitrary joint pattern-space PDF morphologies. Furthermore, this methodology of optimality naturally will also encompass the feature selection process to present a uni ed perspective on the various di ering aspects of classi er combination. A practical implementation of the methodology is also given, along with a series of tests to establish its performance in relation to both model and real-word classi cation scenarios.

  • Windridge D, Kittler J. (2003) 'A morphologically optimal strategy for classifier combination: Multiple expert fusion as a tomographic process'. IEEE Trans. on Pattern Analysis and Machine Intelligence, 25, pp. 343-353-343-353.
  • Kittler J, Yusoff Y, Christmas W, Windeatt T, Windridge D. (2001) 'Boosting multiple experts by joint optimisation of decision thresholds'. Pattern Recognition and Image Analysis, 11 Article number 3 , pp. 529-541.

    Abstract

    We consider a multiple classifier system which combines the hard decisions of experts by voting. We argue that the individual experts should not set their own decision thresholds. The respective thresholds should be selected jointly as this will allow compensation of the weaknesses of some experts by the relative strengths of the others. We perform the joint optimization of decision thresholds for a multiple expert system by a systematic sampling of the multidimensional decision threshold space. We show the effectiveness of this approach on the important practical application of video shot cut detection.

  • Windridge D, Phillipps S. (2000) 'A Fluctuation Analysis for Optical Cluster Galaxies - I. Theory'. Monthly Notices of the Royal Astronomical Society, 319, pp. 591-591.

Conference papers

  • Tatarchuk A, Sulimova V, Mottl V, Windridge D. (2014) 'Supervised Selective Kernel Fusion for Membrane Protein Prediction'. Springer 9th IAPR conference on Pattern Recognition in Bioinformatics (PRIB 2014), Stockholm, Sweden: 9th IAPR conference on Pattern Recognition in Bioinformatics (PRIB 2014)
    [ Status: Accepted ]
  • Chernousova E, Levdik P, Tatarchuk A, Mottl V, Windridge D. (2014) 'Hypothetical Cross Validation for the Choice of Structural Parameters in Feature-Selective Support Vector Machines'. Proc. 22nd International Conference on Pattern Recognition (ICPR 2014), Stockholm, Sweden: 22nd International Conference on Pattern Recognition
  • Campos TED, Khan A, Yan F, Faraji Davar N, Windridge D, Kittler J, Christmas W. (2013) 'A framework for automatic sports video annotation with anomaly detection and transfer learning'. Proceedings of Machine Learning and Cognitive Science, Palma de Mallorca: 3rd EUCogIII Members Conference
  • Hope C, Sterr A, Elangovan P, Geades N, Windridge D, Young K, Wells K. (2013) 'High throughput screening for mammography using a human-computer interface with Rapid Serial Visual Presentation (RSVP)'. Proceedings of SPIE - The International Society for Optical Engineering, 8673

    Abstract

    The steady rise of the breast cancer screening population, coupled with data expansion produced by new digital screening technologies (tomosynthesis/CT) motivates the development of new, more efficient image screening processes. Rapid Serial Visual Presentation (RSVP) is a new fast-content recognition approach which uses electroencephalography to record brain activity elicited by fast bursts of image data. These brain responses are then subjected to machine classification methods to reveal the expert's 'reflex' response to classify images according to their presence or absence of particular targets. The benefit of this method is that images can be presented at high temporal rates (∼10 per second), faster than that required for fully conscious detection, facilitating a high throughput of image (screening) material. In the present paper we present the first application of RSVP to medical image data, and demonstrate how cortically coupled computer vision can be successfully applied to breast cancer screening. Whilst prior RSVP work has utilised multichannel approaches, we also present the first RSVP results demonstrating discriminatory response on a single electrode with a ROC area under the curve of 0.62-0.86 using a simple Fisher discriminator for classification. This increases to 0.75-0.94 when multiple electrodes are used in combination. © 2013 SPIE.

  • Yan F, kittler J, mikolajczyk K, windridge D. (2012) 'Automatic Annotation of Court Games with Structured Output Learning'. Tsukuba Science City, JAPAN: International Conference on Pattern Recognition (ICPR) 2012
  • Seredin O, Mottl V, Tatarchuk A, Razin N, Windridge D. (2012) 'Convex support and Relevance Vector Machines for selective multimodal pattern recognition'. IEEE Pattern Recognition (ICPR), 2012 21st International Conference on, Tsukuba, Japan: 21st International Conference on Pattern Recognition (ICPR2012), pp. 1647-1650.

    Abstract

    We address the problem of featureless pattern recognition under the assumption that pair-wise comparison of objects is arbitrarily scored by real numbers. Such a linear embedding is much more general than the traditional kernel-based approach, which demands positive semi-definiteness of the matrix of object comparisons. This demand is frequently prohibitive and is further complicated if there exist a large number of comparison functions, i.e., multiple modalities of object representation. In these cases, the experimenter typically also has the problem of eliminating redundant modalities and objects. In the context of the general pair-wise comparison space this problem becomes mathematically analogous to that of wrapper-based feature selection. The resulting convex SVM-like training criterion is analogous to Tipping's Relevance Vector Machine, but essentially generalizes it via the presence of a structural parameter controlling the selectivity level. © 2012 ICPR Org Committee.

  • Shaukat A, Gilbert A, Windridge D, Bowden R. (2012) 'Meeting in the Middle: A top-down and bottom-up approach to detect pedestrians'. IEEE Pattern Recognition (ICPR), 2012 21st International Conference on, Tsukuba, Japan: 21st International Conference on Pattern Recognition, pp. 874-877.

    Abstract

    This paper proposes a generic approach combining a bottom-up (low-level) visual detector with a top-down (high-level) fuzzy first-order logic (FOL) reasoning framework in order to detect pedestrians from a moving vehicle. Detections from the low-level visual corner based detector are fed into the logical reasoning framework as logical facts. A set of FOL clauses utilising fuzzy predicates with piecewise linear continuous membership functions associates a fuzzy confidence (a degree-of-truth) to each detector input. Detections associated with lower confidence functions are deemed as false positives and blanked out, thus adding top-down constraints based on global logical consistency of detections. We employ a state of the art visual detector on a challenging pedestrian detection dataset, and demonstrate an increase in detection performance when used in a framework that combines bottom-up detections with (fuzzy FOL-based) top-down constraints. © 2012 ICPR Org Committee.

  • Kiani S, Gordon I, Windridge D, Wells K. (2012) 'On-line spatio-temporal independent component analysis for motion correction in renal DCE-MRI'. IEEE Nuclear Science Symposium Conference Record, , pp. 2910-2915.

    Abstract

    Dynamic contrast enhanced magnetic resonance imaging (DCE-MRI) renography, in common with other medical imaging techniques, is influenced by respiratory motion. As a result, data quantification may be inaccurate. This work presents a novel on-line approach for motion correction by implementing a spatio-temporal independent component analysis method (STICA). This methodology firstly results in removal of motion artefacts and secondly provides independent components that have physiological characteristics. The STICA was applied to 10 healthy volunteers' renal DCE-MRI data. The results were evaluated using independent component curve gradients (ICGs) from different regions of interest and by comparing them with the Rutland-Patlak (RP) analysis. The r2 values for the ICGs were significantly higher compared to the RP curves. The standard deviations of the IC curve gradients also showed less dispersion with comparison to the RP curve gradients across all the ten volunteers' renal data. © 2012 IEEE.

  • Almajai I, Yan F, de Campos T, Khan A, Christmas W, Windridge D, Kittler J. (2012) 'Anomaly Detection and Knowledge Transfer in Automatic Sports Video Annotation'. Springer Proceedings of DIRAC Workshop on Detection and Identification of Rare Audivisual Cues, Barcelona, Spain: DIRAC Workshop on Detection and Identification of Rare Audivisual Cues, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 384, pp. 109-117.

    Abstract

    A key question in machine perception is how to adaptively build upon existing capabilities so as to permit novel functionalities. Implicit in this are the notions of anomaly detection and learning transfer. A perceptual system must firstly determine at what point the existing learned model ceases to apply, and secondly, what aspects of the existing model can be brought to bear on the newlydefined learning domain. Anomalies must thus be distinguished from mere outliers, i.e. cases in which the learned model has failed to produce a clear response; it is also necessary to distinguish novel (but meaningful) input from misclassification error within the existing models.We thus apply a methodology of anomaly detection based on comparing the outputs of strong and weak classifiers [8] to the problem of detecting the rule-incongruence involved in the transition from singles to doubles tennis videos. We then demonstrate how the detected anomalies can be used to transfer learning from one (initially known) rule-governed structure to another. Our ultimate aim, building on existing annotation technology, is to construct an adaptive system for court-based sport video annotation.

  • FarajiDavar N, de Campos TE, Windridge D, Kittler J, Christmas W. (2011) 'Domain Adaptation in the Context of Sport Video Action Recognition'. Sierra Nevada, Spain: NIPS 2011 Domain Adaptation Workshop

    Abstract

    We apply domain adaptation to the problem of recognizing common actions between differing court-game sport videos (in particular tennis and badminton games). Actions are characterized in terms of HOG3D features extracted at the bounding box of each detected player, and thus have large intrinsic dimensionality. The techniques evaluated here for domain adaptation are based on estimating linear transformations to adapt the source domain features in order to maximize the similarity between posterior PDFs for each class in the source domain and the expected posterior PDF for each class in the target domain. As such, the problem scales linearly with feature dimensionality, making the video-environment domain adaptation problem tractable on reasonable time scales and resilient to over-fitting. We thus demonstrate that significant performance improvement can be achieved by applying domain adaptation in this context.

  • Goswami D, Chan CH, Windridge D, Kittler J. (2011) 'Evaluation of face recognition system in heterogeneous environments (visible vs NIR)'. IEEE Proceedings of the IEEE International Conference on Computer Vision, Barcelona, Spain: ICCV 2011, pp. 2160-2167.

    Abstract

    Performing facial recognition between Near Infrared (NIR) and visible-light (VIS) images has been established as a common method of countering illumination variation problems in face recognition. In this paper we present a new database to enable the evaluation of cross-spectral face recognition. A series of preprocessing algorithms, followed by Local Binary Pattern Histogram (LBPH) representation and combinations with Linear Discriminant Analysis (LDA) are used for recognition. These experiments are conducted on both NIR→VIS and the less common VIS→NIR protocols, with permutations of uni-modal training sets. 12 individual baseline algorithms are presented. In addition, the best performing fusion approaches involving a subset of 12 algorithms are also described. © 2011 IEEE.

  • Huang Q, Cox S, Yan F, deCampos TE, Windridge D, Kittler J, Christmas W. (2011) 'Improved Detection of Ball Hit Events in a Tennis Game Using Multimodal Information'. Volterra, Italy : KTH Computer Science and Communication 11th International Conference on Auditory-Visual Speech Processing (AVSP), Volterra, Italy: International Conference on Auditory-Visual Speech Processing

    Abstract

    We describe a novel framework to detect ball hits in a tennis game by combining audio and visual information. Ball hit detection is a key step in understanding a game such as tennis, but single-mode approaches are not very successful: audio detection suffers from interfering noise and acoustic mismatch, video detection is made difficult by the small size of the ball and the complex background of the surrounding environment. Our goal in this paper is to improve detection performance by focusing on high-level information (rather than low-level features), including the detected audio events, the ball’s trajectory, and inter-event timing information. Visual information supplies coarse detection of the ball-hits events. This information is used as a constraint for audio detection. In addition, useful gains in detection performance can be obtained by using and inter-ballhit timing information, which aids prediction of the next ball hit. This method seems to be very effective in reducing the interference present in low-level features. After applying this method to a women’s doubles tennis game, we obtained improvements in the F-score of about 30% (absolute) for audio detection and about 10% for video detection.

  • De Campos T, Barnard M, Mikolajczyk K, Kittler J, Yan F, Christmas W, Windridge D. (2011) 'An evaluation of bags-of-words and spatio-temporal shapes for action recognition'. 2011 IEEE Workshop on Applications of Computer Vision, WACV 2011, , pp. 344-351.

    Abstract

    Bags-of-visual-Words (BoW) and Spatio-Temporal Shapes (STS) are two very popular approaches for action recognition from video. The former (BoW) is an un-structured global representation of videos which is built using a large set of local features. The latter (STS) uses a single feature located on a region of interest (where the actor is) in the video. Despite the popularity of these methods, no comparison between them has been done. Also, given that BoW and STS differ intrinsically in terms of context inclusion and globality/locality of operation, an appropriate evaluation framework has to be designed carefully. This paper compares these two approaches using four different datasets with varied degree of space-time specificity of the actions and varied relevance of the contextual background. We use the same local feature extraction method and the same classifier for both approaches. Further to BoW and STS, we also evaluated novel variations of BoW constrained in time or space. We observe that the STS approach leads to better results in all datasets whose background is of little relevance to action classification. © 2010 IEEE.

  • Felsberg M, Shaukat A, Windridge D. (2010) 'Online Learning in Perception-Action Systems'. Hersonissos, Crete, Greece: ECCV 2010 Workshop on Vision for Cognitive Tasks

    Abstract

    In this position paper, we seek to extend the layered perception-action paradigm for on-line learning such that it includes an explicit symbolic processing capability. By incorporating symbolic processing at the apex of the perception action hierarchy in this way, we ensure that abstract symbol manipulation is fully grounded, without the necessity of specifying an explicit representational framework. In order to carry out this novel interfacing between symbolic and sub-symbolic processing, it is necessary to embed fuzzy rst-order logic theorem proving within a variational framework. The online learning resulting from the corresponding Euler-Lagrange equations establishes an extended adaptability compared to the standard subsumption architecture. We discuss an application of this approach within the eld of advanced driver assistance systems, demonstrating that a closed-form solution to the Euler Lagrange optimization problem is obtainable for simple cases.

  • Felsberg M, Shaukat A, Windridge D. (2010) 'Online Learning in Perception-Action Systems'. Proceedings of ECCV 2010 Workshop on Vision for Cognitive Tasks, 11th European Conference on Computer Vision (ECCV 2010), Crete, Greece,
  • Almajai I, Kittler J, De Campos T, Christmas W, Yan F, Windridge D, Khan A. (2010) 'BALL EVENT RECOGNITION USING HMM FOR AUTOMATIC TENNIS ANNOTATION'. Proceedings of Intl. Conf. on Image Proc.,
  • Taya S, Windridge D, Kittler J, Osman M. (2010) 'Rule-based modulation of visual attention allocation'. Perception (Vol 39, Abstract Suppl), Lausanne, Switzerland: Thirty-third European Conference on Visual Perception, ECVP 2010, pp. 81-81.

    Abstract

    In what way is information processing influenced by the rules underlying a dynamic scene? In two studies we consider this question by examining the relationship between attention allocation in a dynamic visual scene (ie a singles tennis match) and the absence/presence of rule application (ie point allocation task). During training participants observed short clips of a tennis match, and for each they indicated the order of the items (eg players, ball, court lines, umpire, and crowd) from most to least attended. Participants performed a similar task in the test phase, but were also presented with a specific goal which was to indicate which of the two players won the point. In the second experiment, the effects of goal-directed vs non-goal directed observation were compared based on behavioural measures (self-reported ranks and point allocation) and eye-tracking data. Critical differences were revealed between observers regarding their attention allocation for items related to the specific goal (eg court lines). Overall, by varying the levels of goal specificity, observers showed different sensitivity to rule-based items in a dynamic visual scene according to the allocation of attention.

  • Shaukat A, Windridge D, Hollnagel E, Macchi L, Kittler J. (2010) 'Adaptive, Perception-Action-based Cognitive Modelling of Human Driving Behaviour using Control, Gaze and Signal inputs'. Madrid, Spain: Brain Inspired Cognitive Systems

    Abstract

    A perception-action framework for cognition represents the world in terms of an embodied agent’s ability to bring about changes within that environment. This amounts to an affordance-based modelling of the environment. Recent psychological research suggests that a hierarchical perception-action model, known as the Extended Control Model (ECOM), is employed by humans within a vehicle driving context. We thus seek to use machine learning techniques to identify ECOM states (i.e. hierarchical driver intentions) using the modalities of eye-gaze, signalling and driver control input with respect to external visual features. Our approach consists in building a deductive logical model based on a priori highway-code and ECOM rules, which is then to be applied to non-contextual stochastic classifications of feature inputs from a test-car’s camera and detectors so as to determine the currently active ECOM state. Since feature inputs are both noisy and sparse, the goal of the logic system is to adaptively impose top-down consistency and completeness on the input. The cognitively-motivated combination of stochastic bottom-up and logical top-down representational induction means that machine learning problem is one of symbol tethering in Sloman’s sense.

  • Shaukat A, Windridge D, Hollnagel E, Macchi L. (2010) 'Adaptive, Perception-Action-based Cognitive Modelling of Human Driving Behaviour Using Control, Gaze and Signal Inputs'. Proceedings of Brain Inspired Systems 2010 (BICS 2010),
  • Shaukat A, Windridge D, Hollnagel E, Macchi L. (2010) 'Induction of the Human Perception-Action Hierarchy Employed in Junction-Navigation Scenarios'. ETH Zurich, Switzerland: Proc. of 4th International Conference on Cognitive Systems, CogSys 2010
  • Shaukat A, Windridge D, Hollnagel E, Macchi L, Kittler J. (2010) 'Induction of the Human Perception-Action Hierarchy Employed in Junction-Navigation Scenarios'. Zurich, Switzerland: International Conference on Cognitive Systems, CogSys
  • Tatarchuk A, Urlov E, Mottl V, Windridge D. (2010) 'A Support Kernel Machine for Supervised Selective Combining of Diverse Pattern-Recognition Modalities'. SPRINGER-VERLAG BERLIN MULTIPLE CLASSIFIER SYSTEMS, PROCEEDINGS, Cairo, EGYPT: 9th International Workshop on Multiple Classifier Systems 5997, pp. 165-174.
  • Almajai I, Yan F, de Campos TE, Khan A, Christmas W, Windridge D, Kittler J. (2010) 'Anomaly Detection and Knowledge Transfer in Automatic Sports Video Annotation'. Proceedings of DIRAC Workshop, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2010),
  • Windridge D. (2010) 'Tomographic Considerations in Ensemble Bias/Variance Decomposition'. SPRINGER-VERLAG BERLIN MULTIPLE CLASSIFIER SYSTEMS, PROCEEDINGS, Cairo, EGYPT: 9th International Workshop on Multiple Classifier Systems 5997, pp. 43-53.
  • Almajai I, Kittler J, DeCampos T, Christmas W, Yan F, Windridge D, Khan A. (2010) 'BALL EVENT RECOGNITION USING HMM FOR AUTOMATIC TENNIS ANNOTATION'. Proceedings of Intl. Conf. on Image Proc.,
  • Khan A, Windridge D, De Campos T, Kittler J, Christmas W. (2010) 'Lattice-based anomaly rectification for sport video annotation'. Proceedings - International Conference on Pattern Recognition, , pp. 4372-4375.

    Abstract

    Anomaly detection has received much attention within the literature as a means of determining, in an unsupervised manner, whether a learning domain has changed in a fundamental way. This may require continuous adaptive learning to be abandoned and a new learning process initiated in the new domain. A related problem is that of anomaly rectification; the adaptation of the existing learning mechanism to the change of domain. As a concrete instantiation of this notion, the current paper investigates a novel lattice-based HMM induction strategy for arbitrary court-game environments. We test (in real and simulated domains) the ability of the method to adapt to a change of rule structures going from tennis singles to tennis doubles. Our long term aim is to build a generic system for transferring game-rule inferences. © 2010 IEEE.

  • Windridge D, Poh N, Mottl V, Tatarchuk A, Eliseyev A. (2009) 'Handling Multimodal Information Fusion with Missing Observations Using the Neutral Point Substitution Method'. SPRINGER-VERLAG BERLIN MULTIPLE CLASSIFIER SYSTEMS, PROCEEDINGS, Univ Iceland, Reykjavik, ICELAND: 8th International Workshop on Multiple Classifier Systems 5519, pp. 161-170.
  • Tatarchuk A, Sulimova V, Windridge D, Mottl V, Lange M. (2009) 'Supervised Selective Combining Pattern Recognition Modalities and Its Application to Signature Verification by Fusing On-Line and Off-Line Kernels'. SPRINGER-VERLAG BERLIN MULTIPLE CLASSIFIER SYSTEMS, PROCEEDINGS, Univ Iceland, Reykjavik, ICELAND: 8th International Workshop on Multiple Classifier Systems 5519, pp. 324-334.
  • Tatarchuk A, Mottl V, Eliseyev A, Windridge D. (2008) 'Selectivity Supervision in Combining Pattern-Recognition Modalities by Feature- and Kernel-Selective Support Vector Machines'. IEEE 19th International Conference on Pattern Recognition, Tampa, USA: 19th ICPR 2008, pp. 2336-2339.

    Abstract

    Multi-modal pattern recognition must frequently truncate the set of initially available modalities. When a kernel-based approach is adopted within each modality, the problem of modality selection becomes mathematically analogous to that of wrapper-based feature selection. In this paper, we revise two implicitly wrapper-based methods of SVM-embedded selective kernel combination, the Relevance and Support Kernel Machines, so as to equip them with the ability to preset the desired level of feature-selectivity. Hence, a continuous axis of nested feature selection models is obtained, ranging from the absence of selectivity to the selection of single features. We thus unite the distinct processes of selection and classification within the two techniques in manner suitable for general application within Kernel-based multi-modal pattern recognition.

  • Windridge D, Shevchenko M, Kittler J. (2008) 'An Entropy-Based Approach to the Hierarchical Acquisition of Perception-Action Capabilities'. SPRINGER-VERLAG BERLIN COGNITIVE VISION, Santorini, GREECE: 4th International Cognitive Vision Workshop (ICVW 2008) 5329, pp. 79-92.
  • Kittler J, Windridge D, Goswami D. (2008) 'Subsurface Scattering Deconvolution for Improved NIR-Visible Facial Image Correlation'. IEEE 2008 8TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2008), VOLS 1 AND 2, Amsterdam, NETHERLANDS: 8th IEEE International Conference on Automatic Face and Gesture Recognition, pp. 889-894.

    Abstract

    Significant improvements in face-recognition performance have recently been achieved by obtaining near infrared (NIR) probe images. We demonstrate that by taking into account the differential effects of sub-surface scattering, correlation between facial images in the visible (VIS) and NIR wavelengths can be significantly improved. Hence, by using Fourier analysis and Gaussian deconvolution with variable thresholds for the scattering deconvolution radius and frequency, sub-surface scattering effects are largely eliminated from perpendicular isomap transformations of the facial images. (Isomap images are obtained via scanning reconstruction, as in our case, or else, more generically, via model fitting). Thus, small-scale features visible in both the VIS and NIR, such as skin-pores and certain classes of skin-mottling, can be equally weighted within the correlation analysis. The method can consequently serves as the basis for more detailed forms of facial comparison

  • Windridge D, Kittler J. (2008) 'A model for empirical validation in self-updating cognitive representation'. Proceedings of Brain Inspired Systems 2008 (BICS 2008),
  • Windridge D, Kittler J. (2007) 'Open-Ended Inference of Relational Representations in the COSPAL Perception-Action Architecture'. Proceedings of the 5th International Conference on Machine Vision Applications,
  • Windridge D, Mottl V, Tatarchuk A, Eliseyev A. (2007) 'The relationship between kernel and classifier fusion in kernel-based multi-modal pattern recognition: An experimental study'. IEEE PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, Hong Kong, PEOPLES R CHINA: 6th International Conference on Machine Learning and Cybernetics, pp. 3594-3600.
  • Windridge D, Mottl V, Tatarchuk A, Eliseyev A. (2007) 'The neutral point method for kernel-based combination of disjoint training data in multi-modal pattern recognition'. SPRINGER-VERLAG BERLIN Multiple Classifier Systems, Proceedings, Prague, CZECH REPUBLIC: 7th International Workshop on Multiple Classifier Systems 4472, pp. 13-21.
  • Kittler J, Shevchenko M, Windridge D. (2006) 'Visual Bootstrapping for Unsupervised Symbol Grounding'. Springer Proceedings of 8th Advanced Concepts for Intelligent Vision Systems International Conference, , pp. 1037-1046-1037-1046.

    Abstract

    Most existing cognitive architectures integrate computer vision and symbolic reasoning. However, there is still a gap between low-level scene representations (signals) and abstract symbols. Manually attaching, i.e. grounding, the symbols on the physical context makes it impossible to expand system capabilities by learning new concepts. This paper presents a visual bootstrapping approach for the unsupervised symbol grounding. The method is based on a recursive clustering of a perceptual category domain controlled by goal acquisition from the visual environment. The novelty of the method consists in division of goals into the classes of parameter goal, invariant goal and context goal. The proposed system exhibits incremental learning in such a manner as to allow effective transferable representation of high-level concepts.

  • Kittler J, Shevchenko M, Windridge D. (2006) 'Cognitive learning with automatic goal acquisition'. I O S PRESS STAIRS 2006, Riva del Garda, ITALY: 3rd Starting Artificial Intelligence Researchers Symposium 142, pp. 3-13.
  • Bowden R, Ellis L, Kittler J, Shevchenko M, Windridge D. (2005) 'Unsupervised symbol grounding and cognitive bootstrapping in cognitive vision'. Proc. 13th Int. Conference on Image Analysis and Processing, , pp. 27-36-27-36.
  • Patenall R, Windridge D, Kittler J. (2005) 'Multiple Classifier Fusion Performance in Networked Stochastic Vector Quantisers'. Springer Proceedings of 6th International Workshop on Multiple Classifier Systems, , pp. 128-135-128-135.
  • Kittler J, Christmas WJ, Kostin A, Yan F, Kolonias I, Windridge D. (2005) 'A memory architecture and contextual reasoning framework for cognitive vision'. SPRINGER-VERLAG BERLIN IMAGE ANALYSIS, PROCEEDINGS, Joensuu, FINLAND: 14th Scandinavian Conference on Image Analysis 3540, pp. 343-358.
  • Kittler J, Christmas WJ, Yan F, Kolonias I, Windridge D. (2005) 'A memory architecture and contextual reasoning framework for cognitive vision'. Proc. SCIA, , pp. 343-358-343-358.
  • Windridge D, Patenall R, Kittler J. (2004) 'The relationship between classifier factorisation and performance in stochastic vector quantisation'. Berlin : Springer Multiple Classifier Systems, , pp. 194-203-194-203.
  • Windridge D, Bowden R. (2004) 'Induced Decision Fusion in Automated Sign Language Interpretation: Using ICA to Isolate the Underlying Components of Sign'. Multiple Classifier Systems, , pp. 303-313-303-313.
  • Bowden R, Windridge D, Kadir T, Zisserman A, Brady M. (2004) 'A Linguistic Feature Vector for the Visual Interpretation of Sign Language'. European Conference on Computer Vision, , pp. 390-401-390-401.
  • Windridge D, Bowden R, Kittler J. (2004) 'A General Strategy for Hidden Markov Chain Parameterisation in Composite Feature-Spaces'. SSPR/SPR, , pp. 1069-1077-1069-1077.
  • Windridge D. (2003) 'The practcal performance characteristics of tomographically filtered multiple classifier fusion'. Berlin : Springer Multiple Classifier Systems, , pp. 186-195-186-195.
  • Kittler J, Ahmadyfard A, Windridge D. (2003) 'Serial multiple classifier systems exploiting a coarse to fine output coding'. Berlin : Springer Multiple Classifier Systems, , pp. 106-114-106-114.
  • Windridge D, Kittler J. (2002) 'Morphologically unbiased clasifier combination through graphical PDF correlation'. Windsor, Ontario, Canada : Joint IAPR International Workshop on Structural,Syntactic and Statistical Pattern Recognition, , pp. 789-797-789-797.
  • Windridge D, Kittler J. (2002) 'On the general application of the tomographic classifier fusion methodology'. Cagliari : Multiple classifier systems 2002, , pp. 149-158-149-158.
  • Windridge D, Kittler J. (2000) 'Combined classifier optimisation via feature selection'. Advances in Pattern Recognition, , pp. 687-695-687-695.

Book chapters

  • Windridge D, Bober M. (2014) 'A Kernel-Based Framework for Medical Big-Data Analytics'. in Holzinger A, Jurisica I (eds.) Interactive Knowledge Discovery and Data Mining in Biomedical Informatics Springer Berlin Heidelberg 8401, pp. 197-208-197-208.
  • Almajai I, Yan F, de Campos TE, Khan A, Christmas W, Windridge D, Kittler J. (2012) 'Anomaly Detection and Knowledge Transfer in Automatic Sports Video Annotation'. in Weinshall D, Anemüller J, van Gool L (eds.) Detection and Identification of Rare Audiovisual Cues Springer 384, pp. 109-117.

    Abstract

    A key question in machine perception is how to adaptively build upon existing capabilities so as to permit novel functionalities. Implicit in this are the notions of anomaly detection and learning transfer. A perceptual system must firstly determine at what point the existing learned model ceases to apply, and secondly, what aspects of the existing model can be brought to bear on the newly-defined learning domain. Anomalies must thus be distinguished from mere outliers, i.e. cases in which the learned model has failed to produce a clear response; it is also necessary to distinguish novel (but meaningful) input from misclassification error within the existing models. We thus apply a methodology of anomaly detection based on comparing the outputs of strong and weak classifiers [10] to the problem of detecting the rule-incongruence involved in the transition from singles to doubles tennis videos. We then demonstrate how the detected anomalies can be used to transfer learning from one (initially known) rule-governed structure to another. Our ultimate aim, building on existing annotation technology, is to construct an adaptive system for court-based sport video annotation.

Reports

  • Windridge D, Patenall R, Kittler J. (2007) Factoriality as an Indicator of Stochastic Vector Quantiser Generalising Ability.
  • Windridge D. (2005) Cognitive Bootstrapping: A Survey of Bootstrap Mechanisms for Emergent Cognition. in (ed.) CVSSP Technical Report VSSP-TR-2/2005 Article number CVSSP Technical Report VSSP-TR-2/2005

    Abstract

    We propose 'Cognitive Bootstrapping' as a blanket term for all instances of the process by which the perceptual apparatus of an autonomous agent is conceptually extended and experimentally validated. Bootstrap techniques are necessary to transcend the paradox inherent in validating perceptual categorisations via the objects of perception. Perception may thus become self-founding only within certain crucial a priori limits required to maintain referentiality and provide a validation criterion for the proposed perceptual updates. We hence survey the subject areas in which this mechanism occurs, ultimately advocating a hierarchically open-ended perception/action approach to artificial cognition in order to objectively ground perceptual updating

  • Windridge D. (2004) On the Generalisation of Gaussian Mixture Model HMM Parametrisation Techniques. in (ed.) Univ. of Surrey Technical Report: VSSP-TR-1/2004 Article number 3
  • Windridge D, Kittler J. (2003) Economic Tomographic Classifier Fusion: Eliminating Redundant Hogborn Deconvolution Cycles in the Sum-Rule Domain. Article number 1
  • Windridge D, Kittler J. (2000) An Optimal Solution to the Problem of Multiple Expert Fusion.

Page Owner: ees2dw
Page Created: Wednesday 22 September 2010 15:51:24 by lb0014
Last Modified: Monday 30 March 2015 16:57:42 by ks0038
Expiry Date: Thursday 22 December 2011 15:49:04
Assembly date: Thu Jun 22 10:29:47 BST 2017
Content ID: 38052
Revision: 9
Community: 1379

Rhythmyx folder: //Sites/surrey.ac.uk/CVSSP/people
Content type: rx:StaffProfile