Dr Krystian Mikolajczyk

Reader in Robot Vision

Qualifications: PhD, MSc

Email:
Phone: Work: 01483 68 3959
Room no: 29 AB 05

Further information

Publications

Highlights

  • Kalal Z, Matas J, Mikolajczyk K. (2012) 'Tracking-Learning-Detection'. IEEE IEEE Transactions on Pattern Analysis and Machine Intelligence, 34 (7), pp. 1409-1422.

    Abstract

    This paper investigates long-term tracking of unknown objects in a video stream. The object is defined by its location and extent in a single frame. In every frame that follows, the task is to determine the object's location and extent or indicate that the object is not present. We propose a novel tracking framework (TLD) that explicitly decomposes the long-term tracking task into tracking, learning and detection. The tracker follows the object from frame to frame. The detector localizes all appearances that have been observed so far and corrects the tracker if necessary. The learning estimates detector's errors and updates it to avoid these errors in the future. We study how to identify detector's errors and learn from them. We develop a novel learning method (P-N learning) which estimates the errors by a pair of "experts'': (i) P-expert estimates missed detections, and (ii) N-expert estimates false alarms. The learning process is modeled as a discrete dynamical system and the conditions under which the learning guarantees improvement are found. We describe our real-time implementation of the TLD framework and the P-N learning. We carry out an extensive quantitative evaluation which shows a significant improvement over state-of-the-art approaches.

  • Yan F, Kittler J, Mikolajczyk K, Tahir A. (2011) 'Non-Sparse Multiple Kernel Fisher Discriminant Analysis'. Microtome Publishing Journal of Machine Learning Research, 13, pp. 607-642.

    Abstract

    Sparsity-inducing multiple kernel Fisher discriminant analysis (MK-FDA) has been studied in the literature. Building on recent advances in non-sparse multiple kernel learning (MKL), we propose a non-sparse version of MK-FDA, which imposes a general `p norm regularisation on the kernel weights. We formulate the associated optimisation problem as a semi-infinite program (SIP), and adapt an iterative wrapper algorithm to solve it. We then discuss, in light of latest advances inMKL optimisation techniques, several reformulations and optimisation strategies that can potentially lead to significant improvements in the efficiency and scalability of MK-FDA. We carry out extensive experiments on six datasets from various application areas, and compare closely the performance of `p MK-FDA, fixed norm MK-FDA, and several variants of SVM-based MKL (MK-SVM). Our results demonstrate that `p MK-FDA improves upon sparse MK-FDA in many practical situations. The results also show that on image categorisation problems, `p MK-FDA tends to outperform its SVM counterpart. Finally, we also discuss the connection between (MK-)FDA and (MK-)SVM, under the unified framework of regularised kernel machines.

  • Mikolajczyk K, Uemura H. (2011) 'Action recognition with appearance-motion features and fast search trees'. Elsevier Computer Vision and Image Understanding, 115 (3), pp. 426-438.
  • Cai H, Mikolajczyk K, Matas J. (2011) 'Learning linear discriminant projections for dimensionality reduction of image descriptors'. IEEE IEEE Transactions on Pattern Analysis and Machine Intelligence, 33 (2), pp. 338-352.

    Abstract

    In this paper, we present Linear Discriminant Projections (LDP) for reducing dimensionality and improving discriminability of local image descriptors. We place LDP into the context of state-of-the-art discriminant projections and analyze its properties. LDP requires a large set of training data with point-to-point correspondence ground truth. We demonstrate that training data produced by a simulation of image transformations leads to nearly the same results as the real data with correspondence ground truth. This makes it possible to apply LDP as well as other discriminant projection approaches to the problems where the correspondence ground truth is not available, such as image categorization. We perform an extensive experimental evaluation on standard data sets in the context of image matching and categorization. We demonstrate that LDP enables significant dimensionality reduction of local descriptors and performance increases in different applications. The results improve upon the state-of-the-art recognition performance with simultaneous dimensionality reduction from 128 to 30.

  • Cai H, Yan F, Mikolajczyk K. (2010) 'Learning Weights for Codebook in Image Classification and Retrieval'. IEEE IEEE Conference on Computer Vision and Pattern Recognition, pp. 2320-2327.

    Abstract

    This paper presents a codebook learning approach for image classification and retrieval. It corresponds to learning a weighted similarity metric to satisfy that the weighted similarity between the same labeled images is larger than that between the differently labeled images with largest margin. We formulate the learning problem as a convex quadratic programming and adopt alternating optimization to solve it efficiently. Experiments on both synthetic and real datasets validate the approach. The codebook learning improves the performance, in particular in the case where the number of training examples is not sufficient for large size codebook.

  • Mikolajczyk K, Kalal Z, Matas J. (2010) 'P-N Learning: Bootstrapping Binary Classifiers by Structural Constraints'. IEEE Conference on Computer Vision and Pattern Recognition
  • Yan F, Mikolajczyk K, Barnard M, Cai H, Kittler J. (2010) 'Lp Norm Multiple Kernel Fisher Discriminant Analysis for Object and Image Categorisation'. IEEE Conference on Computer Vision and Pattern Recognition
  • Yan F, Kittler J, Mikolajczyk K, Tahir A. (2009) 'Non-Sparse Multiple Kernel Learning for Fisher Discriminant Analysis'. Proceedings of The Ninth IEEE International Conference on Data Mining, Miami, USA: ICDM '09, pp. 1064-1069.

    Abstract

    We consider the problem of learning a linear combination of pre-specified kernel matrices in the Fisher discriminant analysis setting. Existing methods for such a task impose an ¿1 norm regularisation on the kernel weights, which produces sparse solution but may lead to loss of information. In this paper, we propose to use ¿2 norm regularisation instead. The resulting learning problem is formulated as a semi-infinite program and can be solved efficiently. Through experiments on both synthetic data and a very challenging object recognition benchmark, the relative advantages of the proposed method and its ¿1 counterpart are demonstrated, and insights are gained as to how the choice of regularisation norm should be made.

  • Tahir MA, Kittler J, Mikolajczyk K, Yan F, Van De Sande KEA, Gevers T. (2009) 'Visual category recognition using spectral regression and kernel discriminant analysis'. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops 2009, , pp. 178-185.
  • Tuytelaars T, Mikolajczyk K. (2008) 'Local invariant feature detectors: A survey'. Now Publishers Foundations and Trends in Computer Graphics and Vision, 3 (3), pp. 177-280.

Journal articles

  • Kalal Z, Matas J, Mikolajczyk K. (2012) 'Tracking-Learning-Detection'. IEEE IEEE Transactions on Pattern Analysis and Machine Intelligence, 34 (7), pp. 1409-1422.

    Abstract

    This paper investigates long-term tracking of unknown objects in a video stream. The object is defined by its location and extent in a single frame. In every frame that follows, the task is to determine the object's location and extent or indicate that the object is not present. We propose a novel tracking framework (TLD) that explicitly decomposes the long-term tracking task into tracking, learning and detection. The tracker follows the object from frame to frame. The detector localizes all appearances that have been observed so far and corrects the tracker if necessary. The learning estimates detector's errors and updates it to avoid these errors in the future. We study how to identify detector's errors and learn from them. We develop a novel learning method (P-N learning) which estimates the errors by a pair of "experts'': (i) P-expert estimates missed detections, and (ii) N-expert estimates false alarms. The learning process is modeled as a discrete dynamical system and the conditions under which the learning guarantees improvement are found. We describe our real-time implementation of the TLD framework and the P-N learning. We carry out an extensive quantitative evaluation which shows a significant improvement over state-of-the-art approaches.

  • Yan F, Kittler J, Mikolajczyk K, Tahir A. (2011) 'Non-Sparse Multiple Kernel Fisher Discriminant Analysis'. Microtome Publishing Journal of Machine Learning Research, 13, pp. 607-642.

    Abstract

    Sparsity-inducing multiple kernel Fisher discriminant analysis (MK-FDA) has been studied in the literature. Building on recent advances in non-sparse multiple kernel learning (MKL), we propose a non-sparse version of MK-FDA, which imposes a general `p norm regularisation on the kernel weights. We formulate the associated optimisation problem as a semi-infinite program (SIP), and adapt an iterative wrapper algorithm to solve it. We then discuss, in light of latest advances inMKL optimisation techniques, several reformulations and optimisation strategies that can potentially lead to significant improvements in the efficiency and scalability of MK-FDA. We carry out extensive experiments on six datasets from various application areas, and compare closely the performance of `p MK-FDA, fixed norm MK-FDA, and several variants of SVM-based MKL (MK-SVM). Our results demonstrate that `p MK-FDA improves upon sparse MK-FDA in many practical situations. The results also show that on image categorisation problems, `p MK-FDA tends to outperform its SVM counterpart. Finally, we also discuss the connection between (MK-)FDA and (MK-)SVM, under the unified framework of regularised kernel machines.

  • Mikolajczyk K, Uemura H. (2011) 'Action recognition with appearance-motion features and fast search trees'. Elsevier Computer Vision and Image Understanding, 115 (3), pp. 426-438.
  • Cai H, Mikolajczyk K, Matas J. (2011) 'Learning linear discriminant projections for dimensionality reduction of image descriptors'. IEEE IEEE Transactions on Pattern Analysis and Machine Intelligence, 33 (2), pp. 338-352.

    Abstract

    In this paper, we present Linear Discriminant Projections (LDP) for reducing dimensionality and improving discriminability of local image descriptors. We place LDP into the context of state-of-the-art discriminant projections and analyze its properties. LDP requires a large set of training data with point-to-point correspondence ground truth. We demonstrate that training data produced by a simulation of image transformations leads to nearly the same results as the real data with correspondence ground truth. This makes it possible to apply LDP as well as other discriminant projection approaches to the problems where the correspondence ground truth is not available, such as image categorization. We perform an extensive experimental evaluation on standard data sets in the context of image matching and categorization. We demonstrate that LDP enables significant dimensionality reduction of local descriptors and performance increases in different applications. The results improve upon the state-of-the-art recognition performance with simultaneous dimensionality reduction from 128 to 30.

  • Kalal Z, Mikolajczyk K, Matas J. (2010) 'Face-TLD: Tracking-learning-detection applied to faces'. Proceedings - International Conference on Image Processing, ICIP, , pp. 3789-3792.
  • Tahir MA, Yan F, Barnard M, Awais M, Mikolajczyk K, Kittler J. (2010) 'The University of Surrey visual concept detection system at ImageCLEF@ICPR: Working notes'. Springer Lecture Notes in Computer Science: Recognising Patterns in Signals, Speech, Images and Videos, 6388, pp. 162-170.
  • Tuytelaars T, Mikolajczyk K. (2008) 'Local invariant feature detectors: A survey'. Now Publishers Foundations and Trends in Computer Graphics and Vision, 3 (3), pp. 177-280.
  • Mikolajczyk K, Leibe B, Schiele B. (2006) 'Multiple object class detection with a generative model'. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1, pp. 26-33.
  • Mikolajczyk K, Tuytelaars T, Schmid C, Zisserman A, Matas J, Schaffalitzky F, Kadir T, van Gool L. (2005) 'A comparison of affine region detectors'. SPRINGER INTERNATIONAL JOURNAL OF COMPUTER VISION, 65 (1-2), pp. 43-72.

Conference papers

  • Koniusz P, Mikolajczyk K. (2011) 'Spatial coordinate coding to reduce histogram representations, dominant angle and colour pyramid match'. IEEE 18th IEEE International Conference on Image Processing, Brussels, Belgium: ICIP 2011, pp. 661-664.
  • Koniusz P, Mikolajczyk K. (2011) 'Soft assignment of visual words as linear coordinate coding and optimisation of its reconstruction error'. IEEE 18th IEEE International Conference on Image Processing, Brussels, Belgium: ICIP 2011, pp. 2413-2416.
  • Awais M, Yan F, Mikolajczyk K, Kittler J. (2011) 'Novel fusion methods for pattern recognition'. Springer Lecture Notes in Computer Science: Proceedings of Machine Learning and Knowledge Discovery in Databases (Part 1), Athens, Greece: ECML PKDD 2011: Machine Learning and Knowledge Discovery in Databases 6911 (PART 1), pp. 140-155.
  • Yan F, Mikolajczyk K, Kittler J. (2011) 'Multiple kernel learning via distance metric learning for interactive image retrieval'. Springer Lecture Notes in Computer Science: Multiple Classifier Systems, Naples, Italy: MCS 2011: 10th International Workshop on Multiple Classifier Systems 6713, pp. 147-156.
  • Awais M, Yan F, Mikolajczyk K, Kittler J. (2011) 'Two-stage augmented kernel matrix for object recognition'. Springer Lecture Notes in Computer Science: Multiple Classifier Systems, Naples, Italy: MCS 2011: 10th International Workshop on Multiple Classifier Systems 6713, pp. 137-146.
  • De Campos T, Barnard M, Mikolajczyk K, Kittler J, Yan F, Christmas W, Windridge D. (2011) 'An evaluation of bags-of-words and spatio-temporal shapes for action recognition'. 2011 IEEE Workshop on Applications of Computer Vision, WACV 2011, , pp. 344-351.

    Abstract

    Bags-of-visual-Words (BoW) and Spatio-Temporal Shapes (STS) are two very popular approaches for action recognition from video. The former (BoW) is an un-structured global representation of videos which is built using a large set of local features. The latter (STS) uses a single feature located on a region of interest (where the actor is) in the video. Despite the popularity of these methods, no comparison between them has been done. Also, given that BoW and STS differ intrinsically in terms of context inclusion and globality/locality of operation, an appropriate evaluation framework has to be designed carefully. This paper compares these two approaches using four different datasets with varied degree of space-time specificity of the actions and varied relevance of the contextual background. We use the same local feature extraction method and the same classifier for both approaches. Further to BoW and STS, we also evaluated novel variations of BoW constrained in time or space. We observe that the STS approach leads to better results in all datasets whose background is of little relevance to action classification. © 2010 IEEE.

  • Yan F, Kittler J, Mikolajczyk K. (2010) 'Multiple Kernel Learning and Feature Space Denoising'. IEEE International Conference on Machine Learning and Cybernetics, Quingdao: 2010 International Conference on Machine Learning and Cybernetics (ICMLC) 4, pp. 1771-1776.

    Abstract

    We review a multiple kernel learning (MKL) technique called ℓp regularised multiple kernel Fisher discriminant analysis (MK-FDA), and investigate the effect of feature space denoising on MKL. Experiments show that with both the original kernels or denoised kernels, ℓp MK-FDA outperforms its fixed-norm counterparts. Experiments also show that feature space denoising boosts the performance of both single kernel FDA and ℓp MK-FDA, and that there is a positive correlation between the learnt kernel weights and the amount of variance kept by feature space denoising. Based on these observations, we argue that in the case where the base feature spaces are noisy, linear combination of kernels cannot be optimal. An MKL objective function which can take care of feature space denoising automatically, and which can learn a truly optimal (non-linear) combination of the base kernels, is yet to be found.

  • Awais M, Mikolajczyk K. (2010) 'Feature pairs connected by lines for object recognition'. Proceedings of 20th International Conference on Pattern Recognition, Istanbul, Turkey: 2010 20th ICPR, pp. 3093-3096.
  • Kalal Z, Mikolajczyk K, Matas J. (2010) 'Forward-backward error: Automatic detection of tracking failures'. Proceedings of 20th International Conference on Pattern Recognition, Istanbul, Turkey: 2010 20th ICPR, pp. 2756-2759.
  • Koniusz P, Mikolajczyk K. (2010) 'On a quest for image descriptors based on unsupervised segmentation maps'. Proceedings of 20th International Conference on Pattern Recognition, Istanbul, Turkey: 2010 20th ICPR, pp. 762-765.
  • Cai H, Yan F, Mikolajczyk K. (2010) 'Learning Weights for Codebook in Image Classification and Retrieval'. IEEE IEEE Conference on Computer Vision and Pattern Recognition, pp. 2320-2327.

    Abstract

    This paper presents a codebook learning approach for image classification and retrieval. It corresponds to learning a weighted similarity metric to satisfy that the weighted similarity between the same labeled images is larger than that between the differently labeled images with largest margin. We formulate the learning problem as a convex quadratic programming and adopt alternating optimization to solve it efficiently. Experiments on both synthetic and real datasets validate the approach. The codebook learning improves the performance, in particular in the case where the number of training examples is not sufficient for large size codebook.

  • Tahir A, Yan F, Barnard M, Awais M, Mikolajczyk K, Kittler J. (2010) 'The University of Surrey Visual Concept Detection System at ImageCLEF 2010: Working Notes'. Springer Lecture Notes in Computer Science: Recognizing Patterns in Signals, Speech, Images and Videos: Contest Reports, Istanbul, Turkey: ICPR 2010

    Abstract

    Visual concept detection is one of the most important tasks in image and video indexing. This paper describes our system in the ImageCLEF@ICPR Visual Concept Detection Task which ranked first for large-scale visual concept detection tasks in terms of Equal Error Rate (EER) and Area under Curve (AUC) and ranked third in terms of hierarchical measure. The presented approach involves state-of-the-art local descriptor computation, vector quantisation via clustering, structured scene or object representation via localised histograms of vector codes, similarity measure for kernel construction and classifier learning. The main novelty is the classifier-level and kernel-level fusion using Kernel Discriminant Analysis with RBF/Power Chi-Squared kernels obtained from various image descriptors. For 32 out of 53 individual concepts, we obtain the best performance of all 12 submissions to this task.

  • Mikolajczyk K, Kalal Z, Matas J. (2010) 'P-N Learning: Bootstrapping Binary Classifiers by Structural Constraints'. IEEE Conference on Computer Vision and Pattern Recognition
  • Tahir MA, Kittler J, Mikolajczyk K, Yan F, El Gayar N, Kittler J, Roli F. (2010) 'Improving Multilabel Classification Performance by Using Ensemble of Multi-label Classifiers'. SPRINGER-VERLAG BERLIN MULTIPLE CLASSIFIER SYSTEMS, PROCEEDINGS, Cairo, EGYPT: 9th International Workshop on Multiple Classifier Systems 5997, pp. 11-21.
  • Yan F, Mikolajczyk K, Barnard M, Cai H, Kittler J. (2010) 'Lp Norm Multiple Kernel Fisher Discriminant Analysis for Object and Image Categorisation'. IEEE Conference on Computer Vision and Pattern Recognition
  • Yan F, Mikolajczyk K, Kittler J, Tahir MA, El Gayar N, Kittler J, Roli F. (2010) 'Combining Multiple Kernels by Augmenting the Kernel Matrix'. SPRINGER-VERLAG BERLIN MULTIPLE CLASSIFIER SYSTEMS, PROCEEDINGS, Cairo, EGYPT: 9th International Workshop on Multiple Classifier Systems 5997, pp. 175-184.
  • Schubert F, Schertler K, Mikolajczyk K. (2009) 'A hands-on approach to high-dynamic-range and superresolution fusion'. Proceedings of the Ninth IEEE Computer Society Workshop on Application of Computer Vision, Snowbird, USA: 2009 9th IEEE WAVC
  • Yan F, Kittler J, Mikolajczyk K, Tahir A. (2009) 'Non-Sparse Multiple Kernel Learning for Fisher Discriminant Analysis'. Proceedings of The Ninth IEEE International Conference on Data Mining, Miami, USA: ICDM '09, pp. 1064-1069.

    Abstract

    We consider the problem of learning a linear combination of pre-specified kernel matrices in the Fisher discriminant analysis setting. Existing methods for such a task impose an ¿1 norm regularisation on the kernel weights, which produces sparse solution but may lead to loss of information. In this paper, we propose to use ¿2 norm regularisation instead. The resulting learning problem is formulated as a semi-infinite program and can be solved efficiently. Through experiments on both synthetic data and a very challenging object recognition benchmark, the relative advantages of the proposed method and its ¿1 counterpart are demonstrated, and insights are gained as to how the choice of regularisation norm should be made.

  • Tahir MA, Kittler J, Mikolajczyk K, Yan F, Van De Sande KEA, Gevers T. (2009) 'Visual category recognition using spectral regression and kernel discriminant analysis'. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops 2009, , pp. 178-185.
  • Tahir MA, Kittler J, Yan F, Mikolajczyk K. (2009) 'Kernel Discriminant Analysis using Triangular Kernel for Semantic Scene Classification'. IEEE CBMI: 2009 INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING, Chania, GREECE: International Workshop on Content-Based Multimedia Indexing, pp. 1-6.
  • Tahir MA, Kittler J, Mikolajczyk K, Yan F, Benediktsson JA, Kittler J, Roli F. (2009) 'A Multiple Expert Approach to the Class Imbalance Problem Using Inverse Random under Sampling'. SPRINGER-VERLAG BERLIN MULTIPLE CLASSIFIER SYSTEMS, PROCEEDINGS, Univ Iceland, Reykjavik, ICELAND: 8th International Workshop on Multiple Classifier Systems 5519, pp. 82-91.
  • Yan F, Mikolajczyk K, Kittler J, Tahir M. (2009) 'A Comparison of l(1) Norm and l(2) Norm Multiple Kernel SVMs in Image and Video Classification'. IEEE CBMI: 2009 INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING, Chania, GREECE: International Workshop on Content-Based Multimedia Indexing, pp. 7-12.
  • Tahir A, Kittler J, Yan F, Mikolajczyk K. (2009) 'Concept Learning for Image and Video Retrieval: the Inverse Random Under Sampling Approach'. European Signal Processing Conference, Glasgow: 17th European Signal Processing Conference (EUSIPCO 2009), pp. 574-578.
  • Kalal Z, Matas J, Mikolajczyk K. (2009) 'Online learning of robust object detectors during unstable tracking'. 2009 IEEE 12th International Conference on Computer Vision Workshops, Kyoto, Japan: 12th ICCV Worksshops, pp. 1417-1424.
  • Mikolajczyk K, Uemura H. (2008) 'Action recognition with motion-appearance vocabulary forest'. IEEE 2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, Anchorage, AK: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2229-2236.
  • Mikolajczyk K, Matas J. (2007) 'Improving descriptors for fast tree matching by optimal linear projection'. IEEE 2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6, Rio de Janeiro, BRAZIL: 11th IEEE International Conference on Computer Vision, pp. 337-344.

Page Owner: ees1km
Page Created: Thursday 16 September 2010 14:40:05 by lb0014
Last Modified: Friday 21 September 2012 13:45:19 by ees1km
Expiry Date: Friday 16 December 2011 14:20:24
Assembly date: Tue Mar 26 22:38:40 GMT 2013
Content ID: 37282
Revision: 4
Community: 1379