Guosheng Hu, Yongxin Yang, Dong Yi, Josef Kittler, William Christmas, Stan Li, Timothy Hospedales (2015)When Face Recognition Meets with Deep Learning: an Evaluation of Convolutional Neural Networks for Face Recognition, In: Computer Vision Workshop (ICCVW), 2015 IEEE International Conference onpp. 384-392

Deep learning, in particular Convolutional Neural Network (CNN), has achieved promising results in face recognition recently. However, it remains an open question: why CNNs work well and how to design a ‘good’ architecture. The existing works tend to focus on reporting CNN architectures that work well for face recognition rather than investigate the reason. In this work, we conduct an extensive evaluation of CNN-based face recognition systems (CNN-FRS) on a common ground to make our work easily reproducible. Specifically, we use public database LFW (Labeled Faces in the Wild) to train CNNs, unlike most existing CNNs trained on private databases. We propose three CNN architectures which are the first reported architectures trained using LFW data. This paper quantitatively compares the architectures of CNNs and evaluates the effect of different implementation choices. We identify several useful properties of CNN-FRS. For instance, the dimensionality of the learned features can be significantly reduced without adverse effect on face recognition accuracy. In addition, a traditional metric learning method exploiting CNN-learned features is evaluated. Experiments show two crucial factors to good CNN-FRS performance are the fusion of multiple CNNs and metric learning. To make our work reproducible, source code and models will be made publicly available.

S Cornelsen, A Karrenbauer, SJ Li (2012)Leveling the Grid, In: Proceedings of 14th Workshop on Algorithm Engineering and Experiments

Motivated by an application in image processing, we introduce the grid-leveling problem. It turns out to be the dual of a minimum cost flow problem for an apex graph with a grid graph as its basis. We present an O(n^{3/2}) algorithm for this problem. The optimum solution recovers missing DC coefficients from image and video coding by Discrete Cosine Transform used in popular standards like JPEG and MPEG. Generally, we prove that there is an O(n^{3/2}) min-cost flow algorithm for networks that, after removing one node, are planar, have bounded degrees, and have bounded capacities. The costs may be arbitrary.

Ping Yang, Jing Zhu, Zilong Liu, Yue Xiao, Shaoqian Li, Wei Xiang (2018)Unified Power Allocation for Receive Spatial Modulation Based on Approximate Optimization, In: IEEE Access6pp. 49450-49459 Institute of Electrical and Electronics Engineers (IEEE)

In this paper, a novel unified power allocation (PA) framework is proposed for receive (pre-coding aided) spatial modulation (RSM). We find that the PA matrix design can be formulated as a non-convex quadratically constrained quadratic program (QCQP) problem, whose solution is generally intractable. To tackle this problem, we propose a pair of solvers having different trade-offs in terms of biterror- rate (BER) and complexity. Specifically, we first propose a successive convex approximation (SCA) method, to convert the non-convex QCQP problem under consideration into a series of linear convex subproblems, where the latter can be easily solved by the classic polynomial-time based optimization method, i.e., the interior point method. To further reduce the computational complexity, we propose an augmented Lagrangian multiplier (ALM) method, which transforms the challenging non-convex constrained PA optimization problem into its unconstrained counterpart, which can be efficiently solved by an iterative manner. Our simulation results show that both the proposed SCA and ALM methods are capable of substantially improving the system error performance compared with conventional RSM system without PA as well as conventional PA-aided RSM schemes.

Shuoyang Li, Yuhui Luo, Jonathon Chambers, Wenwu Wang (2021)Dimension Selected Subspace Clustering, In: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)pp. 3195-3199 Institute of Electrical and Electronics Engineers (IEEE)

Subspace clustering is a popular method for clustering unlabelled data. However, the computational cost of the subspace clustering algorithm can be unaffordable when dealing with a large data set. Using a set of dimension sketched data instead of the original data set can be helpful for mitigating the computational burden. Thus, finding a way for dimension sketching becomes an important problem. In this paper, a new dimension sketching algorithm is proposed, which aims to select informative dimensions that have significant effects on the clustering results. Experimental results reveal that this method can significantly improve subspace clustering performance on both synthetic and real-world datasets, in comparison with two baseline methods.

Shuoyang Li, Wenwu Wang (2018)Randomly Sketched Sparse Subspace Clustering for Acoustic Scene Clustering, In: EUSIPCO 2018 IEEE

Acoustic scene classification has drawn much research attention where labeled data are often used for model training. However, in practice, acoustic data are often unlabeled, weakly labeled, or incorrectly labeled. To classify unlabeled data, or detect and correct wrongly labeled data, we present an unsupervised clustering method based on sparse subspace clustering. The computational cost of the sparse subspace clustering algorithm becomes prohibitively high when dealing with high dimensional acoustic features. To address this problem, we introduce a random sketching method to reduce the feature dimensionality for the sparse subspace clustering algorithm. Experimental results reveal that this method can reduce the computational cost significantly with a limited loss in clustering accuracy.

S Li, S Dixon, DAA Black, MD Plumbley (2016)A model selection test on effective factors of the choice of expressive timing clusters for a phrase, In: Proceedings of the 13th Sound and Music Conference (SMC 2016)

We model expressive timing for a phrase in performed classical music as being dependent on two factors: the expressive timing in the previous phrase and the position of the phrase within the piece. We present a model selection test for evaluating candidate models that assert different dependencies for deciding the Cluster of Expressive Timing (CET) for a phrase. We use cross entropy and Kullback Leibler (KL) divergence to evaluate the resulting models: with these criteria we find that both the expressive timing in the previous phrase and the position of the phrase in the music score affect expressive timing in a phrase. The results show that the expressive timing in the previous phrase has a greater effect on timing choices than the position of the phrase, as the phrase position only impacts the choice of expressive timing in combination with the choice of expressive timing in the previous phrase.