My publications


Yang Ping, Zhu Jing, Liu Zilong, Xiao Yue, Li Shaoqian, Xiang Wei (2018)Unified Power Allocation for Receive Spatial Modulation Based on Approximate Optimization, In: IEEE Access6pp. 49450-49459 Institute of Electrical and Electronics Engineers (IEEE)
In this paper, a novel unified power allocation (PA) framework is proposed for receive (pre-coding aided) spatial modulation (RSM). We find that the PA matrix design can be formulated as a non-convex quadratically constrained quadratic program (QCQP) problem, whose solution is generally intractable. To tackle this problem, we propose a pair of solvers having different trade-offs in terms of biterror- rate (BER) and complexity. Specifically, we first propose a successive convex approximation (SCA) method, to convert the non-convex QCQP problem under consideration into a series of linear convex subproblems, where the latter can be easily solved by the classic polynomial-time based optimization method, i.e., the interior point method. To further reduce the computational complexity, we propose an augmented Lagrangian multiplier (ALM) method, which transforms the challenging non-convex constrained PA optimization problem into its unconstrained counterpart, which can be efficiently solved by an iterative manner. Our simulation results show that both the proposed SCA and ALM methods are capable of substantially improving the system error performance compared with conventional RSM system without PA as well as conventional PA-aided RSM schemes.
Hu Guosheng, Yang Yongxin, Yi Dong, Kittler Josef, Christmas William, Li Stan, Hospedales Timothy (2015)When Face Recognition Meets with Deep Learning: an Evaluation of Convolutional Neural Networks for Face Recognition, In: Computer Vision Workshop (ICCVW), 2015 IEEE International Conference onpp. 384-392
Deep learning, in particular Convolutional Neural Network (CNN), has achieved promising results in face recognition recently. However, it remains an open question: why CNNs work well and how to design a ‘good’ architecture. The existing works tend to focus on reporting CNN architectures that work well for face recognition rather than investigate the reason. In this work, we conduct an extensive evaluation of CNN-based face recognition systems (CNN-FRS) on a common ground to make our work easily reproducible. Specifically, we use public database LFW (Labeled Faces in the Wild) to train CNNs, unlike most existing CNNs trained on private databases. We propose three CNN architectures which are the first reported architectures trained using LFW data. This paper quantitatively compares the architectures of CNNs and evaluates the effect of different implementation choices. We identify several useful properties of CNN-FRS. For instance, the dimensionality of the learned features can be significantly reduced without adverse effect on face recognition accuracy. In addition, a traditional metric learning method exploiting CNN-learned features is evaluated. Experiments show two crucial factors to good CNN-FRS performance are the fusion of multiple CNNs and metric learning. To make our work reproducible, source code and models will be made publicly available.
Li Shuoyang, Wang Wenwu (2018)Randomly Sketched Sparse Subspace Clustering for Acoustic Scene Clustering, In: EUSIPCO 2018 IEEE
Acoustic scene classification has drawn much research attention where labeled data are often used for model training. However, in practice, acoustic data are often unlabeled, weakly labeled, or incorrectly labeled. To classify unlabeled data, or detect and correct wrongly labeled data, we present an unsupervised clustering method based on sparse subspace clustering. The computational cost of the sparse subspace clustering algorithm becomes prohibitively high when dealing with high dimensional acoustic features. To address this problem, we introduce a random sketching method to reduce the feature dimensionality for the sparse subspace clustering algorithm. Experimental results reveal that this method can reduce the computational cost significantly with a limited loss in clustering accuracy.
Cornelsen S, Karrenbauer A, Li SJ (2012)Leveling the Grid, In: Proceedings of 14th Workshop on Algorithm Engineering and Experiments
Motivated by an application in image processing, we introduce the grid-leveling problem. It turns out to be the dual of a minimum cost flow problem for an apex graph with a grid graph as its basis. We present an O(n^{3/2}) algorithm for this problem. The optimum solution recovers missing DC coefficients from image and video coding by Discrete Cosine Transform used in popular standards like JPEG and MPEG. Generally, we prove that there is an O(n^{3/2}) min-cost flow algorithm for networks that, after removing one node, are planar, have bounded degrees, and have bounded capacities. The costs may be arbitrary.
Li S, Dixon S, Black DAA, Plumbley MD (2016)A model selection test on effective factors of the choice of expressive timing clusters for a phrase, In: Proceedings of the 13th Sound and Music Conference (SMC 2016)
We model expressive timing for a phrase in performed classical music as being dependent on two factors: the expressive timing in the previous phrase and the position of the phrase within the piece. We present a model selection test for evaluating candidate models that assert different dependencies for deciding the Cluster of Expressive Timing (CET) for a phrase. We use cross entropy and Kullback Leibler (KL) divergence to evaluate the resulting models: with these criteria we find that both the expressive timing in the previous phrase and the position of the phrase in the music score affect expressive timing in a phrase. The results show that the expressive timing in the previous phrase has a greater effect on timing choices than the position of the phrase, as the phrase position only impacts the choice of expressive timing in combination with the choice of expressive timing in the previous phrase.