Dr Jingshu Zhang
Academic and research departments
Centre for Vision, Speech and Signal Processing (CVSSP), Faculty of Engineering and Physical Sciences.Publications
In real rooms, recorded speech usually contains reverberation, which degrades the quality and intelligibility of the speech. It has proven effective to use neural networks to estimate complex ideal ratio masks (cIRMs) using mean square error (MSE) loss for speech dereverberation. However, in some cases, when using MSE loss to estimate complex-valued masks, phase may have a disproportionate effect compared to magnitude. We propose a new weighted magnitude-phase loss function, which is divided into a magnitude component and a phase component, to train a neural network to estimate complex ideal ratio masks. A weight parameter is introduced to adjust the relative contribution of magnitude and phase to the overall loss. We find that our proposed loss function outperforms the regular MSE loss function for speech dereverberation.
Realizing the theoretical limiting power conversion efficiency (PCE) in perovskite solar cells requires a better understanding and control over the fundamental loss processes occurring in the bulk of the perovskite layer and at the internal semiconductor interfaces in devices. One of the main challenges is to eliminate the presence of charge recombination centres throughout the film which have been observed to be most densely located at regions near the grain boundaries. Here, we introduce aluminium acetylacetonate to the perovskite precursor solution, which improves the crystal quality by reducing the microstrain in the polycrystalline film. At the same time, we achieve a reduction in the non-radiative recombination rate, a remarkable improvement in the photoluminescence quantum efficiency (PLQE) and a reduction in the electronic disorder deduced from an Urbach energy of only 12.6 meV in complete devices. As a result, we demonstrate a PCE of 19.1% with negligible hysteresis in planar heterojunction solar cells comprising all organic p and n-type charge collection layers. Our work shows that an additional level of control of perovskite thin film quality is possible via impurity cation doping, and further demonstrates the continuing importance of improving the electronic quality of the perovskite absorber and the nature of the heterojunctions to further improve the solar cell performance.
Multiple-input multiple-output filterbank multicarrier communication (MIMO-FBMC) is a promising technique to achieve very tight spectrum confinement (thus, higher spectral efficiency) as well as strong robustness against dispersive channels. In this paper, we present a novel training design for MIMO-FBMC system which enables efficient estimate of frequency-selective channels (associated to multiple transmit antennas) with only one non-zero FBMC symbol. Our key idea is to design real-valued orthogonal training sequences (in the frequency domain) which displaying zero-correlation zone properties in the time-domain. Compared to our earlier proposed training scheme requiring at least two non-zero FBMC symbols (separated by several zero guard symbols), the proposed scheme features ultra-low training overhead yet achieves channel estimation performance comparable to our earlier proposed complex training sequence decomposition(CTSD). Our simulations validate that the proposed method is an efficient channel estimation approach for practical preamble-based MIMO-FBMC systems.
We report the direct determination of nonradiative lifetimes in Si/SiGe asymmetric quantum well structures designed to access spatially indirect (diagonal) interwell transitions between heavy-hole ground states, at photon energies below the optical phonon energy. We show both experimentally and theoretically, using a six-band k center dot p model and a time-domain rate equation scheme, that, for the interface quality currently achievable experimentally (with an average step height >= 1 A), interface roughness will dominate all other scattering processes up to about 200 K. By comparing our results obtained for two different structures we deduce that in this regime both barrier and well widths play an important role in the determination of the carrier lifetime. Comparison with recently published experimental and theoretical data obtained for mid-infrared GaAs/AlxGa1-xAs multiple quantum well systems leads us to the conclusion that the dominant role of interface roughness scattering at low temperature is a general feature of a wide range of semiconductor heterostructures not limited to IV-IV materials.
A robust video watermarking scheme of the state-of-the-art video coding standard H.264/AVC is proposed in this brief. 2-D 8-bit watermarks such as detailed company trademarks or logos can be used as inconvertible watermark for copyright protection. A grayscale watermark pattern is first modified to accommodate the H.264/AVC computational constraints, and then embedded into video data in the compressed domain. With the proposed method, the video watermarking scheme can achieve high robustness and good visual quality without increasing the overall bit-rate. Experimental results show that our algorithm can robustly survive transcoding process and strong common signal processing attacks, such as bit-rate reduction, Gaussian filtering and contrast enhancement. © 2007 IEEE.
In order to improve the manageability and adaptability of future 5G wireless networks, the software orchestration mechanism, named software defined networking (SDN) with control and user plane (C/U-plane) decoupling, has become one of the most promising key techniques. Based on these features, the hybrid satellite terrestrial network is expected to support flexible and customized resource scheduling for both massive machine-type- communication (MTC) and high-quality multimedia requests while achieving broader global coverage, larger capacity and lower power consumption. In this paper, an end-to-end hybrid satellite terrestrial network is proposed and the performance metrics, e. g., coverage probability, spectral and energy efficiency (SE and EE), are analysed in both sparse networks and ultra-dense networks. The fundamental relationship between SE and EE is investigated, considering the overhead costs, fronthaul of the gateway (GW), density of small cells (SCs) and multiple quality-ofservice (QoS) requirements. Numerical results show that compared with current LTE networks, the hybrid system with C/U split can achieve approximately 40% and 80% EE improvement in sparse and ultra-dense networks respectively, and greatly enhance the coverage. Various resource management schemes, bandwidth allocation methods, and on-off approaches are compared, and the applications of the satellite in future 5G networks with software defined features are proposed.
Identification of faulty variables is an important component of multivariate statistical process monitoring (MSPM); it provides crucial information for further analysis of the root cause of the detected fault. The main challenge is the large number of combinations of process variables under consideration, usually resulting in a combinatorial optimization problem. This paper develops a generic reconstruction based multivariate contribution analysis (RBMCA) framework to identify the variables that are the most responsible for the fault. A branch and bound (BAB) algorithm is proposed to efficiently solve the combinatorial optimization problem. The formulation of the RBMCA does not depend on a specific model, which allows it to be applicable to any MSPM model. We demonstrate the application of the RBMCA to a specific model: the mixture of probabilistic principal component analysis (PPCA mixture) model. Finally, we illustrate the effectiveness and computational efficiency of the proposed methodology through a numerical example and the benchmark simulation of the Tennessee Eastman process. © 2012 Elsevier Ltd. All rights reserved.
In this paper, we investigate the downlink secure beamforming (BF) design problem of cloud radio access networks (C-RANs) relying on multicast fronthaul, where millimeter-wave and microwave carriers are used for the access links and fronthaul links, respectively. The base stations (BSs) jointly serve users through cooperating hybrid analog/digital BF. We first develop an analog BF for cooperating BSs. On this basis, we formulate a secrecy rate maximization (SRM) problem subject both to a realistic limited fronthaul capacity and to the total BS transmit power constraint. Due to the intractability of the non-convex problem formulated, advanced convex approximated techniques, constrained concave convex procedures and semidefinite programming (SDP) relaxation are applied to transform it into a convex one. Subsequently, an iterative algorithm of jointly optimizing multicast BF, cooperative digital BF and the artificial noise (AN) covariance is proposed. Next, we construct the solution of the original problem by exploiting both the primal and the dual optimal solution of the SDP-relaxed problem. Furthermore, a per-BS transmit power constraint is considered, necessitating the reformulation of the SRM problem, which can be solved by an efficient iterative algorithm. We then eliminate the idealized simplifying assumption of having perfect channel state information (CSI) for the eavesdropper links and invoke realistic imperfect CSI. Furthermore, a worst-case SRM problem is investigated. Finally, by combining the so-called S-Procedure and convex approximated techniques, we design an efficient iterative algorithm to solve it. Simulation results are presented to evaluate the secrecy rate and demonstrate the effectiveness of the proposed algorithms.
Orthogonal frequency division multiplexing with index modulation (OFDM-IM) has attracted considerable interest recently. The technique uses the subcarrier indices as a source of information. In FBMC system, doubledispersive channels lead to inter-carrier interference (ICI) and/or inter-symbol interference (ISI), which are caused by the neighboring symbols in the frequency and/or time domain. When we introduce index modulation to the FBMC system, the interference power will be smaller comparing to that of the conventional FBMC system as some subcarriers carry nothing but zeros. In this paper, the advantages of FBMC with index modulation (FBMC-IM) are investigated by comparing the signal to interference ratio (SIR) with that of the conventional FBMC system. However, the bit error rate (BER) performance is affected since there exists interference in the FBMC-IM system. To improve the BER performance, we propose an optimal combination-selection algorithm and an optimal combinationmapping rule. By abandoning some combinations whose error probability are larger and by mapping the remaining combinations into specified bits, a better BER performance can be achieved compared with that without optimization. The theoretical analysis and simulation results clearly show the FBMC-IM system has a good BER performance under double-dispersive channels.
Producing an electrically pumped silicon-based laser at terahertz frequencies is gaining increased attention these days. This paper reviews the recent advances in the search for a silicon-based terahertz laser. Topics covered include resonant tunneling in p-type Si/SiGe, terahertz intersubband electroluminescence from quantum cascade structures, intersubband lifetime measurements in Si/SiGe quantum wells, enhanced optical guiding using buried silicide layers, and the potential for exploiting common impurity dopants in silicon such as boron and phosphorus to realize a terahertz laser.
Physical layer security (PLS) technologies have attracted much attention in recent years for their potential to provide information-theoretically secure communications. Artificial Noise (AN)-aided transmission is considered as one of the most practicable PLS technologies, as it can realize secure transmission independent of the eavesdropper’s channel status. In this paper, we reveal that AN transmission has the dependency of eavesdropper’s channel condition by introducing our proposed attack method based on a supervised-learning algorithm which utilizes the modulation scheme, available from known packet preamble and/or header information, as supervisory signals of training data. Numerical simulation results with the comparison to conventional clustering methods show that our proposed method improves the success probability of attack from 4.8% to at most 95.8% for the QPSK modulation. It implies that the transmission to the receiver in the cell-edge with low order modulation will be cracked if the eavesdropper’s channel is good enough by employing more antennas than the transmitter. This work brings new insights into the effectiveness of AN schemes and provides useful guidance for the design of robust PLS techniques for practical wireless systems.
Over the last decade, the explosive increase in demand of high-data-rate video services and massive access machine type communication (MTC) requests have become the main challenges for the future 5G wireless network. The hybrid satellite terrestrial network based on the control and user plane (C/U) separation concept is expected to support flexible and customized resource scheduling and management toward global ubiquitous networking and unified service architecture. In this paper, centralized and distributed resource management strategies (CRMS and DRMS) are proposed and compared com- prehensively in terms of throughput, power consumption, spectral and energy efficiency (SE and EE) and coverage probability, utilizing the mature stochastic geometry. Numerical results show that, compared with DRMS strategy, the U-plane cooperation between satellite and terrestrial network under CRMS strategy could improve the throughput and EE by nearly 136% and 60% respectively in ultra-sparse networks and greatly enhance the U-plane coverage probability (approximately 77%). Efficient resource management mechanism is suggested for the hybrid network according to the network deployment for the future 5G wireless network.
Recent encoder-decoder approaches typically employ string decoders to convert images into serialized strings for image-to-markup. However, for tree-structured representational markup, string representations can hardly cope with the structural complexity. In this work, we first show via a set of toy problems that string decoders struggle to decode tree structures, especially as structural complexity increases, we then propose a tree-structured decoder that specifically aims at generating a tree-structured markup. Our decoders works sequentially, where at each step a child node and its parent node are simultaneously generated to form a sub-tree. This sub-tree is consequently used to construct the final tree structure in a recurrent manner. Key to the success of our tree decoder is twofold, (i) it strictly respects the parent-child relationship of trees, and (ii) it explicitly outputs trees as oppose to a linear string. Evaluated on both math formula recognition and chemical formula recognition, the proposed tree decoder is shown to greatly outperform strong string decoder baselines.