Dr Femi Adeyemi-Ejeye
Pronouns: He/Him
Academic and research departments
School of Arts, Humanities and Creative Industries, Music and Media.About
Biography
Dr Femi Adeyemi-Ejeye is an Associate Professor in Video Technology at the University of Surrey whose research investigates how video compression, network conditions, and system design shape human perception of video and immersive media experiences. His work focuses on the evaluation and optimisation of Quality of Experience (QoE) in networked and next-generation media systems, spanning video compression, streaming, immersive media, XR, real-time communication, and safety-critical video applications.
His research sits at the intersection of multimedia systems, perceptual video quality, networked media delivery, and immersive technologies. Through experimental studies, perceptual evaluation, and systems-level analysis, he examines how technical choices such as compression strategies, transmission constraints, latency, and platform design affect the way users perceive and engage with digital media. A central aim of his work is to ensure that next-generation media technologies are designed, evaluated, and deployed with human experience at their core.
Dr Adeyemi-Ejeye contributes to the development of international multimedia quality standards through his work in ITU-T Study Group 12 and MPEG, helping to shape methodologies for the assessment of immersive and next-generation media systems. He is also a Board Member of the Video Quality Experts Group (VQEG) and a member of the IEEE Consumer Technology Society (CTSoc) Wireless and Network Technologies Technical Committee, reflecting his active role in the international research and standards community.
His research combines fundamental and applied perspectives, with a strong emphasis on translation into real-world systems. He collaborates with partners across the telecommunications, transport, and creative technology sectors to develop perceptually informed approaches to media system design and evaluation. This includes work on high-resolution video systems, immersive media delivery, and technologies used in operational and safety-critical environments.
An example of this applied research is his contribution to a £396,349 SBRI First of a Kind (FOAK) project led by Rail Innovations, which explored how 8K video, cloud technologies, and AI-based image recognition could enhance railway incident investigation and support faster restoration of services. Dr Adeyemi-Ejeye was the Surrey Principal Investigator on Surrey’s contribution to the project, working with partners including One Big Circle Ltd, Avanti West Coast, and Angel Trains Ltd.
Alongside his research, Dr Adeyemi-Ejeye contributes to teaching in video technology, multimedia systems, and computer imaging, supporting the development of students in areas such as video compression, networking, and immersive media systems. He supervises undergraduate and doctoral research and is committed to mentoring students and early-career researchers while fostering strong links between academic research and industry practice. He also currently serves as Director of Postgraduate Research in the School of Arts, Humanities and Creative Industries at the University of Surrey, where he supports doctoral training, interdisciplinary research culture, and collaboration between academia, industry, and public organisations.
He welcomes collaboration with researchers, industry partners, and public sector organisations interested in multimedia quality, video compression, networked media systems, immersive technologies, and the human-centred evaluation of next-generation digital media.
Areas of specialism
University roles and responsibilities
- Director of Post-Graduate Research, School of Arts, Humanities and Creative Industries
- Senior Placement Tutor, Film Production and Broadcast Engineering
Previous roles
Affiliations and memberships
Business, industry and community links
ResearchResearch interests
Dr Femi Adeyemi-Ejeye’s research interests focus on the human-centred evaluation and optimisation of networked and next-generation media systems. His work sits at the intersection of video compression, perceptual video quality, multimedia systems, networked media delivery, and immersive media technologies, examining how technical design choices influence the way users perceive and experience digital media.
His research covers Quality of Experience for video and immersive systems, streaming and real-time communication, XR and immersive media, safety-critical video applications, and AI-enabled multimedia analysis. Through this work, he contributes to international standards development and is helping to shape how the field evaluates media systems.
He welcomes enquiries from prospective PhD students and collaborators interested in multimedia quality, immersive media systems, video technologies, and next-generation networked media applications.
Research projects
Right-Time performance is key to railway customers having a good journey on the railways and fatalities are a major source of disruption. These incidents cause lines to be closed whilst British Transport Police (BTP) investigate and the railway is readied to reopen. TRUST data extracted by the National Disruption Fusion Unit of Network Rail (NR) show that, on average, there have been approximately 350 fatalities pa on GB railways causing 790,000 Delay Minutes pa, equating to 2,300 Delay Minutes per incident.
In response, this project will use emergent 8K video technology, Cloud technology and advanced Artificial Intelligence image recognition to provide BTP with high-quality video they can forensically analyse quickly. With better recordings, more incidents that would be judged as Unexplained due to current technology limitations will correctly be deemed as suicides, reducing hand-back time and alleviating wider customer disruption.
Surrey Principal Investigator: 9-month project and is one of the winners of the 2021 First of a Kind (FOAK) competition funded by SBRI: the Small Business Research Initiative (Total funding:£396,349). It was led by Rail Innovations and also includes One Big Circle Ltd, Avanti West Coast and Angel Trains Ltd as project partners.
This project is funded by ESRC Impact Acceleration Account will build on the successes of a previous project and provide guidance on how to better capture rail forward-facing CCTV video based on improvements offered by 4KUHD and 8KUHD resolutions to rail emergency responders and video systems providers.
The safe deployment of autonomous and remotely operated vehicles is only possible with mature and fault-tolerant vehicle control systems where any kind of vehicle failure is taken into consideration, including overriding autonomous driving software in the event of significant computer system failures. These control systems, also called drive-by-wire (DbW) systems, are not yet commercially available to the vast majority of vehicle manufacturers, thereby severely limiting the potential roll-out of autonomous vehicles across all markets.
The SAFE (Systems for Autonomy in Fail-Operational Environments) project will develop and test technologies applicable for both NUIC (No User In Charge) and UIC (User in Charge) vehicle platforms integrating novel safety systems and subsystems capable of achieving SAE Level 4 (L4) autonomy within a wide range of Operational Design Domains (ODDs).
Perceptual quality evaluation for Mixed Reality Communications (PI)Our research on assessing visual quality and motion sickness in 360° videos led to evaluation methods standardised by the International Telecommunication Union (ITU) as ITU-T Rec. P919. We've discovered these methods can be applied to subjectively assess extended reality (XR) communications. Collaborating with the Video Quality Experts Group, we're developing comprehensive guidelines for subjectively evaluating XR communications quality, leveraging our expertise in mixed-reality visual communication within black box systems and activity-based games. These guidelines will transform XR experience assessments, ensuring accuracy and reliability. Our proposed guidelines will be submitted to the prestigious ITU for consideration, potentially setting a new standard.
Research collaborations
Subjective test methodologies for 360º video on head-mounted display
An international collaboration involving 10 labs and more than 300 participants. This collaboration involved members of the Immersive Media Group (IMG) of the Video Quality Experts Group (VQEG).
The results from this collaboration led to the development of ITU-T Recommendation P.919
Quality of experience (QoE) requirements for real-time multimedia services over 5G networks
An international collaboration to produce a technical report that defines a scope for the analysis of QoE in 5G services and several use cases where this scope is applicable. Such use cases are: tele-operated driving, wireless content production, mixed reality offloading and first responder networks.
This collaboration was with members of VQEG's 5G Key Performance Indicators (5GKPI) group.
The published report can be seen here: ITU GSTR-5GQoE
ITU-T P.BBQCG: Development of a bitstream model predicting cloud gaming QoE
The model will benefit from the bitstream information, from header and payload of packets, to reach a higher accuracy of audiovisual quality prediction, compared to G.1072. In addition, three different types of codecs and a wider range of network parameters will be considered to develop a generalizable model. The model will be trained and validated for H.264, H.265, and AV1 video codecs and video resolutions up to 4K. For the development of the model, two paradigms of passive and interactive will be followed. The passive paradigm will be considered to cover a high range of encoding parameters, while the interactive paradigm will cover the network parameters that might strongly influence the interaction of players with the game.
This collaboration was with members of VQEG's CGI group.
More on the work item can be found here: ITU P.BBQCG
Indicators of esteem
Invited member of the IEEE CTSoc: Wireless and Network Technologies (WNT) Technical Committee (2020 - Present)
Research interests
Dr Femi Adeyemi-Ejeye’s research interests focus on the human-centred evaluation and optimisation of networked and next-generation media systems. His work sits at the intersection of video compression, perceptual video quality, multimedia systems, networked media delivery, and immersive media technologies, examining how technical design choices influence the way users perceive and experience digital media.
His research covers Quality of Experience for video and immersive systems, streaming and real-time communication, XR and immersive media, safety-critical video applications, and AI-enabled multimedia analysis. Through this work, he contributes to international standards development and is helping to shape how the field evaluates media systems.
He welcomes enquiries from prospective PhD students and collaborators interested in multimedia quality, immersive media systems, video technologies, and next-generation networked media applications.
Research projects
Right-Time performance is key to railway customers having a good journey on the railways and fatalities are a major source of disruption. These incidents cause lines to be closed whilst British Transport Police (BTP) investigate and the railway is readied to reopen. TRUST data extracted by the National Disruption Fusion Unit of Network Rail (NR) show that, on average, there have been approximately 350 fatalities pa on GB railways causing 790,000 Delay Minutes pa, equating to 2,300 Delay Minutes per incident.
In response, this project will use emergent 8K video technology, Cloud technology and advanced Artificial Intelligence image recognition to provide BTP with high-quality video they can forensically analyse quickly. With better recordings, more incidents that would be judged as Unexplained due to current technology limitations will correctly be deemed as suicides, reducing hand-back time and alleviating wider customer disruption.
Surrey Principal Investigator: 9-month project and is one of the winners of the 2021 First of a Kind (FOAK) competition funded by SBRI: the Small Business Research Initiative (Total funding:£396,349). It was led by Rail Innovations and also includes One Big Circle Ltd, Avanti West Coast and Angel Trains Ltd as project partners.
This project is funded by ESRC Impact Acceleration Account will build on the successes of a previous project and provide guidance on how to better capture rail forward-facing CCTV video based on improvements offered by 4KUHD and 8KUHD resolutions to rail emergency responders and video systems providers.
The safe deployment of autonomous and remotely operated vehicles is only possible with mature and fault-tolerant vehicle control systems where any kind of vehicle failure is taken into consideration, including overriding autonomous driving software in the event of significant computer system failures. These control systems, also called drive-by-wire (DbW) systems, are not yet commercially available to the vast majority of vehicle manufacturers, thereby severely limiting the potential roll-out of autonomous vehicles across all markets.
The SAFE (Systems for Autonomy in Fail-Operational Environments) project will develop and test technologies applicable for both NUIC (No User In Charge) and UIC (User in Charge) vehicle platforms integrating novel safety systems and subsystems capable of achieving SAE Level 4 (L4) autonomy within a wide range of Operational Design Domains (ODDs).
Our research on assessing visual quality and motion sickness in 360° videos led to evaluation methods standardised by the International Telecommunication Union (ITU) as ITU-T Rec. P919. We've discovered these methods can be applied to subjectively assess extended reality (XR) communications. Collaborating with the Video Quality Experts Group, we're developing comprehensive guidelines for subjectively evaluating XR communications quality, leveraging our expertise in mixed-reality visual communication within black box systems and activity-based games. These guidelines will transform XR experience assessments, ensuring accuracy and reliability. Our proposed guidelines will be submitted to the prestigious ITU for consideration, potentially setting a new standard.
Research collaborations
Subjective test methodologies for 360º video on head-mounted display
An international collaboration involving 10 labs and more than 300 participants. This collaboration involved members of the Immersive Media Group (IMG) of the Video Quality Experts Group (VQEG).
The results from this collaboration led to the development of ITU-T Recommendation P.919
Quality of experience (QoE) requirements for real-time multimedia services over 5G networks
An international collaboration to produce a technical report that defines a scope for the analysis of QoE in 5G services and several use cases where this scope is applicable. Such use cases are: tele-operated driving, wireless content production, mixed reality offloading and first responder networks.
This collaboration was with members of VQEG's 5G Key Performance Indicators (5GKPI) group.
The published report can be seen here: ITU GSTR-5GQoE
ITU-T P.BBQCG: Development of a bitstream model predicting cloud gaming QoE
The model will benefit from the bitstream information, from header and payload of packets, to reach a higher accuracy of audiovisual quality prediction, compared to G.1072. In addition, three different types of codecs and a wider range of network parameters will be considered to develop a generalizable model. The model will be trained and validated for H.264, H.265, and AV1 video codecs and video resolutions up to 4K. For the development of the model, two paradigms of passive and interactive will be followed. The passive paradigm will be considered to cover a high range of encoding parameters, while the interactive paradigm will cover the network parameters that might strongly influence the interaction of players with the game.
This collaboration was with members of VQEG's CGI group.
More on the work item can be found here: ITU P.BBQCG
Indicators of esteem
Invited member of the IEEE CTSoc: Wireless and Network Technologies (WNT) Technical Committee (2020 - Present)
Supervision
Postgraduate research supervision
Dr Femi Adeyemi-Ejeye supervises doctoral research in multimedia systems, video technologies, and immersive media, with a focus on the human-centred evaluation and optimisation of next-generation media systems. His supervision is grounded in research at the intersection of video compression, networked media delivery, perceptual video quality, and Quality of Experience (QoE), and draws on his engagement with international standards bodies and industry collaborations.
His research group investigates how compression, networks, and system design influence the way users perceive and interact with digital media, combining experimental user studies, perceptual modelling, and systems-level analysis. Through this work, doctoral researchers contribute to advancing the methodologies used to evaluate emerging multimedia technologies and, in some cases, to discussions within international standards activities shaping the field.
Dr Adeyemi-Ejeye welcomes ambitious and interdisciplinary PhD projects that address challenges in networked video systems, immersive media technologies, and perceptual evaluation of multimedia experiences. Students working in this area may have opportunities to engage with industry partners, real-world application domains, and international research communities, particularly where multimedia technologies intersect with telecommunications, transport systems, digital media, and safety-critical environments.
Potential PhD research topics include
- Quality of Experience (QoE) modelling for immersive and XR media systems
- Perceptual evaluation of video compression and streaming technologies
- AI and machine learning approaches to video quality assessment
- Human perception and user experience in networked multimedia systems
- Evaluation methodologies for next-generation immersive media platforms
- High-resolution and safety-critical video systems for operational environments
Prospective PhD students interested in these areas are encouraged to get in touch to discuss potential research projects and funding opportunities.
Teaching
Student consultation
If you would like to book an appointment to discuss any of the modules below, please click here.
Module Leader
- FVP2009 VIDEO STREAMING AND COMPUTER NETWORKS
- FVP1013 Computer Systems
- FVP3014 Research methods
- FVP3012 Technical Project
Modules I teach on:
- FVP2009 VIDEO STREAMING AND COMPUTER NETWORKS
- FVP1013 Computer Systems
- TON1024 Computer Systems
- FVP3014 Research methods
Sustainable development goals
My research interests are related to the following:
Publications
We present a scalable data-driven machine learning approach for early and continuous TCP flow-length prediction, enabling Software-Defined Networking controllers to make proactive, latency-aware routing decisions. Unlike traditional Elephant Flow versus Mice Flow classification, which depends on static thresholds and delayed observation, our method performs a data-driven machine learning regression-based estimation using only the first 400ms of traffic. We aggregate IP packets through tokenization to preserve temporal dynamics while reducing monitoring overhead. An ensemble of Long Short-Term Memory layers extract temporal features, that are fused and processed by an uncertainty modelling Mixture Density Network to predict the total flow length. Experiments on real world CAIDA and MAWI datasets show that our approach reduces mean absolute error to 1.74s, nearly halving the error of state-of-the-art baselines.
This paper presents the first study quantifying how practitioner experience affects perceived quality and usability of compressed Forward-Facing Video (FFV). We captured a bespoke 8K UHD forward-facing dataset and encoded each sequence using H.264/AVC and HEVC at three bitrates (30, 50, and 60 Mbps). 18 rail practitioners were recruited and stratified by operational experience. Participants provided subjective ratings of video quality and usability and completed a task-based recognition assessment with associated decision-confidence ratings. FFV is increasingly integral to railway safety assurance and incident investigation, providing visual evidence for events such as Signals Passed at Danger (SPAD), near-misses, and collisions. In operational deployments, FFV acts as a human-in-the-loop decision-support resource that must be transmitted and archived under bandwidth and storage constraints, making compression unavoidable. However, compression artefacts can reduce the visibility of decision-critical cues and potentially undermine confidence in operational judgements. While expertise has been shown to shape video interpretation in medical and surveillance contexts, its influence on rail FFV interpretation has received limited empirical attention. Initial results indicate a clear effect of experience: more experienced practitioners appeared more tolerant of visible artefacts, rated compressed FFV as more usable, and reported higher confidence when making task judgements under degraded conditions. A three-way mixed-design ANOVA confirmed a significant effect of experience, while codec and bitrate effects were non-significant, with performance approaching a ceiling at 60 Mbps. These preliminary findings motivate practitioner-aware evaluation protocols and system requirements for safety-critical rail FFV, complementing fidelity metrics with interpretability-driven acceptance criteria. We outline implications for codec selection, bitrate provisioning, and role-based quality thresholds in railway video systems engineering.
Industry 4.0, driven by enhanced connectivity by wireless technologies such as 5G and Wi-Fi 6, fosters flexible industrial scenarios for high-yield production and services. Private 5G networks and 802.11ax networks in unlicensed spectrum offer very unique opportunities, however existing techniques limit the flexibility needed to serve diverse industrial use cases. In order to address a subset of these challenges, this paper offers a solution for time-sensitive application use cases. A new technique is proposed to enable data-driven operations through Machine Learning for technologies sharing unlicensed bands. This enables proportionate spectrum sharing informed by data to improve critical applications performance metrics. The results presented reveal improved performance to serve critical industrial operations, without degrading spectrum utilization.
This work proposes UIL-AQA for long-term Action Quality Assessment AQA designed to be cliplevel interpretable and uncertainty-aware. AQA evaluates the execution quality of actions in videos. However, the complexity and diversity of actions, especially in long videos, increases the difficulty of AQA. Existing AQA methods solve this by limiting themselves generally to short-term videos. These approaches lack detailed semantic interpretation for individual clips and fail to account for the impact of human biases and subjectivity in the data during model training. Moreover, although querybased Transformer networks demonstrate strong capabilities in long-term modelling, their interpretability in AQA remains insufficient. This is primarily due to a phenomenon we identified, termed Temporal Skipping, where the model skips self-attention layers to prevent output degradation. We introduce an Attention Loss function and a Query Initialization Module to enhance the modelling capability of query-based Transformer networks. Additionally, we incorporate a Gaussian Noise Injection Module to simulate biases in human scoring, mitigating the influence of uncertainty and improving model reliability. Furthermore, we propose a Difficulty-Quality Regression Module, which decomposes each clip’s action score into independent difficulty and quality components, enabling a more fine-grained and interpretable evaluation. Our extensive quantitative and qualitative analysis demonstrates that our proposed method achieves state-of-the-art performance on three long-term real-world AQA datasets. Our code is available at: GitHub Repository.
This preliminary study investigates the impact of packet loss on commercial 2D video conferencing systems, specifically Microsoft Teams and Google Meet, when used within a virtual reality (VR) workspace. These platforms, treated as black-box systems, were accessed through head-mounted displays (HMDs) via the Immersed VR application. Participants engaged in a gesture-based charade game under varying packet loss conditions, alternating between gesture-only (Mimer) and audio-enabled (Guesser) roles. Early results from 28 sessions (14 per platform) indicate a noticeable degradation in audiovisual experience as packet loss increases, particularly for gesture-based users. Microsoft Teams demonstrated greater resilience compared to Google Meet, although these findings remain exploratory. The study lays the groundwork for a more comprehensive Quality of Experience (QoE) evaluation. Future work will include additional participants and integrate metrics such as Simulator Sickness (SSQ), NASA TLX, and Quality of Interaction (QoI) to support a fuller assessment of conferencing feasibility in immersive settings.
Broadcast television traditionally employs a unidi-rectional transmission path to deliver low latency, high-quality media to viewers. To expand their viewing choices, audiences now demand internet OTT (Over The Top) streamed media with the same quality of experience they have become accustomed to with traditional broadcasting. Media streaming over the internet employs elephant flow characteristics and suffers long delays due to the inherent and variable latency of TCP/IP. This paper proposes to perform rapid elephant flow detection on IP networks within 200ms using a data-driven temporal sequence prediction model, reducing the existing detection time by half. Early detection of media streams (elephant flows) as they enter the network allows the controller in a software-defined network to reroute the elephant flows so that the probability of congestion is reduced and the latency-sensitive mice flows can be given priority. We propose a two-stage machine learning method that encodes the inherent and non-linear temporal data and volume characteristics of the sequential network packets using an ensemble of Long Short-Term Memory (LSTM) layers, followed by a Mixture Density Network (MDN) to model uncertainty, thus determining when an elephant flow (media stream) is being sent within 200ms of the flow starting. We demonstrate that on two standard datasets, we can rapidly identify elephant flows and signal them to the controller within 200ms, improving the current count-min-sketch method that requires more than 450ms of data to achieve comparable results.
This paper presents an evaluation of the latest MPEG-5 Part 2 Low Complexity Enhancement Video Coding (LCEVC) for video streaming applications using best effort protocols. LCEVC is a new video standard by MPEG, which enhances any base codec through an additional low bitrate stream, improving both video compression efficiency and and transmission. However, there is an interplay between packetization, packet loss visibility, choice of codec and video quality, which implies that prior studies with other codecs may be not as relevant. The contributions of this paper is, therefore in twofold: It evaluates the compression performance of LCEVC and then the impact of packet loss on its video quality when compared to H.264 and HEVC.The results from this evaluation suggest that, regarding compression, LCEVC outperformed its base codecs, overall in terms average encoding bitrate savings when using the constant rate factor (CRF) rate control. For example at a CRF of 19, the average encoding bitrate was reduced by 18.7% and 15.8% when compared with the base H.264 and HEVC codecs respectively. Furthermore, LCEVC produced better visual quality across the packet loss range compared to its base codecs and the quality only started to decrease once packet loss exceeded 0.8-1%, and decreases at a slower pace compared to its equivalent base codecs. This suggests that the LCEVC enhancement layer also provides error concealment. The results presented in this paper will be of interest to those considering the LCEVC standard and expected video quality in error-prone environments
—Recently an impressive development in immersive technologies, such as Augmented Reality (AR), Virtual Reality (VR) and 360° video, has been witnessed. However, methods for quality assessment have not been keeping up. This paper studies quality assessment of 360° video from the cross-lab tests (involving ten laboratories and more than 300 participants) carried out by the Immersive Media Group (IMG) of the Video Quality Experts Group (VQEG). These tests were addressed to assess and validate subjective evaluation methodologies for 360° video. Audiovisual quality, simulator sickness symptoms, and exploration behavior were evaluated with short (from 10 seconds to 30 seconds) 360° sequences. The following factors’ influences were also analyzed: assessment methodology, sequence.
Long-term Action Quality Assessment (AQA) evaluates the execution of activities in videos. However, the length presents challenges in fine-grained interpretability, with current AQA methods typically producing a single score by averaging clip features, lacking detailed semantic meanings of individual clips. Long-term videos pose additional difficulty due to the complexity and diversity of actions, exacerbating interpretability challenges. While query-based transformer networks offer promising long-term modelling capabilities, their interpretability in AQA remains unsatisfactory due to a phenomenon we term Temporal Skipping, where the model skips self-attention layers to prevent output degradation. To address this, we propose an attention loss function and a query initialization method to enhance performance and interpretability. Additionally, we introduce a weight-score regression module designed to approximate the scoring patterns observed in human judgments and replace conventional single-score regression, improving the rationality of interpretability. Our approach achieves state-of-the-art results on three real-world, long-term AQA benchmarks.
Broadcast television traditionally employs a unidirectional transmission path to deliver low latency, high-quality media to viewers. To expand their viewing choices, audiences now demand internet OTT (Over The Top) streamed media with the same quality of experience they have become accustomed to with traditional broadcasting. Media streaming over the internet employs elephant flow characteristics and suffers long delays due to the inherent and variable latency of TCP/IP. Early detection of media streams (elephant flows) as they enter the network allows the controller in a software-defined network to re-route the elephant flows so that the probability of congestion is reduced and the latency-sensitive mice flows can be given priority. This paper proposes to perform rapid elephant flow detection, and hence media flow detection, on IP networks within 200ms using a data-driven temporal sequence prediction model, reducing the existing detection time by half. We propose a two-stage machine learning method that encodes the inherent and non-linear temporal data and volume characteristics of the sequential network packets using an ensemble of Long Short-Term Memory (LSTM) layers, followed by a Mixture Density Network (MDN) to model uncertainty, thus determining when an elephant flow (media stream) is being sent within 200ms of the flow starting. We demonstrate that on two standard datasets, we can rapidly identify elephant flows and signal them to the controller within 200ms, improving the current count-minsketch method that requires more than 450ms of data to achieve comparable results.
After adjusting for coding gain between the H.264 and HEVC codecs, a comparison is made between the two codecs’ robustness to packet loss. A counter-intuitive finding arises that the less efficient codec is less affected by packet loss than the more efficient codec, even at very low levels of packet loss. The findings will be of interest to those designing portable devices that can display up to 4kUHD video.
This paper examines the 4kUHD video quality from streaming over an IEEE 802.11ac wireless channel, given measured levels of packet loss. Findings suggest that there is a strong content dependency to loss impact upon video quality but that, for short-range transmission, the quality is acceptable, making 4kUHD feasible on head-mounted displays.
Networked visual applications such video streaming have grown exponentially in recent years, yet are known to be sensitive to network impairments. However, available measurement techniques that adopt a full reference model are impractical in real-time streaming because they require the original video sequence available at the receivers side. The primary aim of this study is to present a hybrid no-reference prediction model for the perceptual quality of 4kUHD H.265-coded video in the wireless domain. The contributions of this paper are two-fold: first, an investigation of the impact of quality of service (QoS) parameters on 4kUHD H.265-coded video transmission in an experimental environment; second, objective model based on fuzzy logic inference system is developed to predict the visual quality by mapping QoS parameters to the measured quality of experience. The model is evaluated in contrast to random neural networks. The results show that good prediction accuracy was obtained from the proposed hybrid prediction model. This study will help in the development of a reference-free video quality prediction model and QoS control methods for 4kUHD video streaming.
From a review of the literature and a range of experiments, this paper demonstrates that live video streaming to mobile devices with pixel resolutions from Standard Definition up to 4k Ultra High Definition (UHD) is now becoming feasible by means of high-throughput IEEE 802.11ad at 60 GHz or 802.11ac at 5 GHz, and 4kUHD streaming is even possible with 802.11n operating at 5 GHz. The paper, by a customized implementation, also shows that real-time compression, assisted by Graphical Processing Units (GPUs) at 4kUHD, is also becoming feasible. The paper further considers the impact of packet loss on H.264/AVC and HEVC codec compressed video streams in terms of Structural Similarity (SSIM) index video quality. It additionally gives an indication of wireless network latencies and currently feasible frame rates. Findings suggest that, for medium-range transmission, the video quality may be acceptable at low packet loss rates. For hardware-accelerated 4kUHD encoding, standard frame rates may be possible but appropriate higher frame rates are only just being reached in hardware implementations. The target bitrate was found to be important in determining the display quality, which depends on the coding complexity of the video content. Higher compressed bitrates are recommended, as video quality may improve disproportionately as a result.
The trend towards video streaming with increased spatial resolutions and dimensions, SD, HD, 3D, and 4kUHD, even for portable devices has important implications for displayed video quality. There is an interplay between packetization, packet loss visibility, choice of codec, and viewing conditions, which implies that prior studies at lower resolutions may not be as relevant. This paper presents two sets of experiments, the one at a Variable BitRate (VBR) and the other at a Constant BitRate(CBR), which highlight different aspects of the interpretation. The latter experiments also compare and contrast encoding with either an H.264 or an High Efficiency Video Coding (HEVC) codec, with all results recorded as objective Mean Opinion Score (MOS). The video quality assessments will be of interest to those considering: the bitrates and expected quality in error-prone environments; or, in fact, whether to use a reliable transport protocol to prevent all errors, at a cost in jitter and latency, rather than tolerate low levels of packet errors.
Ultra High Definition (UHD) video streaming to portable devices has become topical. Two standardized codecs are current, H.264/Advanced Video Coding (AVC) and the more recent High Efficiency Video Coding (HEVC). This paper compares the two codecs’ robustness to packet loss, after making allowances for relative coding gain. A significant finding from the comparison is that the H.264/AVC codec is less impacted by packet loss than HEVC, despite their differing coding efficiencies and including at low levels of packet loss. The results will be especially relevant to those designing portable devices with 4K UHD video display capability, allowing them to estimate the level of error concealment necessary. The paper also includes the results of HEVC compressed UHD video streaming over an IEEE 802.11ad wireless link operating at 60 GHz as a pointer to future performance in an error-prone channel.
The Internet of things (IoT) has received a great deal of attention in recent years, and is still being approached with a wide range of views. At the same time, video data now accounts for over half of the internet traffic. With the current availability of beyond high definition, it is worth understanding the performance effects, especially for real-time applications. High Efficiency Video Coding (HEVC) aims to provide reduction in bandwidth utilisation while maintaining perceived video quality in comparison with its predecessor codecs. Its adoption aims to provide for areas such as television broadcast, multimedia streaming/storage, and mobile communications with significant improvements. Although there have been attempts at HEVC streaming, the literature/implementations offered do not take into consideration changes in the HEVC specifications. Beyond this point, it seems little research exists on real-time HEVC coded content live streaming. Our contribution fills this current gap in enabling compliant and real-time networked HEVC visual applications. This is done implementing a technique for real-time HEVC encapsulation in MPEG-2 Transmission Stream (MPEG-2 TS) and HTTP Live Streaming (HLS), thereby removing the need for multi-platform clients to receive and decode HEVC streams. It is taken further by evaluating the transmission of 4k UHDTV HEVC-coded content in a typical wireless environment using both computers and mobile devices, while considering well-known factors such as obstruction, interference and other unseen factors that affect the network performance and video quality. Our results suggest that 4kUHD can be streamed at 13.5 Mb/s, and can be delivered to multiple devices without loss in perceived quality.
This paper proposes a prediction model for the perceptual quality of wireless 4kUHD H.265 video streaming. Based on Interval Type-2 Fuzzy Logic System (IT2FLS), the model exploits application and physical layer parameters. The results show that good prediction accuracy was obtained from the proposed prediction model. This study should help in the development of a reference-free video quality prediction model and QoS control methods for 4kUHD video streaming.
Door phone systems, allowing occupants of a building to communicate with visitors at the door, have evolved over the years, with the current advancements being a fully internet protocol (IP) based solution. In order to adopt newer IP based solutions, current analogue systems can be replaced, yet this may be costly and cumbersome, especially in a conventional multioccupant building. We therefore propose an architecture which supports current analogue door phone systems, and also provides IP based functionality. We have implemented the proposed architecture based on SIP, WebRTC and an IoT gateway system connected to the multi-occupant conventional video door phone system.