Dr Sefki Kolozali
Qualifications: B.Sc. M.Sc. Ph.D.
BiographyI am a Research Fellow at the Institute for Communication Systems (ICS), University of Surrey, and currently involved in CityPulse project under supervision of Dr. Payam Barnaghi as WP3 leader on large scale data analysis and seamless integration of data sources. I have received the B.Sc. degree in Computer Engineering from the Near East University, Nicosia, Turkish Republic of Northern Cyprus, in 2005 and the M.Sc. in Electronic Commerce Technology from University of Essex, Essex, UK, in 2008. I have completed the Ph.D. degree at the Centre for Digital Music (Queen Mary, University of London) under the supervision of Professor Mark Sandler. My thesis was focused on automatic ontology generation based on semantic audio analysis. The research aimed to investigate how the process of developing ontologies can be made less dependent on human supervision by exploring conceptual analysis techniques in a Semantic Web environment. The main application that I developed was a hybrid system to automatically build an ontology using timbre based instrument recognition techniques and Formal Concept Analysis. The potential benefits of such an automated knowledge management system involve a broad perspective spanning from knowledge engineering (knowledge acquisition, representation, and management) to cognitive robotics and pervasive computing systems. I, therefore, practiced an interdisciplinary approach involving diverse research themes: (i) audio perception, (ii) information retrieval, (iii) knowledge management and representation, and (iv) semantic web technologies.
My research interests include Internet of Things, semantic web technologies, knowledge engineering, music information retrieval, Big Data, and machine learning techniques; and developing new technologies for the future Internet and Web systems.
The research carried out for my Ph.D. was used in two separate projects, Networked Environment for Music Analysis (NEMA) and Ontology-driven Music Retrieval & Annotation Sharing service 2 (OMRAS 2) projects. Networked Environment for Music Analysis (NEMA) project addressed the problem of automation, distribution and integration of Music Information Retrieval (MIR) and Computational Music (CM) research tool development, evaluation and use. Ontology-driven Music Retrieval & Annotation Sharing service 2 (OMRAS 2) project addressed annotating and searching collections of both recorded music and digital score representations such as MIDI. I have also contributed to the above stated projects by filtering and publishing a large set of music audio similarity features produced by the SoundBite playlist generator tool. The database, called Isophone, was made available on the Semantic Web via SPARQL end-point, which can be used in Linked Data services.
I am currently working as WP3 leader on large scale data analysis and seamless integration of data sources at CityPulse: Real-Time IoT Stream Processing and Large-scale Data Analytics for Smart City Applications, EU FP7 project (2013-2017).
- 'On the Effect of Adaptive and Non-Adaptive Analysis of Time-Series Sensory Data'.
IEEE Internet of Things, 99
With the growing popularity of Information and Communications Technologies (ICT) and information sharing and integration, cities are evolving into large interconnected ecosystems by using smart objects and sensors that enable interaction with the physical world. However, it is often difficult to perform real-time analysis of large amount on heterogeneous data and sensory information that are provided by various resources. This paper describes a framework for real-time semantic annotation and aggregation of data streams to support dynamic integration into the Web using the Advanced Message Queuing Protocol (AMQP). We provide a comprehensive analysis on the effect of adaptive and non-adaptive window size in segmentation of time series using SensorSAX and SAX approaches for data streams with different variation and sampling rate in real-time processing. The framework is evaluated with 3 parameters, namely window size parameter of the SAX algorithm, sensitivity level and minimum window size parameters of the SensorSAX algorithm based on the average data aggregation and annotation time, CPU consumption, data size, and data reconstruction rate. Based on a statistical analysis, a detailed comparison between various sensor points is made to investigate the memory and computational cost of the stream-processing framework. Our results suggests that regardless of utilised segmentation approach, due to the fact that each geographically different sensory environment has got different dynamicity level, it is desirable to find the optimal data aggregation parameters in order to reduce the energy consumption and improve the data aggregation quality.
- 'CityPulse: Large Scale Data Analytics Framework for Smart Cities'.
IEEE Access, 4Repository URL: http://epubs.surrey.ac.uk/810693/
Our world and our lives are changing in many ways. Communication, networking, and computing technologies are among the most influential enablers that shape our lives today. Digital data and connected worlds of physical objects, people, and devices are rapidly changing the way we work, travel, socialize, and interact with our surroundings, and they have a profound impact on different domains, such as healthcare, environmental monitoring, urban systems, and control and management applications, among several other areas. Cities currently face an increasing demand for providing services that can have an impact on people's everyday lives. The CityPulse framework supports smart city service creation by means of a distributed system for semantic discovery, data analytics, and interpretation of large-scale (near-)real-time Internet of Things data and social media data streams. To goal is to break away from silo applications and enable cross-domain data integration. The CityPulse framework integrates multimodal, mixed quality, uncertain and incomplete data to create reliable, dependable information and continuously adapts data processing techniques to meet the quality of information requirements from end users. Different than existing solutions that mainly offer unified views of the data, the CityPulse framework is also equipped with powerful data analytics modules that perform intelligent data aggregation, event detection, quality assessment, contextual filtering, and decision support. This paper presents the framework, describes its components, and demonstrates how they interact to support easy development of custom-made applications for citizens. The benefits and the effectiveness of the framework are demonstrated in a use-case scenario implementation presented in this paper.
- 'Automatic Ontology Generation for Musical Instruments Based on Audio Analysis.'. IEEE Transactions on Audio, Speech & Language Processing, 21 Article number 10 , pp. 2207-2220. . (2013)
- 'A Validation Tool for the W3C SSN Ontology based Sensory Semantic Knowledge'.
The 13th International Semantic Web Conference
This paper describes an ontology validation tool that is designed for the W3C Semantic Sensor Networks Ontology (W3C SSN). The tool allows ontologies and linked-data descriptions to be validated against the concepts and properties used in the W3C SSN model. It generates validation reports and collects statistics re- garding the most commonly used terms and concepts within the ontologies. An online version of the tool is available at: (http://iot.ee.surrey.ac.uk/SSNValidation). This tool can be used as a checking and validation service for new ontology de- velopments in the IoT domain. It can also be used to give feedback to W3C SSN and other similar ontology developers regarding the most commonly used concepts and properties from the reference ontology and this information can be used to create core ontologies that have higher level interoperability across different systems and various application domains.
- 'A Knowledge-based Approach for Real-Time IoT Data Stream Annotation and Processing'.
IEEE International Conference on Internet of Things (iThings 2014)
Internet of Things is a generic term that refers to interconnection of real-world services which are provided by smart objects and sensors that enable interaction with the physical world. Cities are also evolving into large intercon- nected ecosystems in an effort to improve sustainability and operational efficiency of the city services and infrastructure. However, it is often difficult to perform real-time analysis of large amount of heterogeneous data and sensory information that are provided by various sources. This paper describes a framework for real-time semantic annotation of streaming IoT data to support dynamic integration into the Web using the Advanced Message Queuing Protocol (AMPQ). This will enable delivery of large volume of data that can influence the performance of the smart city systems that use IoT data. We present an information model to represent summarisation and reliability of stream data. The framework is evaluated with the data size and average exchanged message time using summarised and raw sensor data. Based on a statistical analysis, a detailed comparison between various sensor points is made to investigate the memory and computational cost for the stream annotation framework.
- 'A framework for automatic ontology generation based on semantic audio analysis'.
Proceedings of the AES International Conference, , pp. 87-96.
Ontologies have been established for knowledge sharing and are widely used for structuring domains of interests conceptually. With growing amount of data on the internet, manual annotation and development of ontologies becomes critical. We propose a hybrid system to develop ontologies from audio signals automatically, in order to provide assistance to ontology engineers. The method is examined using various musical instruments, from wind and string families, that are classified using timbre features extracted from audio. To obtain models of the analysed instrument recordings, we use K-means clustering and determine an optimised codebook of Line Spectral Frequencies (LSFs) or Mel-frequency Cepstral Coefficients (MFCCs). The system was tested using two classification techniques, Multi-Layer Perceptron (MLP) neural network and Support Vector Machines (SVM). We then apply Formal Concept Analysis (FCA) to derive a lattice of concepts which is transformed into an ontology using the Ontology Web Language (OWL). The system was evaluated using Multivariate Analysis of Variance (MANOVA), with the feature and classifier attributes as independent variables and the lexical and taxonomic evaluation metrics as dependent variables.
- 'Knowledge Management On The Semantic Web: A Comparison of Neuro-Fuzzy and Multi-Layer Perceptron Methods For Automatic Music Tagging'.
The 9th International Symposium on Computer Music Modeling and Retrieval (CMMR 2012)
This paper presents the preliminary analyses towards the development of a formal method for generating autonomous, dynamic ontology systems in the context of web-based audio signals applications. In the music domain, social tags have become important components of database management, recommender systems, and song similarity en- gines. In this study, we map the audio similarity features from the Iso- phone database  to social tags collected from the Last.fm online mu- sic streaming service, by using neuro-fuzzy (NF) and multi-layer percep- tron (MLP) neural networks. The algorithms were tested on a large-scale dataset (Isophone) including more than 40 000 songs from 10 different musical genres. The classification experiments were conducted for a large number of tags (32) related to genre, instrumentation, mood, geographic location, and time-period. The neuro-fuzzy approach increased the over- all F-measure by 25 percentage points in comparison with the traditional MLP approach. This highlights the interest of neuro-fuzzy systems which have been rarely used in music information retrieval so far, whereas they have been interestingly applied to classification tasks in other domains such as image retrieval and affective computing.
- 'Music recommendation for music learning: Hotttabs, a multimedia guitar tutor'.
CEUR Workshop Proceedings, 793, pp. 7-13.
Music recommendation systems built on top of music information retrieval (MIR) technologies are usually designed to provide new ways to discover and listen to digital music collections. However, they do not typically assist in another important aspect of musical activity, music learning. In this study we present the application Hotttabs, an online music recommendation system dedicated to guitar learning. Hotttabs makes use of The Echo Nest music platform to retrieve the latest popular or "hot" songs based on editorial, social and charts/sales criteria, and YouTube to find relevant guitar video tutorials. The audio tracks of the YouTube videos are processed with an automatic chord extraction algorithm in order to provide a visual feedback of the chord labels synchronised with the video. Guitar tablatures, a form of music notation showing instrument fingerings, are mined from the web and their chord sequences are extracted. The tablatures are then clustered based on the songs' chord sequences complexity so that guitarists can pick up those adapted to their performance skills.
- 'Knowledge Representation Issues in Musical Instrument Ontology Design.'. University of Miami ISMIR, , pp. 465-470. . (2011)
- 'Towards the Automatic Generation of a Semantic Web Ontology for Musical Instruments.'. Springer SAMT, 6725, pp. 186-187. . (2010)
- 'Publishing Music Similarity Features on the Semantic Web.'. International Society for Music Information Retrieval ISMIR, , pp. 447-452. . (2009)
- The SSN Ontology Validation Service (if you are on campus try this link) (Code: Sefki Kolozali/Tarek Elsaleh)
- Stream Annotation Ontology (if you are on campus try this link)
- SAOPY (if you are on campus try this link)
- (Simplified) KAT python library (if you are on campus try this link)
- Semantically Annotated Smart City Data (City of Aarhus - Traffic/Parking/Pollution Datasets) (if you are on campus try this link.)
- Semantics and Data Analytics for Smart City Applications, ESWC 2015
- Stream Processing and Data Analytics for Smart City. Big Data and Analytics Summer School 2015, University of Essex. (Register here)