Paul Krause

Professor Paul Krause

Professor in Complex Systems
BSc, PhD, CMath, FIMA
+44 (0)1483 689861
32 BB 02
Tuesdays 13:00 - 15:00


University roles and responsibilities

  • Programme Lead for MSc in Data Science
  • Senior Academic Misconduct Officer
  • CS Rep on HPC Stakeholders Committee
  • Year 2 Coordinator

My publications


Bryant D, Krause PJ, Vreeswijk GAW (2006) Argue tuProlog: A Lightweight Argumentation Engine for Agent Applications, COMPUTATIONAL MODELS OF ARGUMENT 144 pp. 27-32 I O S PRESS
Krause Paul, Razavi AR, Moschoyiannis Sotiris, Marinos A (2009) Stability and Complexity in Digital Ecosystems, 2009 3RD IEEE INTERNATIONAL CONFERENCE ON DIGITAL ECOSYSTEMS AND TECHNOLOGIES pp. 85-90 IEEE
In this paper we explore the concept of ldquoecosystemrdquo as a metaphor in the development of the digital economy. We argue that the modelling of social ecosystems as self-organising systems is also relevant to the study of digital ecosystems. Specifically, that centralised control structures in digital ecosystems militate against emergence of innovation and adaptive response to pressures or shocks that may impact the ecosystem. We hope the paper will stimulate a more holistic approach to gaining empirical and theoretical understanding of digital ecosystems.
Leppenwell E, de Lusignan S, Vicente MT, Michalakidis G, Krause P, Thompson S, McGilchrist M, Sullivan F, Desombre T, Taweel A, Delaney B (2012) Developing a survey instrument to assess the readiness of primary care data, genetic and disease registries to conduct linked research: TRANSFoRm International Research Readiness (TIRRE) survey instrument., Inform Prim Care 20 (3) pp. 207-216
Clinical data are collected for routine care in family practice; there are also a growing number of genetic and cancer registry data repositories. The Translational Research and Patient Safety in Europe (TRANSFoRm) project seeks to facilitate research using linked data from more than one source. We performed a requirements analysis which identified a wide range of data and business process requirements that need to be met before linking primary care and either genetic or disease registry data.
Xhafa F, Wang J, Chen X, Liu JK, Li J, Krause P (2013) An efficient PHR service system supporting fuzzy keyword search and fine-grained access control, Soft Computing pp. 1-8
Outsourcing of personal health record (PHR) has attracted considerable interest recently. It can not only bring much convenience to patients, it also allows efficient sharing of medical information among researchers. As the medical data in PHR is sensitive, it has to be encrypted before outsourcing. To achieve fine-grained access control over the encrypted PHR data becomes a challenging problem. In this paper, we provide an affirmative solution to this problem. We propose a novel PHR service system which supports efficient searching and fine-grained access control for PHR data in a hybrid cloud environment, where a private cloud is used to assist the user to interact with the public cloud for processing PHR data. In our proposed solution, we make use of attribute-based encryption (ABE) technique to obtain fine-grained access control for PHR data. In order to protect the privacy of PHR owners, our ABE is anonymous. That is, it can hide the access policy information in ciphertexts. Meanwhile, our solution can also allow efficient fuzzy search over PHR data, which can greatly improve the system usability. We also provide security analysis to show that the proposed solution is secure and privacy-preserving. The experimental results demonstrate the efficiency of the proposed scheme. © 2013 Springer-Verlag Berlin Heidelberg.
Hierons RM, Bogdanov K, Bowen JP, Cleaveland R, Derrick J, Dick J, Gheorghe M, Harman M, Kapoor K, Krause P, Luettgen G, Simons AJH, Vilkomir S, Woodward MR, Zedan H (2009) Using Formal Specifications to Support Testing, ACM COMPUTING SURVEYS 41 (2) ARTN 9 ASSOC COMPUTING MACHINERY
Razavi A, Marinos A, Moschoyiannis Sotiris, Krause Paul (2009) Recovery management in RESTful Interactions, Proceedings of 3rd IEEE International Conference on Digital Ecosystems and Technologies pp. 436-441 IEEE
With REST becoming a dominant architectural paradigm for web services in distributed systems, more and more use cases are applied to it, including use cases that require transactional guarantees. We believe that the loose coupling that is supported by RESTful transactions, makes this currently our preferred interaction style for digital ecosystems (DEs). To further expand its value to DEs, we propose a RESTful transaction model that satisfies both the constraints of recoverable transactions and those of the REST architectural style. We then show the correctness and applicability of the model.
Bryant D, Krause Paul, Moschoyiannis Sotiris, Fisher M, VanDerHoek W, Konev B, Lisitsa A (2006) A tool to facilitate agent deliberation. In Proc. of 10th European Conference on Logics in Artificial Intelligence (JELIA'06), Lecture Notes in Computer Science 4160 pp. 465-468 Springer
In this paper we present a prototype of a tool that demonstrates how existing limitations in ensuring an agent?s compliance to an argumentation-based dialogue protocol can be overcome. We also present the implementation of compliance enforcement components for a deliberation dialogue protocol, and an application that enables two human participants to engage in an efficiently moderated dialogue, where all inappropriate utterances attempted by an agent are blocked and prevented from inclusion within the dialogue.
Marinos A, Krause P (2009) Using SBVR, REST and Relational Databases to develop Information Systems native to the Digital Ecosystem, 2009 3RD IEEE INTERNATIONAL CONFERENCE ON DIGITAL ECOSYSTEMS AND TECHNOLOGIES pp. 424-429 IEEE
Moschoyiannis Sotiris, Razavi AR, Zheng YY, Krause Paul (2008) Long-running Transactions: semantics, schemas, implementation, Proceedings of 2nd IEEE International Conference on Digital Ecosystems and Techonologies pp. 208-215 IEEE
In this paper we describe a formal model for the distributed coordination of long-running transactions in a Digital Ecosystem for business, involving Small and Medium Enterprises (SMEs). The proposed non-interleaving model of interaction-based service composition allows for communication between internal activities of transactions. The formal semantics of the various modes of service composition are represented by standard xml schemas. The current implementation framework uses suitable asynchronous message passing techniques and reflects the design decisions of the proposed model for distributed transactions in digital ecosystems.
Ryman-Tubb NF, Krause P (2011) Neural Network Rule Extraction to Detect Credit Card Fraud, ENGINEERING APPLICATIONS OF NEURAL NETWORKS, PT I 363 pp. 101-110 SPRINGER-VERLAG BERLIN
Liyanage H, de Lusignan S, Liaw ST, Kuziemsky CE, Mold F, Krause P, Fleming D, Jones S (2014) Big Data Usage Patterns in the Health Care Domain: A Use Case Driven Approach Applied to the Assessment of Vaccination Benefits and Risks. Contribution of the IMIA Primary Healthcare Working Group., Yearb Med Inform 9 pp. 27-35
BACKGROUND: Generally benefits and risks of vaccines can be determined from studies carried out as part of regulatory compliance, followed by surveillance of routine data; however there are some rarer and more long term events that require new methods. Big data generated by increasingly affordable personalised computing, and from pervasive computing devices is rapidly growing and low cost, high volume, cloud computing makes the processing of these data inexpensive. OBJECTIVE: To describe how big data and related analytical methods might be applied to assess the benefits and risks of vaccines. METHOD: We reviewed the literature on the use of big data to improve health, applied to generic vaccine use cases, that illustrate benefits and risks of vaccination. We defined a use case as the interaction between a user and an information system to achieve a goal. We used flu vaccination and pre-school childhood immunisation as exemplars. RESULTS: We reviewed three big data use cases relevant to assessing vaccine benefits and risks: (i) Big data processing using crowdsourcing, distributed big data processing, and predictive analytics, (ii) Data integration from heterogeneous big data sources, e.g. the increasing range of devices in the "internet of things", and (iii) Real-time monitoring for the direct monitoring of epidemics as well as vaccine effects via social media and other data sources. CONCLUSIONS: Big data raises new ethical dilemmas, though its analysis methods can bring complementary real-time capabilities for monitoring epidemics and assessing vaccine benefit-risk balance.
Krause PJ, Perez-Minana E, Thornton J (2012) Bayesian Networks for the management of Greenhouse Gas emissions in the British agricultural sector, Environmental Modelling and Software 35 pp. 132-148 Elsevier
Recent years have witnessed a rapid rise in the development of deterministic and non-deterministic models to estimate human impacts on the environment. An important failing of these models is the difficulty that most people have understanding the results generated by them, the implications to their way of life and also that of future generations. Within the field, the measurement of greenhouse gas emissions (GHG) is one such result. The research described in this paper evaluates the potential of Bayesian Network (BN) models for the task of managing GHG emissions in the British agricultural sector. Case study farms typifying the British agricultural sector were inputted into both, the BN model and CALM, a Carbon accounting tool used by the Country Land and Business Association (CLA) in the UK for the same purpose. Preliminary results show that the BN model provides a better understanding of how the tasks carried out on a farm impact the environment through the generation of GHG emissions. This understanding is achieved by translating the emissions information into their cost in monetary terms using the Shadow Price of Carbon (SPC), something that is not possible using the CALM tool. In this manner, the farming sector should be more inclined to deploy measures for reducing its impact. At the same time, the output of the analysis can be used to generate a business plan that will not have a negative effect on a farm's capital income.
Fenton N, Neil M, Marsh W, Hearty P, Radlinski L, Krause P (2008) On the effectiveness of early life cycle defect prediction with Bayesian Nets, EMPIRICAL SOFTWARE ENGINEERING 13 (5) pp. 499-537 SPRINGER
Krause P, de Lusignan S (2010) Procuring interoperability at the expense of usability: a case study of UK National Programme for IT assurance process., Studies in Health Technology and Informatics: Seamless care, safe care: the challenges of interoperability and patient safety in health care: Proceedings of the EFMI Special Topic Conference 155 pp. 143-149
The allure of interoperable systems is that they should improve patient safety and make health services more efficient. The UK's National Programme for IT has made great strides in achieving interoperability; through linkage to a national electronic spine. However, there has been criticism of the usability of the applications in the clinical environment.
Sansom M, Salazar N, Krause P (2012) MindBeat Quintet: Kinetifying thought through movement and sound.,
MindBeat is a software developed at the University of Surrey that facilitates collaborative thinking within a multi-disciplinary set-up. The project involved the development of an electronic space that enabled an academic ensemble to carry out an 'ideas improvisation'. Five academics from very different disciplines were invited to post short 'beats' (texts made up of no more than 3-4 sentences), around a predefined question: what makes multidisciplinary collaborations work? The beats developed in time into a progressive thread of ideas. The aim of the software was to track the emergence, development and decline of new ideas within a multidisciplinary environment, and also to be able to understand the patterns that emerge in this process by representing the ideas visually as coloured squares.

The MindBeat software was launched in June 2012 as part of an electronic theatre production of Peter Handke's 'Offending the Audience'. The five Surrey academics played the parts remotely by feeding Handke's text onto the Mindbeat website as part of a three-day durational performance. An open audience was then invited to interact with the play's five voices by sitting at one of five iMac stations set up in the studio space. These five computer monitors showed the text broken down into coloured square patterns. The audience could open the text by clicking onto the coloured squares, which would reveal the short text or beat. They could then add a comment or thought to the original text. The audience's participation produced almost 500 additional beats, creating an alternative version to the Handke script. The Mindbeat software visualised this ideation as a complex pattern of coloured squares.

The installation featured generative video and generative electronic music played live throughout the entire three-days. Using the colour and shape patterns of the ideas-exchange as their score, the musicians shared the software visualisation as a basis for their durational sonic improvisation.

Marinos A, Razavi A, Moschoyiannis Sotiris, Krause Paul, Damiani E, Zhang J, Chang R (2009) RETRO: A Consistent and Recoverable RESTful Transaction Model, 2009 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES, VOLS 1 AND 2 pp. 181-188 IEEE Computer Society
Krause PJ, Ambler S, Elvang-Goransson M, Fox J (1995) A Logic of Argumentation for Reasoning Under Unertainty, Computational Intelligence 11 pp. 113-131 Wiley Blackwell
We present the syntax and proof theory of a logic of argumentation, LA. We also outline the development of a category theoretic semantics for LA. LA is the core of a proof theoretic model for reasoning under uncertainty. In this logic, propositions are labelled with a representation of the arguments which support their validity. Arguments may then be aggregated to collect more information about the potential validity of the propositions of interest. We make the notion of aggregation primitive to the logic, and then define strength mappings from sets of arguments to one of a number of possible dictionaries. This provides a uniform framework which incorporates a number of numerical and symbolic techniques for assigning subjective confidences to propositions on the basis of their supporting arguments. These aggregation techniques are also described, with examples
Mak L-O, Krause P (2006) Detection & management of concept drift, Proceedings of 2006 International Conference on Machine Learning and Cybernetics, Vols 1-7 pp. 3486-3491 IEEE
Bryant D, Krause P (2008) A review of current defeasible reasoning implementations, KNOWLEDGE ENGINEERING REVIEW 23 (3) pp. 227-260 CAMBRIDGE UNIV PRESS
Razavi A, Krause Paul, Moschoyiannis Sotiris (2010) Digital Ecosystems: challenges and proposed solutions, In: Antonopoulos N, Exarchakos G, Li M, Liotta A (eds.), Handbook of research on P2P and grid systems for service-oriented computing: Models, Methodologies and Applications pp. 1003-1031 Information Science Reference - Imprint of: IGI Global Publishing
Razavi AR, Moschoyiannis SK, Krause PJ (2008) A Scale-free Business Network for Digital Ecosystems, 2008 2ND IEEE INTERNATIONAL CONFERENCE ON DIGITAL ECOSYSTEMS AND TECHNOLOGIES pp. 196-201 IEEE
Razavi AR, Moschoyiannis SK, Krause PJ (2007) Concurrency Control and Recovery Management for Open e-Business Transactions, WOTUG-30: COMMUNICATING PROCESS ARCHITECTURES 2007 65 pp. 267-285 IOS PRESS
Krause PJ, Sabry N (2013) Optimal Green Virtual Machine Migration Model, International Journal of Business Data Communications and Networking 9 (3) pp. 35-52
Cloud computing provides the opportunity to migrate virtual machines to ?follow-the-green? data centres. That is, to migrate virtual machines between green data centres on the basis of clean energy availability, to mitigate the environmental impact of carbon footprint emissions and energy consumption. The virtual machine migration problem can be modelled to maximize the utility of computing resources or minimizing the cost of using computing resources. However, this would ignore the network energy consumption and its impact on the overall CO2 emissions. Unless this is taken into account the extra data traffic due to migration of data could then cause an increase in brown energy consumption and eventually lead to an unintended increase in carbon footprint emissions. Energy consumption is a key aspect in deploying distributed service in cloud networks within decentralized service delivery architectures. In this paper, the authors address an optimiza- tion view of the problem of locating a set of cloud services on a set of sites green data centres managed by a service provider or hybrid cloud computing brokerage. The authors? goal is to minimize the overall network energy consumption and carbon footprint emissions for accessing the cloud services for any pair of data centres i and j. The authors propose an optimization migration model based on the development of integer linear programming (ILP) models, to identify the leverage of green energy sources with data centres and the energy consumption of migrating VMs.
Zheng Y, Krause P (2006) Asynchronous semantics and anti-patterns for interacting web services, QSIC 2006: Sixth International Conference on Quality Software, Proceedings pp. 74-81 IEEE COMPUTER SOC
Razavi AR, Moschoyiannis SK, Krause PJ (2007) A coordination model for distributed transactions in Digital Business EcoSystems, 2007 INAUGURAL IEEE INTERNATIONAL CONFERENCE ON DIGITAL ECOSYSTEMS AND TECHNOLOGIES pp. 319-324 IEEE
de Lusignan S, Liaw ST, Krause P, Curcin V, Vicente MT, Michalakidis G, Agreus L, Leysen P, Shaw N, Mendis K (2011) Key Concepts to Assess the Readiness of Data for International Research: Data Quality, Lineage and Provenance, Extraction and Processing Errors, Traceability, and Curation. Contribution of the IMIA Primary Health Care Informatics Working Group., Yearb Med Inform 6 (1) pp. 112-120
To define the key concepts which inform whether a system for collecting, aggregating and processing routine clinical data for research is fit for purpose.
Moschoyiannis Sotiris, Marinos A, Krause Paul (2010) Generating SQL queries from SBVR rules, Lecture Notes in Computer Science: Semantic Web Rules 6403 pp. 128-143 Springer
Declarative technologies have made great strides in expressivity between SQL and SBVR. SBVR models are more expressive that SQL schemas, but not as imminently executable yet. In this paper, we complete the architecture of a system that can execute SBVR models. We do this by describing how SBVR rules can be transformed into SQL DML so that they can be automatically checked against the database using a standard SQL query. In particular, we describe a formalization of the basic structure of an SQL query which includes aggregate functions, arithmetic operations, grouping, and grouping on condition. We do this while staying within a predicate calculus semantics which can be related to the standard SBVR-LF specification and equip it with a concrete semantics for expressing business rules formally. Our approach to transforming SBVR rules into standard SQL queries is thus generic, and the resulting queries can be readily executed on a relational schema generated from the SBVR model.
Marinos A, Krause P (2010) Towards the web of models: A rule-driven RESTful architecture for distributed systems, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6403 LNCS pp. 251-258
Krause PJ, Marinos A (2009) An SBVR framework for RESTful Web Applications, Lecture Notes in Computer Science: Rule Interchange and Applications 5858 pp. 144-158 Springer
We propose a framework that can be used to produce functioning web applications from SBVR models. To achieve this, we begin by discussing the concept of declarative application generation and examining the commonalities between SBVR and the RESTful architectural style of the web. We then show how a relational database schema and RESTful interface can be generated from an SBVR model. In this context, we discuss how SBVR can be used to semantically describe hypermedia on the Web and enhance its evolvability and loose coupling properties. Finally, we show that this system is capable of exhibiting process-like behaviour without requiring explicitly defined processes.
Allinjawi AA, Al-Nuaim HA, Krause P (2014) An Achievement Degree Analysis Approach to Identifying Learning Problems in Object-Oriented Programming, ACM TRANSACTIONS ON COMPUTING EDUCATION 14 (3) ARTN 20 ASSOC COMPUTING MACHINERY
Marinos A, Krause P (2009) What, not How: A generative approach to service composition, 2009 3RD IEEE INTERNATIONAL CONFERENCE ON DIGITAL ECOSYSTEMS AND TECHNOLOGIES pp. 430-435 IEEE
Moschoyiannis Sotiris, Krause Paul, Bryant D, McBurney P (2009) Verifiable Protocol Design for Agent Argumentation Dialogues, 2009 3RD IEEE INTERNATIONAL CONFERENCE ON DIGITAL ECOSYSTEMS AND TECHNOLOGIES pp. 630-635 IEEE
We describe a formal approach to protocol design for dialogues between autonomous agents in a digital ecosystem that involve the exchange of arguments between the participants. We introduce a vector language-based representation of argumentation protocols, which captures the interplay between different agentspsila moves in a dialogue in a way that (a) determines the legal moves that are available to each participant, in each step, and (b) records the dialogue history. We use UML protocol state machines (PSMs) to model a negotiation dialogue protocol at both the individual participant level (autonomous agent viewpoint) and the dialogue level (overall interaction viewpoint). The underlying vector semantics is used to verify that a given dialogue was played out in compliance with the corresponding protocol.
Razavi A, Moschoyiannis Sotiris, Krause Paul (2009) An open digital environment to support business ecosystems, Peer-to-Peer Networking and Applications 2 (4) pp. 367-397 SPRINGER
We present a Peer-to-Peer network design which aims to support business activities conducted through a network of collaborations that generate value in different, mutually beneficial, ways for the participating organisations. The temporary virtual networks formed by long-term business transactions that involve the execution of multiple services from different providers are used as the building block of the underlying scale-free business network. We show how these local interactions, which are not governed by a single organisation, give rise to a fully distributed P2P architecture that reflects the dynamics of business activities. The design is based on dynamically formed permanent clusters of nodes, the so-called Virtual Super Peers (VSPs), and this results in a topology that is highly resilient to certain types of failure (and attacks). Furthermore, the proposed P2P architecture is capable of reconfiguring itself to adapt to the usage that is being made of it and respond to global failures of conceptual hubs. This fosters an environment where business communities can evolve to meet emerging business opportunities and achieve sustainable growth within a digital ecosystem.
Michalakidis G, Kumarapeli P, Ring A, van Vlymen J, Krause P, de Lusignan S (2010) A system for solution-orientated reporting of errors associated with the extraction of routinely collected clinical data for research and quality improvement., Studies in Health Technology and Informatics: Proceedings of the 13th World Congress on Medical Informatics 160 (Pt 1) pp. 724-728
We have used routinely collected clinical data in epidemiological and quality improvement research for over 10 years. We extract, pseudonymise and link data from heterogeneous distributed databases; inevitably encountering errors and problems.
de Lusignan S, Cashman J, Poh N, Michalakidis G, Mason A, Desombre T, Krause P (2012) Conducting Requirements Analyses for Research using Routinely Collected Health Data: a Model Driven Approach., Stud Health Technol Inform 180 pp. 1105-1107
Background: Medical research increasingly requires the linkage of data from different sources. Conducting a requirements analysis for a new application is an established part of software engineering, but rarely reported in the biomedical literature; and no generic approaches have been published as to how to link heterogeneous health data. Methods: Literature review, followed by a consensus process to define how requirements for research, using, multiple data sources might be modeled. Results: We have developed a requirements analysis: i-ScheDULEs - The first components of the modeling process are indexing and create a rich picture of the research study. Secondly, we developed a series of reference models of progressive complexity: Data flow diagrams (DFD) to define data requirements; unified modeling language (UML) use case diagrams to capture study specific and governance requirements; and finally, business process models, using business process modeling notation (BPMN). Discussion: These requirements and their associated models should become part of research study protocols.
Krause PJ, Fenton N, Neil M, Marsh W, Hearty P, Marquez D, Mishra R (2007) Predicting software defects in varying development lifecycles using Bayesian nets, Information and Software Technology 49 (1) pp. 32-43
Fenton N, Neil M, Marsh W, Hearty P, Marquez D, Krause P, Mishra R (2007) Predicting software defects in varying development lifecycles using Bayesian nets, INFORMATION AND SOFTWARE TECHNOLOGY 49 (1) pp. 32-43 ELSEVIER SCIENCE BV
Liang P-C, Krause P (2016) Smartphone-Based Real-Time Indoor Location Tracking With 1-m Precision, IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS 20 (3) pp. 756-762 IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Zhang F, Povey D, Krause P (2007) Protein Attributes Microtuning System (PAMS): an effective tool to increase protein structure prediction by data purification, 2007 INAUGURAL IEEE INTERNATIONAL CONFERENCE ON DIGITAL ECOSYSTEMS AND TECHNOLOGIES pp. 53-58 IEEE
Manaf NA, Moschoyiannis Sotiris, Krause Paul (2015) Service Choreography, SBVR, and Time, Proceedings CONCUR 2015 - FOCLASA, EPTCS 201 pp. 63-77
We propose the use of structured natural language (English) in specifying service choreographies, focusing on the what rather than the how of the required coordination of participant services in realising a business application scenario. The declarative approach we propose uses the OMG standard Semantics of Business Vocabulary and Rules (SBVR) as a modelling language. The service choreography approach has been proposed for describing the global orderings of the invocations on interfaces of participant services. We therefore extend SBVR with a notion of time which can capture the coordination of the participant services, in terms of the observable message exchanges between them. The extension is done using existing modelling constructs in SBVR, and hence respects the standard specification. The idea is that users - domain specialists rather than implementation specialists - can verify the requested service composition by directly reading the structured English used by SBVR. At the same time, the SBVR model can be represented in formal logic so it can be parsed and executed by a machine.
Razavi A, Marinos A, Moschoyiannis Sotiris, Krause Paul (2009) RESTful Transactions Supported by the Isolation Theorems, WEB ENGINEERING, PROCEEDINGS 5648 pp. 394-409 Springer
With REST becoming the dominant architectural paradigm for web services in distributed systems, more and more use cases are applied to it, including use cases that require transactional guarantees. We propose a RESTful transaction model that satisfies both the constraints of transactions and those of the REST architectural style. We then apply the isolation theorems to prove the robustness of its properties on a formal level.
Liyanage H, Krause P, De Lusignan S (2015) Using ontologies to improve semantic interoperability in health data., Journal of innovation in health informatics 22 (2) pp. 309-315
The present-day health data ecosystem comprises a wide array of complex heterogeneous data sources. A wide range of clinical, health care, social and other clinically relevant information are stored in these data sources. These data exist either as structured data or as free-text. These data are generally individual person-based records, but social care data are generally case based and less formal data sources may be shared by groups. The structured data may be organised in a proprietary way or be coded using one-of-many coding, classification or terminologies that have often evolved in isolation and designed to meet the needs of the context that they have been developed. This has resulted in a wide range of semantic interoperability issues that make the integration of data held on these different systems changing. We present semantic interoperability challenges and describe a classification of these. We propose a four-step process and a toolkit for those wishing to work more ontologically, progressing from the identification and specification of concepts to validating a final ontology. The four steps are: (1) the identification and specification of data sources; (2) the conceptualisation of semantic meaning; (3) defining to what extent routine data can be used as a measure of the process or outcome of care required in a particular study or audit and (4) the formalisation and validation of the final ontology. The toolkit is an extension of a previous schema created to formalise the development of ontologies related to chronic disease management. The extensions are focused on facilitating rapid building of ontologies for time-critical research studies.
Webb SJ, Hanser T, Howlin B, Krause P, Vessey JD (2014) Feature combination networks for the interpretation of statistical machine learning models: Application to Ames mutagenicity, Journal of Cheminformatics 6 (1)
Background: A new algorithm has been developed to enable the interpretation of black box models. The developed algorithm is agnostic to learning algorithm and open to all structural based descriptors such as fragments, keys and hashed fingerprints. The algorithm has provided meaningful interpretation of Ames mutagenicity predictions from both random forest and support vector machine models built on a variety of structural fingerprints.A fragmentation algorithm is utilised to investigate the model's behaviour on specific substructures present in the query. An output is formulated summarising causes of activation and deactivation. The algorithm is able to identify multiple causes of activation or deactivation in addition to identifying localised deactivations where the prediction for the query is active overall. No loss in performance is seen as there is no change in the prediction; the interpretation is produced directly on the model's behaviour for the specific query. Results: Models have been built using multiple learning algorithms including support vector machine and random forest. The models were built on public Ames mutagenicity data and a variety of fingerprint descriptors were used. These models produced a good performance in both internal and external validation with accuracies around 82%. The models were used to evaluate the interpretation algorithm. Interpretation was revealed that links closely with understood mechanisms for Ames mutagenicity. Conclusion: This methodology allows for a greater utilisation of the predictions made by black box models and can expedite further study based on the output for a (quantitative) structure activity model. Additionally the algorithm could be utilised for chemical dataset investigation and knowledge extraction/human SAR development. © 2014 Webb et al.; licensee Chemistry Central Ltd.
Razavi AR, Malone PJ, Moschoyiannis Sotiris, Jennings B, Krause Paul (2007) A distributed transaction and accounting model for digital ecosystem composed services, 2007 Inaugural IEEE international conference on digital ecosystems and technologies pp. 215-218 IEEE Computer Society
Zheng Y, Zhou J, Krause P (2007) A model checking based test case generation framework for web services, International Conference on Information Technology, Proceedings pp. 715-720 IEEE COMPUTER SOC
Moschoyiannis S, Krause PJ (2015) True Concurrency in Long-running Transactions for Digital Ecosystems, FUNDAMENTA INFORMATICAE 138 (4) pp. 483-514 IOS PRESS
Bryant D, Krause P (2006) An implementation of a lightweight argumentation engine for agent applications, LOGICS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS 4160 pp. 469-472 SPRINGER-VERLAG BERLIN
de Lusignan S, Krause P, Michalakidis G, Vicente MT, Thompson S, McGilchrist M, Sullivan F, van Royen P, Agreus L, Desombre T, Taweel A, Delaney B (2012) Business Process Modelling is an Essential Part of a Requirements Analysis. Contribution of EFMI Primary Care Working Group., Yearb Med Inform 7 (1) pp. 34-43 Schattauer Publishers
To perform a requirements analysis of the barriers to conducting research linking of primary care, genetic and cancer data.
Moschoyiannis Sotiris, Krause Paul, Shields MW (2009) A True-Concurrent Interpretation of Behavioural Scenarios. In Proc. of ETAPS 2007 - Formal Foundations of Embedded Software and Component-Based Software Architectures (FESCA'07), Electronic Notes in Theoretical Computer Science 203 (7) pp. 3-22 Elsevier
We describe a translation of scenarios given in UML 2.0 sequence diagrams into a tuples-based behavioural model that considers multiple access points for a participating instance and exhibits true-concurrency. This is important in a component setting since different access points are connected to different instances, which have no knowledge of each other. Interactions specified in a scenario are modelled using tuples of sequences, one sequence for each access point. The proposed unfolding of the sequence diagram involves mapping each location (graphical position) onto the so-called component vectors. The various modes of interaction (sequential, alternative, concurrent) manifest themselves in the order structure of the resulting set of component vectors, which captures the dependencies between participating instances. In previous work, we have described how (sets of) vectors generate concurrent automata. The extension to our model with sequence diagrams in this paper provides a way to verify the diagram against the state-based model.
Krause PJ, Fenton N, Neil M, Marsh W, Hearty P, Radlinski L (2008) On the effectiveness of early life cycle defect prediction with Bayesian Nets, Empirical Software Engineering: an international journal 13 (5) pp. 499-537 Springer
Standard practice in building models in software engineering normally involves three steps: collecting domain knowledge (previous results, expert knowledge); building a skeleton of the model based on step 1 including as yet unknown parameters; estimating the model parameters using historical data. Our experience shows that it is extremely difficult to obtain reliable data of the required granularity, or of the required volume with which we could later generalize our conclusions. Therefore, in searching for a method for building a model we cannot consider methods requiring large volumes of data. This paper discusses an experiment to develop a causal model (Bayesian net) for predicting the number of residual defects that are likely to be found during independent testing or operational usage. The approach supports (1) and (2), does not require (3), yet still makes accurate defect predictions (an R 2 of 0.93 between predicted and actual defects). Since our method does not require detailed domain knowledge it can be applied very early in the process life cycle. The model incorporates a set of quantitative and qualitative factors describing a project and its development process, which are inputs to the model. The model variables, as well as the relationships between them, were identified as part of a major collaborative project. A dataset, elicited from 31 completed software projects in the consumer electronics industry, was gathered using a questionnaire distributed to managers of recent projects. We used this dataset to validate the model by analyzing several popular evaluation measures (R 2, measures based on the relative error and Pred). The validation results also confirm the need for using the qualitative factors in the model. The dataset may be of interest to other researchers evaluating models with similar aims. Based on some typical scenarios we demonstrate how the model can be used for better decision support in operational environments. We also performed sensitivity analysis in which we identified the most influential variables on the number of residual defects. This showed that the project size, scale of distributed communication and the project complexity cause the most of variation in number of defects in our model. We make both the dataset and causal model available for research use.
Zheng Y, Krause P (2007) Automata semantics and analysis of BPEL, 2007 INAUGURAL IEEE INTERNATIONAL CONFERENCE ON DIGITAL ECOSYSTEMS AND TECHNOLOGIES pp. 307-312 IEEE
At a time when climate change has become an accepted risk to civilisation, governments are developing and implementing policies designed to reduce the impacts of anthropogenic greenhouse gas (GHG) emissions. These can include ?command and control? directives. However, in free market economies, such as the United Kingdom, there is a preference for a lassie faire approach with taxes and incentives designed to shape the market rather than direct government intervention.

This thesis examines the feasibility of using agent-based modeling techniques for predictive analysis with respect to the application of carbon taxes on electricity consumption in the context of wider societal objectives and scenarios ? particularly in the procurement of information and communication technology (ICT) equipment and services by small and medium businesses (SMEs) in the United Kingdom. In doing so it provides an area of novel research. With more than 2% of greenhouse gas (GHG) emissions associated with the usage of ICT and with that proportion expected to grow, it is an important sector to target with emission reduction strategies. Testing these strategies in simulation prior to application to the market place is important to policy makers. Normally, policy makers wish to reduce the risk of implementing policies that would not achieve the desired goals and or harm the economy. Agent-based modeling potentially offers policy makers valuable insight into probable emergent market behaviour from a ?bottom-up? methodology that better suits free markets that contain millions of SMEs.

This thesis applies a multidisciplinary problem solving approach, including microeconomics, agent based modeling and policy research, to examining potential market responses to carbon taxes such as the Climate Change Levy (CCL); the UK?s primary carbon tax designed to reduce carbon emissions produced by SMEs. Areas of novelty in the thesis include the use of agent based models to examine the effects of carbon taxes on the behaviour of SMEs and the use of ICT as a factor production with the SME agents themselves.

Mak Lee-Onn, Krause Paul J. (2006) Detection & Management of Concept Drift, pp. 3486-3491

The ability to correctly detect the location and derive the contextual information where a concept begins to drift is essential in the study of domains with changing context. This paper proposes a Top-down learning method with the incorporation of a learning accuracy mechanism to efficiently detect and manage context changes within a large dataset. With the utilisation of simple search operators to perform convergent search and JBNC with a graphical viewer to derive context information, the identified hidden context are shown with the location of the disjoint points, the contextual attributes that contribute to the concept drift, the graphical output of the true relationships between these attributes and the Boolean characterisation which is the context.

Razavi AR, Moschoyiannis Sotiris, Krause Paul (2007) Concurrency control and recovery management for open e-business transactions, Proc. of Communicating Process Architectures: Concurrent Systems Engineering Series 65 pp. 267-285 IOS Press
Concurrency control mechanisms such as turn-taking, locking, serialization, transactional locking mechanism, and operational transformation try to provide data consistency when concurrent activities are permitted in a reactive system. Locks are typically used in transactional models for assurance of data consistency and integrity in a concurrent environment. In addition, recovery management is used to preserve atomicity and durability in transaction models. Unfortunately, conventional lock mechanisms severely (and intentionally) limit concurrency in a transactional environment. Such lock mechanisms also limit recovery capabilities. Finally, existing recovery mechanisms themselves afford a considerable overhead to concurrency. This paper describes a new transaction model that supports release of early results inside and outside of a transaction, decreasing the severe limitations of conventional lock mechanisms, yet still warranties consistency and recoverability of released resources (results). This is achieved through use of a more flexible locking mechanism and by using two types of consistency graph. This provides an integrated solution for transaction management, recovery management and concurrency control. We argue that these are necessary features for management of long-term transactions within "digital ecosystems" of small to medium enterprises.
Razavi AR, Moschoyiannis Sotiris, Krause Paul (2007) A coordination model for distributed transactions in Digital Business EcoSystems, Proceedings of the 2007 Inaugural IEEE-IES Digital EcoSystems and Technologies Conference, DEST 2007 pp. 159-164 IEEE
In this paper we present a model for coordinating distributed long running and multi-service transactions in Digital Business EcoSystems. The model supports various forms of service composition, which are translated into a tuples-based behavioural description that allows to reason about the required behaviour in terms of ordering, dependencies and alternative execution. The compensation mechanism warranties consistency, including omitted results, without breaking local autonomy. The proposed model is considered at the deployment level of SOA, rather than the realisation level, and is targeted to business transactions between collaborating SMEs as it respects the loose-coupling of the underlying services. © 2007 IEEE.
Razavi AR, Moschoyiannis Sotiris, Krause Paul (2008) A scale-free business network for digital ecosystems, Proceedings of 2nd IEEE International Conference on Digital Ecosystems and Technologies pp. 241-246 IEEE
The aim of this paper is to facilitate e-business transactions between small and medium enterprises (SMEs), in a way that respects their local autonomy, within a digital ecosystem. For this purpose, we distinguish transactions from services (and service providers) by considering virtual private transaction networks (VPTNs) and virtual service networks (VSNs). These two virtual levels are optimised individually and in respect to each other. The effect of one on the other, can supply us with stability, failure resistance and small-world characteristics on one hand and durability, consistency and sustainability on the other hand. The proposed network design has a dynamic topology that adapts itself to changes in business models and availability of SMEs, and reflects the highly dynamic nature of a digital ecosystem.
Moschoyiannis Sotiris, Krause Paul, Georgiou P (2012) An animation tool for exploring transactions in a de, IEEE International Conference on Digital Ecosystems and Technologies IEEE
The concept of a digital ecosystem (DE) has been used to explore scenarios in which multiple online services and resources can be accessed by users without there being a single point of control. In previous work we have described how the so-called transaction languages can express concurrent and distributed interactions between online services in a transactional environment. In this paper we outline how transaction languages capture the history of a long-running transaction and highlight the benefits of our true-concurrent approach in the context of DEs. This includes support for the recovery of a long-running transaction whenever some failure is encountered. We introduce an animation tool that has been developed to explore the behaviours of long-running transactions within our modelling environment. Further, we discuss how this work supports the declarative approach to the development of open distributed applications. © 2012 IEEE.
Marinos A, Moschoyiannis Sotiris, Krause Paul (2009) Towards a RESTful infrastructure for digital ecosystems, Proceedings of the International Conference on Management of Emergent Digital EcoSystems, MEDES '09 pp. 340-344
Moschoyiannis Sotiris, Razavi A, Krause Paul (2010) Transaction Scripts: Making Implicit Scenarios Explicit, Electronic Notes in Theoretical Computer Science 238 (6) pp. 63-79 Elsevier
We describe a true-concurrent approach for managing dependencies between distributed and concurrent coordinator components of a long-running transaction. In previous work we have described how interactions specified in a scenario can be translated into a tuples-based behavioural description, namely vector languages. In this paper we show how reasoning against order-theoretic properties of such languages can reveal missing behaviours which are not explicitly described in the scenario but are still possible. Our approach supports the gradual refinement of scenarios of interaction into a complete set of behaviours that includes all desirable orderings of execution and prohibits emergent behaviour of the transaction. Crown Copyright © 2010.
Shields MW, Moschoyiannis S, Krause PJ (2010) Primes in Component Languages, In: Proceedings of Real-time and Embedded Systems (RTES 2010)
Shields MW, Moschoyiannis S, Krause PJ (2010) Behavioural Presentations and an Automata Theory of Components, In: Proceedings of Real-time and Embedded Systems (RTES 2010)
The 'connected world' forces us to think about 'interoperability' as a primary requirement when building health care databases in the present day. Whilst semantic interoperability has made a major contribution to data utilisation between systems it often has not been able to integrate some large heterogeneous datasets required for research. As health data gets 'bigger' and complex, we are required to shift to rapid and flexible ways of resolving problems related to semantic interoperability. Ontological approaches accelerate implementing interoperability due to the availability of robust tools and technology frameworks that promote reuse.

This thesis reports the results of a mixed methods study that proposes a pragmatic methodology that maximises the use of ontologies across a multilayered research readiness model which can be used in data-driven health care research projects. The research examined evidence for the use of ontologies across a majority of layers in the reference model. The first part of the thesis examines the methods used for assessing readiness to participate in research across six dimensions of health care. It reports on existing ontological elements that boosts research readiness and also proposes ontological extensions for modelling the semantics of data sources and research study requirements. The second part of the thesis presents an ontology toolkit that supports rapid development of ontologies that can be used in health care research projects. It provides details of how an ontology toolkit for creating health care ontologies was developed through the consensus of a panel of informatics experts and clinicians. This toolkit evolved further to include a series of ontological building blocks that assist clinicians to rapidly build ontologies.

After the arrival of the web in the 1990s, educational institutions started to maintain their learning materials within Virtual Learning Environments (VLEs), as the web is a significant source of material for many students and teachers. However, there has been less development in the current VLEs in the past few years, which remain heavily centred on single institutions even though the web has been developing (e.g., web 2.0, web 3.0). There is a clear need to integrate VLEs with the wider and social Web and maintain its learning contents freely open in order to support the sharing and reuse of learning resources.
In this PhD project, we have prototyped a simple VLE that makes use of Version 7 of the Semantic Content Management System (SCMS) Drupal in order to provide a more open, social and semantically structured learning environment. Essentially, we aim to add semantic markup based on vocabularies (the semantic markup that is supported by major search providers including Bing, Google, Yahoo! and Yandex), and integrate social networking and media to develop and enhance VLEs by improving sharing, discovering and reusing of learning contents.
In June 2011, the major search engines (Bing, Google, Yahoo! and Yandex) announced the new innovation of This PhD project focuses also on our proposal to by proposing additional concepts to describe VLEs? content with rich semantic information due the limited support for describing educational resources in the current schema. This proposal aims to extend to the previous work that has been included in the schema by The Learning Resource Metadata Initiative (LRMI) in order to provide an enhanced approach to describe learning contents with rich semantic data in a VLE context.
Through this thesis project, we will introduce, describe, evaluate and discuss the prototyped VLE in order to demonstrate the advantages of social and semantic web technologies for VLEs. We demonstrate how an advanced SCMS such as Drupal can offer advantages over existing VLE platforms in terms of: sharing of learning content with social networks; provision of advanced media features. We also demonstrate how Drupal?s support for can be used to enhance the findability of on-line learning content, and propose enhancements to that will make it more relevant to the needs of learning platforms. This proposal has been evaluated by and LRMI and a working group set up to take the proposal forward.
Ryman-Tubb Nick F., Krause Paul, Garn Wolfgang (2018) How Artificial Intelligence and machine learning research impacts payment card fraud detection: A survey and industry benchmark, Engineering Applications of Artificial Intelligence 76 pp. 130-157 Elsevier
The core goal of this paper is to identify guidance on how the research community can better transition their research into payment card fraud detection towards a transformation away from the current unacceptable levels of payment card fraud. Payment card fraud is a serious and long-term threat to society (Ryman-Tubb and d?Avila Garcez, 2010) with an economic impact forecast to be $416bn in 2017 (see Appendix A).1 The proceeds of this fraud are known to finance terrorism, arms and drug crime. Until recently the patterns of fraud (fraud vectors) have slowly evolved and the criminals modus operandi (MO) has remained unsophisticated. Disruptive technologies such as smartphones, mobile payments, cloud computing and contactless payments have emerged almost simultaneously with large-scale data breaches. This has led to a growth in new fraud vectors, so that the existing methods for detection are becoming less effective. This in turn makes further research in this domain important. In this context, a timely survey of published methods for payment card fraud detection is presented with the focus on methods that use AI and machine learning. The purpose of the survey is to consistently benchmark payment card fraud detection methods for industry using transactional volumes in 2017. This benchmark will show that only eight methods have a practical performance to be deployed in industry despite the body of research. The key challenges in the application of artificial intelligence and machine learning to fraud detection are discerned. Future directions are discussed and it is suggested that a cognitive computing approach is a promising research direction while encouraging industry data philanthropy.
Improvements in communication technology means that increasing numbers of people around the world can share information with increasing ease. This information is forming knowledge in forms that was not previously conventionally possible. It is enabling new communities to be formed.
This research aimed to determine how this data could be exploited and combined with additional complementary tools to enable automated large-scale non-intrusive monitoring of wildlife, and in particular keystone species.
Three proof-of-concept research studies explored automated camera traps, citizen science and large-scale crowdsourcing to determine the potential of a system that combines this technology and its use for automated monitoring of wild animals.
The results demonstrated that internet-connected camera traps are capable of collecting valuable visual data at a large-scale. However, for keystone species, such as tigers, the scale required for monitoring presents technical and economic challenges.
The participation of citizen scientists to collect and analyse data demonstrated a potential monitoring mechanism. However, the volume of data provided for such a focused practice proved insufficient for accurate large-scale monitoring.
The Wildsense project, which used publicly-available image data from the Web as its primary data source demonstrated that there is additional data available that can be processed with the participation of citizen scientists. The popularity and overall interest towards this project showed that crowdsourcing is a viable method for collecting relevant data for animal monitoring.
It was concluded that the proof-of-concept experiments completed provided evidence that there is a potential to monitor individual animals through an automated approach and a system architecture is proposed. There is potential for automated large scale monitoring using the proposed framework. However, there are significant challenges to overcome and multiple directions for future work are recommended for exploration.
Many researchers and psychology specialists aim to develop educational applications and e-games, which target cognitive abilities, behavioural and social skills for children with Attention deficit and hyperactivity disorder (ADHD). These applications apply certain learning strategies that might improve certain abilities and skills. They could be found easily in online stores, yet hard to judge the efficiency and desirability of each one unless they are evaluated and tested. For this reason, there was a need for the existence of a list of guidelines that could be used to assist building learning systems with effective e-strategies for children with ADHD. The main objective of this work was to form a foundation that guide software developers in implementing effective educational applications to develop these children?s abilities and skills. In addition, it may help educators and parents to distinguish between available applications. As our first stage of investigation, a meta-analytical review of multiple empirical studies was conducted, that outlined the effective game features on the development of abilities and skills for children with ADHD. Five units of analysis were done separately, targeting: attention, working memory, processing speed, behaviour, and social skills. The most significant and effective methods/features from the included studies were highlighted and used to draw out our list of guidelines. As the second stage, we investigated an existing e-game with certain game features to check if our guidelines apply, and evaluated its effectiveness toward improving cognition, behaviour and social skills. Seventeen female students with ADHD, from two primary schools in Saudi Arabia, participated in the evaluation. they played with the game three sessions a week, for four months. Significant improvements found on their cognition, behaviour, social skills and academic performance. As for our intervention, we validated ?e-socialization? component; by developing and evaluating a social online tool for children with ADHD. Seven Saudi students with ADHD, aged between 6 and 8 years, participated in the evaluation. The intervention involved playing ACTIVATE mini games, and a chatting session after each game. Children showed fairly significant improvements in games scores. The online socialization tool, found to be positively influencing children?s knowledge and experience exchange, motivation, and social skills. As a conclusion, we could say that our produced list of guidelines might assist in building effective applications and games for children with ADHD. Therefore, aiding the process of improving their academic achievements, improving their cognition and behaviour, and supporting socialisation.