I graduated with a BSc degree in Mathematics from Royal Holloway, University of London and gained a PhD in Statistics also from Royal Holloway. In 2003 I joined the Department of Mathematics at the University of Surrey. I spent a period of time seconded to the Office of the Dean of Students and am currently a Reader in the Department of Mathematics.
My research interests are in the methodology of the statistical design and analysis of experiments. Experimental design is the process of determining the most appropriate use of resources, under a given set of circumstances, to obtain information to answer a set of experimental questions. My work combines theory and computation to solve problems typical in real experiments in engineering, agriculture and the pharmaceutical and manufacturing industries.
Current specific interests include:
- Planning an experiment which is robust against breakdown in the event of various patterns of observation loss, for a variety of design types;
- Construction of full and fractional factorial designs with single or double confounding, that is, with one or two forms of blocking, where main effects and selected interactions are required;
- Use of graph theory to obtain conditions on design properties and to inform and aid in design construction.
In 2020/21 I delivered the following modules and will be teaching the same modules in the 2021/22 academic year:
- MAT2002: General Linear Modules
- MAT3021: Experimental Design.
Godolphin JD (2019) Construction of Row-Column Factorial Designs, Journal of the Royal Statistical Society: Series B
Godolphin Janet (2017) Designs with Blocks of Size Two and Applications to Microarray Experiments, Annals of Statistics 46 (6A) pp. 2775-2805
Godolphin JD (2015) A Link between the E-value and the Robustness of Block Designs, Journal of the American Statistical Association
Digital computation is central to almost all scientific endeavor and has become integral to university physics education. Students collect experimental data using digital devices, process data using spreadsheets and graphical software, and develop scientific programming skills for modeling, simulation and computational work. Issues associated with the floating-point representation of numbers are rarely explored. In this article, problems of floating point are divided into three categories: significant-figure limits, propagation of floating-point representation error, and rounding. For each category, examples are presented of unexpected ways in which the digital representation of floating-point numbers can impact the veracity of scientific results. These examples cover aspects of classical dynamics, numerical integration, cellular automata, statistical analysis, and digital timing. Suggestions are made for curriculum enhancement and project-style investigations that reinforce the issues covered at a level suitable for physics undergraduate students.
Manual digital timing devices such as stopwatches are ubiquitous in the education sector for experimental work where automated electronic timing is unavailable or impractical. The disadvantage of manual timing is that the experimenter introduces an additional systematic error and random uncertainty to a measurement that hitherto could only be approximated and which masks useful information on uncertainty due to variations in the physical conditions of the experiment. A model for the reaction time of a timekeeper using a stopwatch for a single anticipated visual stimulus of the type encountered in physics experiments is obtained from a set of 4304 reaction times from timekeepers at swimming competitions. The reaction time is found to be well modelled by the normal distribution N (E, σ2) = N (0.11, 0.072) in units of seconds where E and σ2 are the systematic error and variance for a single time measurement. Consistency between timekeepers is shown to be very good. The reaction time for a stopwatch-operated start and stop experiment can therefore be modelled by N (0, 0.102), assuming that the average reaction time is the same in both cases. This makes a significant contribution to the uncertainty of most manually-timed measurements. This timing uncertainty can be subtracted out of the variation observed in repeat measurements in the real experiment to reveal the uncertainty solely associated with fluctuations in the physical conditions of the experiment.
Robustness of binary incomplete block designs against giving rise to a disconnected design in the event of observation loss is investigated. A link is established between the E-value of a planned design and the extent of observation loss that can be experienced whilst still guaranteeing an eventual design from which all treatment contrasts can be estimated. Patterns of missing observations covered include loss of entire blocks and loss of individual observations. Simple bounds are provided enabling practitioners to easily assess the robustness of a planned design.
Robustness against design breakdown following observation loss is investigated for Partially Balanced Incomplete Block Designs with two associate classes (PBIBD(2)s). New results are obtained which add to the body of knowledge on PBIBD(2)s. In particular, using an approach based on the E-value of a design, all PBIBD(2)s with triangular and Latin square association schemes are established as having optimal block breakdown number. Furthermore, for group divisible designs not covered by existing results in the literature, a sufficient condition for optimal block breakdown number establishes that all members of some design sub-classes have this property.
Designs with blocks of size two have numerous applications. In experimental situations where observation loss is common, it is im- portant for a design to be robust against breakdown. For designs with one treatment factor and a single blocking factor, with blocks of size two, conditions for connectivity and robustness are obtained using combinatorial arguments and results from graph theory. Lower bounds are given for the breakdown number in terms of design pa- rameters. For designs with equal or near equal treatment replication, the concepts of treatment and block partitions, and of linking blocks, are used to obtain information on the number of blocks required to guarantee various levels of robustness. The results provide guidance for construction of designs with good robustness properties. Robustness conditions are also established for row column designs in which one of the blocking factors involves blocks of size two. Such designs are particularly relevant for microarray experiments, where the high risk of observation loss makes robustness important. Dis- connectivity in row column designs can be classified as three types. Techniques are given to assess design robustness according to each type, leading to lower bounds for the breakdown number. Guidance is given for robust design construction. Cyclic designs and interwoven loop designs are shown to have good robustness properties.
For p two-level factors, designs comprising full replicates with runs in blocks of size two are investigated. The minimum number of replicates for estimation of all main effects and two-factor interactions is established and a construction method is developed based on replicate generators. Complete design classes are given in the minimum number of replicates for p ≤ 15. Designs in full replicates are used as root designs to obtain designs in fractional 2p‒r replicates, again to estimate main effects and two-factor interactions, and designs are recommended for p = 4; . . . ; 15. Guidance is given on design construction when only a subset of the interactions are of interest.
The arrangement of 2 n factorials in row-column designs to estimate main effects and two factor interactions is investigated. Single replicate constructions are given which enable estimation of all main effects and maximise the number of estimable two factor interactions. Constructions and guidance are given for multi-replicate designs in single arrays and in multiple arrays. Consideration is given to constructions for 2 n−t fractional factorials.
The problem of ascertaining conditions that ensure that an m-way design is connected has occupied the attention of research workers for very many years. One of the significant advances, as well as one of the earliest contributions, was provided by the classic work of J. N. Srivastava and D. A. Anderson in 1970, which gives a necessary and sufficient rank condition for an m-way design to be completely connected. In this article it is shown that the class of estimable parametric functions for an individual factor is derived directly from a simple extension of the Srivastava-Anderson result. This takes the form of a necessary and sufficient rank condition that is expressed in terms of the dimension of a segregated component of the kernel of the design matrix. The result has the interesting property that the connectivity status for all of the individual factors can be found simultaneously. Furthermore, it enables the formulation of several general results, which include the specification of conditions on designs exhibiting adjusted orthogonality. A number of examples are given to illustrate these results. © 2013 Copyright Grace Scientific Publishing, LLC.
Two-level factorial designs are widely used in industry. For experiments involving n factors, the construction of designs comprising 2n and 2n
A recent article by Faux and Godolphin explored issues of floating-point error in situations relevant to classical dynamics, numerical integration, cellular automata, statistical analysis, and digital timing. Examples were given that were suitable for discussion and student project work. One of the examples explored the properties of an algorithm, described in an IBM Knowledge Center document designed to convert a binary field representing the number of counts of a quartz oscillator to integers for digital display. In Ref. 1 it was demonstrated that the algorithm was vulnerable to rounding error resulting in an incorrect digital display. Investigation associated with this example form the focus of this note. The timing simulation results presented in Ref. 1 suggested that uncorrected rounding error in stopwatch timer displays could be impactful if used for precision timing, such as for race times or in experimental physics. Here we present and analyse race times obtained from swimming competitions. The data give a clear demonstration of anomalous stopwatch timing patterns, which can only be explained by rounding error. It is also shown that such rounding error can result in a set of times being wrongly ordered. In the context of a sporting event this could lead to the incorrect ranking of athletes and hence the incorrect awarding of race positions. As a spin off of Ref. 1, this note may be of interest to educators, with the results providing a resource for discussion and the approach providing a template for additional student projects.
This paper considers the robustness of resolvable incomplete block designs in the event of two patterns of missing observations: loss of whole blocks and loss of whole replicates. The approach used to assess designs is based on the concept of block intersection which exploits the resolvability property of the design. This improves on methods using minimal treatment concurrence which have been used previously. It is shown that several classes of designs, including affine resolvable designs, square and rectangular lattice designs and two-category concurrence α-designs and αn-designs, are maximally robust; some of these classes of designs are also shown to be most replicate robust.
In swimming competitions, race times are generally measured using full electronic timing with backup times provided by semi-manual and manual (stopwatch) timing. The official race time is taken to be the electronic time. However, the reliability of the semi-manual and manual times are important in determining swimmer performance, since either can be used as the official race time, for example, if the electronic time is missing. National swimming governing bodies oversee the training of officials. Timekeeper training includes a practical assessment of performance at manual timekeeping conducted at the poolside. No consistent test criterion is applied to this practical assessment due to the absence of robust reaction time data upon which to base a test. The recent publication of a substantial data set of timekeeper reaction times has allowed the timing characteristics of trained timekeepers to be described by means of a timing profile. Consideration of timekeeper candidates with timing profiles consistent with that of a trained timekeeper and of candidates with significantly different timing profiles enables the properties of practical timekeeping tests to be evaluated. Two tests with good properties are proposed to assess performance at manual timekeeping. Further, a test to assess performance at semi-manual timekeeping is proposed. This is an aspect of timekeeping which has no practical assessment under the current training regime.
© 2015 Australian Statistical Publishing Association Inc. Criteria are proposed for assessing the robustness of a binary block design against the loss of whole blocks, based on summing entries of selected upper non-principal sections of the concurrence matrix. These criteria improve on the minimal concurrence concept that has been used previously and provide new conditions for measuring the robustness status of a design. The robustness properties of two-associate partially balanced designs are considered and it is shown that two categories of group divisible designs are maximally robust. These results expand a classic result in the literature, obtained by Ghosh, which established maximal robustness for the class of balanced block designs.
A set of measures is developed which indicate the robustness of a Balanced Incomplete Block Design (BIBD) against yielding a disconnected eventual design in the event of observation loss. The measures have uses as a pilot procedure and as a tool to aid in design selection in situations in which significant observation loss is thought possible. The measures enable non-isomorphic BIBDs with the same parameters to be ranked. Investigation of a class of BIBDs suggests there is some correspondence between robustness against becoming disconnected and rankings associated with A-efficiency.
Knowledge of the cardinality and the number of minimal rank reducing observation sets in experimental design is important information which makes a useful contribution to the statistician's tool-kit to assist in the selection of incomplete block designs. Its prime function is to guard against choosing a design that is likely to be altered to a disconnected eventual design if observations are lost during the course of the experiment. A method is given for identifying these observation sets based on the concept of treatment separation, which is a natural approach to the problem and provides a vastly more efficient computational procedure than a standard search routine for rank reducing observation sets. The properties of the method are derived and the procedure is illustrated by four applications which have been discussed previously in the literature. © 2013 Elsevier Inc. All rights reserved.
Objective: The aim of this study was to investigate the prevalence of pulmonary nodules at presentation in cases of soft tissue sarcoma (STS) in dogs with no previous thoracic imaging. Animals: Client-owned dogs with a histologic diagnosis of STS. Procedures: Dogs were retrospectively included in this study if the first thoracic imaging performed was at the time of presentation to our referral center. De novo and recurrent tumors were included, and information regarding tumor grade, history (primary mass vs scar vs recurrence), duration, location and size was also collected. Results: One hundred and forty-six dogs were included. Routine staging was performed with computed tomography (131 dogs, 89.7%) or 3-view thoracic radiographs (15 dogs, 10.3%). STS were grade 1 in 55.5% of dogs, grade 2 in 27.4% and grade 3 in 17.1%. Pulmonary nodules suggestive of metastasis were present in 11.7% of cases overall and in 6.5%, 5.6% and 37.5% of grade 1, grade 2 and grade 3 STS cases, respectively. Tumor grade (low/intermediate versus high) and tumor duration ( 3 months) were significantly associated with presence of pulmonary nodules at presentation. Conclusions and Clinical Relevance: This is the first large study reporting prevalence of pulmonary nodules at presentation in dogs with STS having had no previous thoracic imaging. The prevalence of pulmonary nodules suggestive of metastasis at presentation is low (
In experimental situations where observation loss is common, it is important for a design to be robust against breakdown. For incomplete block designs, with one treatment factor and a single blocking factor, conditions for connectivity and robustness are developed using the concepts of treatment and block partitions, and of linking blocks. Lower bounds are given for the block breakdown number in terms of parameters of the design and its support. The results provide guidance for construction of designs with good robustness properties.