A rating scale was developed to assess the contribution made by computer software towards the delivery of a quality consultation, with the purpose of informing the development of the next generation of systems. Two software programmes were compared, using this scale to test their ability to enable or inhibit the delivery of an ideal consultation with a patient with heart disease. The context was a general practice based, nurse run clinic for the secondary prevention of heart disease. One of the programmes was customized for this purpose; the other was a standard general practice programme. Consultations were video-recorded, and then assessed by an expert panel using the new assessment tool. Both software programmes were oriented towards the implementation of the evidence, rather than facilitating patient-centred practice. The rating scale showed, not surprisingly, significantly greater support from the customized software in the consultation in five out of eight areas. However, the scale's reliability measured by Cronbach's Alpha, was sub-optimal. With further refinement, this rating scale may become a useful tool that will inform software developers of the effectiveness of their programmes in the consultation, and suggest where they need development. © 2002 Informa UK Ltd All rights reserved.
We propose a unified formulation for the problem of
3D human pose estimation from a single raw RGB image
that reasons jointly about 2D joint estimation and 3D pose
reconstruction to improve both tasks. We take an integrated
approach that fuses probabilistic knowledge of 3D
human pose with a multi-stage CNN architecture and uses
the knowledge of plausible 3D landmark locations to refine
the search for better 2D locations. The entire process is
trained end-to-end, is extremely efficient and obtains stateof-the-art
results on Human3.6M outperforming previous
approaches both on 2D and 3D errors.
Kusner MJ, Loftus J, Russell Christopher, Silva R (2017) Counterfactual Fairness, Advances in Neural Information Processing Systems 30 (NIPS 2017) pre-proceedings 30
Machine learning can impact people with legal or ethical consequences when it is used to automate decisions in areas such as insurance, lending, hiring, and predictive policing. In many of these scenarios, previous decisions have been made that are unfairly biased against certain subpopulations, for example those of a particular race, gender, or sexual orientation. Since this past data may be biased, machine learning predictors must account for this to avoid perpetuating or creating discriminatory practices. In this paper, we develop a framework for modeling fairness using tools from causal inference. Our definition of counterfactual fairness captures the intuition that a decision is fair towards an individual if it the same in (a) the actual world and (b) a counterfactual world where the individual belonged to a different demographic group. We demonstrate our framework on a real-world problem of fair prediction of success in law school.
Deep generative models provide powerful tools for distributions over complicated manifolds, such as those of natural images. But many of these methods, including generative adversarial networks (GANs), can be difficult to train, in part because they are prone to mode collapse, which means that they characterize only a few modes of the true distribution. To address this, we introduce VEEGAN, which features a reconstructor network, reversing the action of the generator by mapping from data to noise. Our training objective retains the original asymptotic consistency guarantee of GANs, and can be interpreted as a novel autoencoder loss over the noise. In sharp contrast to a traditional autoencoder over data points, VEEGAN does not require specifying a loss function over the data, but rather only over the representations, which are standard normal by assumption. On an extensive set of synthetic and real world image datasets, VEEGAN indeed resists mode collapsing to a far greater extent than other recent GAN variants, and produces more realistic samples.
Machine learning is now being used to make crucial decisions about people?s lives.
For nearly all of these decisions there is a risk that individuals of a certain race,
gender, sexual orientation, or any other subpopulation are unfairly discriminated
against. Our recent method has demonstrated how to use techniques from counterfactual
inference to make predictions fair across different subpopulations. This
method requires that one provides the causal model that generated the data at hand.
In general, validating all causal implications of the model is not possible without
further assumptions. Hence, it is desirable to integrate competing causal models to
provide counterfactually fair decisions, regardless of which causal ?world? is the
correct one. In this paper, we show how it is possible to make predictions that are
approximately fair with respect to multiple possible causal models at once, thus
mitigating the problem of exact causal specification. We frame the goal of learning
a fair classifier as an optimization problem with fairness constraints entailed by
competing causal explanations. We show how this optimization problem can be
efficiently solved using gradient-based methods. We demonstrate the flexibility of
our model on two real-world fair classification problems. We show that our model
can seamlessly balance fairness in multiple worlds with prediction accuracy.
Submodular extensions of an energy function can be used to efficiently compute approximate marginals via variational inference. The accuracy of the marginals depends crucially on the quality of the submodular extension. To identify the best possible extension, we show an equivalence between the submodular extensions of the energy and the objective functions of linear programming (LP) relaxations for the corresponding MAP estimation problem. This allows us to (i) establish the worst-case optimality of the submodular extension for Potts model used in the literature; (ii) identify the worst-case optimal submodular extension for the more general class of metric labeling; and (iii) efficiently compute the marginals for the widely used dense CRF model with the help of a recently proposed Gaussian filtering method. Using synthetic and real data, we show that our approach provides comparable upper bounds on the log-partition function to those obtained using tree-reweighted message passing (TRW) in cases where the latter is computationally feasible. Importantly, unlike TRW, our approach provides the first practical algorithm to compute an upper bound on the dense CRF model.
This paper proposes new search algorithms for counterfactual explanations based upon mixed integer programming. We are concerned with complex data in which variables may take any value from a contiguous range or an additional set of discrete states. We propose
a novel set of constraints that we refer to as a ?mixed polytope? and show how this can be used with an integer programming solver to efficiently find coherent counterfactual explanations i.e. solutions that are guaranteed to map back onto the underlying data structure, while avoiding the need for brute-force enumeration. We also look at the problem of diverse explanations and show how these can be generated within our framework.
There has been much discussion of the ?right to explanation? in the EU General Data Protection Regulation, and its existence, merits, and disadvantages. Implementing a right to explanation that opens the ?black box? of algorithmic decision-making faces major legal and technical barriers. Explaining the functionality of complex algorithmic decisionmaking systems and their rationale in specific cases is a technically challenging problem. Some explanations may offer little meaningful information to data subjects, raising questions around their value. Data controllers have an interest to not disclose information about their algorithms that contains trade secrets, violates the rights and freedoms of others (e.g. privacy), or allows data subjects to game or manipulate decision-making. Explanations of automated decisions need not hinge on the general public understanding how algorithmic systems function. Even though interpretability is of great importance and should be pursued, explanations can, in principle, be offered without opening the black box. Looking at explanations as a means to help a data subject act rather than merely understand, one can gauge the scope and content of explanations according to the specific goal or action they are intended to support. From the perspective of individuals affected by automated decision-making, we propose three aims for explanations: (1) to inform and help the individual understand why a particular decision was reached, (2) to provide grounds to contest the decision if the outcome is undesired, and (3) to understand what could be changed to receive a desired result in the future, based on the current decision-making model. We assess how each of these goals finds support in the GDPR, and the extent to which they hinge on opening the ?black box?. We suggest data controllers should offer a particular type of explanation, ?unconditional counterfactual explanations?, to support these three aims. These counterfactual explanations describe the smallest change to the world that would obtain a desirable outcome, or to arrive at a ?close possible world.? As multiple variables or sets of variables can lead to one or more desirable outcomes, multiple counterfactual explanations can be provided, corresponding to different choices of nearby possible worlds for which the counterfactual holds. Counterfactuals describe a dependency on the external facts that lead to that decision without the need to convey the internal state or logic of an algorithm. As a result, counterfactuals serve as a minimal solution that bypasses the current technical limitations of interpretability, while striking a balance between transparency and the rights and freedoms of others (e.g. privacy, trade secrets).
We propose a CNN-based approach for multi-camera markerless motion capture of the human body. Unlike existing methods that first perform pose estimation on individual cameras and generate 3D models as post-processing, our approach makes use of 3D reasoning throughout a multi-stage approach. This novelty allows us to use provisional 3D models of human pose to rethink where the joints should be located in the image and to recover from past mistakes. Our principled refinement of 3D human poses lets us make use of image cues, even from images where we previously misdetected joints, to refine our estimates as part of an end-to-end approach. Finally, we demonstrate how the high-quality output of our multi-camera setup can be used as an additional training source to improve the accuracy of existing single camera models.
Mittelstadt Brent, Russell Christopher, Wachter Sandra (2019) Explaining Explanations in AI, FAT* '19 Proceedings of the Conference on Fairness, Accountability, and Transparency pp. 279-288
Recent work on interpretability in machine learning and AI has focused on the building of simplified models that approximate the true criteria used to make decisions. These models are a useful pedagogical device for teaching trained professionals how to predict what decisions will be made by the complex system, and most importantly how the system might break. However, when considering any such model it?s important to remember Box?s maxim
that "All models are wrong but some are useful." We focus on the distinction between these models and explanations in philosophy and sociology. These models can be understood as a "do it yourself kit" for explanations, allowing a practitioner to directly answer "what if questions" or generate contrastive explanations without
external assistance. Although a valuable ability, giving these models as explanations appears more difficult than necessary, and other forms of explanation may not have the same trade-offs. We contrast the different schools of thought on what makes an explanation, and suggest that machine learning might benefit from viewing the problem more broadly.
When an individual purchases a home, they simultaneously purchase its structural features, its accessibility to work, and the neighborhood amenities. Some amenities, such as air quality, are measurable whilst others, such as the prestige or the visual impression of a neighborhood, are difficult to quantify. Despite the well-known impacts intangible housing features have on house prices, limited attention has been given to systematically quantifying these difficult to measure amenities. Two issues have lead to this neglect. Not
only do few quantitative methods exist that can measure the urban environment, but that the collection of such data is both costly and subjective. We show that street image and satellite image data can capture these urban qualities and improve the estimation of house prices.
We propose a pipeline that uses a deep neural network model to automatically extract visual features from images to estimate house prices in London, UK. We make use of traditional housing features such as age, size and accessibility as well as visual features from
Google Street View images and Bing aerial images in estimating the house price model. We find encouraging results where learning to characterize the urban quality of a neighborhood improves house price prediction, even when generalizing to previously unseen London boroughs.
We explore the use of non-linear vs. linear methods to fuse these cues with conventional models of house pricing, and show how the interpretability of linear models allows us to directly extract the visual desirability of neighborhoods as proxy variables that are
both of interest in their own right, and could be used as inputs to other econometric methods. This is particularly valuable as once the network has been trained with the training data, it can be applied elsewhere, allowing us to generate vivid dense maps of the desirability of London streets.
We introduce the first approach to solve the challenging problem of unsupervised 4D visual scene understanding for complex dynamic scenes with multiple interacting people from multi-view video. Our approach simultaneously estimates a detailed model that includes a per-pixel semantically and temporally coherent reconstruction, together with instance-level segmentation exploiting photo-consistency, semantic and motion information. We further leverage recent advances in 3D pose estimation to constrain the joint semantic instance segmentation and 4D temporally coherent reconstruction. This enables per person semantic instance segmentation of multiple interacting people in
complex dynamic scenes. Extensive evaluation of the joint visual scene understanding framework against state-of-the-art methods on challenging indoor and outdoor sequences demonstrates a significant (H 40%) improvement in semantic segmentation, reconstruction and scene flow accuracy.
We present a novel data-driven regularizer for weakly-supervised learning of 3D
human pose estimation that eliminates the drift problem that affects existing approaches.
We do this by moving the stereo reconstruction problem into the loss of the network
itself. This avoids the need to reconstruct 3D data prior to training and unlike previous
semi-supervised approaches, avoids the need for a warm-up period of supervised training.
The conceptual and implementational simplicity of our approach is fundamental to its
appeal. Not only is it straightforward to augment many weakly-supervised approaches
with our additional re-projection based loss, but it is obvious how it shapes reconstructions
and prevents drift. As such we believe it will be a valuable tool for any researcher working
in weakly-supervised 3D reconstruction. Evaluating on Panoptic, the largest multi-camera
and markerless dataset available, we obtain an accuracy that is essentially indistinguishable
from a strongly-supervised approach making full use of 3D groundtruth in training.