Lugano,

Tuesday 27 July

Speaker: Scott Ferson

These five approaches redress, or comprehensively solve, several major deficiencies
of Monte Carlo simulations and of standard probability theory in risk assessments.
For instance, it is almost always difficult, if not impossible, to completely
characterize precise distributions of all the variables in a risk assessment,
or the multivariate dependencies among the variables. As a result, in the
practical situations where empirical data are limiting, analysts are often
forced to make assumptions that can result in assessments that are arbitrarily
over-specified and therefore misleading. In practice, the assumptions typically
made in these situations, such as independence, (log)normality of distributions,
and linear relationships, can under

More fundamentally, it can be argued that probability theory has an inadequate
model of ignorance because it uses equiprobability as a model for incertitude
and thus cannot distinguish uniform risk from pure lack of knowledge. In most
practical risk assessments, some uncertainty is epistemic rather than aleatory,
that is, it is incertitude rather than variability. For example, uncertainty
about the shape of a probability distribution and most other instances of
model uncertainty are typically epistemic. Treating incertitude as though
it were variability is even worse than overspecification because it confounds
epistemic and aleatory uncertainty and leads to risk conclusions that are
simply wrong. The five approaches based on interval and imprecise probabilities
allow an analyst to keep these kinds of uncertainty separate and treat them
differently as necessary to maintain the interpretation of risk as the frequency
of adverse outcomes.

The five approaches also make backcalculations possible and practicable in
risk assessments. Backcalculation is required to compute cleanup goals, remediation
targets and performance standards from available knowledge and constraints
about uncertain variables. The needed calculations are notoriously difficult
with standard probabilistic methods and cannot be done at all with straightforward
Monte Carlo simulation, except by approximate, trial-and-error strategies.

Although the five approaches arose from distinct scholarly traditions and
have many important differences, the tutorial emphasizes that they share a
commonality of purpose and employ many of the same ideas and methods. They
can be viewed as complementary, and they constitute a single perspective on
risk analysis that is sharply different from both traditional worst-case and
standard probabilistic approaches. Each approach is illustrated with a numerical
case study and summarized by a checklist of reasons to use, and not to use,
the approach.

The presentation style will be casual and interactive. Participants will receive
a CD of some demonstration software and the illustrations used during the
tutorial.

Overview of topics

What's missing from Monte Carlo?

Correlations are special cases of dependencies

Probability theory has an inadequate model of ignorance

Model uncertainty is epistemic rather than aleatory in nature

Backcalculation cannot be done with Monte Carlo methods

Interval probability

Conjunction and disjunction (ANDs and ORs)

Fréchet case (no assumption about dependence)

Mathematical programming solution

Case study 1: fault-tree for a pressurized tank system

Why and why not use interval probability

Robust Bayes and Bayesian sensitivity analysis

Bayes' rule and the joy of conjugate pairs

Dogma of Ideal Precision

Classes of priors and classes of likelihoods

Robustness and escaping subjectivity

Case study 2: extinction risk and conservation of pinnipeds

Why and why not use robust Bayes

Dempster-Shafer theory

Indistinguishability in evidence

Belief and plausibility

Convolution via the Cartesian product

Case study 3: reliability of dike construction

Case study 4: human health risk from ingesting PCB-contaminated waterfowl

Why and why not use Dempster-Shafer theory

Probability bounds analysis

Marrying interval analysis and probability theory

Fréchet case in convolutions

Case study 5: environmental exposure of wild mink to mercury contaminationbirds
to an agricultural insecticide

Backcalculation

Case study 6: planning cleanup for selenium contamination in San Francisco
Bay

Why and why not use probability bounds analysis

Imprecise probabilities

Comparative probabilities

Closed convex sets of probability distributions

Multifurcation of the concept of independence

Case study 7: medical diagnosis

Why and why not use imprecise probabilities

***

Slides of Scott's talk and exercises are available.

**Wednesday 28 July**

Speaker:

*lower
and upper previsions*; (ii) *sets of probability measures*; and (iii)
*sets of desirable gambles*. For each of these models we study their
interpretation terms of behaviour, the rationality criteria of *avoiding
sure loss* and *coherence*, and the underlying mechanism, called *natural
extension*, that allows us to make deductions based on these models. We
also study their mutual relationships. We show that these models encompass
a number of popular uncertainty models and reasoning mechanisms extant in
the literature, such as classical propositional logical, Bayesian or precise
probabilities, 2-monotone lower probabilities, belief functions, possibility
measures.

In the afternoon session, we move on to the notion of conditioning, and shed
more light on fundamental results such as the *Generalised Bayes Rule*
and the *Marginal Extension Theorem*. These lead to techniques, based
on the rationality criterion of coherence, that allow us to construct a conditional
model from an unconditional one, and to combine conditional and marginal models.

The classroom exercises are intended to allow the students to become more
familiar with the more theoretical notions discussed in the theory part.

***

Slides of Gert's talk and exercises are available.

**Thursday 29 July**

Speaker:

We begin with a discussion of canonical expected utility theory, and topics
related to decisions in static (non-sequential) decisions. These include criteria
relating to coherence, avoiding sure-loss – sometimes called “Book” – and
admissibility. We will consider criteria relating to ordering assumptions.
Third, we will review results that do not require an Archimedean (or Continuity)
condition.

Following that, the discussion will focus on some criteria that affect sequential
decision theory, including equivalence of normal and extensive form decisions,
and various

notions of “dynamic” coherence.

Next, we will examine what becomes of these same criteria with various decision
rules that apply when either probability or utility is allowed to go indeterminate.

The class will include some practice with tools for elicitation and for sequential
decision analysis.

**Friday 30 July**

Speakers:

*imprecise Dirichlet model*.

*Bayesian networks* are models to represent complex and uncertain relationships
between a large number of variables. They are based on an explicit representation
of independence relationships by means of a graph, and procedures to exploit
the factorization associated to independence in order to produce fast inferences.
They have been very successful in building real applications, but one of their
main drawbacks is that, very often it is necessary to give precise estimations
for a large number of probability values, sometimes with very small sample
sizes. *Credal networks* try to avoid this difficulty by allowing the
use of imprecise probabilities. We will review the work that has been done
in credal networks with two main points: inference (much more difficult than
with precise probability) and learning from data (in general, probabilistic
procedures are applied to learn the structure and very few genuine methods
based on imprecision have been proposed).

More generally speaking, we will show that the task called *knowledge discovery
from data sets* can benefit from adopting imprecise probability methods.
Knowledge discovery typically assumes that data are the only source of information
about a domain, and aims at inferring models that make domain knowledge explicit.
Learning from data is thus started in conditions of prior ignorance; and the
data are often available in incomplete way, such as when values are missing
in the data set, which involves another form of ignorance that is about the
data themselves. When *pattern classification* is concerned, the inferred
models are used in practice to do medical diagnosis, fraud detection, or image
recognition, just to name a few applications. Modeling ignorance carefully
is a central issue to make these models and applications reliable. This issue
is strictly related to the possibility to state, and work with, weak assumptions.

Initially we will show how imprecise probability allows to reliably dealing
with incomplete data in a way that significantly departs from established
approaches. *Missing data* are a serious problem of knowledge discovery
application, that can severely limit the credibility of the inferred models.
Imprecise probability makes robust modeling of missing data possible by permitting
to do no assumptions on the mechanism that turns complete into incomplete
data. The issues of learning from, and classifying, incomplete data will be
treated in a unified framework by a generalized updating rule. This will naturally
produce generalized classifiers, called *credal classifiers*, with the
novel characteristic of being able to (partially) suspend the judgment when
there are reasonable doubts about the correct classification. Credal classifiers
will be shown to be able to carefully deal also with the prior ignorance problem,
by relying on the imprecise Dirichlet model.

Finally, we will focus on the practical design of credal classifiers. We will
consider the *naive Bayes*, *TAN*, and *C4.5* classifiers.
Naive Bayes and TAN are special cases of Bayesian networks, while C4.5 is
a classification tree. These are traditional classifiers, which are very popular
and widely recognized to be good in the knowledge discovery community. We
will review the extension of these models to credal classification, showing
how to infer them from data and to carry out the classification. Real case
studies will be presented to show the impact of credal classification.

***

Slides of Serafin's talk
and exercises are available.

Slides of Marco's
talk and exercises
are available.

**Saturday 31 July**

Speaker:

***

Slides of Thomas'
talk and summary lecture are available.

**Typical schedule**

08:30-10:30 Talk

10:30-11:00 Coffee break

11:00-13:00 Exercises

13:00-14:30 Lunch

14:30-16:30 Talk

16:30-17:00 Coffee break

17:00-19:00 Exercises