MULTIPLE MODEL INFERENCE IN
MANAGEMENT
In science, we need
to consider multiple alternative explanations for any phenomenon. Here, we are specifically
interested in multiple models because of their role in affecting optimal
decision making, and we’ll see some concrete examples that should make some of
the points raised in the morning workshop a little clearer.
USING MODELS AS
ALTERNATIVE HYPOTHESES
I’m going to illustrate some of these points with an example from my
experience with the development of adaptive harvest management (AHM) models for
American black ducks (Anas rubripes), but the lessons should be generally
applicable to many systems. If you wish
to get more details on black duck AHM, you can consult the website that we maintain
for the project. Please note that in the
specific examples below, I have taken some liberties with the actual data
results (parameter estimates, model weights, etc.). I’ve done this both to
simplify the problem, and to make specific points that would have been
confusing if the actual analyses were used.
I’ll boil down the main “model” issues for black duck AHM as follows.
CONSTRUCTING THE MODELS
In the black duck
problem, we first had to development models that expressed our alternative
views about what makes black duck populations tick. We are particularly
interested in the views that say different things about the impacts of
management on black duck populations.
This boils down to two sub-models of black duck dynamics: (1) a
production sub-model and (2) a survival sub-model. The production sub-model
describes the relationships between black duck abundance (density dependence)
and mallard numbers (competition) on fall age ratios for black ducks. The general form of this model is
![]()
where
are the numbers (in 100 thousands) of black ducks and mallards in the breeding
population in year t,
is
the predicted age ratio for black ducks in the fall populations, and
is a time index (describing a downward trend
in productivity through time). The above model assumes both density
dependence and competition; by setting
=0
we obtain the alternative reproduction model, which does not include
competition.
.
We fit historical
survey data to both models and obtained parameter
estimates and AIC values (see the first spreadsheet tab, “estimates”). The AIC values can be used for two purposes.
First, they can be used to compute model-averaged parameter estimates. For example, the weighted estimate and
unconditional standard error for
(which appears in both models) is -0.14563
(SE=0.04683). Second, the AIC weights
give a measure of the current (past on data to this point) believe in each
model. Based on our analyses, we give
the “no competition” model weight of 0.41, and the “competition” model weight
of 0.59.
On the survival side, the key issue of contention is the
impact of harvest on survival. This can be neatly summarized by the sub-model
![]()
which assumes an additive impact of harvest
mortality; that is, starting at
, each
additional increment of harvest morality
is assumed to decrease annual survival by
. The extreme forms of this specify
=1
(completely additive mortality, AMH) and
=0
(completely compensatory, CMH, at least up to a threshold (we will assume that
harvest rates never exceed this threshold).
For a number of technical reasons, the empirical estimates of
and
are unsatisfactory, and here we use values for these that are (1) based on life
history characteristics of black ducks, and (2) assume either complete
compensation or additivity. These assumptions result
in
(AMH)
(CMH)
Further, because of
the technical issues alluded to, we do not have reliable, empirical weights for
these 2 models (Conroy et al. 2002 came up with some weights, which put almost
all weight on AMH, but we will not use these here). Given the contentious nature of harvest, and
depending on the data used to “test” AMH vs. CMH, one could derive weights over
nearly the entire range of 0 (no evidence for AMH, perfect for CMH) to 1(the
reverse). In such a circumstance the
best thing may be to take weights that spread uncertainty evenly among the
models, so 0.5 for each model in this case.
Under the assumption
that the evidence in favor of competition/no competition is independent of that
in favor of AMH/CMH, we can simple multiply the corresponding model weights:
e.g., “no competition-AMH” = 0.41 X 0.5, etc.. We can now summarize the 4 competing hypotheses and their weights this
way,
|
Combined models |
|||||
|
model |
mallards |
harvest |
wt |
Production |
Survival |
|
1 |
no_comp |
cmh |
0.206691 |
|
|
|
2 |
no_comp |
amh |
0.206691 |
|
|
|
3 |
compet |
cmh |
0.293309 |
|
|
|
4 |
compet |
amh |
0.293309 |
|
|
Notice that instead of
trying to “reject” any of these alternative models, we are keeping all 4 of
them, but applying weights of belief / evidence to each. In this way, we can take the next step:
predicting under each model, and comparing these predictions to new data.
PREDICTION
Now we will take the above models and apply them to predicting next year’s population size of black ducks, given observed current levels of black ducks and mallards, habitat conditions, and harvest rates. Whereas earlier we fit data to an estimation model, now we are going to take estimated parameters and apply them to predicting age ratios, survival rates, and population size from current conditions. This will involve 3 steps. First, a prediction equation for
![]()
which predicts natural log of age
ratio, with coefficient estimates
and predictors
; note that this prediction will be different under each
model! One we get
we easily get
predicted age ratio by the exponential function
. Finally, note that we’re using ‘~’ to stand for prediction,
to distinguish this from estimation ‘^’: under estimation, we fit the equation
to data, while under prediction we apply the estimated equation to predicting
the response, based on specific values of the predictor values.
For the survival portion we have simply
where
=1 under AMH and
=0 under CMH. Finally, we put the production and survival predictions
together to predict population size next year as
,
under each of the 4 models (combinations of
production and survival assumptions).
For example (prediction tab), take current conditions of 400,000
black ducks, 300,000 mallards, and a time index of 32 (this essentially assumes
late 1990’s habitat conditions, and is affectively treated as in intercept
term), so
.
We will also assume a harvest mortality rate of 0.2
(
.
First, these values
would be used in each of the alternative production models to prediction age
ratios of 0.9749 under the “no competition” model and 0.9534 under the
“competition” model.
Likewise, for each
of the compensation models, we have predictions: 0.6 under CMH, and 0.4 under
AMH.
These are put
together into the combined models to predict abundance next year as
|
Model |
Production |
Survival |
Predicted_N |
|
1 |
No compet. |
CMH |
4.73976088 |
|
2 |
No compet. |
AMH |
3.15984059 |
|
3 |
Compet. |
CMH |
4.68818457 |
|
4 |
Compet. |
AMH |
3.12545638 |
Note that because
the predictions are based on statistical estimates, they have standard errors.
Also, other forms of uncertainty (including environmental uncertainty) will add
to statistical uncertainty. The methods for incorporating all these sources of
uncertainty are complex; here we will assume that prediction error is
proportional to the size of the prediction, by a constant coefficient of
variation, which is initially 0.2 (20%).
Finally, we can, if
we wish, obtain a single prediction, which averages over the alternative
models. This is done in much the same
way as model averaging. In this case, the model –averaged prediction, obtained
in columns S and T of the spreadsheet, provides a
model-averaged prediction of about 392,000 (SE=302,000). Note that this SE is quite
large—reflecting the fact that there is a lot of uncertainty (due to both model
and statistical uncertainty).
I’ve coded this particular
example into the spreadsheet (prediction tab). You
can vary the inputs by changing number of black ducks, mallard, and harvest
rates (hilited in yellow) and see how this affects
predictions.
COMPARING PREDICTIONS TO OBSERVATIONS--- PREDICTION
LIKELIHOOD
Predicting outcomes
under alternative conditions is a good thing to do, and we’ll do more of it as
we get into decision analysis, where we see how these predictions can be used
to guide management. Here we’re going to skip to the next step, which is how to
incorporate new information from surveys into our models. That is, once we’ve made some predictions, we
are (hopefully) going to continue to monitor the system to see how well our
predictions match up with observations we obtain via our monitoring
programs. To see how this might work,
consider the example which we just set up, in which we observed 400,000 black
ducks, 300,000 mallards, and harvest rates were 0.2 We then made predictions under each of our 4
models, which ranged from about 313,000 (Competition, AMH) to 474,000 (No competition, CMH).
Suppose that next
year we observe 370,000 black ducks in the surveys (column N, spreadsheet, likelihood tab). Look at the first figure in this spreadsheet,
which compares predicted (under each model) to observed numbers of black
ducks. Eyeballing this figure, we can
see that 2 of the 4 models (2 and 4) seem to agree better to the predictions,
than the other 2. What we need is a
formal way of measuring this agreement of observations to predictions that
takes into account prediction error.
That is, we’re not going to necessarily get too excited if a model is
off a bit on its predictions, if there was a lot of prediction error. We can do this formally by means of a likelihood, which now has a very similar
meaning to the likelihoods we refer to in maximum likelihood estimation, except
now we are talking about the likelihood of an observation (the survey value) give
the prediction of a particular model .
Here we’ll use a likelihood form that is based on the normal
distribution, and is nice because it’s very easy to see how it works. The general expression for the likelihood of
the observation under model i is:
![]()
where
is the observed value,
is the prediction
under model i,
and
is the prediction
error referred to earlier.
Note that if the
observed value exactly matches the prediction, the likelihood will be 1; in all
other cases the values will be highest for the models that come closes to
predicting the observed value, and vice-versa.
In our case the observed value is 3.7, and the likelihood under model 1
(No competition, CMH) is
.
A similar procedure is
used to obtain the likelihoods under the other 3 models, so that
|
Model |
Likelihood |
|
1 |
0.30026831 |
|
2 |
0.48164109 |
|
3 |
0.32931995 |
|
4 |
0.4296389 |
From this, we can
see that in this example, the observation
=370,000 is most likely under model 2 (no competition, AMH)
and least likelihood under model 1 (no competition, AMH). However, all the models have decent
likelihood weights, and so do need to be considered in management.
UPDATING BELIEF: BAYES’ THEOREM
The above likelihood measure is useful, but does not actually give us what we need, namely, an updated measure of belief in our models. There are 2 reasons for this. First, the likelihood is not on a probability scale (notice for instance that the likelihood weights do not sum to one). More important, the likelihood does not take into account prior information about model: our prior relative beliefs, which may (or may not) be informed by data.
We can solve both problems by invoking one of the most famous—yet
simple—theorems from probability, Bayes’ Theorem
(BT). In our case, BT says that the new model weight of a model is got by the
likelihood of that model (see above), weighted by the prior (previous) model
weight, and divided by the sum of this quantity over all the models:

The
is
just the current weight for each model; initially this will be the model
weights we obtained when we developed the models (see the estimation tab of the spreadsheet). The likelihoods are obtained by the procedure
we just went through, comparing model predictions to observed values and
computing a likelihood. For example, we can get the new model weight
for model 3 as

The table bellow summarizes the inputs and results for the 4 models, from the updating (1) tab of the spreadsheet.
|
Model wt |
Likelihood |
Updated_wt |
|
0.206691 |
0.30026831 |
0.16152808 |
|
0.206691 |
0.48164109 |
0.25909681 |
|
0.293309 |
0.32931995 |
0.25139674 |
|
0.293309 |
0.4296389 |
0.32797837 |
In the 3rd figure on the spreadsheet, you can see side-by-side the previous (prior) and updated (posterior) model weights. Now, model 4 (Competition, AMH) is receiving the most weight, based (1) on our prior belief (partially informed by data) which has been (2) modified by data (the likelihood).
KEEPING THE CIRCLE UNBROKEN
Do we call it quits? No, not unless we’ve quit monitoring, which, I hope you are convinced by now, is probably a dumb idea. Now that we’re armed by new model weights, how do we use them? Quite simply, we replace the previous (prior) model weights by these new weights, and go on our merry way as before. This means
This results in an unending process in which we are always folding new information from surveys into updating our model weights. So, for example in the updating (2) tab of the spreadsheet, I’ve pasted a link from column Q of the updating (1) tab into column D of the updating (2) tab. Secondly, I’m going to assume that we are at 370,000 black ducks (as just observed in our surveys, but mallards and other factors remain unchanged); this will slightly change the predictions of our 4 models, which will affect the likelihoods for next year’s observation of black ducks. Finally, say that in the next year we observe 350,000 black ducks. We now use this value and the model likelihoods to update the model weights, as displayed in column Q of updating (2) tab in the spreadsheet; this process could be repeated through any number of sequential predictions-observation-updating sequences. In this example, we seem to be accumulating a bit more evidence in favor of model 4 (Competition, AMH). In practice, the model updating will not be “smooth” over time, for all kinds of reasons, including statistical variation, but also environmental “noise” and management uncertainty (which we’ll discuss later).
WHAT? “PRIORS” BOTHER YOU?
Occasionally, discussion of Bayes’ updating (which is what this is) gets bogged down with the issue of “priors”, that is, the initial weights of evidence that we apply to our alternative models. There are several ways that these weights can arise, the first 2 of which we’ve seen in the black duck example:
The key point is that some means must be established for deriving these weights, in order for this process to work. Because the initial weights will start out having a lot of influence, it is a good idea to use the most objective (or at least rational and explainable) procedures to obtain these weights. More important is a dedication to the long term process of prediction, monitoring, and updating that will eventually make everyone forget what it is they were arguing about when the weights were first formulated. However, if these are unavailable (or cannot be agreed to) then any reasonable procedure (see above) will do. In a moment of frustration at a black duck AHM meeting, I even suggested that the attending members cast ballots for their favorite model(s), with the tally of votes per model divided by votes cast as the model weight.
Why such a cavalier attitude? The answer is the adaptive process should soon become independent of the initial weights selected. Effectively, each additional year of updating means that the model weights accumulate one more year that is based on monitoring. After 1 year, there is an initial weight and 1 year of data; after 2, an initial weight and 2, after 10, an initial weight and 10 years of data. By year 10, the effect of the initial weights, unless very different among models, will essentially have disappeared in the data. Eventually, the updated weights are guaranteed to become independent of the initial weights, via a property in these types of Markovian models that is the same property that guarantees stability in age- or staged-based population growth models, regardless of initial age distributions.