How to use SpatialSablefishAssessment

Configuring and checking model inputs (data and parameters)

This package contains TMB models and summarizing and plotting functions for a spatial model used in sablefish research. Users are responsible for configuring the data and parameter structures that are passed to the TMB MakeADFun function which compiles the model. The best place to look for what dimensions and names of expected elements for data and parameter it is best to look at an example or the source code for the model (see here). To view an example data run

load(system.file("testdata", "MockSablefishModel.RData",package="SpatialSablefishAssessment"))
names(data)
names(parameters)
region_key

Once you have built the data and parameter objects then you can check that the dimensions are consistent with the model by using,

validate_input_data_and_parameters(data, parameters)

This function will report a message telling you either success or if one of the objects dimensions are off, then you will need to fix this because it is likely if you pass this to the TMBs MakeADFun it will crash your R session.

The following list are some useful utility functions contained in this package to visualize data inputs. All functions should be documented and queried using the usual R ? method

  • plot_input_observations
  • plot_input_catches
  • plot_mean_weight
  • plot_age_length_matrix

Once you have checked the data and parameter objects you can build the TMB model following,

my_model = TMB::MakeADFun(data = data,
                               parameters = parameters,
                               DLL = "SpatialSablefishAssessment_TMBExports", silent  = T)

During simulations I would sometimes encounter the following error

Error in getParameterOrder(data, parameters, new.env(), DLL = DLL) : NOT A VECTOR!`.

If you encounter this use the function ?convert_simdata_integers.

Simple sanity checks

It is possible to configure a model that has zero gradients for parameters, and you want to check this is not the case using the

## check gradients and NA's in likelihoods
pre_optim_sanity_checks(my_model)  

Optimisation

mle_optim = nlminb(start = my_model$par, objective = my_model$fn, gradient  = my_model$gr, control = list(iter.max = 10000, eval.max = 10000))

# Try and improve the optimsation running the model for two additional Newton Raphson iterations
try_improve = tryCatch(expr =
                         for(i in 1:2) {
                           g = as.numeric(my_model$gr(mle_optim$par))
                           h = optimHess(mle_spatial$par, fn = my_model$fn, gr = my_model$gr)
                           mle_optim$par = mle_spatial$par - solve(h,g)
                           mle_optim$objective = my_model$fn(mle_optim$par)
                         }
                       , error = function(e){e})

try_improve ## should be NULL i.e., no error or warning otherwise suggests non-PD hessian matrix

Check convergence

## check PD covariance
## check parameters at bounds
## check final gradients
## check eigen value sizes
post_optim_sanity_checks(my_model)  

Fixing estimable parameters at input values

?fix_pars()

Sharing parameter estimates among estimable variables i.e., male and female having a common selectivity

?set_pars_to_be_the_same()

For the TagIntegrated there is a specific function set_up_parameters which will set up the map object used by TMB’s MakeADFun which will fix values at input values during optimization and set shared parameters.

?set_up_parameters()

Model Summary

Once you are satisfied that the model has converged at global minimum, then you can get the model to report model quantities

## get MLE quantities
mle_report = my_model$report(mle_optim$par)

#########
## Plot interesting
## Model quantities i.e., recruitment SSBs model fits etc
#########

## movement assumption
plot_movement(mle_report, region_key = region_key)
# starting values
plot_movement(OM_report, region_key)
# plot selectivities
plot_selectivities(mle_report)
# starting values
plot_selectivities(OM_report)

## Regional SSBs
ssb_plt = plot_SSB(mle_report, region_key = region_key)
ssb_plt
## sum SSB over all regions
ggplot(ssb_plt$data %>% group_by(Year) %>% summarise(total_ssb = sum(SSB)), aes(x = Year, y = total_ssb)) +
  geom_line(linewidth = 1.1) +
  ylim(0,NA)

## plot F's
plot_fishing_mortalities(MLE_report = mle_report, region_key = region_key)

## plot fits to observations

# catch and index
plot_catch_fit(MLE_report = mle_report, region_key = region_key) + facet_wrap(label~Region, ncol = n_regions) + ylab("Catch (mt)")
plot_index_fit(MLE_report = mle_report, region_key = region_key)

# survey AFs
first_year_set = c(1981, seq(from = 1985, to = 1993, by = 2), 1996:1999)
plot_AF(MLE_report = mle_report, region_key = region_key, label = "srv_dom_ll", subset_years = first_year_set, sex = "male") +
  ggtitle("Male survey AF") +
  guides(shape = "none", linetype = "none") +
  ylim(0,20)
plot_AF(MLE_report = mle_report, region_key = region_key, label = "srv_dom_ll", subset_years = first_year_set, sex = "female") +
  ggtitle("Female survey AF") +
  guides(shape = "none", linetype = "none")
  
# fishery LFs
# Fixed gear
plot_LF(MLE_report = mle_report, region_key = region_key, label = "fixed", subset_years = 1991:1999, sex = "male") +
  ggtitle("Male fixed LF") +
  guides(shape = "none", linetype = "none")
plot_LF(MLE_report = mle_report, region_key = region_key, label = "fixed", subset_years = 1991:1999, sex = "female") +
  ggtitle("Female fixed LF") +
  guides(shape = "none", linetype = "none")

# plot by tag releases event
tag_fits_by_age_sex = get_tag_recovery_obs_fitted_values(mle_report, region_key = region_key)
tag_fits = tag_fits_by_age_sex %>% group_by(release_event, unique_recovery_id) %>% summarise(observed = sum(observed), predicted = sum(predicted), recovery_year = unique(recovery_year), recovery_region = unique(recovery_region), release_region = unique(release_region), release_year = unique(release_year))
tag_fits$resid = tag_fits$observed - tag_fits$predicted
tag_fits$stand_resid = tag_fits$resid / tag_fits$predicted
tag_fits$resid_sign = ifelse(tag_fits$resid < 0, "negative", "positive")
unique_release_events = unique(tag_fits$release_event)
gplt = ggplot(tag_fits %>% filter(release_event %in% unique_release_events[1:16]), aes(x = recovery_year, y = recovery_region)) +
  geom_point(aes(size = abs(stand_resid), col = resid_sign)) +
  labs(x = "Recovery year", y = "Recovery region", size = "Pearson residuals") +
  theme_bw() +
  scale_size_area() +
  facet_wrap(~release_event) +
  scale_x_discrete(breaks = every_nth(n = 5))

MCMC inference

Previously ADMB had both frequentist and Bayesian inference available, given this package is embedded in TMB it can only do frequentist inference. However, it is very easy use the tmbstan package allows TMB models to plug into the Stan back-end for MCMC inference. This will mean you will need to install this package following.

devtools::install_github("kaskr/tmbstan/tmbstan")

Once you have done MLE you can use the following R code to run MCMC and assess posteriors/convergence

library(tmbstan)
## Run a single chain in serial with defaults
fit <- tmbstan(my_model, chains=4)

## Can also get ESS and Rhat from rstan::monitor
mon <- monitor(fit)
max(mon$Rhat)
min(mon$Tail_ESS)

## Other methods provided by 'rstan'
class(fit)
methods(class="stanfit")
## Pairs plot of the fixed effects
pairs(fit, pars=names(obj$par))
## Trace plot
traceplot(fit, pars=names(obj$par), inc_warmup=TRUE)

## Can extract marginal posteriors easily
post <- as.matrix(fit)

hist(post[,'ln_mean_rec'])

## What if you want a posterior for derived quantities in the report? Just
## loop through each posterior sample (row) and call the report function
## which returns a list. The last column is the log-posterior density (lp__)
## and needs to be dropped
obj$report(post[1,-ncol(post)])         # sd0 is only element
Bzero <- rep(NA, len=nrow(post))
for(i in 1:nrow(post)){
  r <- obj$report(post[i,-ncol(post)])
  Bzero[i] <- r$Bzero
}
hist(Bzero)