Chapter 3 Current Alaskan sablefish stock assessment model

The latest published stock assessment (D. Goethel et al. 2021) is a single area and sexually disaggregated age-structured model. Let \(\boldsymbol{N_{y,s}}\) denote a vector of ages in year \(y\) for sex \(s\) (the partition) i.e., \(\boldsymbol{N_{y,s}} = (N_{1,y,s}, N_{2,y,s}, \dots, N_{a_+,y,s})^T\). The general process model is sequential and follows the general Equation (3.1),

\[\begin{equation} \boldsymbol{N_{y,s}} = \begin{cases} g\left(\boldsymbol{\theta}\right), & y = 1959 \text{ Initial model year}\\ f\left(\boldsymbol{N_{y-1,s}}|\boldsymbol{\theta}\right), & y > 1959 \\ \end{cases} \tag{3.1} \end{equation}\] where, \(g(.)\) is the function describing initial conditions for the partition and \(f(.)\) is the function that applies populations dynamics each year i.e., birth, death, growth and migration. See later sections for a detailed description of \(g(.)\) and \(f(.)\) and \(\boldsymbol{\theta}\) is the set of estimable (not all are estimated) parameters.

Maximum Likelihood Estimates (MLE) for estimated parameters \(\widehat{\boldsymbol{\theta}}_{MLE}\) are evaluated by, \[\begin{equation} \widehat{\boldsymbol{\theta}}_{MLE} = \underset{\boldsymbol{\theta}}{\arg\max} \left( L\left(\boldsymbol{\theta} | \boldsymbol{y^{obs}}\right) \right) \tag{3.2} \end{equation}\] where, \(\boldsymbol{y^{obs}}\) is a set of observations and \(L\left( . \right)\) is an objective function that is made up of priors/penalties and probability densities. All the models in this research estimate \(\widehat{\boldsymbol{\theta}}_{MLE}\) by minimising the negative log of \(ll\left( . \right) = -\log(L\left( . \right))\). See the observation section for components of \(ll\left( . \right)\).

All symbols used in the following equations are defined in the table at the end of this chapter.

Process equations

Initialisation (\(g\left(.\right)\))

\[\begin{align*} N_{a,1,s} = \begin{cases} R_1, & a = a_1\\ \exp\bigg( \mu_r + \tau_{a_1 - a + 1}\bigg) \exp-\bigg( a - a_1\bigg) \bigg( M + F_{hist} * \mu_{LL} * S^{LL}_{a,1}\bigg), & a_0 < a < a_+\\ \exp( \mu_r) \exp-( a - 1) ( M + F_{hist} * \mu_{LL} * S^{LL}_{a - 1,1})(1 - \exp( M + F_{hist} * \mu_{LL} * S^{LL}_{a - 1,1}))^{-1}, & a = a_+ \end{cases} \end{align*}\]

Population dynamics (\(f\left(.\right)\))

The assessment assumes a closed population that is only effected by mortality (natural and fishing), recruitment and growth. Mortality is applied assuming

\[ Z_{a,y,s} = M + \sum_g F^g_{y} S^g_{a,y,s} \] where, \(S^g_{a,y,s}\) is the fishery selectivity and \(F^g_{y}\) is the annual estimated fishing mortality for fishing fleet \(g\). The annual cycle follows,

\[\begin{align*} N_{a,y,s} = \begin{cases} p^s_{y} R_y, & a = a_1\\ N_{a - 1,y - 1,s} \times \exp\bigg( -Z_{a - 1,y - 1,s} \bigg), & a_0 < a < a_+\\ \exp\bigg( -Z_{a - 1,y - 1,s} \bigg) + \exp\bigg( -Z_{a,y - 1,s} \bigg), & a = a_+ \end{cases} \end{align*}\] where, \[ R_y = \exp\{\mu_r + \tau_y + 0.5\sigma_R^2\} \] and \(p^s_{y}\) is the proportion of recruits in year \(y\) for sex \(s\).

Observation equations (\(ll\left( . \right)\))

Three are three observation types used in the current sablefish stock assessment - Relative abundance indices - Age composition (aggregated over sex) - Length composition (disaggregated by sex)

These three observation types come from both fishery dependent i.e., observer programs and fishery independent i.e., research surveys.

Catch at age

Fishery dependent catch at age observations for gear type \(g\) denoted by \({C^g}_{a,y,s}\) are calculated as follows

\[\begin{equation} {C^g}_{a,y,s} = \frac{F^g_{a,y,s}}{Z_{a,y,s}} N_{a,y,s} \left(1 - S_{a,y,s} \right) \tag{3.3} \end{equation}\] Currently all age observations are sex aggregated which means the model expected values before applying ageing error is \[ {C^g}_{a,y} = 0.5 \sum_s \frac{{C^g}_{a,y,s}}{\sum_a{C^g}_{a,y,s}} \] why the 0.5? should be omitted going forward. Ageing error is then incorporated and the values are normalized so that they are proportions, before being passed to the multinomial log-likelihood function.

Survey age composition is similar but instead of being a function of \(F\) it is calculated at the beginning of the year. For survey \(k\) the numbers at age are denoted by \({C^k}_{a,y}\) and calculated following \[\begin{equation} {C^k}_{a,y} = \sum_s p^s N_{a,y,s} S^k_{y,a,s} \tag{3.4} \end{equation}\]

I am not sure exactly what the timing of these surveys are, but do we need to account for some mid-year mortality? or changes in timing of the survey? if so we could easily replace \[ N_{a,y,s} \] with \[ N_{a,y,s} \left(1 - \exp\{-p^k_y Z_{a,y,s}\} \right) \] where,\(p^k_y\) is the proportion of mortality that we want to account for in year \(y\) for survey \(k\).

The survey numbers at age \({C^k}_{a,y}\) are then adjusted for ageing error and normalised so they sum to one for each year.

Relative abundance indices

\[\begin{equation} \widehat{I}^g_{y} = \sum_s\sum_a p^s N_{a,y,s} \exp \{-0.5 Z_{a,y,s}\} S^g_{y,a,s} \bar{w}_{a,y,s} \tag{3.5} \end{equation}\] where, \(\bar{w}_{a,y,s}\) is mean weight at age, this can be omitted if the observation is in numbers i.e., abundance instead of biomass and \(p^s\) is the proportion for each sex. This is currently a user input, but should be dealt within the model either as having different sex selectivities or through the sex ratio of recruitment.

A list of slight improvements

  • change how sex ratio is handled in age and length composition (Chapter 16)
  • fishery dependent abundance indices i.e., CPUE change \(N_{a,y,s} \exp \{-0.5 Z_{a,y,s}\}\) with \(\boldsymbol{C^g}_{a,y,s}\) which is calculated in the catch at age observations.

Catch at length

For each year that has a length frequency observation, numbers at length denoted by \(\boldsymbol{C^l}_{y,s} = (C^l_{1,y,s}, \dots, C^l_{n_l,y,s})^T\) (dimension of \(\boldsymbol{C^l}_{y,s}\) is \(n_l \ \times \ 1\)) were calculated for each sex. This involved multiplying the catch at age (see above for how it is calculated) through a sex specific age-length transition matrix denoted by \(\boldsymbol{A}^l_{y,s}\) (dimensions of \(\boldsymbol{A}^l_{y,s}\) are \(n_a \ \times \ n_l\) and its rows must sum to 1). The calculation follows Equation (3.6),

\[\begin{equation} \boldsymbol{C^l}_{y,s} = \left(\boldsymbol{A}^l_{y,s} \right)^T \ \times \ \boldsymbol{C_{y,s}} \tag{3.6} \end{equation}\] where, \(\boldsymbol{C_{y,s}}\) is a column vector of numbers at age (dimension \(n_a \ \times \ 1\)) at the beginning of year \(y\) for sex \(s\).

Symbol Notation

Symbol Description
\(y\) Year, \(y = 1960, \dots, T\)
\(T\) Terminal year of the model
\(s\) Sex index \(s \in \{1,2\}\)
\(a\) Model age cohort, i.e., \(a = a_0, a_0 + 1, \dots\)
\(a_{1}\) Recruitment age to the model = 2
\(a_+\) Plus-group age class (oldest age considered plus all older ages)
\(n_a\) Number of age classes modeled \(a_+ \ - a_1\)
\(l\) length class
\(n_l\) Number of length classes
\(g\) gear type index, i.e. longline survey, fixed gear fishery, trawl fishery
\(x\) log-likelihoos index
\(\bar{w}_{a,y, s}\) Average weight at age \(a\), year \(y\) and sex \(s\)
\(\phi_{a,y}\) Proportion of female mature by age and year
\(p^s_{y}\) Proportion of recruits for sex \(s\). Often assumed = 0.5
\(\ln \mu_{r}\) Average log-recruitment
\(\ln \mu_{f}\) Average log-fishing mortality
\(\phi_{y,g}\) annual fishing mortality deviation by gear (log space)
\(\tau_{y}\) annual recruitment deviation \(\sim LogNormal\left(0,\sigma_r\right)\)
\(\sigma_r\) Recruitment standard deviation
\(N_{a,y,s}\) Numbers of fish at age \(a\) in year \(y\) of sex \(s\)
\(M\) Natural mortality
\(F^g_{a,y}\) Fishing mortality for year \(y\), age \(a\) and gear \(g\)
\(F_{hist}\) Historical proportion of Fishing mortality
\(Z_{a,y}\) Total mortality for year \(y\), age \(a\) \(=\sum\limits_g F^g_{a,y} + M\)
\(R_{y}\) Annual recruitment
\(B_{y}\) Spawning biomass in year \(y\)
\(S^g_{a,y,s}\) Selectivity at age \(a\) for gear type \(g\) and sex \(s\)
\(a_50\) age at 50% selection for ascending limb
\(d_50\) age at 50% selection for descending limb
\(\delta\) slope/shape parameters for different logistic curves
\(\boldsymbol{A}\) ageing-error matrix dimensions \(n_a \ \times \ n_a\)
\(\boldsymbol{A}^l_s\) age to length conversion matrix by sex. dimensions \(n_a \ \times \ n_l\)
\(q_g\) abundance index catchability coeffecient by gear
\(\lambda_x\) Statistical weight (penalty) for component \(x\)
\(P^g_{l,y,s}\) Observed proportions at length for gear \(g\) in year \(y\) and sex \(s\)
\(P^g_{a,y,s}\) Observed proportions at age for gear \(g\) in year \(y\) and sex \(s\)

Inference

If random effects are considered the joint probability model follows, \[\begin{equation} Pr\left[ \boldsymbol{y^{obs}}, \boldsymbol{u}| \boldsymbol{\theta} \right] = Pr\left[\boldsymbol{y^{obs}} |\boldsymbol{\theta^f}, \boldsymbol{u} \right] Pr\left[\boldsymbol{u} |\boldsymbol{\theta^h} \right] \end{equation}\]

where, \(\boldsymbol{\theta}\) denotes “fixed-effect” parameters which are furthur seperated by \(\boldsymbol{\theta} = (\boldsymbol{\theta^f},\boldsymbol{\theta^h})\), with \(\boldsymbol{\theta^h}\) denoting fixed effects that are hyperprior parameters for the “random-effect” variables denoted by \(\boldsymbol{u}\) and \(\boldsymbol{\theta^f}\) are all other fixed-effect parameters.

Inference is conducted by maximising the marginal likelihood noting \(L\left(\boldsymbol{\theta} | \boldsymbol{y^{obs}} \right) \propto Pr\left[ \boldsymbol{y^{obs}} | \boldsymbol{\theta} \right]\) \[\begin{equation}\label{eq:marginal_ll} L\left(\boldsymbol{\theta} | \boldsymbol{y^{obs}}\right) = \int \left(Pr\left[\boldsymbol{y^{obs}} |\boldsymbol{\theta^f}, \boldsymbol{\theta^g}, \boldsymbol{u} \right] Pr\left[\boldsymbol{u} |\boldsymbol{\theta^h} \right] \right) \boldsymbol{du} \end{equation}\] In general this integral is not tractable, and so approximations are necessary. The software used here implement the Laplace approximation, which relies on Gaussian assumptions.

Maximum Likelihood Estimates (MLE) for fixed effect parameters \(\widehat{\boldsymbol{\theta}}_{MLE}\) are evaluated, \[\begin{equation} \widehat{\boldsymbol{\theta}}_{MLE} = \underset{\boldsymbol{\theta}}{\arg\max} \left( L\left(\boldsymbol{\theta} | \boldsymbol{y^{obs}}\right) \right) \end{equation}\]

and Empirical Bayes estimates are evaluated for \(\widehat{\boldsymbol{u}}\), which are used model diagnostics and other model quantities, \[\begin{equation} \widehat{\boldsymbol{u}} = \underset{\boldsymbol{u}}{\arg\max} \left( Pr\left[ \boldsymbol{y^{obs}}, \boldsymbol{u}| \widehat{\boldsymbol{\theta}}_{MLE} \right] \right) \end{equation}\]

TODO

  • Build a validate function to help catch users setting up parameters or data structures that will cause a crash once supplied to TMB.
  • Self test
  • Change array column casting from vector<Type>(array.col(i)) to array.col(i).vec()

References

Goethel, DR, DH Hanselman, CJ Rodgveller, KH Fenski, SK Shotwell, KB Echave, PW Malecha, A Siwicke, and CR Lunsford. 2021. “Assessment of the Sablefish Stock in Alaska.” Stock Assessment and Fishery Evaluation Report for the Groundfish Resources of the Gulf of Alaska and the Bering Sea/Aleutian Islands as Projected for.