16.5 Careful if specifying a prior on initial conditions

You should also be aware that mis-specification of the prior on the initial states (\(\boldsymbol{\pi}\) and \(\boldsymbol{\Lambda}\)) can have catastrophic effects on your parameter estimates if your prior conflicts with the distribution of the initial states implied by the MARSS model. These effects can be very difficult to detect because the model will appear to be well-fitted. Unless you have a good idea of what the parameters should be, you might not realize that your prior conflicts.

The default behavior for MARSS() is to set \(\boldsymbol{\Lambda}\) to zero and estimate \(\boldsymbol{\pi}\). This does not put any constraints on \(\boldsymbol{\Lambda}\) (there is no \(\boldsymbol{\Lambda}\) to put constraints on) and circumvents this problem. However if you plan to put contraints on \(\boldsymbol{\pi}\) or \(\boldsymbol{\Lambda}\), you should verse yourself in the most common problems. The common problems we have found with priors on \(\mathbf{x}_0\) are the following. Problem 1) The correlation structure in \(\boldsymbol{\Lambda}\) (whether the prior is diffuse or not) does not match the correlation structure in \(\mathbf{x}_0\) implied by your model. For example, you specify a diagonal \(\boldsymbol{\Lambda}\) (independent states), but the implied distribution has correlations. Problem 2) The correlation structure in \(\boldsymbol{\Lambda}\) does not match the structure in \(\mathbf{x}_0\) implied by constraints you placed on \(\boldsymbol{\pi}\). For example, you specify that all values in \(\boldsymbol{\pi}\) are shared, yet you specify that \(\boldsymbol{\Lambda}\) is diagonal (independent).

Unfortunately, using a diffuse prior does not help with these two problems because the diffuse prior still has a correlation structure and can still conflict with the implied correlation in \(\mathbf{x}_0\). One way to get around these problems is to set \(\boldsymbol{\Lambda}=0\) (a \(m \times m\) matrix of zeros) and estimate \(\boldsymbol{\pi} \equiv \mathbf{x}_0\) only. Now \(\boldsymbol{\pi}\) is a fixed but unknown (estimated) parameter, not the mean of a distribution. In this case, \(\boldsymbol{\Lambda}\) does not exist in your model and there is no conflict with the model. This is the default behavior of MARSS(). Unfortunately estimating \(\boldsymbol{\pi}\) as a parameter is not always robust. If you specify that \(\boldsymbol{\Lambda}\)=0 and specify that \(\boldsymbol{\pi}\) corresponds to \(\mathbf{x}_0\), but your model explodes when run backwards in time, you cannot estimate \(\boldsymbol{\pi}\) because you cannot get a good estimate of \(\mathbf{x}_0\). Sometimes this can be avoided by specifying that \(\boldsymbol{\pi}\) corresponds to \(\mathbf{x}_1\) so that it can be constrained by the data \(\mathbf{y}_1\).

In summary, if the implied correlation structure of your initial states is independent (diagonal variance-covariance matrix), you should generally be ok with a diagonal and high variance prior or with treating the initial states as parameters (with \(\boldsymbol{\Lambda}=0\)). But if your initial states have an implied correlation structure that is not independent, then proceed with caution. With caution means that you should assume you have problems and test how your model fits with simulated data.