Chapter 15 Troubleshooting
Tip: Use MARSSinfo()
for information on various common error and warning messages.
Numerical errors due to ill-conditioned matrices are not uncommon when fitting MARSS models. The Kalman and EM algorithms need inverses of matrices. If those matrices become ill-conditioned, for example all elements are close to the same value, then the algorithm becomes unstable. Warning messages will be printed if the algorithms are becoming unstable and you can set control$trace=1
, to see details of where the algorithm is becoming unstable. Whenever possible, you should avoid using shared \(\boldsymbol{\pi}\) values in your model. An example of a \(\boldsymbol{\pi}\) with shared values is \(\boldsymbol{\pi}=\bigl[\begin{smallmatrix} a\\a\\a \end{smallmatrix} \bigr]\). The way the EM algorithm deals with \(\boldsymbol{\Lambda}\) tends to make this case unstable, especially if \(\mathbf{R}\) is not diagonal. In general, estimation of a non-diagonal \(\mathbf{R}\) is more difficult, more prone to ill-conditioning, and more data-hungry.
You may also see non-convergence warnings, especially if your MLE model turns out to be degenerate. This means that one of the elements on the diagonal of your \(\mathbf{Q}\) or \(\mathbf{R}\) matrix are going to zero (are degenerate). It will take the EM algorithm forever to get to zero. BFGS will have the same problem, although it will often get a bit closer to the degenerate solution. If you are using method="kem"
, MARSS will warn you if it looks like the solution is degenerate. If you use control=list(allow.degen=TRUE)
, the EM algorithm will attempt to set the degenerate variances to zero (instead of trying to get to zero using an infinite number of iterations). However, if one of the variances is going to zero, first think about why this is happening. This is typically caused by one of three problems: 1) you made a mistake in inputting your data, e.g. used -99 as the missing value in your data but did not replace these with NAs before passing to MARSS, 2) your data are not sufficient to estimate multiple variances or 3) your data are inconsistent with the model you are trying fit.
The algorithms in the MARSS package are designed for cases where the \(\mathbf{Q}\) and \(\mathbf{R}\) diagonals are all non-minuscule. For example, the EM update equation for \(\mathbf{u}\) will grind to a halt (not update \(\mathbf{u}\)) if \(\mathbf{Q}\) is tiny (like 1E-7). Conversely, the BFGS equations are likely to miss the maximum-likelihood when \(\mathbf{R}\) is tiny because then the likelihood surface becomes hyper-sensitive to \(\boldsymbol{\pi}\). The solution is to use the degenerate likelihood function for the likelihood calculation and the EM update equations. MARSS will implement this automatically when \(\mathbf{Q}\) or \(\mathbf{R}\) diagonal elements are set to zero and will try setting \(\mathbf{Q}\) and \(\mathbf{R}\) terms to zero automatically if control$allow.degen=TRUE
.
One odd case can occur when \(\mathbf{R}\) goes to zero (a matrix of zeros), but you are estimating \(\boldsymbol{\pi}\). If model$tinitx=1
, then \(\boldsymbol{\pi}=\mathbf{x}_1^0\) and \(\mathbf{y}_1-\mathbf{Z}\mathbf{x}_1^0\) can go to 0 as well as \(\,\text{var}(\mathbf{y}_1-\mathbf{Z}\mathbf{x}_1^0)\) by driving \(\mathbf{R}\) to zero. But as this happens, the log-likelihood associated with \(\mathbf{y}_1\) will go (correctly) to infinity and thus the log-likelihood goes to infinity. But if you set \(\mathbf{R}=0\), the log-likelihood will be finite. The reason is that \(\mathbf{R} \approx 0\) and \(\mathbf{R}=0\) specify different likelihoods associated with \(\mathbf{y}_1-\mathbf{Z}\mathbf{x}_1^0\). With \(\mathbf{R}=0\), \(\mathbf{y}_1-\mathbf{Z}\mathbf{x}_1^0\) does not have a distribution; it is just a fixed value. So there is no likelihood to go to infinity. If some elements of the diagonal of \(\mathbf{R}\) are going to zero, you should be suspect of the parameter estimates. Sometimes the structure of your data, e.g. one data value followed by a long string of missing values, is causing an odd spike in the likelihood at \(\mathbf{R} \approx 0\). Try manually setting \(\mathbf{R}\) equal to zero to get the correct log-likelihood. The likelihood returned when \(\mathbf{R} \approx 0\) is not incorrect. It is just not the likelihood that you probably want. You want the likelihood where the \(\mathbf{R}\) term is dropped because it is zero.