### linear-algebra

#### Eigen - directly compute log determinant of huge sparse matrix

I would like to compute the log-determinant of a very large matrix (5e6 x 5e6). It is, however, highly sparse - there are only 6 nonzero entries on each row (7 counting the diagonal). It's also symmetrical and positive definite. In Eigen I've tried to use the Cholesky decomposition: SimplicialLDLT<SparseMatrix<double>> followed by summing the log-values of the diagonal (accessible by SimplicialLDLT::vectorD()) But the decomposition runs for a very long time without finishing. Any better approaches? I don't actually need any sort of decomposition, just the log-determinant itself (or a good estimate).

How fast do you expect? You could try all the sparse solvers in version 3.3-beta1 and find which one is faster for you problem. https://eigen.tuxfamily.org/dox-devel/group__TopicSparseSystems.html

I might as well put this as an answer so I can show a figure. First, Eigen’s documentation on sparse solvers says SimplicialLDLT is “Recommended for very sparse and not too large problems”. Your problem is very sparse but also very large. Second, SimplicialLDLT needs input to be not just symmetric (“selfadjoint”) but also positive definite, which yours is almost certainly not (unless you have reason to think otherwise?). It’s possible that SimplicialLDLT is spending a ton of time computing the Cholesky factorization—which it won’t find successfully since your matrix isn’t positive definite. That brings me to a third point. I generated, in Matlab 😭, a small version of your problem: 1e4 by 1e4 sparse matrix, symmetric, with small integers on the diagonal between 1 and 5, and each row having six other entries of -1. Matlab’s equivalent of LDLT is ldl and oddly enough doesn’t need a positive definite matrix, so it chewed through my 1e4 by 1e4 example in a few seconds (it takes longer for it to generate the sparse matrix than to factorize it into LDL'). Here’s the lower-triangular factor’s sparsity pattern (via spy(L) in Matlab): It has 1e7 non-zero elements, and takes up 162 MB RAM. Recall this is for a 1e4 by 1e4 problem. If memory usage scales linearly with the length of the matrix (1e4 → 5e6), you’re looking at nearly 80 GB RAM usage. If instead it scales with the number of elements (1e4^2 → 5e6^2), you’d need 38 TB RAM… None of this analysis is conclusive—it could well be that scaling to 5e6 by 5e6 greatly increases sparsity in LDL' factors, but this may explain why Eigen is hanging. As mentioned in comments, check if your swap file is thrashing. A fourth issue is that, for my test example of 1e4 by 1e4, I have a lot of exactly-zero entries on the lower-triangular L matrix’s diagonal, so the determinant of the entire sparse matrix is zero to double-precision, logs or no logs.

### Related Links

Computing singular values vs eigenvalues when you have the choice

Solving of a linear system with parameters

Upper Division Linear Algebra

positive solutions to a homogeneous linear system

Use LispLab within AutoCAD

Eigenvalues of large symmetric matrices

Eigen - directly compute log determinant of huge sparse matrix

Calculating the coefficients of a separable state

When to use eigen and when to use Blas

Numerical Economic Computability Algorithm

Index of a maximum element in TensorFlow tensor

Efficiently multiplying matrix with transpose using cuBlas

Linear Algebra Derivation in Gertler-Karadi (2015) AEJ

Lapack Orthonormalization Function for Rectangular Matrix

Speed of linear dynamical system trajectory

Linear iterative solver vs direct solver stability