The low-rank canonical polyadic tensor decomposition is useful in data analysis and can be computed by solving a sequence of overdetermined least squares subproblems. Motivated by consideration of sparse tensors, we propose sketching each subproblem using leverage scores to select a subset of the rows, with probabilistic guarantees of the solution accuracy. We randomly sample rows proportional to leverage score upper bounds that can be efficiently computed using the special Khatri-Rao subproblem structure inherent in tensor decomposition. Crucially, the number of rows in the sketched system is independent of the number of nonzeros in the full tensor and the number of rows in the full system. Along the way, we provide a practical solution to the generic matrix sketching problem of sampling overabundance for high-leverage-score rows, proposing to include such rows deterministically and combine repeated samples in the sketched system; we conjecture that this can lead to improved theoretical bounds. Numerical results on real-world large-scale tensors show the method is significantly faster than deterministic methods without sacrificing accuracy.

Type

Publication

arXiv

Date

Dec 2020

Tags

Citation

B. W. Larsen, T. G. Kolda.
**Practical Leverage-Based Sampling for Low-Rank Tensor Decomposition**.
arXiv:2006.16438, submitted for publication,
2020.
http://arxiv.org/abs/2006.16438

tensor decomposition, CANDECOMP/PARAFAC (CP), canonical polyadic (CP), matrix sketching, leverage score sampling, randomized numerical linear algebra (RandNLA)

Initial version posted June 30, 2020. Revised December 10, 2020, January 3, 2022.

```
@misc{arXiv-LaKo20,
author = {Brett W. Larsen and Tamara G. Kolda},
title = {Practical Leverage-Based Sampling for Low-Rank Tensor Decomposition},
month = {December},
year = {2020},
note = {submitted for publication},
eprint = {2006.16438},
eprintclass = {math.NA},
}
```