作者:Vladimir Braverman,Robert Krauthgamer,Aditya Krishnan,Roi Sinoff
摘要:矩阵的频谱包含关于基础数据的重要结构信息,因此对计算矩阵频谱的各种函数有相当大的兴趣。这种函数的一个基本例子是频谱的lp范数,称为矩阵的Schattenp范数。表示真实世界数据的大矩阵通常是\ emph {稀疏}(大多数条目为零)或\ emph {双稀疏},即在行和列中都是稀疏的。这些大型矩阵通常作为更新的\ emph {stream}访问,通常以\ emph {row-order}组织。在该设置中,空间(存储器)是限制资源,计算频谱函数是一项昂贵的任务,并且已知算法需要在矩阵的维度中是多项式的空间,即使对于稀疏矩阵也是如此。因此,非常希望设计需要明显更小空间的算法。
我们通过提供第一种算法来回答这一挑战,该算法使用space \ emph {独立于矩阵维度}来计算以行顺序呈现的双稀疏矩阵的Schattenp范数。相反,我们的算法在稀疏性参数k中使用空间多项式,并且使得O(p)在数据流上传递。我们进一步证明在这种情况下多次通过是不可避免的,并且显示了我们主要技术的几个扩展,包括特殊矩阵族的更强上界,更难转动模型的算法,以及空间要求和通过次数之间的权衡。
原文标题:Schatten Norms in Matrix Streams: Hello Sparsity, Goodbye Dimension
原文摘要:The spectrum of a matrix contains important structural information about the underlying data, and hence there is considerable interest in computing various functions of the matrix spectrum. A fundamental example for such functions is thelp-norm of the spectrum, called the Schattenp-norm of the matrix. Large matrices representing real-world data are often \emph{sparse} (most entries are zeros) or \emph{doubly sparse}, i.e., sparse in both rows and columns. These large matrices are usually accessed as a \emph{stream} of updates, typically organized in \emph{row-order}. In this setting, where space (memory) is the limiting resource, computing spectral functions is an expensive task and known algorithms require space that is polynomial in the dimension of the matrix, even for sparse matrices. Thus, it is highly desirable to design algorithms requiring significantly smaller space.
We answer this challenge by providing the first algorithm that uses space \emph{independent of the matrix dimension} to compute the Schattenp-norm of a doubly-sparse matrix presented in row order. Instead, our algorithm uses space polynomial in the sparsity parameterkand makesO(p)passes over the data stream. We further prove that multiple passes are unavoidable in this setting and show several extensions of our primary technique, including stronger upper bounds for special matrix families, algorithms for the more difficult turnstile model, and a trade-off between space requirements and number of passes.
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。