前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >矩阵流中的Schatten规范:Hello Sparsity,Goodbye Dimension

矩阵流中的Schatten规范:Hello Sparsity,Goodbye Dimension

原创
作者头像
罗大琦
发布2019-07-18 17:10:27
6340
发布2019-07-18 17:10:27
举报
文章被收录于专栏:算法和应用

作者:Vladimir Braverman,Robert Krauthgamer,Aditya Krishnan,Roi Sinoff

摘要:矩阵的频谱包含关于基础数据的重要结构信息,因此对计算矩阵频谱的各种函数有相当大的兴趣。这种函数的一个基本例子是频谱的lp范数,称为矩阵的Schattenp范数。表示真实世界数据的大矩阵通常是\ emph {稀疏}(大多数条目为零)或\ emph {双稀疏},即在行和列中都是稀疏的。这些大型矩阵通常作为更新的\ emph {stream}访问,通常以\ emph {row-order}组织。在该设置中,空间(存储器)是限制资源,计算频谱函数是一项昂贵的任务,并且已知算法需要在矩阵的维度中是多项式的空间,即使对于稀疏矩阵也是如此。因此,非常希望设计需要明显更小空间的算法。

我们通过提供第一种算法来回答这一挑战,该算法使用space \ emph {独立于矩阵维度}来计算以行顺序呈现的双稀疏矩阵的Schattenp范数。相反,我们的算法在稀疏性参数k中使用空间多项式,并且使得O(p)在数据流上传递。我们进一步证明在这种情况下多次通过是不可避免的,并且显示了我们主要技术的几个扩展,包括特殊矩阵族的更强上界,更难转动模型的算法,以及空间要求和通过次数之间的权衡。

原文标题:Schatten Norms in Matrix Streams: Hello Sparsity, Goodbye Dimension

原文摘要:The spectrum of a matrix contains important structural information about the underlying data, and hence there is considerable interest in computing various functions of the matrix spectrum. A fundamental example for such functions is thelp-norm of the spectrum, called the Schattenp-norm of the matrix. Large matrices representing real-world data are often \emph{sparse} (most entries are zeros) or \emph{doubly sparse}, i.e., sparse in both rows and columns. These large matrices are usually accessed as a \emph{stream} of updates, typically organized in \emph{row-order}. In this setting, where space (memory) is the limiting resource, computing spectral functions is an expensive task and known algorithms require space that is polynomial in the dimension of the matrix, even for sparse matrices. Thus, it is highly desirable to design algorithms requiring significantly smaller space.

We answer this challenge by providing the first algorithm that uses space \emph{independent of the matrix dimension} to compute the Schattenp-norm of a doubly-sparse matrix presented in row order. Instead, our algorithm uses space polynomial in the sparsity parameterkand makesO(p)passes over the data stream. We further prove that multiple passes are unavoidable in this setting and show several extensions of our primary technique, including stronger upper bounds for special matrix families, algorithms for the more difficult turnstile model, and a trade-off between space requirements and number of passes.

地址:https://arxiv.org/abs/1907.05457

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档