Papers
arxiv:1803.10172

Distributed Adaptive Sampling for Kernel Matrix Approximation

Published on Mar 27, 2018
Authors:
,
,

Abstract

SQUEAK is a novel RLS sampling algorithm for kernel approximation that processes large datasets efficiently without constructing the full kernel matrix.

AI-generated summary

Most kernel-based methods, such as kernel or Gaussian process regression, kernel PCA, ICA, or k-means clustering, do not scale to large datasets, because constructing and storing the kernel matrix K_n requires at least O(n^2) time and space for n samples. Recent works show that sampling points with replacement according to their ridge leverage scores (RLS) generates small dictionaries of relevant points with strong spectral approximation guarantees for K_n. The drawback of RLS-based methods is that computing exact RLS requires constructing and storing the whole kernel matrix. In this paper, we introduce SQUEAK, a new algorithm for kernel approximation based on RLS sampling that sequentially processes the dataset, storing a dictionary which creates accurate kernel matrix approximations with a number of points that only depends on the effective dimension d_{eff}(γ) of the dataset. Moreover since all the RLS estimations are efficiently performed using only the small dictionary, SQUEAK is the first RLS sampling algorithm that never constructs the whole matrix K_n, runs in linear time mathcal{O}(nd_{eff}(γ)^3) w.r.t. n, and requires only a single pass over the dataset. We also propose a parallel and distributed version of SQUEAK that linearly scales across multiple machines, achieving similar accuracy in as little as mathcal{O}(log(n)d_{eff}(γ)^3) time.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/1803.10172 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/1803.10172 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.