RandomizedSparsification

Implementation of the randomized sparsification algorithm from [1].

Index

RandomizedSparsification.FindIndices
RandomizedSparsification.FindIndices._indices_groupsort
RandomizedSparsification._sparsify_context_data
RandomizedSparsification.r_ε
RandomizedSparsification.rankrand
RandomizedSparsification.representation_error
RandomizedSparsification.sparsify
RandomizedSparsification.srank
RandomizedSparsification.ρ_s

Algorithm

`RandomizedSparsification` API

RandomizedSparsification.sparsify — Function

sparsify(A; r, multithreaded=false, progress=false)

Sparsify the input matrix A using the randomized element-wise sparsification algorithm with r samples.

source

RandomizedSparsification.r_ε — Function

r_ε(A, ε, [rank])
r_ε(d₁, d₂, ε, rank)

Compute the sufficient number of samples r required to achieve a given error tolerance ε in the sparsification of the matrix A ($A \in \mathbb{R}^{d_1 \times d_2}$) in the expected value. Optionally use rank as an upperbound for $\operatorname{srank}$. The number of samples returned by this function is not necessarily the least bound for the number of samples needed for the error to bound to hold.

source

Utility functions

RandomizedSparsification.rankrand — Function

rankrand(d₁, d₂, rank)

Generate a random matrix of dimensions d₁ × d₂ with a rank rank.

source

RandomizedSparsification.representation_error — Function

representation_error(A, A_r)

Compute the relative operator representation error between the original matrix A and the sparsified matrix A_r.

\[ \frac{\|A - A_r\|_\text{op}}{\|A\|_\text{op}}\]

source

RandomizedSparsification.srank — Function

srank(A::AbstractMatrix)

Compute the stable rank of the input matrix A.

The stable rank is defined as the ratio of the squared Frobenius norm to the squared spectral norm of the matrix.

\[ \frac{\|A\|^2_\text{F}}{\|A\|^2_\text{op}}\]

source

RandomizedSparsification.ρ_s — Function

ρ_s(A::AbstractSparseArray)

Compute the proportional sparsity of the input matrix A.

\[ \frac{\#\{(i,j)\mid A_{ij} \neq 0 \}}{\#\{(i,j)\}}\]

source

Internals

RandomizedSparsification.FindIndices — Module

This module contains functions for determining the indices from the cumulative sum of the probability matrix P_ij_cs (i.e. cumulative distribution function on the indices) and the randomly sampled numbers from $[0,1]$ of the length of the desired number perspired by the user to align with the necessary accuracy bounds.

Danger

The methods do not check whether or not the number of the samples that are taken fit into the RAM memory. This should be considered before in the call stack. The number of samples desired must be separated into batches that fit into the memory of the particular algorithm used for the indices location.
Assumption that P_ij_cs fits into memory is made. If it does not we need to be able to compute it at evaluation time for a given index efficiently (which is not yet implemented).

The functions that find the summation indices in the sparsification algorithm have the signature:

_indices(
    P_ij_cs::AbstractVector{<:Real},    # cumulative distribution function on matrix indices
    rands01::AbstractVector{<:Real},    # sampled values in [0,1]
    m::Integer, n::Integer;             # original matrix size
    progress=false                      # report progress to the user
)::AbstractVector{CartesianIndex{2}}

So the indices are returned as the vector [ij::CartesianIndex{2}...] with repetitions. This is passed to the sparse function for the creation of the sparsified matrix.

source

RandomizedSparsification._sparsify_context_data — Function

_sparsify_context_data(A)

Compute the necessary context data, A_1_norm, A_F_norm, P_ij and P_ij_cs (cumulative distribution on the matrix elements) for the sparsify randomized sparisification algorithm of A.

source

RandomizedSparsification.FindIndices._indices_groupsort — Function

_indices_groupsort(P_ij_cs, rands01, m, n)

Algorithm

Defines a DataFrame holiding P_ij_cs and rands01 in the :vals column having a boolean :group column that differentiates values from P_ij_cs from rands01.
Rows are sorted by the :vals columns interleaving the rands01.
Indices are determined by mapping the number of preceding ":group == P_ij_cs" values for ta given :group == rands01

Perf

`DataFrames.jl` should only bring forward a small overhead compared to doing the underlying sorting manually.

source

Bibliography

[1]: A. Kundu and P. Drineas. A Note on Randomized Element-wise Matrix Sparsification (2014). Accessed on Jan 6, 2025.

RandomizedSparsification

Algorithm

RandomizedSparsification API

Bibliography

`RandomizedSparsification` API