Interface

This section lists the interface for structured matrices, that is the operations they need to implement to work in SINGD. It serves for internal purposes only. This is useful for developers that wish to add a new structured matrix class to the code that cannot be constructed with one of the available templates.

singd.structures.base.StructuredMatrix

StructuredMatrix()

Bases: ABC

Base class for structured matrices closed under addition and multiplication.

This base class defines the functions that need to be implemented to support a new structured matrix class with SINGD.

The minimum amount of work to add a new structured matrix class requires implementing the following methods:

to_dense
from_dense

All other operations will then use a naive implementation which internally re-constructs unstructured dense matrices. By default, these operations will trigger a warning which can be used to identify functions that can be implemented more efficiently using structure.

Note

You need to register tensors that represent parts of the represented matrix using the register_tensor method. This is similar to the mechanism in PyTorch modules, which have a register_parameter method. It allows to support many operations out of the box.

Attributes:

WARN_NAIVE (bool) –

Warn the user if a method falls back to a naive implementation of this base class. This indicates a method that should be implemented to save memory and run time by considering the represented structure. Default: True.
WARN_NAIVE_EXCEPTIONS (Set[str]) –

Set of methods that should not trigger a warning even if WARN_NAIVE is True. This can be used to silence warnings for methods for which it is too complicated to leverage a specific structure and which should therefore call out to this class's implementation without performance warnings.

Initialize the structured matrix.

Source code in singd/structures/base.py

def __init__(self) -> None:
    """Initialize the structured matrix."""
    self._tensor_names: List[str] = []

add

__add__(other: StructuredMatrix) -> StructuredMatrix

Add another matrix of same structure.

Parameters:

other (StructuredMatrix) –

Another structured matrix which will be added.

Returns:

StructuredMatrix –

A structured matrix resulting from the addition.

Source code in singd/structures/base.py

def __add__(self, other: StructuredMatrix) -> StructuredMatrix:
    """Add another matrix of same structure.

    Args:
        other: Another structured matrix which will be added.

    Returns:
        A structured matrix resulting from the addition.
    """
    self._warn_naive_implementation("__add__")
    return self.from_dense(self.to_dense() + other.to_dense())

matmul

__matmul__(other: Union[StructuredMatrix, Tensor]) -> Union[StructuredMatrix, Tensor]

Multiply onto a matrix (@ operator).

Parameters:

other (Union[StructuredMatrix, Tensor]) –

Another matrix which will be multiplied onto. Can be represented by a PyTorch tensor or a structured matrix.

Returns:

Union[StructuredMatrix, Tensor] –

Result of the multiplication. If a PyTorch tensor was passed as argument,
Union[StructuredMatrix, Tensor] –

the result will be a PyTorch tensor. Otherwise, it will be a a structured
Union[StructuredMatrix, Tensor] –

matrix.

Source code in singd/structures/base.py

def __matmul__(
    self, other: Union[StructuredMatrix, Tensor]
) -> Union[StructuredMatrix, Tensor]:
    """Multiply onto a matrix ([@ operator](https://peps.python.org/pep-0465/)).

    Args:
        other: Another matrix which will be multiplied onto. Can be represented
            by a PyTorch tensor or a structured matrix.

    Returns:
        Result of the multiplication. If a PyTorch tensor was passed as argument,
        the result will be a PyTorch tensor. Otherwise, it will be a a structured
        matrix.
    """
    self._warn_naive_implementation("__matmul__")

    dense = self.to_dense()
    if isinstance(other, Tensor):
        return supported_matmul(dense, other)

    other_dense = other.to_dense()
    return self.from_dense(supported_matmul(dense, other_dense))

mul

__mul__(other: float) -> StructuredMatrix

Multiply with a scalar.

Parameters:

other (float) –

A scalar that will be multiplied onto the structured matrix.

Returns:

StructuredMatrix –

The structured matrix, multiplied by the scalar.

Source code in singd/structures/base.py

def __mul__(self, other: float) -> StructuredMatrix:
    """Multiply with a scalar.

    Args:
        other: A scalar that will be multiplied onto the structured matrix.

    Returns:
        The structured matrix, multiplied by the scalar.
    """
    self._warn_naive_implementation("__mul__")
    return self.from_dense(self.to_dense() * other)

add_

add_(other: StructuredMatrix, alpha: float = 1.0) -> StructuredMatrix

In-place addition with another structured matrix.

Parameters:

other (StructuredMatrix) –

Another structured matrix which will be added in-place.
alpha (float, default: 1.0 ) –

A scalar that will be multiplied onto other before adding it. Default: 1.0.

Returns:

StructuredMatrix –

Reference to the in-place updated matrix.

Source code in singd/structures/base.py

def add_(self, other: StructuredMatrix, alpha: float = 1.0) -> StructuredMatrix:
    """In-place addition with another structured matrix.

    Args:
        other: Another structured matrix which will be added in-place.
        alpha: A scalar that will be multiplied onto `other` before adding it.
            Default: `1.0`.

    Returns:
        Reference to the in-place updated matrix.
    """
    for (_, tensor), (_, tensor_other) in zip(
        self.named_tensors(), other.named_tensors()
    ):
        tensor.add_(tensor_other, alpha=alpha)

    return self

all_reduce

all_reduce(op: dist.ReduceOp = dist.ReduceOp.AVG, group: Union[dist.ProcessGroup, None] = None, async_op: bool = False) -> Union[None, Tuple[torch._C.Future, ...]]

Reduce the structured matrix across all devices.

This method only has to be implemented to support distributed data parallel training.

Parameters:

op (ReduceOp, default: AVG ) –

The reduction operation to perform (default: dist.ReduceOp.AVG).
group (Union[ProcessGroup, None], default: None ) –

The process group to work on. If None, the default process group will be used.
async_op (bool, default: False ) –

If True, this function will return a torch.distributed.Future object. Otherwise, it will block until the reduction completes (default: False).

Returns:

Union[None, Tuple[Future, ...]] –

If async_op is True, a (tuple of) torch.distributed.Future
Union[None, Tuple[Future, ...]] –

object(s), else None.

Source code in singd/structures/base.py

def all_reduce(
    self,
    op: dist.ReduceOp = dist.ReduceOp.AVG,
    group: Union[dist.ProcessGroup, None] = None,
    async_op: bool = False,
) -> Union[None, Tuple[torch._C.Future, ...]]:
    """Reduce the structured matrix across all devices.

    This method only has to be implemented to support distributed data
    parallel training.

    Args:
        op: The reduction operation to perform (default: `dist.ReduceOp.AVG`).
        group: The process group to work on. If `None`, the default process group
            will be used.
        async_op: If `True`, this function will return a
            `torch.distributed.Future` object.
            Otherwise, it will block until the reduction completes
            (default: `False`).

    Returns:
        If `async_op` is `True`, a (tuple of) `torch.distributed.Future`
        object(s), else `None`.
    """
    handles = []
    for _, tensor in self.named_tensors():
        tensor = tensor.contiguous()
        if async_op:
            handles.append(
                dist.all_reduce(tensor, op=op, group=group, async_op=True)
            )
        else:
            dist.all_reduce(tensor, op=op, group=group, async_op=False)
    if async_op:
        return tuple(handles)

average_trace

average_trace() -> Tensor

Compute the average trace of the represented matrix.

Returns:

Tensor –

The average trace of the represented matrix.

Source code in singd/structures/base.py

def average_trace(self) -> Tensor:
    """Compute the average trace of the represented matrix.

    Returns:
        The average trace of the represented matrix.
    """
    self._warn_naive_implementation("trace")
    return self.to_dense().diag().mean()

diag_add_

diag_add_(value: float) -> StructuredMatrix

In-place add a value to the diagonal of the represented matrix.

Parameters:

value (float) –

Value to add to the diagonal.

Returns:

StructuredMatrix –

A reference to the updated matrix.

Source code in singd/structures/base.py

def diag_add_(self, value: float) -> StructuredMatrix:
    """In-place add a value to the diagonal of the represented matrix.

    Args:
        value: Value to add to the diagonal.

    Returns:
        A reference to the updated matrix.
    """
    self._warn_naive_implementation("diag_add_")
    dense = self.to_dense()
    diag_add_(dense, value)

    # NOTE `self` is immutable, so we have to update its state with the following
    # hack (otherwise, the call `a.diag_add_(b)` will not modify `a`). See
    # https://stackoverflow.com/a/37658673 and https://stackoverflow.com/q/1015592.
    new = self.from_dense(dense)
    self.__dict__.update(new.__dict__)
    return self

eye `classmethod`

eye(dim: int, dtype: Union[torch.dtype, None] = None, device: Union[torch.device, None] = None) -> StructuredMatrix

Create a structured matrix representing the identity matrix.

Parameters:

dim (int) –

Dimension of the (square) matrix.
dtype (Union[dtype, None], default: None ) –

Optional data type of the matrix. If not specified, uses the default tensor type.
device (Union[device, None], default: None ) –

Optional device of the matrix. If not specified, uses the default tensor type.

Returns:

StructuredMatrix –

A structured matrix representing the identity matrix.

Source code in singd/structures/base.py

@classmethod
def eye(
    cls,
    dim: int,
    dtype: Union[torch.dtype, None] = None,
    device: Union[torch.device, None] = None,
) -> StructuredMatrix:
    """Create a structured matrix representing the identity matrix.

    Args:
        dim: Dimension of the (square) matrix.
        dtype: Optional data type of the matrix. If not specified, uses the default
            tensor type.
        device: Optional device of the matrix. If not specified, uses the default
            tensor type.

    Returns:
        A structured matrix representing the identity matrix.
    """
    cls._warn_naive_implementation("eye")
    return cls.from_dense(supported_eye(dim, dtype=dtype, device=device))

from_dense `abstractmethod` `classmethod`

from_dense(sym_mat: Tensor) -> StructuredMatrix

Extract the represented structure from a dense symmetric matrix.

This will discard elements that are not part of the structure, even if they are non-zero.

Warning

We do not verify whether mat is symmetric internally.

Parameters:

sym_mat (Tensor) –

A symmetric dense matrix which will be converted into a structured one.

Returns:

StructuredMatrix –

Structured matrix.

Raises:

NotImplementedError –

Must be implemented by a child class.

Source code in singd/structures/base.py

@classmethod
@abstractmethod
def from_dense(cls, sym_mat: Tensor) -> StructuredMatrix:
    """Extract the represented structure from a dense symmetric matrix.

    This will discard elements that are not part of the structure, even if they
    are non-zero.

    Warning:
        We do not verify whether `mat` is symmetric internally.

    Args:
        sym_mat: A symmetric dense matrix which will be converted into a structured
            one.

    Returns:
        Structured matrix.

    Raises:
        NotImplementedError: Must be implemented by a child class.
    """
    raise NotImplementedError

from_inner

from_inner(X: Union[Tensor, None] = None) -> StructuredMatrix

Extract the represented structure from self.T @ X @ X^T @ self.

We can recycle terms by writing self.T @ X @ X^T @ self as S @ S^T with S := self.T @ X.

Parameters:

X (Union[Tensor, None], default: None ) –

Optional arbitrary 2d tensor. If None, X = I will be used.

Returns:

StructuredMatrix –

The structured matrix extracted from self.T @ X @ X^T @ self.

Source code in singd/structures/base.py

def from_inner(self, X: Union[Tensor, None] = None) -> StructuredMatrix:
    """Extract the represented structure from `self.T @ X @ X^T @ self`.

    We can recycle terms by writing `self.T @ X @ X^T @ self` as `S @ S^T`
    with `S := self.T @ X`.

    Args:
        X: Optional arbitrary 2d tensor. If `None`, `X = I` will be used.

    Returns:
        The structured matrix extracted from `self.T @ X @ X^T @ self`.
    """
    self._warn_naive_implementation("from_inner")
    S_dense = self.to_dense().T if X is None else self.rmatmat(X)
    return self.from_dense(supported_matmul(S_dense, S_dense.T))

from_inner2

from_inner2(XXT: Tensor) -> StructuredMatrix

Extract the represented structure from self.T @ XXT @ self.

Parameters:

XXT (Tensor) –

2d square symmetric matrix.

Returns:

StructuredMatrix –

The structured matrix extracted from self.T @ XXT @ self.

Source code in singd/structures/base.py

def from_inner2(self, XXT: Tensor) -> StructuredMatrix:
    """Extract the represented structure from `self.T @ XXT @ self`.

    Args:
        XXT: 2d square symmetric matrix.

    Returns:
        The structured matrix extracted from `self.T @ XXT @ self`.
    """
    self._warn_naive_implementation("from_inner2")
    dense = self.to_dense()
    return self.from_dense(supported_matmul(dense.T, XXT, dense))

infinity_vector_norm

infinity_vector_norm() -> Tensor

Compute the infinity vector norm.

The infinity vector norm is the absolute value of the largest entry. Note that this is different from the infinity matrix norm, compare here and here.

Returns:

Tensor –

The matrix's infinity vector norm.

Source code in singd/structures/base.py

def infinity_vector_norm(self) -> Tensor:
    """Compute the infinity vector norm.

    The infinity vector norm is the absolute value of the largest entry.
    Note that this is different from the infinity matrix norm, compare
    [here](https://pytorch.org/docs/stable/generated/torch.linalg.vector_norm.html)
    and
    [here](https://pytorch.org/docs/stable/generated/torch.linalg.matrix_norm.html).

    Returns:
        The matrix's infinity vector norm.
    """
    # NOTE `.max` can only be called on tensors with non-zero shape
    return max(t.abs().max() for _, t in self.named_tensors() if t.numel() > 0)

mul_

mul_(value: float) -> StructuredMatrix

In-place multiplication with a scalar.

Parameters:

value (float) –

A scalar that will be multiplied onto the structured matrix.

Returns:

StructuredMatrix –

Reference to the in-place updated matrix.

Source code in singd/structures/base.py

def mul_(self, value: float) -> StructuredMatrix:
    """In-place multiplication with a scalar.

    Args:
        value: A scalar that will be multiplied onto the structured matrix.

    Returns:
        Reference to the in-place updated matrix.
    """
    for _, tensor in self.named_tensors():
        tensor.mul_(value)

    return self

named_tensors

named_tensors() -> Iterator[Tuple[str, Tensor]]

Yield all tensors that represent the matrix and their names.

Yields:

str –

A tuple of the tensor's name and the tensor itself.

Source code in singd/structures/base.py

def named_tensors(self) -> Iterator[Tuple[str, Tensor]]:
    """Yield all tensors that represent the matrix and their names.

    Yields:
        A tuple of the tensor's name and the tensor itself.
    """
    for name in self._tensor_names:
        yield name, getattr(self, name)

register_tensor

register_tensor(tensor: Tensor, name: str) -> None

Register a tensor that represents a part of the matrix structure.

Parameters:

tensor (Tensor) –

A tensor that represents a part of the matrix structure.
name (str) –

A name for the tensor. The tensor will be available under self.name.

Raises:

ValueError –

If the name is already in use.

Source code in singd/structures/base.py

def register_tensor(self, tensor: Tensor, name: str) -> None:
    """Register a tensor that represents a part of the matrix structure.

    Args:
        tensor: A tensor that represents a part of the matrix structure.
        name: A name for the tensor. The tensor will be available under
            `self.name`.

    Raises:
        ValueError: If the name is already in use.
    """
    if hasattr(self, name):
        raise ValueError(f"Variable name {name!r} is already in use.")

    setattr(self, name, tensor)
    self._tensor_names.append(name)

rmatmat

rmatmat(mat: Tensor) -> Tensor

Multiply the structured matrix's transpose onto a matrix (self.T @ mat).

Parameters:

mat (Tensor) –

A dense matrix that will be multiplied onto.

Returns:

Tensor –

A dense PyTorch tensor resulting from the multiplication.

Source code in singd/structures/base.py

def rmatmat(self, mat: Tensor) -> Tensor:
    """Multiply the structured matrix's transpose onto a matrix (`self.T @ mat`).

    Args:
        mat: A dense matrix that will be multiplied onto.

    Returns:
        A dense PyTorch tensor resulting from the multiplication.
    """
    self._warn_naive_implementation("rmatmat")
    return supported_matmul(self.to_dense().T, mat)

to_dense `abstractmethod`

to_dense() -> Tensor

Return a dense tensor representing the structured matrix.

Returns:

Tensor –

A dense PyTorch tensor representing the matrix.

Raises:

NotImplementedError –

Must be implemented by a child class.

Source code in singd/structures/base.py

@abstractmethod
def to_dense(self) -> Tensor:
    """Return a dense tensor representing the structured matrix.

    Returns:
        A dense PyTorch tensor representing the matrix.

    Raises:
        NotImplementedError: Must be implemented by a child class.
    """
    raise NotImplementedError

zeros `classmethod`

zeros(dim: int, dtype: Union[torch.dtype, None] = None, device: Union[torch.device, None] = None) -> StructuredMatrix

Create a structured matrix representing the zero matrix.

Parameters:

dim (int) –

Dimension of the (square) matrix.
dtype (Union[dtype, None], default: None ) –

Optional data type of the matrix. If not specified, uses the default tensor type.
device (Union[device, None], default: None ) –

Optional device of the matrix. If not specified, uses the default tensor type.

Returns:

StructuredMatrix –

A structured matrix representing the zero matrix.

Source code in singd/structures/base.py

@classmethod
def zeros(
    cls,
    dim: int,
    dtype: Union[torch.dtype, None] = None,
    device: Union[torch.device, None] = None,
) -> StructuredMatrix:
    """Create a structured matrix representing the zero matrix.

    Args:
        dim: Dimension of the (square) matrix.
        dtype: Optional data type of the matrix. If not specified, uses the default
            tensor type.
        device: Optional device of the matrix. If not specified, uses the default
            tensor type.

    Returns:
        A structured matrix representing the zero matrix.
    """
    cls._warn_naive_implementation("zero")
    return cls.from_dense(zeros((dim, dim), dtype=dtype, device=device))

Interface

singd.structures.base.StructuredMatrix

__add__

__matmul__

__mul__

add_

all_reduce

average_trace

diag_add_

eye classmethod

from_dense abstractmethod classmethod

from_inner

from_inner2

infinity_vector_norm

mul_

named_tensors

register_tensor

rmatmat

to_dense abstractmethod

zeros classmethod

add

matmul

mul

eye `classmethod`

from_dense `abstractmethod` `classmethod`

to_dense `abstractmethod`

zeros `classmethod`