Skip to content

Interface

This section lists the interface for structured matrices, that is the operations they need to implement to work in SINGD. It serves for internal purposes only. This is useful for developers that wish to add a new structured matrix class to the code that cannot be constructed with one of the available templates.

singd.structures.base.StructuredMatrix

StructuredMatrix()

Bases: ABC

Base class for structured matrices closed under addition and multiplication.

This base class defines the functions that need to be implemented to support a new structured matrix class with SINGD.

The minimum amount of work to add a new structured matrix class requires implementing the following methods:

  • to_dense
  • from_dense

All other operations will then use a naive implementation which internally re-constructs unstructured dense matrices. By default, these operations will trigger a warning which can be used to identify functions that can be implemented more efficiently using structure.

Note

You need to register tensors that represent parts of the represented matrix using the register_tensor method. This is similar to the mechanism in PyTorch modules, which have a register_parameter method. It allows to support many operations out of the box.

Attributes:

  • WARN_NAIVE (bool) –

    Warn the user if a method falls back to a naive implementation of this base class. This indicates a method that should be implemented to save memory and run time by considering the represented structure. Default: True.

  • WARN_NAIVE_EXCEPTIONS (Set[str]) –

    Set of methods that should not trigger a warning even if WARN_NAIVE is True. This can be used to silence warnings for methods for which it is too complicated to leverage a specific structure and which should therefore call out to this class's implementation without performance warnings.

Initialize the structured matrix.

Source code in singd/structures/base.py
def __init__(self) -> None:
    """Initialize the structured matrix."""
    self._tensor_names: List[str] = []

__add__

__add__(other: StructuredMatrix) -> StructuredMatrix

Add another matrix of same structure.

Parameters:

Returns:

Source code in singd/structures/base.py
def __add__(self, other: StructuredMatrix) -> StructuredMatrix:
    """Add another matrix of same structure.

    Args:
        other: Another structured matrix which will be added.

    Returns:
        A structured matrix resulting from the addition.
    """
    self._warn_naive_implementation("__add__")
    return self.from_dense(self.to_dense() + other.to_dense())

__matmul__

__matmul__(other: Union[StructuredMatrix, Tensor]) -> Union[StructuredMatrix, Tensor]

Multiply onto a matrix (@ operator).

Parameters:

  • other (Union[StructuredMatrix, Tensor]) –

    Another matrix which will be multiplied onto. Can be represented by a PyTorch tensor or a structured matrix.

Returns:

  • Union[StructuredMatrix, Tensor]

    Result of the multiplication. If a PyTorch tensor was passed as argument,

  • Union[StructuredMatrix, Tensor]

    the result will be a PyTorch tensor. Otherwise, it will be a a structured

  • Union[StructuredMatrix, Tensor]

    matrix.

Source code in singd/structures/base.py
def __matmul__(
    self, other: Union[StructuredMatrix, Tensor]
) -> Union[StructuredMatrix, Tensor]:
    """Multiply onto a matrix ([@ operator](https://peps.python.org/pep-0465/)).

    Args:
        other: Another matrix which will be multiplied onto. Can be represented
            by a PyTorch tensor or a structured matrix.

    Returns:
        Result of the multiplication. If a PyTorch tensor was passed as argument,
        the result will be a PyTorch tensor. Otherwise, it will be a a structured
        matrix.
    """
    self._warn_naive_implementation("__matmul__")

    dense = self.to_dense()
    if isinstance(other, Tensor):
        return supported_matmul(dense, other)

    other_dense = other.to_dense()
    return self.from_dense(supported_matmul(dense, other_dense))

__mul__

__mul__(other: float) -> StructuredMatrix

Multiply with a scalar.

Parameters:

  • other (float) –

    A scalar that will be multiplied onto the structured matrix.

Returns:

Source code in singd/structures/base.py
def __mul__(self, other: float) -> StructuredMatrix:
    """Multiply with a scalar.

    Args:
        other: A scalar that will be multiplied onto the structured matrix.

    Returns:
        The structured matrix, multiplied by the scalar.
    """
    self._warn_naive_implementation("__mul__")
    return self.from_dense(self.to_dense() * other)

add_

add_(other: StructuredMatrix, alpha: float = 1.0) -> StructuredMatrix

In-place addition with another structured matrix.

Parameters:

  • other (StructuredMatrix) –

    Another structured matrix which will be added in-place.

  • alpha (float, default: 1.0 ) –

    A scalar that will be multiplied onto other before adding it. Default: 1.0.

Returns:

Source code in singd/structures/base.py
def add_(self, other: StructuredMatrix, alpha: float = 1.0) -> StructuredMatrix:
    """In-place addition with another structured matrix.

    Args:
        other: Another structured matrix which will be added in-place.
        alpha: A scalar that will be multiplied onto `other` before adding it.
            Default: `1.0`.

    Returns:
        Reference to the in-place updated matrix.
    """
    for (_, tensor), (_, tensor_other) in zip(
        self.named_tensors(), other.named_tensors()
    ):
        tensor.add_(tensor_other, alpha=alpha)

    return self

all_reduce

all_reduce(op: dist.ReduceOp = dist.ReduceOp.AVG, group: Union[dist.ProcessGroup, None] = None, async_op: bool = False) -> Union[None, Tuple[torch._C.Future, ...]]

Reduce the structured matrix across all devices.

This method only has to be implemented to support distributed data parallel training.

Parameters:

  • op (ReduceOp, default: AVG ) –

    The reduction operation to perform (default: dist.ReduceOp.AVG).

  • group (Union[ProcessGroup, None], default: None ) –

    The process group to work on. If None, the default process group will be used.

  • async_op (bool, default: False ) –

    If True, this function will return a torch.distributed.Future object. Otherwise, it will block until the reduction completes (default: False).

Returns:

  • Union[None, Tuple[Future, ...]]

    If async_op is True, a (tuple of) torch.distributed.Future

  • Union[None, Tuple[Future, ...]]

    object(s), else None.

Source code in singd/structures/base.py
def all_reduce(
    self,
    op: dist.ReduceOp = dist.ReduceOp.AVG,
    group: Union[dist.ProcessGroup, None] = None,
    async_op: bool = False,
) -> Union[None, Tuple[torch._C.Future, ...]]:
    """Reduce the structured matrix across all devices.

    This method only has to be implemented to support distributed data
    parallel training.

    Args:
        op: The reduction operation to perform (default: `dist.ReduceOp.AVG`).
        group: The process group to work on. If `None`, the default process group
            will be used.
        async_op: If `True`, this function will return a
            `torch.distributed.Future` object.
            Otherwise, it will block until the reduction completes
            (default: `False`).

    Returns:
        If `async_op` is `True`, a (tuple of) `torch.distributed.Future`
        object(s), else `None`.
    """
    handles = []
    for _, tensor in self.named_tensors():
        tensor = tensor.contiguous()
        if async_op:
            handles.append(
                dist.all_reduce(tensor, op=op, group=group, async_op=True)
            )
        else:
            dist.all_reduce(tensor, op=op, group=group, async_op=False)
    if async_op:
        return tuple(handles)

average_trace

average_trace() -> Tensor

Compute the average trace of the represented matrix.

Returns:

  • Tensor

    The average trace of the represented matrix.

Source code in singd/structures/base.py
def average_trace(self) -> Tensor:
    """Compute the average trace of the represented matrix.

    Returns:
        The average trace of the represented matrix.
    """
    self._warn_naive_implementation("trace")
    return self.to_dense().diag().mean()

diag_add_

diag_add_(value: float) -> StructuredMatrix

In-place add a value to the diagonal of the represented matrix.

Parameters:

  • value (float) –

    Value to add to the diagonal.

Returns:

Source code in singd/structures/base.py
def diag_add_(self, value: float) -> StructuredMatrix:
    """In-place add a value to the diagonal of the represented matrix.

    Args:
        value: Value to add to the diagonal.

    Returns:
        A reference to the updated matrix.
    """
    self._warn_naive_implementation("diag_add_")
    dense = self.to_dense()
    diag_add_(dense, value)

    # NOTE `self` is immutable, so we have to update its state with the following
    # hack (otherwise, the call `a.diag_add_(b)` will not modify `a`). See
    # https://stackoverflow.com/a/37658673 and https://stackoverflow.com/q/1015592.
    new = self.from_dense(dense)
    self.__dict__.update(new.__dict__)
    return self

eye classmethod

eye(dim: int, dtype: Union[torch.dtype, None] = None, device: Union[torch.device, None] = None) -> StructuredMatrix

Create a structured matrix representing the identity matrix.

Parameters:

  • dim (int) –

    Dimension of the (square) matrix.

  • dtype (Union[dtype, None], default: None ) –

    Optional data type of the matrix. If not specified, uses the default tensor type.

  • device (Union[device, None], default: None ) –

    Optional device of the matrix. If not specified, uses the default tensor type.

Returns:

Source code in singd/structures/base.py
@classmethod
def eye(
    cls,
    dim: int,
    dtype: Union[torch.dtype, None] = None,
    device: Union[torch.device, None] = None,
) -> StructuredMatrix:
    """Create a structured matrix representing the identity matrix.

    Args:
        dim: Dimension of the (square) matrix.
        dtype: Optional data type of the matrix. If not specified, uses the default
            tensor type.
        device: Optional device of the matrix. If not specified, uses the default
            tensor type.

    Returns:
        A structured matrix representing the identity matrix.
    """
    cls._warn_naive_implementation("eye")
    return cls.from_dense(supported_eye(dim, dtype=dtype, device=device))

from_dense abstractmethod classmethod

from_dense(sym_mat: Tensor) -> StructuredMatrix

Extract the represented structure from a dense symmetric matrix.

This will discard elements that are not part of the structure, even if they are non-zero.

Warning

We do not verify whether mat is symmetric internally.

Parameters:

  • sym_mat (Tensor) –

    A symmetric dense matrix which will be converted into a structured one.

Returns:

Raises:

  • NotImplementedError

    Must be implemented by a child class.

Source code in singd/structures/base.py
@classmethod
@abstractmethod
def from_dense(cls, sym_mat: Tensor) -> StructuredMatrix:
    """Extract the represented structure from a dense symmetric matrix.

    This will discard elements that are not part of the structure, even if they
    are non-zero.

    Warning:
        We do not verify whether `mat` is symmetric internally.

    Args:
        sym_mat: A symmetric dense matrix which will be converted into a structured
            one.

    Returns:
        Structured matrix.

    Raises:
        NotImplementedError: Must be implemented by a child class.
    """
    raise NotImplementedError

from_inner

from_inner(X: Union[Tensor, None] = None) -> StructuredMatrix

Extract the represented structure from self.T @ X @ X^T @ self.

We can recycle terms by writing self.T @ X @ X^T @ self as S @ S^T with S := self.T @ X.

Parameters:

  • X (Union[Tensor, None], default: None ) –

    Optional arbitrary 2d tensor. If None, X = I will be used.

Returns:

  • StructuredMatrix

    The structured matrix extracted from self.T @ X @ X^T @ self.

Source code in singd/structures/base.py
def from_inner(self, X: Union[Tensor, None] = None) -> StructuredMatrix:
    """Extract the represented structure from `self.T @ X @ X^T @ self`.

    We can recycle terms by writing `self.T @ X @ X^T @ self` as `S @ S^T`
    with `S := self.T @ X`.

    Args:
        X: Optional arbitrary 2d tensor. If `None`, `X = I` will be used.

    Returns:
        The structured matrix extracted from `self.T @ X @ X^T @ self`.
    """
    self._warn_naive_implementation("from_inner")
    S_dense = self.to_dense().T if X is None else self.rmatmat(X)
    return self.from_dense(supported_matmul(S_dense, S_dense.T))

from_inner2

from_inner2(XXT: Tensor) -> StructuredMatrix

Extract the represented structure from self.T @ XXT @ self.

Parameters:

  • XXT (Tensor) –

    2d square symmetric matrix.

Returns:

  • StructuredMatrix

    The structured matrix extracted from self.T @ XXT @ self.

Source code in singd/structures/base.py
def from_inner2(self, XXT: Tensor) -> StructuredMatrix:
    """Extract the represented structure from `self.T @ XXT @ self`.

    Args:
        XXT: 2d square symmetric matrix.

    Returns:
        The structured matrix extracted from `self.T @ XXT @ self`.
    """
    self._warn_naive_implementation("from_inner2")
    dense = self.to_dense()
    return self.from_dense(supported_matmul(dense.T, XXT, dense))

infinity_vector_norm

infinity_vector_norm() -> Tensor

Compute the infinity vector norm.

The infinity vector norm is the absolute value of the largest entry. Note that this is different from the infinity matrix norm, compare here and here.

Returns:

  • Tensor

    The matrix's infinity vector norm.

Source code in singd/structures/base.py
def infinity_vector_norm(self) -> Tensor:
    """Compute the infinity vector norm.

    The infinity vector norm is the absolute value of the largest entry.
    Note that this is different from the infinity matrix norm, compare
    [here](https://pytorch.org/docs/stable/generated/torch.linalg.vector_norm.html)
    and
    [here](https://pytorch.org/docs/stable/generated/torch.linalg.matrix_norm.html).

    Returns:
        The matrix's infinity vector norm.
    """
    # NOTE `.max` can only be called on tensors with non-zero shape
    return max(t.abs().max() for _, t in self.named_tensors() if t.numel() > 0)

mul_

mul_(value: float) -> StructuredMatrix

In-place multiplication with a scalar.

Parameters:

  • value (float) –

    A scalar that will be multiplied onto the structured matrix.

Returns:

Source code in singd/structures/base.py
def mul_(self, value: float) -> StructuredMatrix:
    """In-place multiplication with a scalar.

    Args:
        value: A scalar that will be multiplied onto the structured matrix.

    Returns:
        Reference to the in-place updated matrix.
    """
    for _, tensor in self.named_tensors():
        tensor.mul_(value)

    return self

named_tensors

named_tensors() -> Iterator[Tuple[str, Tensor]]

Yield all tensors that represent the matrix and their names.

Yields:

  • str

    A tuple of the tensor's name and the tensor itself.

Source code in singd/structures/base.py
def named_tensors(self) -> Iterator[Tuple[str, Tensor]]:
    """Yield all tensors that represent the matrix and their names.

    Yields:
        A tuple of the tensor's name and the tensor itself.
    """
    for name in self._tensor_names:
        yield name, getattr(self, name)

register_tensor

register_tensor(tensor: Tensor, name: str) -> None

Register a tensor that represents a part of the matrix structure.

Parameters:

  • tensor (Tensor) –

    A tensor that represents a part of the matrix structure.

  • name (str) –

    A name for the tensor. The tensor will be available under self.name.

Raises:

  • ValueError

    If the name is already in use.

Source code in singd/structures/base.py
def register_tensor(self, tensor: Tensor, name: str) -> None:
    """Register a tensor that represents a part of the matrix structure.

    Args:
        tensor: A tensor that represents a part of the matrix structure.
        name: A name for the tensor. The tensor will be available under
            `self.name`.

    Raises:
        ValueError: If the name is already in use.
    """
    if hasattr(self, name):
        raise ValueError(f"Variable name {name!r} is already in use.")

    setattr(self, name, tensor)
    self._tensor_names.append(name)

rmatmat

rmatmat(mat: Tensor) -> Tensor

Multiply the structured matrix's transpose onto a matrix (self.T @ mat).

Parameters:

  • mat (Tensor) –

    A dense matrix that will be multiplied onto.

Returns:

  • Tensor

    A dense PyTorch tensor resulting from the multiplication.

Source code in singd/structures/base.py
def rmatmat(self, mat: Tensor) -> Tensor:
    """Multiply the structured matrix's transpose onto a matrix (`self.T @ mat`).

    Args:
        mat: A dense matrix that will be multiplied onto.

    Returns:
        A dense PyTorch tensor resulting from the multiplication.
    """
    self._warn_naive_implementation("rmatmat")
    return supported_matmul(self.to_dense().T, mat)

to_dense abstractmethod

to_dense() -> Tensor

Return a dense tensor representing the structured matrix.

Returns:

  • Tensor

    A dense PyTorch tensor representing the matrix.

Raises:

  • NotImplementedError

    Must be implemented by a child class.

Source code in singd/structures/base.py
@abstractmethod
def to_dense(self) -> Tensor:
    """Return a dense tensor representing the structured matrix.

    Returns:
        A dense PyTorch tensor representing the matrix.

    Raises:
        NotImplementedError: Must be implemented by a child class.
    """
    raise NotImplementedError

zeros classmethod

zeros(dim: int, dtype: Union[torch.dtype, None] = None, device: Union[torch.device, None] = None) -> StructuredMatrix

Create a structured matrix representing the zero matrix.

Parameters:

  • dim (int) –

    Dimension of the (square) matrix.

  • dtype (Union[dtype, None], default: None ) –

    Optional data type of the matrix. If not specified, uses the default tensor type.

  • device (Union[device, None], default: None ) –

    Optional device of the matrix. If not specified, uses the default tensor type.

Returns:

Source code in singd/structures/base.py
@classmethod
def zeros(
    cls,
    dim: int,
    dtype: Union[torch.dtype, None] = None,
    device: Union[torch.device, None] = None,
) -> StructuredMatrix:
    """Create a structured matrix representing the zero matrix.

    Args:
        dim: Dimension of the (square) matrix.
        dtype: Optional data type of the matrix. If not specified, uses the default
            tensor type.
        device: Optional device of the matrix. If not specified, uses the default
            tensor type.

    Returns:
        A structured matrix representing the zero matrix.
    """
    cls._warn_naive_implementation("zero")
    return cls.from_dense(zeros((dim, dim), dtype=dtype, device=device))