Week 4 - Matrix-Vector to Matrix-Matrix Multiplication #
Preparation #
Partitioned Matrix-vector Multiplication #
$$\displaystyle \left( \begin{array}{ c | c | c | c } A_{0,0} & A_{0,1} & \cdots & A_{0,N-1} \\ A_{1,0} & A_{1,1} & \cdots & A_{1,N-1} \\ \vdots & \vdots & \ddots & \vdots \\ A_{M-1,0} & A_{M-1,1} & \cdots & A_{M-1,N-1} \end{array} \right) \left( \begin{array}{ c } x_0 \\ x_1 \\ \vdots \\ x_{N-1} \end{array} \right) = \left( \begin{array}{ c } A_{0,0} x_{0} + A_{0,1} x_{1} + \cdots + A_{0,N-1} x_{N-1} \\ A_{1,0} x_{0} + A_{1,1} x_{1} + \cdots + A_{1,N-1} x_{N-1} \\ \vdots \\ A_{M-1,0} x_{0} + A_{M-1,1} x_{1} + \cdots + A_{M-1,N-1} x_{N-1} \end{array} \right)$$
Two different algorithms to calculate Matrix-vector Multiplication
- First one, is calculating by rows.
- $\psi_1 := a_{10}^T x_0 + \alpha_{11} x_1 + a_{12}^T x_2 + \psi_1$
- Second one is by columns.
- $y_0 := \chi_1 a_{01} + y_0$
- $\psi_1 := \chi_1 \alpha_{11} + \psi_1$
- $y_2 := \chi_1 a_{21} + y_2$
- First one, is calculating by rows.
Transposing a Partitioned Matrix #
$$\left( \begin{array}{c | c | c | c} A_{0,0} & A_{0,1} & \cdots & A_{0,N-1} \\ A_{1,0} & A_{1,1} & \cdots & A_{1,N-1} \\ \vdots & \vdots & & \vdots \\ A_{M-1,0} & A_{M-1,1} & \cdots & A_{M-1,N-1} \end{array} \right)^ T = \left( \begin{array}{c | c | c | c} A_{0,0}^ T & A_{1,0}^ T & \cdots & A_{M-1,0}^ T \\ A_{0,1}^ T & A_{1,1}^ T & \cdots & A_{M-1,1}^ T \\ \vdots & \vdots & & \vdots \\ A_{0,N-1}^ T & A_{1,N-1}^ T & \cdots & A_{M-1,N-1}^ T \end{array} \right).$$
Example
Matrix-Vector Multiplication with Special Matrices #
- When doing calculation, we always concern flops and memops. And later we will prove that, memops is much slower that flops.
- Let’s take an example. If we want calculate $y := A^T x + y$. Normally, we will do, set $B = A^T$, then $y := B x +y$
- Which means, we are going to spend the most time to transpose matrix A to B, and little time computing.
- What we want is, compute with $A^T$ without actually transposing.
- The solution is, we can simply use columns of A for the dot products in the dot product based algorithm for $y := Ax + y$.
Triangular Matrix-Vector Multiplication #
- Let $U \in \mathbb{R}^{n \times n}$ be an upper triangular matrix and $x \in \mathbb{R}^n$ be a vector. Consider
- We notice that $u^T = 0$ (a vector of two zeroes) and hence we need not compute with it.
- Let’s calculate the flops:
- The calculate step: $\psi_1 := u_{11} x_1 + u_{12}^T x_2 + \psi_1$, (without $u_{10}^T x_0$)
- if we set the size of $U_{00}$ to $k$,then the size of $U_{02}$ and $U_{20}$ should be $n - k - 1$.
- flops($u_{11} x_1$) = 1, flops($u_{12}^T x_2$) = $2(n-k-1)$, flops($u_{11} x_1 + u_{12}^T x_2$) = 1
- So the flops = $\sum_{i=0}^k (2 + 2(n - k - 1)) = n*(n+1)$
Symmetric Matrix-Vector Multiplication #
- We purposely chose the matrix on the right to be symmetric. We notice that $a_{10}^T = a_{01}$ , $A_{20}^T = A_{02}$ , and $a_{12}^T = a_{21}$.
- So we just need to change the step of calculation
- By rows:
- from $\psi_1 := a_{10}^T x_0 + \alpha_{11} x_1 + a_{12}^T x_2 + \psi_1$
- to $\psi_1 := a_{01} x_0 + \alpha_{11} x_1 + a_{12}^T x_2 + \psi_1$
- By columns:
- from
- $y_0 := \chi_1 a_{01} + y_0$
- $\psi_1 := \chi_1 \alpha_{11} + \psi_1$
- $y_2 := \chi_1 a_{21} + y_2$
- to
- $y_0 := \chi_1 a_{01} + y_0$
- $\psi_1 := \chi_1 \alpha_{11} + \psi_1$
- $y_2 := \chi_1 (a_{11}^T)^T + y_2$
- from
- By rows:
Composing linear transformations #
- Let $L_A: \mathbb {R}^ k \rightarrow \mathbb {R}^ m$ and $L_ B: \mathbb {R}^ n \rightarrow \mathbb {R}^ k$ both be linear transformations and, for all $x \in \mathbb{R}^n$, define the function $L_C: \mathbb {R}^ n \rightarrow \mathbb {R}^ m$ by $L_ C( x ) = L_A( L_ B( x ) )$. Then $L_ C( x )$ is a linear transformations.
Matrix-matrix multiplication #
$$A B = A \left( \begin{array}{c | c | c | c } b_0 & b_1 & \cdots & b_{n-1} \end{array} \right) = \left( \begin{array}{c | c | c | c } A b_0 & A b_1 & \cdots & A b_{n-1} \end{array} \right).$$
If
- $$C = \begin{bmatrix} \gamma_{0,0} & \gamma_{0,1} & \cdots & \gamma_{0,n-1} \\ \gamma_{1,0} & \gamma_{1,1} & \cdots & \gamma_{1,n-1} \\ \vdots & \vdots & \vdots & \vdots \\ \gamma_{m-1,0} & \gamma_{m-1,1} & \cdots & \gamma_{m-1,n-1} \\ \end{bmatrix},$$
- $$C = \begin{bmatrix} \gamma_{0,0} & \gamma_{0,1} & \cdots & \gamma_{0,n-1} \\ \gamma_{1,0} & \gamma_{1,1} & \cdots & \gamma_{1,n-1} \\ \vdots & \vdots & \vdots & \vdots \\ \gamma_{m-1,0} & \gamma_{m-1,1} & \cdots & \gamma_{m-1,n-1} \\ \end{bmatrix},$$
- $$\quad A = \begin{bmatrix} \alpha_{0,0} & \alpha_{0,1} & \cdots & \alpha_{0,k-1} \\ \alpha_{1,0} & \alpha_{1,1} & \cdots & \alpha_{1,k-1} \\ \vdots & \vdots & \vdots & \vdots \\ \alpha_{m-1,0} & \alpha_{m-1,1} & \cdots & \alpha_{m-1,k-1} \\ \end{bmatrix}$$
- $$\\ \text{and} \quad B = \begin{bmatrix} \beta_{0,0} & \beta_{0,1} & \cdots & \beta_{0,n-1} \\ \beta_{1,0} & \beta_{1,1} & \cdots & \beta_{1,n-1} \\ \vdots & \vdots & \vdots & \vdots \\ \beta_{k-1,0} & \beta_{k-1,1} & \cdots & \beta_{k-1,n-1} \\ \end{bmatrix}.$$
Then C = AB means that $\gamma_{i,j} = \sum_{p=0}^{k-1} \alpha_{i,p} \beta_{p,j}$
A table of matrix-matrix multiplications with matrices of special shape is given at the end of this unit.
Outer product #
- Let $x \in \mathbb{R}^m$ and $y \in \mathbb{R}^n$. Then the outer product of x and y is given by $xy^T$. Notice that this yields an $m \times n$ matrix:
- $$\begin{aligned} xy^T &= \left( \begin{array}{c} \chi_0 \\ \chi_1 \\ \vdots \\ \chi_{m-1} \end{array} \right) \left( \begin{array}{c} \psi_0 \\ \psi_1 \\ \vdots \\ \psi_{n-1} \end{array} \right)^ T = \left( \begin{array}{c} \chi_0 \\ \chi_1 \\ \vdots \\ \chi_{m-1} \end{array} \right) \left( \begin{array}{c c c c} \psi_0 & \psi_1 & \cdots & \psi_{n-1} \end{array} \right) \\ &= \left( \begin{array}{c c c c} \chi_0 \psi_0 & \chi_0 \psi_1 & \cdots & \chi_0 \psi_{n-1} \\ \chi_1 \psi_0 & \chi_1 \psi_1 & \cdots & \chi_1 \psi_{n-1} \\ \vdots & \vdots & & \vdots \\ \chi_{m-1} \psi_0 & \chi_{m-1} \psi_1 & \cdots & \chi_{m-1} \psi_{n-1} \end{array} \right). \end{aligned}$$
- The cost of memops of matrix-matrix multiplication is $2kmn$.