四小时速通线性代数（一）-11s' NoteBook

线性方程组

In mathematics, a system of linear equations (or linear system) is a collection of two or more linear equations involving the same variables.^[1]^[2]

例如，这是一个三元一次线性方程组：

\begin{cases} 3x + 2y - z = 1 \\ 2x - 2y + 4z = -2 \\ -x + \dfrac{1}{2}y - z = 0 \end{cases}

基于中学知识，我们可以通过加减消元法来解这个方程组。

\begin{cases} x=1 \\ y=-2 \\ z=-2 \end{cases}

我们发现，每个方程都长得像：(系数)x+(系数)y+(系数)z=(常数) 这个形式，所以我们可以不关注具体的未知数的符号，把方程组中的系数提取出来：

\begin{pmatrix} 3 & 2 & -1 \\ 2 & -2 & 4 \\ -1 & \frac12 & -1 \end{pmatrix}

这就是所谓的矩阵（Matrix）。

矩阵的基本知识

In mathematics, a matrix (pl.: matrices) is a rectangular array of numbers or other mathematical objects with elements or entries arranged in rows and columns, usually satisfying certain properties of addition and multiplication.

如下是一个 $m\times n$ 的矩阵，其中 m 是矩阵的行数，n 是矩阵的列数：

A=(a_{ij})_{m\times n}= \begin{pmatrix} a_{11} & a_{12} & \cdots&a_{1n} \\ a_{21} & a_{22} & \cdots&a_{2n} \\ \vdots & \vdots && \vdots \\ a_{m1} & a_{m2} & \cdots&a_{mn} \end{pmatrix}

我们可以把上面的线性方程组写为矩阵形式的方程：

\begin{pmatrix} 3 & 2 & -1 \\ 2 & -2 & 4 \\ -1 & \dfrac{1}{2} & -1 \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} 1 \\ -2 \\ 0 \end{pmatrix}

关于这个矩阵如何运算，将在后文讲解。

矩阵的行与列

当矩阵只有一行时，我们称其为行矩阵或行向量：

A=\begin{pmatrix} a_1 & a_2 & \cdots & a_n \end{pmatrix}

同样的，当矩阵只有一列的时候，我们称其为列矩阵或列向量：

A=\begin{pmatrix} a_1 \\ a_2 \\ \vdots \\ a_m \end{pmatrix}

特殊的矩阵

以下介绍一些特殊的矩阵，首先是方阵。

In mathematics, a square matrix is a matrix with the same number of rows and columns.

方阵的行数与列数相等，如下：

A= \begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{pmatrix}

In mathematics, particularly linear algebra, a zero matrix or null matrix is a matrix all of whose entries are zero.

零矩阵是所有元素全都为 0 的矩阵。不同规格的零矩阵不可等同。

O= \begin{pmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix}

In linear algebra, the identity matrix of size n is the n*n square matrix with ones on the main diagonal and zeros elsewhere.

单位矩阵可理解为矩阵里的数字 1，作用和算术中的数字 1 基本等同：

E= \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix}

单位矩阵的主对角线全是 1，除此之外的其余元素全都是 0。

In linear algebra, a diagonal matrix is a matrix in which the entries outside the main diagonal are all zero;

只有主对角线上有非零数，其余位置全是 0 的方阵，就叫对角矩阵。

\begin{pmatrix} -1 & 0 & 0 \\ 0 & 5 & 0 \\ 0 & 0 & 0 \end{pmatrix}

需要注意的是，并非主对角线上的元素全都是非零数才叫做对角矩阵，只要至少存在一个非零数即可。

主对角线上的数全都一样的对角矩阵称为数量矩阵。也可以理解为单位矩阵乘上一个非零常数 k 得到的矩阵：

\begin{pmatrix} k & 0 & 0 \\ 0 & k & 0 \\ 0 & 0 & k \end{pmatrix}

上三角矩阵主对角线下方全是 0，只有对角线及上方可以有数；而下三角矩阵主对角线上方全是 0，只有对角线及下方可以有数。

\begin{pmatrix} 1 & 2 & 3 \\ 0 & 4 & 5 \\ 0 & 0 & 6 \end{pmatrix}

\begin{pmatrix} 1 & 0 & 0 \\ 2 & 3 & 0 \\ 4 & 5 & 6 \end{pmatrix}

这两类矩阵有着非常重要的运算性质，后续内容将详细分析。

对单位矩阵做一次初等行（列）变换得到的矩阵称为初等矩阵。初等矩阵有三种形式：

交换两行（列）
某一行乘非零常数 k
某一行的 k 倍加到另一行

以 3 阶单位矩阵为基础，第一种初等矩阵：

E_{12}= \begin{pmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{pmatrix}

第二种初等矩阵：

E_2(5)= \begin{pmatrix} 1 & 0 & 0 \\ 0 & 5 & 0 \\ 0 & 0 & 1 \end{pmatrix}

第三种初等矩阵：

E_{31}(2)= \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 0 & 1 \end{pmatrix}

在矩阵乘法中，左乘初等矩阵相当于对矩阵做一次对应行变换，而右乘初等矩阵相当于对矩阵做一次对应列变换。

矩阵的高斯消元法就是不断左乘初等矩阵把矩阵变成上三角矩阵。

矩阵的运算

矩阵相等、矩阵加减法与数乘矩阵

设是 A、B 是两个同型矩阵（行数、列数分别相等），如果它们对应位置的元素都相等，则称矩阵相等。

A = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix},\quad B = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix}

矩阵的加法就是同型矩阵对应位置元素相加，减法同理：

A-B = \begin{pmatrix} 1-5 & 2-6 \\ 3-7 & 4-8 \end{pmatrix} = \begin{pmatrix} -4 & -4 \\ -4 & -4 \end{pmatrix}

数乘矩阵就是用一个数 k 乘矩阵的每一个元素E

2A = \begin{pmatrix} 2 & 4 \\ 6 & 8 \end{pmatrix}

矩阵乘法

只有符合“前矩阵列数 = 后矩阵行数”的两个矩阵才能相乘，矩阵相乘的结果是一个新的矩阵，这个矩阵的行数为前矩阵行数，列数为后矩阵列数。

例如： $m_1 \times n$ 与 $n \times m_2$ 的两个矩阵可以相乘（中间两个数相等），运算的结果是一个 $m_1 \times m_2$ 的新矩阵（即外围两个数字）。

矩阵乘法的规则是 $c_{ij} = \sum_{k=1}^{s} a_{ik} b_{kj}$ ，即等于 A 的第 i 行与 B 的第 j 列对应元素相乘再求和。

\begin{aligned} A &= \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix},\quad B = \begin{pmatrix} 5 & 6 \\ 7 & 8 \end{pmatrix} \\[6pt] AB &= \begin{pmatrix} 1\cdot5+2\cdot7 & 1\cdot6+2\cdot8 \\ 3\cdot5+4\cdot7 & 3\cdot6+4\cdot8 \end{pmatrix} \\[6pt] &= \begin{pmatrix} 19 & 22 \\ 43 & 50 \end{pmatrix} \end{aligned}

矩阵乘法不满足交换律：

AB \neq BA

在顺序不变的前提下，矩阵乘法满足结合律和分配律：

(AB)C = A(BC)

A(B+C) = AB + AC

在矩阵乘法中，单位矩阵充当 1 的角色：

AE = A = EA

两个非零矩阵相乘可能等于零矩阵，因此 $AB = O \nRightarrow A=O \text{ 或 } B=O$ 。

同时，对于以下这个方程，不可以同时约掉 A 来求解：

AB = AC \nRightarrow B=C

矩阵的转置

将矩阵的行与列互换，得到一个新的矩阵，就是矩阵的转置矩阵。

A = \begin{pmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{pmatrix}

A^T = \begin{pmatrix} 1 & 4 \\ 2 & 5 \\ 3 & 6 \end{pmatrix}

矩阵转置两次得到其本身：

(A^T)^T = A

加法的转置等于转置的加法，即两个矩阵相加后转置，等价于分别转置后再相加。

(A+B)^T = A^T + B^T

数乘的转置等于转置的数乘，数乘矩阵后转置，等价于转置后再数乘。

(kA)^T = kA^T

而矩阵乘法的转置相对较为复杂，需要对矩阵的循序做一次倒序变换：

(AB)^T = B^T A^T

(A_1A_2\cdots A_n)^T = A_n^T A_{n-1}^T \cdots A_1^T

若矩阵等于其转置矩阵，我们称这个矩阵是对称矩阵，对称矩阵的元素关于主对角线对称；同样的，满足矩阵等于其转置矩阵乘 -1 的矩阵，我们称之为反对称矩阵。

基于这个定义，我们可以发现，任何矩阵都可以分解为对称矩阵和反对称矩阵的的和，例如：

A = \frac{A+A^T}{2} + \frac{A-A^T}{2}

这个性质可以用在力学、机器学习等等领域。

矩阵的幂

首先需要明确，只有方阵才有幂，可以通过矩阵乘法的条件来证明。规定 $A^0 = E$ 。

\begin{aligned} A^2 &= A \cdot A = \begin{pmatrix} 1 & 2 \\ 0 & 1 \end{pmatrix} \begin{pmatrix} 1 & 2 \\ 0 & 1 \end{pmatrix} \\ &= \begin{pmatrix} 1 \cdot 1 + 2 \cdot 0 & 1 \cdot 2 + 2 \cdot 1 \\ 0 \cdot 1 + 1 \cdot 0 & 0 \cdot 2 + 1 \cdot 1 \end{pmatrix} \\ &= \begin{pmatrix} 1 & 4 \\ 0 & 1 \end{pmatrix} \end{aligned}

\begin{aligned} A^3 &= A^2 \cdot A = \begin{pmatrix} 1 & 4 \\ 0 & 1 \end{pmatrix} \begin{pmatrix} 1 & 2 \\ 0 & 1 \end{pmatrix} \\ &= \begin{pmatrix} 1 & 6 \\ 0 & 1 \end{pmatrix} \end{aligned}

方阵的幂运算有两个重要的性质，与数字幂运算法则基本相同，需要注意的是：

$A^k \cdot A^l = A^{k+l}$
$(A^k)^l = A^{kl}$

需要注意的是，一般情况下， $(AB)^k \neq A^kB^k\$ ， $A^k - B^k \neq (A - B)(A^{k-1} + A^{k-2}B + \cdots + B^{k-1})$ ，除非 $AB = BA$ 即两方阵可交换。

注意同行方阵相乘不一定满足交换律，仅有小部分类型的矩阵满足交换律。

矩阵的迹

同样的，只有方阵才有迹。定义方阵的迹为主对角线所有元素之和。

A = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix},\quad \text{tr}(A) = 1 + 4 = 5

矩阵的迹运算具有几个重要的性质，例如：

\text{tr}(A + B) = \text{tr}(A) + \text{tr}(B)

\text{tr}(kA) = k \cdot \text{tr}(A)

\text{tr}(A^T) = \text{tr}(A)

\text{tr}(AB) = \text{tr}(BA)

即交换乘法顺序，迹不变