(This is not meant to show you how to solve the problem, just what the solution does in the most important special case).

Gaussian elimination is about *factorization*. Factorization is just
expressing something as a composition of simpler things. In this case,
expressing a linear function as a product of simpler linear functions.

Say we have an *invertible* linear function \(A: V \to W\) and we want
to find the (unknown) vector \(x \in V\) that maps to a *known* vector
\(b \in W\). The assumption of invertibility is just to make the basic
idea easier to explain.

That means we have to solve \(A(x) = b\).

Since \(A\) is invertible, we just need to apply \(A^{-1}\) to both
sides to get \(x = A^{-1}(b)\). Gaussian elimination does that by
*factoring.*

The set of all invertible linear functions forms a group under the
operation of composition (AKA matrix multiplication), \(GL(n)\). This
group is generated by the *elementary matrices*. Let’s call the set of
those elementary matrices \(E\). Then \(A = \prod_i e_i \in E\). Each
elementary matrix is also invertible, and the inverse of the product is
the product of the inverses, *in reverse order*,
\(e_n^{-1} \dots e_1^{-1}\). Applying this inverse sequence to both
sides gives a fast way of finding the inverse function.

Each step of Gaussian elimination is equivalent to left-multiplying by an elementary matrix until you’ve inverted your original function.

Gaussian elimination is an algorithmic way of finding that factorization into elementary matrices that works over any field, not just the real numbers.

The funny thing is that you can do linear algebra without ever being able to solve such a simple problem. Not that that’s much fun.