What is a Tensor?


TL;DR

  • matrix = linear map
  • tensor = multilinear map

The Long Short Of It

This is a brief “explanation” and contains almost no detail. To be fair, it’s really just meant to give some intuition to someone who understands linear algebra semi-well.

Matrices

First, think of a matrix, which we’ll call \(A\).

(Digression: A matrix is nothing but an array of stuff, \((a_{ij})\), which are usually numbers, but don’t have to be. We’ll say they’re real numbers since in machine learning, that’s pretty much always the case. If you want to learn more, read about fields.)

These matrices can be used for more than linear algebra, but that’s where they really shine, because of the next piece of information.

Linear Functions

I assume you know what vector spaces and linear maps are. To emphasize that linear maps are just functions, I will call them linear functions for the rest of this post.

Linear function: A linear map.

Linear functions and matrices are in 1-1 correspondence. Every linear function has a unique representation as a matrix (in terms of some basis), and every matrix represents some linear function (in that basis).

That means \(A\) represents some linear function \(f\), and \(f\) is represented by the matrix \(A\).

Incidentally, matrix multiplication is defined to be equivalent to function composition. So if the matrices \(A,B\) correspond to linear functions \(f,g\), and \(x\) is a vector,

\(ABx = (f \circ g)(x) \equiv f(g(x))\).

Additionally, an \(r \times c\) matrix (\(r\) for row, \(c\) for column) is a linear function from \(\mathbb{R}^c\) to \(\mathbb{R}^r\), (or any field, not just \(\mathbb{R}\)).

This is why an \(m \times n\) matrix, \(A\), can only be (left) multiplied with an \(n \times 1\) vector, \(x\). Since \(A: \mathbb{R}^n \rightarrow \mathbb{R}^m\), it can only be applied to vectors in \(\mathbb{R}^n\), which are \(n \times 1\) vectors.

Notice that linear functions take in only one argument. It’s \(f(x)\), not \(f(x,y)\).

We’ll fix that now.

Multilinear Functions

There’s an extension of linear functions to multiple arguments that preserves linearity in a simple way. Consider the multilinear function \(g(x_1..x_n)\).

Multilinear (with respect to a single argument \(x_i\)): Linear if every other argument is held fixed, and \(x_i\) is allowed to vary. Think of a partial derivative for the basic idea.

A tensor is a multilinear function. Tensors and multilinear functions are in the same kind of 1-1 correspondence that linear functions and matrices are. This is why matrices are a (simple) type of tensor. After all, the 1 argument that linear functions take is a subset of the \(n\) arguments that tensors take.

There is a 1-1 correspondence between linear functions in one space and multilinear functions in another, related space, so tensors can be written as multi-dimensional arrays, or as nested arrays in libraries such as NumPy.

The Point

Tensors are multilinear functions. Matrices are linear functions.

Errors

If you find any errors, or if I’m just plain wrong, feel free to tell me.

Related Posts

Just because 2 things are dual, doesn't mean they're just opposites

Boolean Algebra, Arithmetic POV

discontinuous linear functions

Continuous vs Bounded

Minimal Surfaces

November 2, 2023

NTK reparametrization

Kate from Vancouver, please email me

ChatGPT Session: Emotions, Etymology, Hyperfiniteness

Some ChatGPT Sessions