## What is an autoencoder?

An autoencoder is an approximation of the identity function.

Unlike many approximations, it’s usually meant to be an imperfect one.

An autoencoder $$A$$ is usually the composition of two parts: an encoder $$e$$ and a decoder $$d$$. So $$A := d \circ e$$. Generally the encoder is surjective and the decoder is injective, but neither are bijective. In English, the encoder maps the original points into a lower-dimensional space, and the decoder maps them back into a higher-dimensional one. This lower-dimensional bottleneck $$e(x)$$ is where most of the interesting properties of an autoencoder come from.

If you see the standard picture of an autoencoder that makes it look like a tipped-over hourglass, this will make more sense.

Adding a sparsity regularizer such as the $$L_1$$ norm penalty on the bottleneck layer gives a sparse autoencoder.

Instead of mapping from $$x \mapsto x$$, we can add some noise to the input and have it try to learn to ignore the noise by giving it the real input as a label.

In math, we use $$x + \varepsilon \mapsto x$$.

The chapter has loads of other stuff, but it didn’t feel interesting enough to me to write down.