03 Jan 2023

Latent Space

Latent (lat. being hidden) space is a representation on compressed data. Is used to simplify the data representation for the purpose of finding patterns.

In machine learning data is compressed (using lossy compression by definition) to learn important information about data points. This happens in the encoding stage,

The value of compression is that it allows us to discard the noise and only leave the information we need for learning →This “compressed state” is the Latent Space Representation.

The model needs to store all the relevant information, because it is then required to reconstruct the compressed data (in the decoding stage).

What does it look like?

Latent space is usually a matrix that we can use to perform different operations on.

A useful property of the latent space is that “similar” values (eg. colour of object) will be closer to one another in the latent space.

(Note: “closeness” is is an ambiguous term as different models use different algorithms to denote “closeness”).

Autoencoders and Generative Models

  • Autoencoder is a neural network that acts as an identity function, it outputs whatever is inputted. But we don’t care so much about what the model outputs. We care more about what the model learns in the process.
  • When force a model to act as an identity function we are forcing it to store all the relevant features in a compressed way → The model can later reconstruct it.
  • Similar data points in the latent space tend to cluster together, and we can later sample the points from the latent space to generate new data.
  • Here’s an example of image generation through latent space interpolation: