Auto - Encoders

 Autoencoders are:

  • Artificial neural networks.
  • Capable of learning efficient representations of the input data, called codings, without any supervision.
  • The training set is unlabeled.

These codings typically have a much lower dimensionality than the input  data, making autoencoders useful for dimensionality reduction

Why use Autoencoders?

  • Useful for dimensionality reduction
  • Autoencoders act as powerful feature detectors,
  • And they can be used for unsupervised pre-training of deep neural  networks
  • Lastly, they are capable of randomly generating new data that looks very  similar to the training data; this is called a generative model
  • Surprisingly, autoencoders work by simply learning to copy their inputs  to their outputs
  • This may sound like a trivial task, but we will see that constraining the  network in various ways can make it rather difficult

For example

  • You can limit the size of the internal representation, or you can add noise to the inputs and train the network to recover the original inputs.
  • These constraints prevent the autoencoder from trivially copying the  inputs directly to the outputs, which forces it to learn efficient ways of  representing the data
  • In short, the codings are byproducts of the autoencoder’s attempt to  learn the identity function under some constraints

Efficient Data Representations

Composition of Autoencoder


  • There is just one hidden  layer composed of two  neurons (the encoder) and one output layer  composed of three  neurons (the decoder)
  • Because the internal  representation has a lower  dimensionality than the input  data, it is 2D instead of  3D, and the autoencoder is said  to be under complete
  • An under-complete autoencoder cannot  trivially copy its inputs to  the codings, yet it must  find a way to output a  copy of its inputs
  • It is forced to learn the most important features in the input data and drop the unimportant ones.
PCA with an Undercomplete Linear Autoencoder

If the autoencoder uses only linear activations and the cost function is  the Mean Squared Error (MSE), then it can be shown that it ends up  performing Principal Component Analysis.

Now we will build a simple linear autoencoder to perform PCA on a 3D  dataset, projecting it to 2D

The two things to note in the previous code are:
  • The number of outputs is equal to the number of inputs
  • To perform simple PCA, we set activation_fn=None i.e., all neurons are linear, and the cost function is the MSE.

تعليقات