Stochastic Neural Networks

Hopfield network

A markov chain of neurons, settles to some final state.

  • Only two states (1 or -1 / black or white)

  • Connections both way with the same weight

    • w_ij = w_ji

  • No neurons connect to themselves

    • w_ii = 0

  • Asynchronous operation

  • sign(v) is the activation function

Can converge very quickly on a custom circuit (but slow on CPU)

  • Will fall into some stable state (attractors)

Associative memory

  • So given a prompt of part of a dog it will fall back into a state of a reconstructed dog

Learning

Hebbian rule

  • w_ij = w_ij + u_ni x_nj

Capacity

  • 0.138 U patterns for U neurons... (correct patterns)

  • Essentially because it there is too much similarity because so many patterns will start with 110 it is difficult to get the correct one

Will learn spurious states

  • Combined states of learned states or kinda random states

Slow in training and slow in evaluation (will also have to settle to the state

Boltzmann machine

A stochastic hidden Hopfield network with hidden units

Activation function is a sidmoid (between 0 and 1)

  • Probability of neuron being switched on / off

  • T is temperature - how gradual the change between states is

Could have a neuron for input and a neuron for output as the visible units

Anneal the T value, if you in theory cool the system at infinitely slow rate you get the absolute minimia (but you cant really do that)

  • Design a schedule for this (high T is stochastic, low T is deterministic)

  • Very slow to settle

Learning

  • Positive phase

    • ...

  • Negative phase

    • ...

Takes forever to settle (with simulated annealing)

Restricted boltzmann machine

(i.e. all the non-connections are forced to be 0, you loose a lot of dynamics)

  • No connections between visible units

  • No connections between hidden units

  • Feedforward

Dynamics

  • Easier / faster to train

  • Not as powerful

  • Autoencoder

    • Lots -> little -> lots (A compression exercise, learn the important ascepts)

Don't need to know the learning with contrastive divergence stuff

Deep Belief Network

  • Contrastive divergence

  • Stacked auto-encoders

  • Fine tuned backprop

  • Explaining away labels

First do unsupervised learning, then use those features to derive labels

  • Essentially an autoencoder learning step by step, where the 'input' layer moves through the network.

    • A series of restricted boltzmann machines.

    • Requires sigmoids (can't use ReLU)

    • Showed deeper doesn't make it worse

  • Then you add the output neurons, and then use backpropagation.

    • Essentially a way of fine-tuning. This was better than SVM.

  • Can be used to run computation backwards, to generate possible 'inputs'

Modern Hopfield Network

Can learn more patterns than there are neurons

Kinda like a ReLU unit with some power ^ k

  • Ali covers these in better detail in Ali's notes

May be useful for reinforcement applications due to its high capacity!

Last updated

Was this helpful?