LSTM architecture


A neural network is highly motivated by the biological neural system. All the developed neural network algorithms are somehow related to the human brain's thinking. All these networks lack persistent thinking. A recurrent neural network (RNN) helped to solve this issue. RNN is a recurring structure that helps to preserve memory. It has looped. It helps to pass the output from one structure to the next one to generate new output. In practice, RNN fails to store long memory storage, and hence the output was not as were thought RNN to work. RNN often has gradient vanishing and exploding problems due to the tanh layer. So, a new neural network architecture came to existence called Long Shot Term Memory (LSTM).




LSTM also follows a recurring structure same as RNN. The internal working of the LSTM recurring structure is complex compared to the RNN. RNN is having a simple tanh function. LSTM consists of three sigmoid and pointwise multiplications. This sigmoid layer helps in calculating the amount of memory that needs to be preserved for further prediction. The sigmoid value ranges from 0 to 1. Two different output is seen here. One is known as the old cell state which stores the long duration output. The other is the cell state which stores the vector of current input passed through the tanh. The first gate is known as a forget gate. It controls the amount of old memory need to forget which will be not required further. The second is the input gate. It decides the value of the current cell state be updated. It decides the amount of current memory in the cell state that needs to be stored for current output. The third one is the output gate. It helps to decide the part of the cell state going for output. Now, let's see how these three gates help in the workflow. Current input with previous output is passed to the LSTM along with the old cell state. The forget value is between 0 to 1. The value 0 means to forget all the information and value 1 means to preserve all the information. Next, combine the sigmoid of the input gate and the new cell state passed through tanh. Now, the old cell state is updated into a new cell state using forget and input gate. Finally, the output is generated using the output gate and cell state passed through tanh. LSTM has lots of applications such as language identification, speech recognition, image captioning, etc.










References

  1. https://colah.github.io/posts/2015-08-Understanding-LSTMs/

  2. https://wandb.ai/sauravmaheshkar/LSTM-PyTorch/reports/How-to-Use-LSTMs-in-PyTorch--VmlldzoxMDA2NTA5?galleryTag=

Running Dog