Skip to content

Instantly share code, notes, and snippets.

@Camuslu
Forked from ceshine/birnn.ipynb
Created January 22, 2018 02:17
Show Gist options
  • Select an option

  • Save Camuslu/e7e599dbe21d1df3c041e7b7a40c7fae to your computer and use it in GitHub Desktop.

Select an option

Save Camuslu/e7e599dbe21d1df3c041e7b7a40c7fae to your computer and use it in GitHub Desktop.
Figuring How Bidirectional RNN works in Pytorch
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Figuring How Bidirectional RNN works in Pytorch"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import numpy as np\n",
"import torch, torch.nn as nn\n",
"from torch.autograd import Variable"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Initialize Input Sequence Randomly\n",
"For demonstration purpose, we are going to feed RNNs only one sequence of length 5 with only one dimension."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Variable containing:\n",
"-0.1308\n",
"-0.4986\n",
"-0.2581\n",
" 1.7486\n",
" 1.4340\n",
"[torch.FloatTensor of size 5]"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"random_input = Variable(torch.FloatTensor(5, 1, 1).normal_(), requires_grad=False)\n",
"random_input[:, 0, 0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Initialize a Bidirectional GRU Layer"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"bi_grus = torch.nn.GRU(input_size=1, hidden_size=1, num_layers=1, batch_first=False, bidirectional=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Initialize a GRU Layer ( for Feeding the Sequence Reversely)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"reverse_gru = torch.nn.GRU(input_size=1, hidden_size=1, num_layers=1, batch_first=False, bidirectional=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now make sure the weights of the reverse gru layer match ones of the (reversed) bidirectional's:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"reverse_gru.weight_ih_l0 = bi_grus.weight_ih_l0_reverse\n",
"reverse_gru.weight_hh_l0 = bi_grus.weight_hh_l0_reverse\n",
"reverse_gru.bias_ih_l0 = bi_grus.bias_ih_l0_reverse\n",
"reverse_gru.bias_hh_l0 = bi_grus.bias_hh_l0_reverse"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Feed Input Sequence into Both Networks"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"bi_output, bi_hidden = bi_grus(random_input)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"reverse_output, reverse_hidden = reverse_gru(random_input[np.arange(4, -1, -1), :, :])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Check Outputs"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Variable containing:\n",
" 0.7001\n",
" 0.8531\n",
" 0.4716\n",
" 0.4065\n",
" 0.4960\n",
"[torch.FloatTensor of size 5]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"reverse_output[:, 0, 0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The outputs of the reverse GRUs sit in the [latter half of the output](https://discuss.pytorch.org/t/get-forward-and-backward-output-seperately-from-bidirectional-rnn/2523)(in the last dimension):"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Variable containing:\n",
" 0.4960\n",
" 0.4065\n",
" 0.4716\n",
" 0.8531\n",
" 0.7001\n",
"[torch.FloatTensor of size 5]"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bi_output[:, 0, 1]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Check Hidden States"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Variable containing:\n",
"(0 ,.,.) = \n",
" 0.4960\n",
"[torch.FloatTensor of size 1x1x1]"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"reverse_hidden"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The hidden states of the reversed GRUs sits in [the odd indices in the first dimension](https://discuss.pytorch.org/t/how-can-i-know-which-part-of-h-n-of-bidirectional-rnn-is-for-backward-process/3883/4)."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Variable containing:\n",
" 0.4960\n",
"[torch.FloatTensor of size 1x1]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bi_hidden[1]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conclusion\n",
"\n",
"1. The returned outputs of bidirectional RNN at timestep t is just the output after feeding input to both the reverse and normal RNN unit at timestep t. (where normal RNN has seen inputs 1...t and reverse RNN has seen inputs t...n, n being the length of the sequence)\n",
"2. The returned hidden state of bidirectional RNN is the hidden state after the whole sequence is consume. For normal RNN it's after timestep n; for reverse RNN it's after timestep 1."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment