Based on IndRNNs (https://arxiv.org/abs/1803.04831) and similar to
BasicLSTMCell, yet with the \(U_f\), \(U_i\), \(U_o\) and \(U_c\)
matrices in the regular LSTM equations replaced by diagonal matrices, i.e. a
Hadamard product with a single vector:
where \(\circ\) denotes the Hadamard operator. This means that each IndyLSTM
node sees only its own state \(h\) and \(c\), as opposed to seeing all
states in the same layer.
We add forget_bias (default: 1) to the biases of the forget gate in order to
reduce the scale of forgetting in the beginning of the training.
It does not allow cell clipping, a projection layer, and does not
use peep-hole connections: it is the basic baseline.
float, The bias added to forget gates (see above).
Must set to 0.0 manually when restoring from CudnnLSTM-trained
checkpoints.
activation
Activation function of the inner states. Default: tanh.
reuse
(optional) Python boolean describing whether to reuse variables
in an existing scope. If not True, and the existing scope already has
the given variables, an error is raised.
kernel_initializer
(optional) The initializer to use for the weight
matrix applied to the inputs.
bias_initializer
(optional) The initializer to use for the bias.
name
String, the name of the layer. Layers with the same name will
share weights, but to avoid mistakes we require reuse=True in such
cases.
dtype
Default dtype of the layer (default of None means use the type
of the first input). Required when build is called before call.
Attributes
graph
DEPRECATED FUNCTION
output_size
Integer or TensorShape: size of outputs produced by this cell.
scope_name
state_size
size(s) of state(s) used by this cell.
It can be represented by an Integer, a TensorShape or a tuple of Integers
or TensorShapes.
int, float, or unit Tensor representing the batch size.
dtype
the data type to use for the state.
Returns
If state_size is an int or TensorShape, then the return value is a
N-D tensor of shape [batch_size, state_size] filled with zeros.
If state_size is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of 2-D tensors with
the shapes [batch_size, s] for each s in state_size.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2020-10-01 UTC."],[],[],null,["# tf.contrib.rnn.IndyLSTMCell\n\n|------------------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/contrib/rnn/python/ops/rnn_cell.py#L3278-L3414) |\n\nBasic IndyLSTM recurrent network cell.\n\nInherits From: [`LayerRNNCell`](../../../tf/contrib/rnn/LayerRNNCell) \n\n tf.contrib.rnn.IndyLSTMCell(\n num_units, forget_bias=1.0, activation=None, reuse=None,\n kernel_initializer=None, bias_initializer=None, name=None, dtype=None\n )\n\nBased on IndRNNs (\u003chttps://arxiv.org/abs/1803.04831\u003e) and similar to\nBasicLSTMCell, yet with the \\\\(U_f\\\\), \\\\(U_i\\\\), \\\\(U_o\\\\) and \\\\(U_c\\\\)\nmatrices in the regular LSTM equations replaced by diagonal matrices, i.e. a\nHadamard product with a single vector: \n$$f_t = \\\\sigma_g\\\\left(W_f x_t + u_f \\\\circ h_{t-1} + b_f\\\\right)$$ \n$$i_t = \\\\sigma_g\\\\left(W_i x_t + u_i \\\\circ h_{t-1} + b_i\\\\right)$$ \n$$o_t = \\\\sigma_g\\\\left(W_o x_t + u_o \\\\circ h_{t-1} + b_o\\\\right)$$ \n$$c_t = f_t \\\\circ c_{t-1} + i_t \\\\circ \\\\sigma_c\\\\left(W_c x_t + u_c \\\\circ h_{t-1} + b_c\\\\right)$$\n\nwhere \\\\(\\\\circ\\\\) denotes the Hadamard operator. This means that each IndyLSTM\nnode sees only its own state \\\\(h\\\\) and \\\\(c\\\\), as opposed to seeing all\nstates in the same layer.\n\nWe add forget_bias (default: 1) to the biases of the forget gate in order to\nreduce the scale of forgetting in the beginning of the training.\n\nIt does not allow cell clipping, a projection layer, and does not\nuse peep-hole connections: it is the basic baseline.\n\nFor a detailed analysis of IndyLSTMs, see \u003chttps://arxiv.org/abs/1903.08023\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `num_units` | int, The number of units in the LSTM cell. |\n| `forget_bias` | float, The bias added to forget gates (see above). Must set to `0.0` manually when restoring from CudnnLSTM-trained checkpoints. |\n| `activation` | Activation function of the inner states. Default: `tanh`. |\n| `reuse` | (optional) Python boolean describing whether to reuse variables in an existing scope. If not `True`, and the existing scope already has the given variables, an error is raised. |\n| `kernel_initializer` | (optional) The initializer to use for the weight matrix applied to the inputs. |\n| `bias_initializer` | (optional) The initializer to use for the bias. |\n| `name` | String, the name of the layer. Layers with the same name will share weights, but to avoid mistakes we require reuse=True in such cases. |\n| `dtype` | Default dtype of the layer (default of `None` means use the type of the first input). Required when `build` is called before `call`. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Attributes ---------- ||\n|---------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `graph` | DEPRECATED FUNCTION \u003cbr /\u003e | **Warning:** THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Stop using this property because tf.layers layers no longer track their graph. |\n| `output_size` | Integer or TensorShape: size of outputs produced by this cell. |\n| `scope_name` | \u003cbr /\u003e |\n| `state_size` | size(s) of state(s) used by this cell. \u003cbr /\u003e It can be represented by an Integer, a TensorShape or a tuple of Integers or TensorShapes. |\n\n\u003cbr /\u003e\n\nMethods\n-------\n\n### `get_initial_state`\n\n[View source](https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/python/ops/rnn_cell_impl.py#L281-L309) \n\n get_initial_state(\n inputs=None, batch_size=None, dtype=None\n )\n\n### `zero_state`\n\n[View source](https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/python/ops/rnn_cell_impl.py#L311-L340) \n\n zero_state(\n batch_size, dtype\n )\n\nReturn zero-filled state tensor(s).\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ||\n|--------------|---------------------------------------------------------|\n| `batch_size` | int, float, or unit Tensor representing the batch size. |\n| `dtype` | the data type to use for the state. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ||\n|---|---|\n| If `state_size` is an int or TensorShape, then the return value is a `N-D` tensor of shape `[batch_size, state_size]` filled with zeros. \u003cbr /\u003e If `state_size` is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of `2-D` tensors with the shapes `[batch_size, s]` for each s in `state_size`. ||\n\n\u003cbr /\u003e"]]