|  View source on GitHub | 
SequenceQueueingStateSaver provides access to stateful values from input.
tf.contrib.training.SequenceQueueingStateSaver(
    batch_size, num_unroll, input_length, input_key, input_sequences, input_context,
    initial_states, capacity=None, allow_small_batch=False, name=None
)
This class is meant to be used instead of, e.g., a Queue, for splitting
variable-length sequence inputs into segments of sequences with fixed length
and batching them into mini-batches.  It maintains contexts and state for a
sequence across the segments.  It can be used in conjunction with a
QueueRunner (see the example below).
The SequenceQueueingStateSaver (SQSS) accepts one example at a time via the
inputs input_length, input_key, input_sequences (a dict),
input_context (a dict), and initial_states (a dict).
The sequences, values in input_sequences, may have variable first dimension
(the padded_length), though this dimension must always be a multiple of
num_unroll.  All other dimensions must be fixed and accessible via
get_shape calls.  The length prior to padding can be recorded in
input_length.  The context values in input_context must all have fixed and
well defined dimensions.  The initial state values must all have fixed and
well defined dimensions.
The SQSS splits the sequences of an input example into segments of length
num_unroll.  Across examples minibatches of size batch_size are formed.
These minibatches contain a segment of the sequences, copy the context values,
and maintain state, length, and key information of the original input
examples.  In the first segment of an example the state is still the initial
state.  It can then be updated; and updated state values are accessible in
subsequent segments of the same example. After each segment
batch.save_state() must be called which is done by the state_saving_rnn.
Without this call, the dequeue op associated with the SQSS will not run.
Internally, SQSS has a queue for the input examples. Its capacity is
configurable.  If set smaller than batch_size then the dequeue op will block
indefinitely.  A small multiple of batch_size is a good rule of thumb to
prevent that queue from becoming a bottleneck and slowing down training.
If set too large (and note that it defaults to unbounded) memory consumption
goes up.  Moreover, when iterating over the same input examples multiple times
reusing the same key the capacity must be smaller than the number of
examples.
The prefetcher, which reads one unrolled, variable-length input sequence at
a time, is accessible via prefetch_op.  The underlying Barrier object
is accessible via barrier.  Processed minibatches, as well as
state read and write capabilities are accessible via next_batch.
Specifically, next_batch provides access to all of the minibatched
data, including the following, see NextQueuedSequenceBatch for details:
- total_length,- length,- insertion_index,- key,- next_key,
- sequence(the index each minibatch entry's time segment index),
- sequence_count(the total time segment count for each minibatch entry),
- context(a dict of the copied minibatched context values),
- sequences(a dict of the split minibatched variable-length sequences),
- state(to access the states of the current segments of these entries)
- save_state(to save the states for the next segments of these entries)
Example usage:
batch_size = 32
num_unroll = 20
lstm_size = 8
cell = tf.compat.v1.nn.rnn_cell.BasicLSTMCell(num_units=lstm_size)
initial_state_values = tf.zeros(cell.state_size, dtype=tf.float32)
raw_data = get_single_input_from_input_reader()
length, key, sequences, context = my_parser(raw_data)
assert "input" in sequences.keys()
assert "label" in context.keys()
initial_states = {"lstm_state": initial_state_value}
stateful_reader = tf.SequenceQueueingStateSaver(
    batch_size, num_unroll,
    length=length, input_key=key, input_sequences=sequences,
    input_context=context, initial_states=initial_states,
    capacity=batch_size*100)
batch = stateful_reader.next_batch
inputs = batch.sequences["input"]
context_label = batch.context["label"]
inputs_by_time = tf.split(value=inputs, num_or_size_splits=num_unroll, axis=1)
assert len(inputs_by_time) == num_unroll
lstm_output, _ = tf.contrib.rnn.static_state_saving_rnn(
  cell,
  inputs_by_time,
  state_saver=batch,
  state_name="lstm_state")
# Start a prefetcher in the background
sess = tf.compat.v1.Session()
num_threads = 3
queue_runner = tf.compat.v1.train.QueueRunner(
    stateful_reader, [stateful_reader.prefetch_op] * num_threads)
tf.compat.v1.train.add_queue_runner(queue_runner)
tf.compat.v1.train.start_queue_runners(sess=session)
while True:
  # Step through batches, perform training or inference...
  session.run([lstm_output])
| Args | |
|---|---|
| batch_size | int or int32 scalar Tensor, how large minibatches should
be when accessing thestate()method andcontext,sequences, etc,
properties. | 
| num_unroll | Python integer, how many time steps to unroll at a time.
The input sequences of length kare then split intok / num_unrollmany segments. | 
| input_length | An int32 scalar Tensor, the length of the sequence prior
to padding.  This value may be at mostpadded_lengthfor any given
input (see below for the definition ofpadded_length).
Batched and total lengths of the current iteration are made accessible
via thelengthandtotal_lengthproperties.  The shape of
input_length (scalar) must be fully specified. | 
| input_key | A string scalar Tensor, the unique key for the given
input.  This is used to keep track of the split minibatch elements
of this input.  Batched keys of the current iteration are made
accessible via thekeyproperty.  The shape ofinput_key(scalar)
must be fully specified. | 
| input_sequences | A dict mapping string names to Tensorvalues.  The
values must all have matching first dimension, calledpadded_length.
TheSequenceQueueingStateSaverwill split these tensors along
this first dimension into minibatch elements of dimensionnum_unroll. Batched and segmented sequences of the current iteration
are made accessible via thesequencesproperty. | 
| input_context | A dict mapping string names to Tensorvalues.  The values
are treated as "global" across all time splits of the given input,
and will be copied across for all minibatch elements accordingly.
Batched and copied context of the current iteration are made
accessible via thecontextproperty. | 
| initial_states | A dict mapping string state names to multi-dimensional
values (e.g. constants or tensors).  This input defines the set of
states that will be kept track of during computing iterations, and
which can be accessed via the stateandsave_statemethods. | 
| capacity | The max capacity of the SQSS in number of examples. Needs to be
at least batch_size. Defaults to unbounded. | 
| allow_small_batch | If true, the SQSS will return smaller batches when there aren't enough input examples to fill a whole batch and the end of the input has been reached (i.e., the underlying barrier has been closed). | 
| name | An op name string (optional). | 
| Raises | |
|---|---|
| TypeError | if any of the inputs is not an expected type. | 
| ValueError | if any of the input values is inconsistent, e.g. if not enough shape information is available from inputs to build the state saver. | 
| Attributes | |
|---|---|
| barrier | |
| batch_size | |
| name | |
| next_batch | The NextQueuedSequenceBatchproviding access to batched output data.Also provides access to the  In order to access data in  | 
| num_unroll | |
| prefetch_op | The op used to prefetch new data into the state saver. Running it once enqueues one new input example into the state saver.
The first time this gets called, it additionally creates the prefetch_op.
Subsequent calls simply return the previously created  It should be run in a separate thread via e.g. a  | 
Methods
close
close(
    cancel_pending_enqueues=False, name=None
)
Closes the barrier and the FIFOQueue.
This operation signals that no more segments of new sequences will be enqueued. New segments of already inserted sequences may still be enqueued and dequeued if there is a sufficient number filling a batch or allow_small_batch is true. Otherwise dequeue operations will fail immediately.
| Args | |
|---|---|
| cancel_pending_enqueues | (Optional.) A boolean, defaulting to False. IfTrue, all pending enqueues to the underlying queues will
be cancelled, and completing already started sequences is not possible. | 
| name | Optional name for the op. | 
| Returns | |
|---|---|
| The operation that closes the barrier and the FIFOQueue. |