TensorFlow 1 version
 | 
  
     
    View source on GitHub
  
 | 
Pads sequences to the same length.
tf.keras.preprocessing.sequence.pad_sequences(
    sequences, maxlen=None, dtype='int32', padding='pre',
    truncating='pre', value=0.0
)
This function transforms a list (of length num_samples)
of sequences (lists of integers)
into a 2D Numpy array of shape (num_samples, num_timesteps).
num_timesteps is either the maxlen argument if provided,
or the length of the longest sequence in the list.
Sequences that are shorter than num_timesteps
are padded with value until they are num_timesteps long.
Sequences longer than num_timesteps are truncated
so that they fit the desired length.
The position where padding or truncation happens is determined by
the arguments padding and truncating, respectively.
Pre-padding or removing values from the beginning of the sequence is the
default.
sequence = [[1], [2, 3], [4, 5, 6]]tf.keras.preprocessing.sequence.pad_sequences(sequence)array([[0, 0, 1],[0, 2, 3],[4, 5, 6]], dtype=int32)
tf.keras.preprocessing.sequence.pad_sequences(sequence, value=-1)array([[-1, -1, 1],[-1, 2, 3],[ 4, 5, 6]], dtype=int32)
tf.keras.preprocessing.sequence.pad_sequences(sequence, padding='post')array([[1, 0, 0],[2, 3, 0],[4, 5, 6]], dtype=int32)
tf.keras.preprocessing.sequence.pad_sequences(sequence, maxlen=2)array([[0, 1],[2, 3],[5, 6]], dtype=int32)
Args | |
|---|---|
sequences
 | 
List of sequences (each sequence is a list of integers). | 
maxlen
 | 
Optional Int, maximum length of all sequences. If not provided, sequences will be padded to the length of the longest individual sequence. | 
dtype
 | 
(Optional, defaults to int32). Type of the output sequences.
To pad sequences with variable length strings, you can use object.
 | 
padding
 | 
String, 'pre' or 'post' (optional, defaults to 'pre'): pad either before or after each sequence. | 
truncating
 | 
String, 'pre' or 'post' (optional, defaults to 'pre'):
remove values from sequences larger than
maxlen, either at the beginning or at the end of the sequences.
 | 
value
 | 
Float or String, padding value. (Optional, defaults to 0.) | 
Returns | |
|---|---|
Numpy array with shape (len(sequences), maxlen)
 | 
Raises | |
|---|---|
ValueError
 | 
In case of invalid values for truncating or padding,
or in case of invalid shape for a sequences entry.
 | 
  TensorFlow 1 version
    View source on GitHub