tfio.experimental.columnar.parse_avro

Parses avro records into a dict of tensors.

tfio.experimental.columnar.parse_avro(
    serialized, reader_schema, features, avro_names=None, name=None
)

This op parses serialized avro records into a dictionary mapping keys to Tensor, and SparseTensor objects. features is a dict from keys to VarLenFeature, SparseFeature, RaggedFeature, and FixedLenFeature objects. Each VarLenFeature and SparseFeature is mapped to a SparseTensor; each FixedLenFeature is mapped to a Tensor.

Each VarLenFeature maps to a SparseTensor of the specified type representing a ragged matrix. Its indices are [batch, index] where batch identifies the example in serialized, and index is the value's index in the list of values associated with that feature and example.

Each SparseFeature maps to a SparseTensor of the specified type representing a Tensor of dense_shape [batch_size] + SparseFeature.size. Its values come from the feature in the examples with key value_key. A values[i] comes from a position k in the feature of an example at batch entry batch. This positional information is recorded in indices[i] as [batch, index_0, index_1, ...] where index_j is the k-th value of the feature in the example at with key SparseFeature.index_key[j]. In other words, we split the indices (except the first index indicating the batch entry) of a SparseTensor by dimension into different features of the avro record. Due to its complexity a VarLenFeature should be preferred over a SparseFeature whenever possible.

Each FixedLenFeature df maps to a Tensor of the specified type (or tf.float32 if not specified) and shape (serialized.size(),) + df.shape. FixedLenFeature entries with a default_value are optional. With no default value, we will fail if that Feature is missing from any example in serialized.

Use this within the dataset.map(parser_fn=parse_avro).

Only works for batched serialized input!

Args
`serialized`	The batched, serialized string tensors.
`reader_schema`	The reader schema. Note, this MUST match the reader schema from the avro_record_dataset. Otherwise, this op will segfault!
`features`	A map of feature names mapped to feature information.
`avro_names`	(Optional.) may contain descriptive names for the corresponding serialized avro parts. These may be useful for debugging purposes, but they have no effect on the output. If not `None`, `avro_names` must be the same length as `serialized`.
`name`	The name of the op.

Returns
A map of feature names to tensors.

tfio.experimental.columnar.parse_avro Stay organized with collections Save and categorize content based on your preferences.

Args

Returns

tfio.experimental.columnar.parse_avro