tf.data.experimental.TFRecordWriter
Stay organized with collections
Save and categorize content based on your preferences.
Writes a dataset to a TFRecord file. (deprecated)
tf.data.experimental.TFRecordWriter(
filename, compression_type=None
)
The elements of the dataset must be scalar strings. To serialize dataset
elements as strings, you can use the tf.io.serialize_tensor
function.
dataset = tf.data.Dataset.range(3)
dataset = dataset.map(tf.io.serialize_tensor)
writer = tf.data.experimental.TFRecordWriter("/path/to/file.tfrecord")
writer.write(dataset)
To read back the elements, use TFRecordDataset
.
dataset = tf.data.TFRecordDataset("/path/to/file.tfrecord")
dataset = dataset.map(lambda x: tf.io.parse_tensor(x, tf.int64))
To shard a dataset
across multiple TFRecord files:
dataset = ... # dataset to be written
def reduce_func(key, dataset):
filename = tf.strings.join([PATH_PREFIX, tf.strings.as_string(key)])
writer = tf.data.experimental.TFRecordWriter(filename)
writer.write(dataset.map(lambda _, x: x))
return tf.data.Dataset.from_tensors(filename)
dataset = dataset.enumerate()
dataset = dataset.apply(tf.data.experimental.group_by_window(
lambda i, _: i % NUM_SHARDS, reduce_func, tf.int64.max
))
# Iterate through the dataset to trigger data writing.
for _ in dataset:
pass
Args |
filename
|
a string path indicating where to write the TFRecord data.
|
compression_type
|
(Optional.) a string indicating what type of compression
to use when writing the file. See tf.io.TFRecordCompressionType for
what types of compression are available. Defaults to None .
|
Methods
write
View source
write(
dataset
)
Writes a dataset to a TFRecord file.
An operation that writes the content of the specified dataset to the file
specified in the constructor.
If the file exists, it will be overwritten.
Returns |
In graph mode, this returns an operation which when executed performs the
write. In eager mode, the write is performed by the method itself and
there is no return value.
|
Raises
TypeError: if dataset
is not a tf.data.Dataset
.
TypeError: if the elements produced by the dataset are not scalar strings.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Some content is licensed under the numpy license.
Last updated 2023-10-06 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2023-10-06 UTC."],[],[],null,["# tf.data.experimental.TFRecordWriter\n\n\u003cbr /\u003e\n\n|--------------------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/tensorflow/blob/v2.13.1/tensorflow/python/data/experimental/ops/writers.py#L27-L126) |\n\nWrites a dataset to a TFRecord file. (deprecated)\n\n#### View aliases\n\n\n**Compat aliases for migration**\n\nSee\n[Migration guide](https://www.tensorflow.org/guide/migrate) for\nmore details.\n\n[`tf.compat.v1.data.experimental.TFRecordWriter`](https://www.tensorflow.org/api_docs/python/tf/data/experimental/TFRecordWriter)\n\n\u003cbr /\u003e\n\n tf.data.experimental.TFRecordWriter(\n filename, compression_type=None\n )\n\n| **Deprecated:** THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: To write TFRecords to disk, use [`tf.io.TFRecordWriter`](../../../tf/io/TFRecordWriter). To save and load the contents of a dataset, use [`tf.data.experimental.save`](../../../tf/data/experimental/save) and [`tf.data.experimental.load`](../../../tf/data/experimental/load)\n\nThe elements of the dataset must be scalar strings. To serialize dataset\nelements as strings, you can use the [`tf.io.serialize_tensor`](../../../tf/io/serialize_tensor) function. \n\n dataset = tf.data.Dataset.range(3)\n dataset = dataset.map(tf.io.serialize_tensor)\n writer = tf.data.experimental.TFRecordWriter(\"/path/to/file.tfrecord\")\n writer.write(dataset)\n\nTo read back the elements, use `TFRecordDataset`. \n\n dataset = tf.data.TFRecordDataset(\"/path/to/file.tfrecord\")\n dataset = dataset.map(lambda x: tf.io.parse_tensor(x, tf.int64))\n\nTo shard a `dataset` across multiple TFRecord files: \n\n dataset = ... # dataset to be written\n\n def reduce_func(key, dataset):\n filename = tf.strings.join([PATH_PREFIX, tf.strings.as_string(key)])\n writer = tf.data.experimental.TFRecordWriter(filename)\n writer.write(dataset.map(lambda _, x: x))\n return tf.data.Dataset.from_tensors(filename)\n\n dataset = dataset.enumerate()\n dataset = dataset.apply(tf.data.experimental.group_by_window(\n lambda i, _: i % NUM_SHARDS, reduce_func, tf.int64.max\n ))\n\n # Iterate through the dataset to trigger data writing.\n for _ in dataset:\n pass\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|--------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `filename` | a string path indicating where to write the TFRecord data. |\n| `compression_type` | (Optional.) a string indicating what type of compression to use when writing the file. See `tf.io.TFRecordCompressionType` for what types of compression are available. Defaults to `None`. |\n\n\u003cbr /\u003e\n\nMethods\n-------\n\n### `write`\n\n[View source](https://github.com/tensorflow/tensorflow/blob/v2.13.1/tensorflow/python/data/experimental/ops/writers.py#L91-L126) \n\n write(\n dataset\n )\n\nWrites a dataset to a TFRecord file.\n\nAn operation that writes the content of the specified dataset to the file\nspecified in the constructor.\n\nIf the file exists, it will be overwritten.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ||\n|-----------|--------------------------------------------------------------------------------------------|\n| `dataset` | a [`tf.data.Dataset`](../../../tf/data/Dataset) whose elements are to be written to a file |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ||\n|---|---|\n| In graph mode, this returns an operation which when executed performs the write. In eager mode, the write is performed by the method itself and there is no return value. ||\n\n\u003cbr /\u003e\n\nRaises\nTypeError: if `dataset` is not a [`tf.data.Dataset`](../../../tf/data/Dataset).\nTypeError: if the elements produced by the dataset are not scalar strings."]]