tfds.beam.ReadFromTFDS
Stay organized with collections
Save and categorize content based on your preferences.
Creates a beam pipeline yielding TFDS examples.
tfds.beam.ReadFromTFDS(
pipeline,
builder: tfds.core.DatasetBuilder
,
split: str,
workers_per_shard: int = 1,
**as_dataset_kwargs
)
Used in the notebooks
Each dataset shard will be processed in parallel.
Usage:
builder = tfds.builder('my_dataset')
_ = (
pipeline
| tfds.beam.ReadFromTFDS(builder, split='train')
| beam.Map(tfds.as_numpy)
| ...
)
Use tfds.as_numpy
to convert each examples from tf.Tensor
to numpy.
The split argument can make use of subsplits, eg 'train[:100]', only when the
batch_size=None (in as_dataset_kwargs). Note: the order of the images will be
different than when tfds.load(split='train[:100]') is used, but the same
examples will be used.
Args |
pipeline
|
beam pipeline (automatically set)
|
builder
|
Dataset builder to load
|
split
|
Split name to load (e.g. train+test , train )
|
workers_per_shard
|
number of workers that should read a shard in parallel.
The shard will be split in this many parts. Note that workers cannot skip
to a specific row in a tfrecord file, so they need to read the file up
until that point without using that data.
|
**as_dataset_kwargs
|
Arguments forwarded to builder.as_dataset .
|
Returns |
The PCollection containing the TFDS examples.
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-06-19 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-06-19 UTC."],[],[]]