TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

tfds.core.ShardedFileTemplate

Template to produce filenames for sharded datasets.

tfds.core.ShardedFileTemplate(
    data_dir: epath.Path,
    template: str = DEFAULT_FILENAME_TEMPLATE,
    dataset_name: Optional[str] = None,
    split: Optional[str] = None,
    filetype_suffix: Optional[str] = None
)

Attributes
`data_dir`	the directory that contains the files for the shards.
`template`	template of the sharded files, e.g. '${SPLIT}/data.${FILEFORMAT}-${SHARD_INDEX}'.
`dataset_name`	the name of the dataset.
`split`	the split of the dataset.
`filetype_suffix`	the filetype suffix to denote the type of file. For example, `tfrecord`.
`regex`	Returns the regular expression for this template. Can be used to test whether a filename matches to this template.

Methods

`filepath_prefix`

View source

filepath_prefix() -> str

`is_valid`

View source

is_valid(
    filename: str
) -> bool

Returns whether the given filename follows this template.

`parse_filename_info`

View source

parse_filename_info(
    filename: str
) -> Optional[FilenameInfo]

Parses the filename using this template.

Note that when the filename doesn't specify the dataset name, split, or filetype suffix, but this template does, then the value in the template will be used.

Arguments
`filename`	the filename that should be parsed.

Returns
the FilenameInfo corresponding to the given file if it could be parsed. None otherwise.

`relative_filepath`

View source

relative_filepath(
    *, shard_index: int, num_shards: Optional[int]
) -> str

Returns the path (relative to the data dir) of the shard.

`replace`

View source

replace(
    **kwargs
) -> 'ShardedFileTemplate'

Returns a copy of the ShardedFileTemplate with updated attributes.

`sharded_filenames`

View source

sharded_filenames(
    num_shards: int
) -> List[str]

`sharded_filepath`

View source

sharded_filepath(
    *, shard_index: int, num_shards: Optional[int]
) -> epath.Path

Returns the filename (including full path if data_dir is set) for the given shard.

`sharded_filepaths`

View source

sharded_filepaths(
    num_shards: int
) -> List[epath.Path]

`sharded_filepaths_pattern`

View source

sharded_filepaths_pattern(
    *, num_shards: Optional[int] = None
) -> str

Returns a pattern describing all the file paths captured by this template.

If num_shards is given, then it returns '/path/dataset_name-split.fileformat@num_shards. Ifnum_shardsis not given, then it returns '/path/dataset_name-split.fileformat*.

Args
`num_shards`	optional specification of the number of shards.

Returns
the pattern describing all shards captured by this template.

`eq`

__eq__(
    other
)

Class Variables
dataset_name	`None`
filetype_suffix	`None`
split	`None`
template	`'{DATASET}-{SPLIT}.{FILEFORMAT}-{SHARD_X_OF_Y}'`

tfds.core.ShardedFileTemplate Stay organized with collections Save and categorize content based on your preferences.

Attributes

Methods

filepath_prefix

is_valid

parse_filename_info

relative_filepath

replace

sharded_filenames

sharded_filepath

sharded_filepaths

sharded_filepaths_pattern

__eq__

Class Variables

tfds.core.ShardedFileTemplate

`filepath_prefix`

`is_valid`

`parse_filename_info`

`relative_filepath`

`replace`

`sharded_filenames`

`sharded_filepath`

`sharded_filepaths`

`sharded_filepaths_pattern`

`eq`