Init module for TF.Transform.
Modules
coders
module: Module level imports for tensorflow_transform.coders.
experimental
module: Module level imports for tensorflow_transform.experimental.
Classes
class DatasetMetadata
: Metadata about a dataset used for the "instance dict" format.
class TFTransformOutput
: A wrapper around the output of the tf.Transform.
class TransformFeaturesLayer
: A Keras layer for applying a tf.Transform output to input layers.
Functions
annotate_asset(...)
: Creates mapping between user-defined keys and SavedModel assets.
apply_buckets(...)
: Returns a bucketized column, with a bucket index assigned to each input.
apply_buckets_with_interpolation(...)
: Interpolates within the provided buckets and then normalizes to 0 to 1.
apply_pyfunc(...)
: Applies a python function to some Tensor
s.
apply_vocabulary(...)
: Maps x
to a vocabulary specified by the deferred tensor.
bag_of_words(...)
: Computes a bag of "words" based on the specified ngram configuration.
bucketize(...)
: Returns a bucketized column, with a bucket index assigned to each input.
bucketize_per_key(...)
: Returns a bucketized column, with a bucket index assigned to each input.
compute_and_apply_vocabulary(...)
: Generates a vocabulary for x
and maps it to an integer with this vocab.
count_per_key(...)
: Computes the count of each element of a Tensor
.
covariance(...)
: Computes the covariance matrix over the whole dataset.
deduplicate_tensor_per_row(...)
: Deduplicates each row (0-th dimension) of the provided tensor.
estimated_probability_density(...)
: Computes an approximate probability density at each x, given the bins.
get_analyze_input_columns(...)
: Return columns that are required inputs of AnalyzeDataset
.
get_num_buckets_for_transformed_feature(...)
: Provides the number of buckets for a transformed feature if annotated.
get_transform_input_columns(...)
: Return columns that are required inputs of TransformDataset
.
hash_strings(...)
: Hash strings into buckets.
histogram(...)
: Computes a histogram over x, given the bin boundaries or bin count.
make_and_track_object(...)
: Keeps track of the object created by invoking trackable_factory_callable
.
max(...)
: Computes the maximum of the values of x
over the whole dataset.
mean(...)
: Computes the mean of the values of a Tensor
over the whole dataset.
min(...)
: Computes the minimum of the values of x
over the whole dataset.
ngrams(...)
: Create a SparseTensor
of n-grams.
pca(...)
: Computes PCA on the dataset using biased covariance.
quantiles(...)
: Computes the quantile boundaries of a Tensor
over the whole dataset.
scale_by_min_max(...)
: Scale a numerical column into the range [output_min, output_max].
scale_by_min_max_per_key(...)
: Scale a numerical column into a predefined range on a per-key basis.
scale_to_0_1(...)
: Returns a column which is the input column scaled to have range [0,1].
scale_to_0_1_per_key(...)
: Returns a column which is the input column scaled to have range [0,1].
scale_to_gaussian(...)
: Returns an (approximately) normal column with mean to 0 and variance 1.
scale_to_z_score(...)
: Returns a standardized column with mean 0 and variance 1.
scale_to_z_score_per_key(...)
: Returns a standardized column with mean 0 and variance 1, grouped per key.
segment_indices(...)
: Returns a Tensor
of indices within each segment.
size(...)
: Computes the total size of instances in a Tensor
over the whole dataset.
sparse_tensor_left_align(...)
: Re-arranges a tf.SparseTensor
and returns a left-aligned version of it.
sparse_tensor_to_dense_with_shape(...)
: Converts a SparseTensor
into a dense tensor and sets its shape.
sum(...)
: Computes the sum of the values of a Tensor
over the whole dataset.
tfidf(...)
: Maps the terms in x to their term frequency * inverse document frequency.
tukey_h_params(...)
: Computes the h parameters of the values of a Tensor
over the dataset.
tukey_location(...)
: Computes the location of the values of a Tensor
over the whole dataset.
tukey_scale(...)
: Computes the scale of the values of a Tensor
over the whole dataset.
var(...)
: Computes the variance of the values of a Tensor
over the whole dataset.
vocabulary(...)
: Computes the unique values of x
over the whole dataset.
word_count(...)
: Find the token count of each document/row.