Computes the quantile boundaries of a Tensor
over the whole dataset.
tft.quantiles(
x: tf.Tensor,
num_buckets: int,
epsilon: float,
weights: Optional[tf.Tensor] = None,
reduce_instance_dims: bool = True,
name: Optional[str] = None
) -> tf.Tensor
Quantile boundaries are computed using approximate quantiles,
and error tolerance is specified using epsilon
. The boundaries divide the
input tensor into approximately equal num_buckets
parts.
See go/squawd for details, and how to control the error due to approximation.
NaN input values and values with NaN weights are ignored.
Args |
x
|
An input Tensor .
|
num_buckets
|
Values in the x are divided into approximately equal-sized
buckets, where the number of buckets is num_buckets . The number of
returned quantiles is num_buckets - 1.
|
epsilon
|
Error tolerance, typically a small fraction close to zero (e.g.
0.01). Higher values of epsilon increase the quantile approximation, and
hence result in more unequal buckets, but could improve performance,
and resource consumption. Some measured results on memory consumption:
For epsilon = 0.001, the amount of memory for each buffer to hold the
summary for 1 trillion input values is ~25000 bytes. If epsilon is
relaxed to 0.01, the buffer size drops to ~2000 bytes for the same input
size. The buffer size also determines the amount of work in the
different stages of the beam pipeline, in general, larger epsilon
results in fewer and smaller stages, and less time. For more performance
trade-offs see also http://web.cs.ucla.edu/~weiwang/paper/SSDBM07_2.pdf
|
weights
|
(Optional) Weights tensor for the quantiles. Tensor must have the
same batch size as x.
|
reduce_instance_dims
|
By default collapses the batch and instance dimensions
to arrive at a single output vector. If False, only collapses the batch
dimension and outputs a vector of the same shape as the input.
|
name
|
(Optional) A name for this operation.
|
Returns |
The bucket boundaries represented as a list, with num_bucket-1 elements,
unless reduce_instance_dims is False, which results in a Tensor of
shape x.shape + [num_bucket-1].
See code below for discussion on the type of bucket boundaries.
|