Maps the terms in x to their inverse document frequency in the same order.
tft.experimental.idf(
    x: tf.SparseTensor,
    vocab_size: int,
    smooth: bool = True,
    add_baseline: bool = True,
    name: Optional[str] = None
) -> tf.SparseTensor
The inverse document frequency of a term, by default, is calculated as
1 + log ((corpus size + 1) / (count of documents containing term + 1)).
Example usage:
def preprocessing_fn(inputs):
  integerized = tft.compute_and_apply_vocabulary(inputs['x'])
  vocab_size = tft.get_num_buckets_for_transformed_feature(integerized)
  idf_weights = tft.experimental.idf(integerized, vocab_size)
  return {
     'idf': idf_weights,
     'integerized': integerized,
  }
raw_data = [dict(x=["I", "like", "pie", "pie", "pie"]),
            dict(x=["yum", "yum", "pie"])]
feature_spec = dict(x=tf.io.VarLenFeature(tf.string))
raw_data_metadata = tft.DatasetMetadata.from_feature_spec(feature_spec)
with tft_beam.Context(temp_dir=tempfile.mkdtemp()):
  transformed_dataset, transform_fn = (
      (raw_data, raw_data_metadata)
      | tft_beam.AnalyzeAndTransformDataset(preprocessing_fn))
transformed_data, transformed_metadata = transformed_dataset
# 1 + log(3/2) = 1.4054651
transformed_data
[{'idf': array([1.4054651, 1.4054651, 1., 1., 1.], dtype=float32),
  'integerized': array([3, 2, 0, 0, 0])},
 {'idf': array([1.4054651, 1.4054651, 1.], dtype=float32),
  'integerized': array([1, 1, 0])}]
  example strings: [["I", "like", "pie", "pie", "pie"], ["yum", "yum", "pie]]
  in: SparseTensor(indices=[[0, 0], [0, 1], [0, 2], [0, 3], [0, 4],
                            [1, 0], [1, 1], [1, 2]],
                   values=[1, 2, 0, 0, 0, 3, 3, 0])
  out: SparseTensor(indices=[[0, 0], [0, 1], [0, 2], [0, 3], [0, 4],
                            [1, 0], [1, 1], [1, 2]],
                   values=[1 + log(3/2), 1 + log(3/2), 1, 1, 1,
                           1 + log(3/2), 1 + log(3/2), 1])
| Args | 
|---|
| x | A 2D SparseTensorrepresenting int64 values (most likely that are the
result of callingcompute_and_apply_vocabularyon a tokenized string). | 
| vocab_size | An int - the count of vocab used to turn the string into int64s
including any OOV buckets. | 
| smooth | A bool indicating if the inverse document frequency should be
smoothed. If True, which is the default, then the idf is calculated as 1 +
log((corpus size + 1) / (document frequency of term + 1)). Otherwise, the
idf is 1 + log((corpus size) / (document frequency of term)), which could
result in a division by zero error. | 
| add_baseline | A bool indicating if the inverse document frequency should be
added with a constant baseline 1.0. If True, which is the default, then
the idf is calculated as 1 + log(). Otherwise, the idf is log() without
the constant 1 baseline. Keeping the baseline reduces the discrepancy in
idf between commonly seen terms and rare terms. | 
| name | (Optional) A name for this operation. | 
| Returns | 
|---|
| SparseTensors with indices [index_in_batch, index_in_local_sequence] and
values inverse document frequency. Same shape as the inputx. | 
| Raises | 
|---|
| ValueError if xdoes not have 2 dimensions. |