tf.contrib.layers.sparse_column_with_vocabulary_file
Stay organized with collections
Save and categorize content based on your preferences.
Creates a _SparseColumn with vocabulary file configuration.
tf.contrib.layers.sparse_column_with_vocabulary_file(
column_name, vocabulary_file, num_oov_buckets=0, vocab_size=None,
default_value=-1, combiner='sum', dtype=tf.dtypes.string
)
Use this when your sparse features are in string or integer format, and you
have a vocab file that maps each value to an integer ID.
output_id = LookupIdFromVocab(input_feature_string)
Args |
column_name
|
A string defining sparse column name.
|
vocabulary_file
|
The vocabulary filename.
|
num_oov_buckets
|
The number of out-of-vocabulary buckets. If zero all out of
vocabulary features will be ignored.
|
vocab_size
|
Number of the elements in the vocabulary.
|
default_value
|
The value to use for out-of-vocabulary feature values.
Defaults to -1.
|
combiner
|
A string specifying how to reduce if the sparse column is
multivalent. Currently "mean", "sqrtn" and "sum" are supported, with "sum"
the default. "sqrtn" often achieves good accuracy, in particular with
bag-of-words columns.
- "sum": do not normalize features in the column
- "mean": do l1 normalization on features in the column
- "sqrtn": do l2 normalization on features in the column
For more information:
tf.embedding_lookup_sparse .
|
dtype
|
The type of features. Only string and integer types are supported.
|
Returns |
A _SparseColumn with vocabulary file configuration.
|
Raises |
ValueError
|
vocab_size is not defined.
|
ValueError
|
dtype is neither string nor integer.
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2020-10-01 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2020-10-01 UTC."],[],[]]