tf.contrib.layers.sparse_column_with_integerized_feature
Stay organized with collections
Save and categorize content based on your preferences.
Creates an integerized _SparseColumn.
tf.contrib.layers.sparse_column_with_integerized_feature(
column_name, bucket_size, combiner='sum', dtype=tf.dtypes.int64
)
Use this when your features are already pre-integerized into int64 IDs, that
is, when the set of values to output is already coming in as what's desired in
the output. Integerized means we can use the feature value itself as id.
Typically this is used for reading contiguous ranges of integers indexes, but
it doesn't have to be. The output value is simply copied from the
input_feature, whatever it is. Just be aware, however, that if you have large
gaps of unused integers it might affect what you feed those in (for instance,
if you make up a one-hot tensor from these, the unused integers will appear as
values in the tensor which are always zero.)
Args |
column_name
|
A string defining sparse column name.
|
bucket_size
|
An int that is >= 1. The number of buckets. It should be bigger
than maximum feature. In other words features in this column should be an
int64 in range [0, bucket_size)
|
combiner
|
A string specifying how to reduce if the sparse column is
multivalent. Currently "mean", "sqrtn" and "sum" are supported, with "sum"
the default. "sqrtn" often achieves good accuracy, in particular with
bag-of-words columns.
- "sum": do not normalize features in the column
- "mean": do l1 normalization on features in the column
- "sqrtn": do l2 normalization on features in the column
For more information:
tf.embedding_lookup_sparse .
|
dtype
|
Type of features. It should be an integer type. Default value is
dtypes.int64.
|
Returns |
An integerized _SparseColumn definition.
|
Raises |
ValueError
|
bucket_size is less than 1.
|
ValueError
|
dtype is not integer.
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2020-10-01 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2020-10-01 UTC."],[],[]]