tf.contrib.opt.extend_with_decoupled_weight_decay
Stay organized with collections
Save and categorize content based on your preferences.
Factory function returning an optimizer class with decoupled weight decay.
tf.contrib.opt.extend_with_decoupled_weight_decay(
base_optimizer
)
Returns an optimizer class. An instance of the returned class computes the
update step of base_optimizer
and additionally decays the weights.
E.g., the class returned by
extend_with_decoupled_weight_decay(tf.compat.v1.train.AdamOptimizer)
is
equivalent to
tf.contrib.opt.AdamWOptimizer
.
The API of the new optimizer class slightly differs from the API of the
base optimizer:
- The first argument to the constructor is the weight decay rate.
minimize
and apply_gradients
accept the optional keyword argument
decay_var_list
, which specifies the variables that should be decayed.
If None
, all variables that are optimized are decayed.
Usage example:
# MyAdamW is a new class
MyAdamW = extend_with_decoupled_weight_decay(tf.compat.v1.train.AdamOptimizer)
# Create a MyAdamW object
optimizer = MyAdamW(weight_decay=0.001, learning_rate=0.001)
sess.run(optimizer.minimize(loss, decay_variables=[var1, var2]))
Note that this extension decays weights BEFORE applying the update based
on the gradient, i.e. this extension only has the desired behaviour for
optimizers which do not depend on the value of'var' in the update step!
Args |
base_optimizer
|
An optimizer class that inherits from tf.train.Optimizer.
|
Returns |
A new optimizer class that inherits from DecoupledWeightDecayExtension
and base_optimizer.
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2020-10-01 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2020-10-01 UTC."],[],[],null,["# tf.contrib.opt.extend_with_decoupled_weight_decay\n\n\u003cbr /\u003e\n\n|------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/contrib/opt/python/training/weight_decay_optimizers.py#L229-L289) |\n\nFactory function returning an optimizer class with decoupled weight decay. \n\n tf.contrib.opt.extend_with_decoupled_weight_decay(\n base_optimizer\n )\n\nReturns an optimizer class. An instance of the returned class computes the\nupdate step of `base_optimizer` and additionally decays the weights.\nE.g., the class returned by\n`extend_with_decoupled_weight_decay(tf.compat.v1.train.AdamOptimizer)` is\nequivalent to\n[`tf.contrib.opt.AdamWOptimizer`](../../../tf/contrib/opt/AdamWOptimizer).\n\nThe API of the new optimizer class slightly differs from the API of the\nbase optimizer:\n\n- The first argument to the constructor is the weight decay rate.\n- `minimize` and `apply_gradients` accept the optional keyword argument `decay_var_list`, which specifies the variables that should be decayed. If `None`, all variables that are optimized are decayed.\n\n#### Usage example:\n\n # MyAdamW is a new class\n MyAdamW = extend_with_decoupled_weight_decay(tf.compat.v1.train.AdamOptimizer)\n # Create a MyAdamW object\n optimizer = MyAdamW(weight_decay=0.001, learning_rate=0.001)\n sess.run(optimizer.minimize(loss, decay_variables=[var1, var2]))\n\n Note that this extension decays weights BEFORE applying the update based\n on the gradient, i.e. this extension only has the desired behaviour for\n optimizers which do not depend on the value of'var' in the update step!\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|------------------|-----------------------------------------------------------|\n| `base_optimizer` | An optimizer class that inherits from tf.train.Optimizer. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|---|---|\n| A new optimizer class that inherits from DecoupledWeightDecayExtension and base_optimizer. ||\n\n\u003cbr /\u003e"]]