tff.learning.optimizers.build_adamw
Stay organized with collections
Save and categorize content based on your preferences.
Returns a tff.learning.optimizers.Optimizer
for AdamW.
tff.learning.optimizers.build_adamw(
learning_rate: optimizer.Float,
beta_1: optimizer.Float = 0.9,
beta_2: optimizer.Float = 0.999,
epsilon: optimizer.Float = 1e-07,
weight_decay: optimizer.Float = 0.004
) -> tff.learning.optimizers.Optimizer
The AdamW optimizer is based on Decoupled Weight Decay
Regularization
The update rule given learning rate lr
, epsilon eps
, accumulator acc
,
preconditioner s
, weigh decay lambda
, iteration t
, weights w
and
gradients g
is:
acc = beta_1 * acc + (1 - beta_1) * g
s = beta_2 * s + (1 - beta_2) * g**2
normalization = sqrt(1 - beta_2**t) / (1 - beta_1**t)
w = w - lr * (normalization * acc / (sqrt(s) + eps) + lambda * w)
Args |
learning_rate
|
A positive float for learning rate.
|
beta_1
|
A float between 0.0 and 1.0 for the decay used to track the
previous gradients.
|
beta_2
|
A float between 0.0 and 1.0 for the decay used to track the
magnitude (second moment) of previous gradients.
|
epsilon
|
A small non-negative float , used to maintain numerical stability.
|
weight_decay
|
A non-negative float , governing the amount of weight decay.
When set to 0, this recovers Adam.
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-09-20 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-09-20 UTC."],[],[]]