View source on GitHub |
Optimization package definition.
Modules
adafactor_optimizer
module: Adafactor optimizer.
base_config
module: Base configurations to standardize experiments.
configs
module
ema_optimizer
module: Exponential moving average optimizer.
lamb
module: Layer-wise Adaptive Moments (LAMB) optimizer.
lars
module: Layer-wise adaptive rate scaling optimizer.
legacy_adamw
module: Adam optimizer with weight decay that exactly matches the original BERT.
lr_cfg
module: Dataclasses for learning rate schedule config.
lr_schedule
module: Learning rate schedule classes.
math
module: This module provides access to the mathematical functions defined by the C standard.
oneof
module: Config class that supports oneof functionality.
opt_cfg
module: Dataclasses for optimizer configs.
optimizer_factory
module: Optimizer factory class.
slide_optimizer
module: SLIDE optimizer.
Classes
class AdafactorConfig
: Configuration for Adafactor optimizer.
class AdagradConfig
: Configuration for Adagrad optimizer.
class AdamConfig
: Configuration for Adam optimizer.
class AdamExperimentalConfig
: Configuration for experimental Adam optimizer.
class AdamWeightDecayConfig
: Configuration for Adam optimizer with weight decay.
class AdamWeightDecayExperimentalConfig
: Configuration for Adam optimizer with weight decay.
class BaseOptimizerConfig
: Base optimizer config.
class ConstantLrConfig
: Configuration for constant learning rate.
class CosineDecayWithOffset
: A LearningRateSchedule that uses a cosine decay with optional warmup.
class CosineLrConfig
: Configuration for Cosine learning rate decay.
class DirectPowerDecay
: Learning rate schedule follows lr * (step)^power.
class DirectPowerLrConfig
: Configuration for DirectPower learning rate decay.
class EMAConfig
: Exponential moving average optimizer config.
class ExponentialDecayWithOffset
: A LearningRateSchedule that uses an exponential decay schedule.
class ExponentialLrConfig
: Configuration for exponential learning rate decay.
class ExponentialMovingAverage
: Optimizer that computes an exponential moving average of the variables.
class LAMBConfig
: Configuration for LAMB optimizer.
class LARSConfig
: Layer-wise adaptive rate scaling config.
class LinearWarmup
: Linear warmup schedule.
class LinearWarmupConfig
: Configuration for linear warmup schedule config.
class LrConfig
: Configuration for lr schedule.
class OptimizationConfig
: Configuration for optimizer and learning rate schedule.
class OptimizerConfig
: Configuration for optimizer.
class OptimizerFactory
: Optimizer factory class.
class PiecewiseConstantDecayWithOffset
: A LearningRateSchedule that uses a piecewise constant decay schedule.
class PolynomialDecayWithOffset
: A LearningRateSchedule that uses a polynomial decay schedule.
class PolynomialLrConfig
: Configuration for polynomial learning rate decay.
class PolynomialWarmUp
: Applies polynomial warmup schedule on a given learning rate decay schedule.
class PolynomialWarmupConfig
: Configuration for linear warmup schedule config.
class PowerAndLinearDecay
: Learning rate schedule with multiplied by linear decay at the end.
class PowerAndLinearDecayLrConfig
: Configuration for DirectPower learning rate decay.
class PowerDecayWithOffset
: Power learning rate decay with offset.
class PowerDecayWithOffsetLrConfig
: Configuration for power learning rate decay with step offset.
class RMSPropConfig
: Configuration for RMSProp optimizer.
class SGDConfig
: Configuration for SGD optimizer.
class SGDExperimentalConfig
: Configuration for SGD optimizer.
class SLIDEConfig
: Configuration for SLIDE optimizer.
class StepCosineDecayWithOffset
: Stepwise cosine learning rate decay with offset.
class StepCosineLrConfig
: Configuration for stepwise learning rate decay.
class StepwiseLrConfig
: Configuration for stepwise learning rate decay.
class WarmupConfig
: Configuration for lr schedule.
Functions
register_optimizer_cls(...)
: Register customize optimizer cls.