Warning: This project is deprecated. Swift for TensorFlow was an experiment in the next-generation platform for machine learning, incorporating the latest research across machine learning, compilers, differentiable programming, systems design, and beyond. It was archived in February 2021.

AdaDelta

public class AdaDelta<Model: Differentiable>: Optimizer
where
  Model.TangentVector: VectorProtocol & PointwiseMultiplicative
    & ElementaryFunctions & KeyPathIterable,
  Model.TangentVector.VectorSpaceScalar == Float

An AdaDelta optimizer.

Implements the AdaDelta optimization algorithm. AdaDelta is a stochastic gradient descent method based on the first order information. It adapts learning rates based on a moving window of gradient updates, instead of accumulating all past gradients. Thus, AdaDelta continues learning even when many updates have been done. It adapts faster to changing dynamics of the optimization problem space.

Reference: “ADADELTA: An Adaptive Learning Rate Method” (Zeiler, 2012)

Model
Declaration
public typealias Model = Model
learningRate
The learning rate.
Declaration
public var learningRate: Float
rho
The decay factor, corresponding to the fraction of gradient to keep at each time step.
Declaration
public var rho: Float
epsilon
A small scalar added to the denominator to improve numerical stability.
Declaration
public var epsilon: Float
decay
The learning rate decay.
Declaration
public var decay: Float
step
The current step.
Declaration
public var step: Int
averageSquared
The accumulated, exponentially decaying average of squared gradients.
Declaration
public var averageSquared: Model.TangentVector
accumulatedDelta
The accumulated parameter updates.
Declaration
public var accumulatedDelta: Model.TangentVector


                    
                    
                    init(for:learningRate:rho:epsilon:decay:)

Creates an instance for model.

Declaration

public init(
  for model: __shared Model,
  learningRate: Float = 1,
  rho: Float = 0.95,
  epsilon: Float = 1e-6,
  decay: Float = 0
)

Parameters

`learningRate`	The learning rate. The default value is `1`.
`rho`	The decay factor. The default value is `0.95`.
`epsilon`	A small scalar added to the denominator to improve numerical stability. The default value is `1e-6`.
`decay`	The learning rate decay. The defalut value is `0`.


                    
                    
                    update(_:along:)

Declaration

public func update(_ model: inout Model, along direction: Model.TangentVector)


                    
                    
                    init(copying:to:)

Declaration

public required init(copying other: AdaDelta, to device: Device)