public class GeneralOptimizer<Model: EuclideanDifferentiable>: Optimizer
where
Model.TangentVector: VectorProtocol & ElementaryFunctions & KeyPathIterable,
Model.TangentVector.VectorSpaceScalar == Float
General optimizer that should be able to express multiple possible optimizations. The optimizer is composed of a mapping from ParameterGroup to ParameterGroupOptimizer. This optimizer also contains the number of elements working in a cross replica sum. This is for efficiency to prevent multiple inefficient iterations over the gradient.
-
Declaration
public typealias Model = Model
-
The set of steps taken.
Declaration
public var step: Int
-
Used to determine the scaling factor of the cross replica sum.
Declaration
public var crossReplicaSumCount: Int?
-
global optimizer state.
Declaration
public var optimizerState: OptimizerState
-
Current device of the model. (Used for constructing hyperparameters)
Declaration
public var device: Device
-
An array mapping nested weight indices to parameter group optimizers? Weight i will be optimized by
parameterGroups[parameterGroupIndices[i]]
Declaration
public var parameterGroupIndices: [Int]
-
An array of parameter group optimizers.
Declaration
public var parameterGroups: [ParameterGroupOptimizer]
-
Overall learning rate of the optimizer.
Declaration
public var learningRate: Float { get set }
-
Per-parameter group optimizer learning rates.
Declaration
public var learningRates: [Float] { get set }
-
Constructs an optimizer from a list of parameter group optimizers and a selector that divides the weights into different parameter groups. This is the most general constructor as there are many ways to construct this selector vector.
Declaration
public init( for model: __shared Model, _ kpPlan: TensorVisitorPlan<Model.TangentVector>, parameterGroupIndices: [Int], parameterGroups: [ParameterGroupOptimizer] )
-
Constructs an optimizer from a sequence of per-parameter group optimizers and then a final default parameter group optimizer. The
[Bool]
array is per weight and is true for the weights in that param group. The first parameterGroup will be used over subsequent ones.Declaration
public convenience init( for model: __shared Model, _ kpPlan: TensorVisitorPlan<Model.TangentVector>, parameterGroups: ([Bool], ParameterGroupOptimizer)..., defaultOptimizer: ParameterGroupOptimizer )
-
Copies the optimizer to the specified device.
Declaration
public required init(copying other: GeneralOptimizer, to device: Device)