View source on GitHub |
Train action to recover from loss blowup.
tfm.core.actions.RecoveryAction(
checkpoint_manager: tf.train.CheckpointManager
)
Checks the loss value by the given threshold. If applicable, recover the model by reading the checkpoint on disk.
Methods
__call__
__call__(
_
)
Recovers the training by triggering checkpoint restoration.