tf.compat.v1.train.MonitoredSession
Stay organized with collections
Save and categorize content based on your preferences.
Session-like object that handles initialization, recovery and hooks.
tf.compat.v1.train.MonitoredSession(
session_creator=None, hooks=None, stop_grace_period_secs=120
)
Example usage:
saver_hook = CheckpointSaverHook(...)
summary_hook = SummarySaverHook(...)
with MonitoredSession(session_creator=ChiefSessionCreator(...),
hooks=[saver_hook, summary_hook]) as sess:
while not sess.should_stop():
sess.run(train_op)
Initialization: At creation time the monitored session does following things
in given order:
- calls
hook.begin()
for each given hook
- finalizes the graph via
scaffold.finalize()
- create session
- initializes the model via initialization ops provided by
Scaffold
- restores variables if a checkpoint exists
- launches queue runners
- calls
hook.after_create_session()
Run: When run()
is called, the monitored session does following things:
- calls
hook.before_run()
- calls TensorFlow
session.run()
with merged fetches and feed_dict
- calls
hook.after_run()
- returns result of
session.run()
asked by user
- if
AbortedError
or UnavailableError
occurs, it recovers or
reinitializes the session before executing the run() call again
Exit: At the close()
, the monitored session does following things in order:
- calls
hook.end()
- closes the queue runners and the session
- suppresses
OutOfRange
error which indicates that all inputs have been
processed if the monitored_session is used as a context
How to set tf.compat.v1.Session
arguments:
- In most cases you can set session arguments as follows:
MonitoredSession(
session_creator=ChiefSessionCreator(master=..., config=...))
- In distributed setting for a non-chief worker, you can use following:
MonitoredSession(
session_creator=WorkerSessionCreator(master=..., config=...))
See MonitoredTrainingSession
for an example usage based on chief or worker.
- it cannot be set as default session.
- it cannot be sent to saver.save.
- it cannot be sent to tf.train.start_queue_runners.
Args |
session_creator
|
A factory object to create session. Typically a
ChiefSessionCreator which is the default one.
|
hooks
|
An iterable of `SessionRunHook' objects.
|
Returns |
A MonitoredSession object.
|
Attributes |
graph
|
The graph that was launched in this session.
|
Child Classes
class StepContext
Methods
close
View source
close()
run
View source
run(
fetches, feed_dict=None, options=None, run_metadata=None
)
Run ops in the monitored session.
This method is completely compatible with the tf.Session.run()
method.
Args |
fetches
|
Same as tf.Session.run() .
|
feed_dict
|
Same as tf.Session.run() .
|
options
|
Same as tf.Session.run() .
|
run_metadata
|
Same as tf.Session.run() .
|
Returns |
Same as tf.Session.run() .
|
run_step_fn
View source
run_step_fn(
step_fn
)
Run ops using a step function.
Args |
step_fn
|
A function or a method with a single argument of type
StepContext . The function may use methods of the argument to perform
computations with access to a raw session. The returned value of the
step_fn will be returned from run_step_fn , unless a stop is
requested. In that case, the next should_stop call will return True.
Example usage:
```python
with tf.Graph().as_default():
c = tf.compat.v1.placeholder(dtypes.float32)
v = tf.add(c, 4.0)
w = tf.add(c, 0.5)
def step_fn(step_context):
a = step_context.session.run(fetches=v, feed_dict={c: 0.5})
if a <= 4.5:
step_context.request_stop()
return step_context.run_with_hooks(fetches=w,
feed_dict={c: 0.1})
with tf.MonitoredSession() as session:
while not session.should_stop():
a = session.run_step_fn(step_fn)
```
Hooks interact with the `run_with_hooks()` call inside the
`step_fn` as they do with a `MonitoredSession.run` call.
|
Returns |
Returns the returned value of step_fn .
|
Raises |
StopIteration
|
if step_fn has called request_stop() . It may be
caught by with tf.MonitoredSession() to close the session.
|
ValueError
|
if step_fn doesn't have a single argument called
step_context . It may also optionally have self for cases when it
belongs to an object.
|
should_stop
View source
should_stop()
__enter__
View source
__enter__()
__exit__
View source
__exit__(
exception_type, exception_value, traceback
)
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Some content is licensed under the numpy license.
Last updated 2021-08-16 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2021-08-16 UTC."],[],[],null,["# tf.compat.v1.train.MonitoredSession\n\n\u003cbr /\u003e\n\n|------------------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/tensorflow/blob/v2.6.0/tensorflow/python/training/monitored_session.py#L955-L1039) |\n\nSession-like object that handles initialization, recovery and hooks. \n\n tf.compat.v1.train.MonitoredSession(\n session_creator=None, hooks=None, stop_grace_period_secs=120\n )\n\n#### Example usage:\n\n saver_hook = CheckpointSaverHook(...)\n summary_hook = SummarySaverHook(...)\n with MonitoredSession(session_creator=ChiefSessionCreator(...),\n hooks=[saver_hook, summary_hook]) as sess:\n while not sess.should_stop():\n sess.run(train_op)\n\nInitialization: At creation time the monitored session does following things\nin given order:\n\n- calls `hook.begin()` for each given hook\n- finalizes the graph via `scaffold.finalize()`\n- create session\n- initializes the model via initialization ops provided by `Scaffold`\n- restores variables if a checkpoint exists\n- launches queue runners\n- calls `hook.after_create_session()`\n\nRun: When `run()` is called, the monitored session does following things:\n\n- calls `hook.before_run()`\n- calls TensorFlow `session.run()` with merged fetches and feed_dict\n- calls `hook.after_run()`\n- returns result of `session.run()` asked by user\n- if `AbortedError` or `UnavailableError` occurs, it recovers or reinitializes the session before executing the run() call again\n\nExit: At the `close()`, the monitored session does following things in order:\n\n- calls `hook.end()`\n- closes the queue runners and the session\n- suppresses `OutOfRange` error which indicates that all inputs have been processed if the monitored_session is used as a context\n\nHow to set [`tf.compat.v1.Session`](../../../../tf/compat/v1/Session) arguments:\n\n- In most cases you can set session arguments as follows:\n\n MonitoredSession(\n session_creator=ChiefSessionCreator(master=..., config=...))\n\n- In distributed setting for a non-chief worker, you can use following:\n\n MonitoredSession(\n session_creator=WorkerSessionCreator(master=..., config=...))\n\nSee `MonitoredTrainingSession` for an example usage based on chief or worker.\n| **Note:** This is not a [`tf.compat.v1.Session`](../../../../tf/compat/v1/Session). For example, it cannot do following:\n\n- it cannot be set as default session.\n- it cannot be sent to saver.save.\n- it cannot be sent to tf.train.start_queue_runners.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|-------------------|-------------------------------------------------------------------------------------------------|\n| `session_creator` | A factory object to create session. Typically a `ChiefSessionCreator` which is the default one. |\n| `hooks` | An iterable of \\`SessionRunHook' objects. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|---|---|\n| A MonitoredSession object. ||\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Attributes ---------- ||\n|---------|----------------------------------------------|\n| `graph` | The graph that was launched in this session. |\n\n\u003cbr /\u003e\n\nChild Classes\n-------------\n\n[`class StepContext`](../../../../tf/compat/v1/train/MonitoredSession/StepContext)\n\nMethods\n-------\n\n### `close`\n\n[View source](https://github.com/tensorflow/tensorflow/blob/v2.6.0/tensorflow/python/training/monitored_session.py#L877-L878) \n\n close()\n\n### `run`\n\n[View source](https://github.com/tensorflow/tensorflow/blob/v2.6.0/tensorflow/python/training/monitored_session.py#L761-L779) \n\n run(\n fetches, feed_dict=None, options=None, run_metadata=None\n )\n\nRun ops in the monitored session.\n\nThis method is completely compatible with the `tf.Session.run()` method.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ||\n|----------------|-----------------------------|\n| `fetches` | Same as `tf.Session.run()`. |\n| `feed_dict` | Same as `tf.Session.run()`. |\n| `options` | Same as `tf.Session.run()`. |\n| `run_metadata` | Same as `tf.Session.run()`. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ||\n|---|---|\n| Same as `tf.Session.run()`. ||\n\n\u003cbr /\u003e\n\n### `run_step_fn`\n\n[View source](https://github.com/tensorflow/tensorflow/blob/v2.6.0/tensorflow/python/training/monitored_session.py#L781-L835) \n\n run_step_fn(\n step_fn\n )\n\nRun ops using a step function.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ||\n|-----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `step_fn` | A function or a method with a single argument of type `StepContext`. The function may use methods of the argument to perform computations with access to a raw session. The returned value of the `step_fn` will be returned from `run_step_fn`, unless a stop is requested. In that case, the next `should_stop` call will return True. Example usage: \u003cbr /\u003e ```python with tf.Graph().as_default(): c = tf.compat.v1.placeholder(dtypes.float32) v = tf.add(c, 4.0) w = tf.add(c, 0.5) def step_fn(step_context): a = step_context.session.run(fetches=v, feed_dict={c: 0.5}) if a \u003c= 4.5: step_context.request_stop() return step_context.run_with_hooks(fetches=w, feed_dict={c: 0.1}) with tf.MonitoredSession() as session: while not session.should_stop(): a = session.run_step_fn(step_fn) ``` Hooks interact with the `run_with_hooks()` call inside the `step_fn` as they do with a `MonitoredSession.run` call. \u003cbr /\u003e |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ||\n|---|---|\n| Returns the returned value of `step_fn`. ||\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ||\n|-----------------|-----------------------------------------------------------------------------------------------------------------------------------------------|\n| `StopIteration` | if `step_fn` has called `request_stop()`. It may be caught by `with tf.MonitoredSession()` to close the session. |\n| `ValueError` | if `step_fn` doesn't have a single argument called `step_context`. It may also optionally have `self` for cases when it belongs to an object. |\n\n\u003cbr /\u003e\n\n### `should_stop`\n\n[View source](https://github.com/tensorflow/tensorflow/blob/v2.6.0/tensorflow/python/training/monitored_session.py#L874-L875) \n\n should_stop()\n\n### `__enter__`\n\n[View source](https://github.com/tensorflow/tensorflow/blob/v2.6.0/tensorflow/python/training/monitored_session.py#L880-L881) \n\n __enter__()\n\n### `__exit__`\n\n[View source](https://github.com/tensorflow/tensorflow/blob/v2.6.0/tensorflow/python/training/monitored_session.py#L883-L888) \n\n __exit__(\n exception_type, exception_value, traceback\n )"]]