tf.compat.v1.tpu.outside_compilation
Stay organized with collections
Save and categorize content based on your preferences.
Builds part of a computation outside any current TPU replicate scope.
tf.compat.v1.tpu.outside_compilation(
computation: Callable[..., Any], *args, **kwargs
) -> Any
tf.tpu.outside_compilation()
is used to run ops in computation
on CPU
instead of running on TPU. For example, users can run ops that are not
supported on TPU's (e.g. tf.summary.write()) by explicitly placing those
ops on CPU's. Below usage of outside compilation will place ops in
computation_with_string_ops
on CPU.
Example usage:
def computation_with_string_ops(x):
# strings types are not supported on TPU's and below ops must
# run on CPU instead.
output = tf.strings.format('1{}', x)
return tf.strings.to_number(output)
def tpu_computation():
# Expected output is 11.
output = tf.tpu.outside_compilation(computation_with_string_ops, 1)
Outside compilation should be called inside TPUReplicateContext. That is,
tf.tpu.outside_compilation()
should be called inside a function that is
passed to tpu.split_compile_and_replicate()
-- this is implied when
outside compilation is invoked inside a function passed to TPUStrategy
run()
. If invoked outside of TPUReplicateContext,
then this simply returns the result of computation
, and therefore,
would be a no-op. Note that outside compilation is different from
tf.distribute.experimental.TPUStrategy.merge_call()
as logic in
outside compilation is replicated and executed separately for each
replica. On the other hand, merge_call()
requires a merge_fn
to aggregate the inputs from different replicas and is executed only
once.
For variables placed in TPU device, which includes variables created inside
TPUStrategy scope, outside compilation logic must not include variable
read/write. For variables placed on host, variable read/write is only allowed
if the variable is not accessed by any other ops in the TPU computation.
Variable read/write from outside compilation cluster is not visible from TPU
computation and vice versa. Therefore, if outside compilation logic contains
such host variables read/write ops and if the variables are accessed by TPU
computation as well, then this may lead to deadlock.
Internally, tf.tpu.outside_compilation()
adds outside compilation
attributes to all ops in computation
. During a later passes ops with outside
compilation attributes are moved to a host-side graph. Inputs to this extract
host-side graph are sent from TPU computation graph to host graph via a pair
of XlaSendToHost and XlaRecvFromHost ops. Note that using
tf.tpu.outside_compilation()
may result in tensor transfer between TPU and
CPU, leading to non-trivial performance impact.
Args |
computation
|
A Python function that builds the computation to place on the
host.
|
*args
|
the positional arguments for the computation.
|
**kwargs
|
the keyword arguments for the computation.
|
Returns |
The Tensors returned by computation.
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Some content is licensed under the numpy license.
Last updated 2024-04-26 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-04-26 UTC."],[],[],null,["# tf.compat.v1.tpu.outside_compilation\n\n\u003cbr /\u003e\n\n|-----------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/tensorflow/blob/v2.16.1/tensorflow/python/tpu/tpu_replication.py#L654-L719) |\n\nBuilds part of a computation outside any current TPU replicate scope. \n\n tf.compat.v1.tpu.outside_compilation(\n computation: Callable[..., Any], *args, **kwargs\n ) -\u003e Any\n\n`tf.tpu.outside_compilation()` is used to run ops in `computation` on CPU\ninstead of running on TPU. For example, users can run ops that are not\nsupported on TPU's (e.g. tf.summary.write()) by explicitly placing those\nops on CPU's. Below usage of outside compilation will place ops in\n`computation_with_string_ops` on CPU.\n\n#### Example usage:\n\n def computation_with_string_ops(x):\n # strings types are not supported on TPU's and below ops must\n # run on CPU instead.\n output = tf.strings.format('1{}', x)\n return tf.strings.to_number(output)\n\n def tpu_computation():\n # Expected output is 11.\n output = tf.tpu.outside_compilation(computation_with_string_ops, 1)\n\nOutside compilation should be called inside TPUReplicateContext. That is,\n`tf.tpu.outside_compilation()` should be called inside a function that is\npassed to `tpu.split_compile_and_replicate()` -- this is implied when\noutside compilation is invoked inside a function passed to TPUStrategy\n`run()`. If invoked outside of TPUReplicateContext,\nthen this simply returns the result of `computation`, and therefore,\nwould be a no-op. Note that outside compilation is different from\n`tf.distribute.experimental.TPUStrategy.merge_call()` as logic in\noutside compilation is replicated and executed separately for each\nreplica. On the other hand, `merge_call()` requires a `merge_fn`\nto aggregate the inputs from different replicas and is executed only\nonce.\n\nFor variables placed in TPU device, which includes variables created inside\nTPUStrategy scope, outside compilation logic must not include variable\nread/write. For variables placed on host, variable read/write is only allowed\nif the variable is not accessed by any other ops in the TPU computation.\nVariable read/write from outside compilation cluster is not visible from TPU\ncomputation and vice versa. Therefore, if outside compilation logic contains\nsuch host variables read/write ops and if the variables are accessed by TPU\ncomputation as well, then this may lead to deadlock.\n\nInternally, `tf.tpu.outside_compilation()` adds outside compilation\nattributes to all ops in `computation`. During a later passes ops with outside\ncompilation attributes are moved to a host-side graph. Inputs to this extract\nhost-side graph are sent from TPU computation graph to host graph via a pair\nof XlaSendToHost and XlaRecvFromHost ops. Note that using\n`tf.tpu.outside_compilation()` may result in tensor transfer between TPU and\nCPU, leading to non-trivial performance impact.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|---------------|---------------------------------------------------------------------|\n| `computation` | A Python function that builds the computation to place on the host. |\n| `*args` | the positional arguments for the computation. |\n| `**kwargs` | the keyword arguments for the computation. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|---|---|\n| The Tensors returned by computation. ||\n\n\u003cbr /\u003e"]]