tf.distribute.HierarchicalCopyAllReduce
Stay organized with collections
Save and categorize content based on your preferences.
Reduction using hierarchical copy all-reduce.
tf.distribute.HierarchicalCopyAllReduce(
num_packs=1
)
It reduces to one GPU along edges in some hierarchy and broadcasts back to
each GPU along the same path. Before performing all-reduce, tensors will be
repacked or aggregated for more efficient cross-device transportation.
This is a reduction created for Nvidia DGX-1 which assumes GPUs connects like
that on DGX-1 machine. If you have different GPU inter-connections, it is
likely that it would be slower than tf.distribute.ReductionToOneDevice
.
Args |
num_packs
|
values will be packed in this many splits. num_packs should
be greater than 0.
|
Raises |
ValueError if num_packs is zero or negative.
|
Methods
batch_reduce
View source
batch_reduce(
reduce_op, value_destination_pairs
)
Reduce PerReplica objects in a batch.
Reduce each first element in value_destination_pairs
to each second
element which indicates the destinations.
Args |
reduce_op
|
Indicates how per_replica_value will be reduced. Accepted
values are tf.distribute.ReduceOp.SUM , tf.distribute.ReduceOp.MEAN .
|
value_destination_pairs
|
a list or a tuple of tuples of PerReplica objects
(or tensors with device set if there is one device) and destinations.
|
Returns |
a list of Mirrored objects.
|
Raises |
ValueError
|
if value_destination_pairs is not a list or a tuple of
tuples of PerReplica objects and destinations
|
broadcast
View source
broadcast(
tensor, destinations
)
Broadcast the tensor
to destinations.
Args |
tensor
|
the tensor to broadcast.
|
destinations
|
the broadcast destinations.
|
Returns |
a Mirrored object.
|
reduce
View source
reduce(
reduce_op, per_replica_value, destinations
)
Reduce per_replica_value
to destinations
.
It runs the reduction operation defined by reduce_op
and put the
result on destinations
.
Returns |
a Mirrored object.
|
Raises |
ValueError
|
if per_replica_value can't be converted to a PerReplica
object.
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2020-10-01 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2020-10-01 UTC."],[],[],null,["# tf.distribute.HierarchicalCopyAllReduce\n\n\u003cbr /\u003e\n\n|-------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------|\n| [TensorFlow 1 version](/versions/r1.15/api_docs/python/tf/distribute/HierarchicalCopyAllReduce) | [View source on GitHub](https://github.com/tensorflow/tensorflow/blob/v2.0.0/tensorflow/python/distribute/cross_device_ops.py#L821-L849) |\n\nReduction using hierarchical copy all-reduce.\n\n#### View aliases\n\n\n**Compat aliases for migration**\n\nSee\n[Migration guide](https://www.tensorflow.org/guide/migrate) for\nmore details.\n\n[`tf.compat.v1.distribute.HierarchicalCopyAllReduce`](/api_docs/python/tf/distribute/HierarchicalCopyAllReduce)\n\n\u003cbr /\u003e\n\n tf.distribute.HierarchicalCopyAllReduce(\n num_packs=1\n )\n\nIt reduces to one GPU along edges in some hierarchy and broadcasts back to\neach GPU along the same path. Before performing all-reduce, tensors will be\nrepacked or aggregated for more efficient cross-device transportation.\n\nThis is a reduction created for Nvidia DGX-1 which assumes GPUs connects like\nthat on DGX-1 machine. If you have different GPU inter-connections, it is\nlikely that it would be slower than [`tf.distribute.ReductionToOneDevice`](../../tf/distribute/ReductionToOneDevice).\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|-------------|----------------------------------------------------------------------------------|\n| `num_packs` | values will be packed in this many splits. `num_packs` should be greater than 0. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ------ ||\n|---|---|\n| ValueError if `num_packs` is zero or negative. ||\n\n\u003cbr /\u003e\n\nMethods\n-------\n\n### `batch_reduce`\n\n[View source](https://github.com/tensorflow/tensorflow/blob/v2.0.0/tensorflow/python/distribute/cross_device_ops.py#L284-L324) \n\n batch_reduce(\n reduce_op, value_destination_pairs\n )\n\nReduce PerReplica objects in a batch.\n\nReduce each first element in `value_destination_pairs` to each second\nelement which indicates the destinations.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ||\n|---------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `reduce_op` | Indicates how per_replica_value will be reduced. Accepted values are [`tf.distribute.ReduceOp.SUM`](../../tf/distribute/ReduceOp#SUM), [`tf.distribute.ReduceOp.MEAN`](../../tf/distribute/ReduceOp#MEAN). |\n| `value_destination_pairs` | a list or a tuple of tuples of PerReplica objects (or tensors with device set if there is one device) and destinations. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ||\n|---|---|\n| a list of Mirrored objects. ||\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ||\n|--------------|--------------------------------------------------------------------------------------------------------|\n| `ValueError` | if `value_destination_pairs` is not a list or a tuple of tuples of PerReplica objects and destinations |\n\n\u003cbr /\u003e\n\n### `broadcast`\n\n[View source](https://github.com/tensorflow/tensorflow/blob/v2.0.0/tensorflow/python/distribute/cross_device_ops.py#L326-L337) \n\n broadcast(\n tensor, destinations\n )\n\nBroadcast the `tensor` to destinations.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ||\n|----------------|-----------------------------|\n| `tensor` | the tensor to broadcast. |\n| `destinations` | the broadcast destinations. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ||\n|---|---|\n| a Mirrored object. ||\n\n\u003cbr /\u003e\n\n### `reduce`\n\n[View source](https://github.com/tensorflow/tensorflow/blob/v2.0.0/tensorflow/python/distribute/cross_device_ops.py#L248-L282) \n\n reduce(\n reduce_op, per_replica_value, destinations\n )\n\nReduce `per_replica_value` to `destinations`.\n\nIt runs the reduction operation defined by `reduce_op` and put the\nresult on `destinations`.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ||\n|---------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `reduce_op` | Indicates how per_replica_value will be reduced. Accepted values are [`tf.distribute.ReduceOp.SUM`](../../tf/distribute/ReduceOp#SUM), [`tf.distribute.ReduceOp.MEAN`](../../tf/distribute/ReduceOp#MEAN). |\n| `per_replica_value` | a PerReplica object or a tensor with device set. |\n| `destinations` | the reduction destinations. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ||\n|---|---|\n| a Mirrored object. ||\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ||\n|--------------|-----------------------------------------------------------------|\n| `ValueError` | if per_replica_value can't be converted to a PerReplica object. |\n\n\u003cbr /\u003e"]]