tf.nn.atrous_conv2d
Stay organized with collections
Save and categorize content based on your preferences.
Atrous convolution (a.k.a. convolution with holes or dilated convolution).
tf.nn.atrous_conv2d(
value, filters, rate, padding, name=None
)
This function is a simpler wrapper around the more general
tf.nn.convolution
, and exists only for backwards compatibility. You can
use tf.nn.convolution
to perform 1-D, 2-D, or 3-D atrous convolution.
Computes a 2-D atrous convolution, also known as convolution with holes or
dilated convolution, given 4-D value
and filters
tensors. If the rate
parameter is equal to one, it performs regular 2-D convolution. If the rate
parameter is greater than one, it performs convolution with holes, sampling
the input values every rate
pixels in the height
and width
dimensions.
This is equivalent to convolving the input with a set of upsampled filters,
produced by inserting rate - 1
zeros between two consecutive values of the
filters along the height
and width
dimensions, hence the name atrous
convolution or convolution with holes (the French word trous means holes in
English).
More specifically:
output[batch, height, width, out_channel] =
sum_{dheight, dwidth, in_channel} (
filters[dheight, dwidth, in_channel, out_channel] *
value[batch, height + rate*dheight, width + rate*dwidth, in_channel]
)
Atrous convolution allows us to explicitly control how densely to compute
feature responses in fully convolutional networks. Used in conjunction with
bilinear interpolation, it offers an alternative to conv2d_transpose
in
dense prediction tasks such as semantic image segmentation, optical flow
computation, or depth estimation. It also allows us to effectively enlarge
the field of view of filters without increasing the number of parameters or
the amount of computation.
For a description of atrous convolution and how it can be used for dense
feature extraction, please see: Semantic Image Segmentation with Deep
Convolutional Nets and Fully Connected CRFs.
The same operation is investigated further in Multi-Scale Context Aggregation
by Dilated Convolutions. Previous works
that effectively use atrous convolution in different ways are, among others,
OverFeat: Integrated Recognition, Localization and Detection using
Convolutional Networks and Fast Image
Scanning with Deep Max-Pooling Convolutional Neural
Networks.
Atrous convolution is also closely related to the so-called noble identities
in multi-rate signal processing.
There are many different ways to implement atrous convolution (see the refs
above). The implementation here reduces
atrous_conv2d(value, filters, rate, padding=padding)
to the following three operations:
paddings = ...
net = space_to_batch(value, paddings, block_size=rate)
net = conv2d(net, filters, strides=[1, 1, 1, 1], padding="VALID")
crops = ...
net = batch_to_space(net, crops, block_size=rate)
Advanced usage. Note the following optimization: A sequence of atrous_conv2d
operations with identical rate
parameters, 'SAME' padding
, and filters
with odd heights/ widths:
net = atrous_conv2d(net, filters1, rate, padding="SAME")
net = atrous_conv2d(net, filters2, rate, padding="SAME")
...
net = atrous_conv2d(net, filtersK, rate, padding="SAME")
can be equivalently performed cheaper in terms of computation and memory as:
pad = ... # padding so that the input dims are multiples of rate
net = space_to_batch(net, paddings=pad, block_size=rate)
net = conv2d(net, filters1, strides=[1, 1, 1, 1], padding="SAME")
net = conv2d(net, filters2, strides=[1, 1, 1, 1], padding="SAME")
...
net = conv2d(net, filtersK, strides=[1, 1, 1, 1], padding="SAME")
net = batch_to_space(net, crops=pad, block_size=rate)
because a pair of consecutive space_to_batch
and batch_to_space
ops with
the same block_size
cancel out when their respective paddings
and crops
inputs are identical.
Args |
value
|
A 4-D Tensor of type float . It needs to be in the default "NHWC"
format. Its shape is [batch, in_height, in_width, in_channels] .
|
filters
|
A 4-D Tensor with the same type as value and shape
[filter_height, filter_width, in_channels, out_channels] . filters '
in_channels dimension must match that of value . Atrous convolution is
equivalent to standard convolution with upsampled filters with effective
height filter_height + (filter_height - 1) * (rate - 1) and effective
width filter_width + (filter_width - 1) * (rate - 1) , produced by
inserting rate - 1 zeros along consecutive elements across the
filters ' spatial dimensions.
|
rate
|
A positive int32. The stride with which we sample input values across
the height and width dimensions. Equivalently, the rate by which we
upsample the filter values by inserting zeros across the height and
width dimensions. In the literature, the same parameter is sometimes
called input stride or dilation .
|
padding
|
A string, either 'VALID' or 'SAME' . The padding algorithm.
|
name
|
Optional name for the returned tensor.
|
Returns |
A Tensor with the same type as value .
Output shape with 'VALID' padding is:
[batch, height - 2 * (filter_width - 1),
width - 2 * (filter_height - 1), out_channels].
Output shape with 'SAME' padding is:
[batch, height, width, out_channels].
|
Raises |
ValueError
|
If input/output depth does not match filters ' shape, or if
padding is other than 'VALID' or 'SAME' .
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2020-10-01 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2020-10-01 UTC."],[],[],null,["# tf.nn.atrous_conv2d\n\n\u003cbr /\u003e\n\n|-----------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------|\n| [TensorFlow 1 version](/versions/r1.15/api_docs/python/tf/nn/atrous_conv2d) | [View source on GitHub](https://github.com/tensorflow/tensorflow/blob/v2.0.0/tensorflow/python/ops/nn_ops.py#L1412-L1546) |\n\nAtrous convolution (a.k.a. convolution with holes or dilated convolution).\n\n#### View aliases\n\n\n**Compat aliases for migration**\n\nSee\n[Migration guide](https://www.tensorflow.org/guide/migrate) for\nmore details.\n\n[`tf.compat.v1.nn.atrous_conv2d`](/api_docs/python/tf/nn/atrous_conv2d)\n\n\u003cbr /\u003e\n\n tf.nn.atrous_conv2d(\n value, filters, rate, padding, name=None\n )\n\nThis function is a simpler wrapper around the more general\n[`tf.nn.convolution`](../../tf/nn/convolution), and exists only for backwards compatibility. You can\nuse [`tf.nn.convolution`](../../tf/nn/convolution) to perform 1-D, 2-D, or 3-D atrous convolution.\n\nComputes a 2-D atrous convolution, also known as convolution with holes or\ndilated convolution, given 4-D `value` and `filters` tensors. If the `rate`\nparameter is equal to one, it performs regular 2-D convolution. If the `rate`\nparameter is greater than one, it performs convolution with holes, sampling\nthe input values every `rate` pixels in the `height` and `width` dimensions.\nThis is equivalent to convolving the input with a set of upsampled filters,\nproduced by inserting `rate - 1` zeros between two consecutive values of the\nfilters along the `height` and `width` dimensions, hence the name atrous\nconvolution or convolution with holes (the French word trous means holes in\nEnglish).\n\n#### More specifically:\n\n output[batch, height, width, out_channel] =\n sum_{dheight, dwidth, in_channel} (\n filters[dheight, dwidth, in_channel, out_channel] *\n value[batch, height + rate*dheight, width + rate*dwidth, in_channel]\n )\n\nAtrous convolution allows us to explicitly control how densely to compute\nfeature responses in fully convolutional networks. Used in conjunction with\nbilinear interpolation, it offers an alternative to `conv2d_transpose` in\ndense prediction tasks such as semantic image segmentation, optical flow\ncomputation, or depth estimation. It also allows us to effectively enlarge\nthe field of view of filters without increasing the number of parameters or\nthe amount of computation.\n\nFor a description of atrous convolution and how it can be used for dense\nfeature extraction, please see: [Semantic Image Segmentation with Deep\nConvolutional Nets and Fully Connected CRFs](http://arxiv.org/abs/1412.7062).\nThe same operation is investigated further in [Multi-Scale Context Aggregation\nby Dilated Convolutions](http://arxiv.org/abs/1511.07122). Previous works\nthat effectively use atrous convolution in different ways are, among others,\n[OverFeat: Integrated Recognition, Localization and Detection using\nConvolutional Networks](http://arxiv.org/abs/1312.6229) and [Fast Image\nScanning with Deep Max-Pooling Convolutional Neural\nNetworks](http://arxiv.org/abs/1302.1700).\nAtrous convolution is also closely related to the so-called noble identities\nin multi-rate signal processing.\n\nThere are many different ways to implement atrous convolution (see the refs\nabove). The implementation here reduces \n\n atrous_conv2d(value, filters, rate, padding=padding)\n\nto the following three operations: \n\n paddings = ...\n net = space_to_batch(value, paddings, block_size=rate)\n net = conv2d(net, filters, strides=[1, 1, 1, 1], padding=\"VALID\")\n crops = ...\n net = batch_to_space(net, crops, block_size=rate)\n\nAdvanced usage. Note the following optimization: A sequence of `atrous_conv2d`\noperations with identical `rate` parameters, 'SAME' `padding`, and filters\nwith odd heights/ widths: \n\n net = atrous_conv2d(net, filters1, rate, padding=\"SAME\")\n net = atrous_conv2d(net, filters2, rate, padding=\"SAME\")\n ...\n net = atrous_conv2d(net, filtersK, rate, padding=\"SAME\")\n\ncan be equivalently performed cheaper in terms of computation and memory as: \n\n pad = ... # padding so that the input dims are multiples of rate\n net = space_to_batch(net, paddings=pad, block_size=rate)\n net = conv2d(net, filters1, strides=[1, 1, 1, 1], padding=\"SAME\")\n net = conv2d(net, filters2, strides=[1, 1, 1, 1], padding=\"SAME\")\n ...\n net = conv2d(net, filtersK, strides=[1, 1, 1, 1], padding=\"SAME\")\n net = batch_to_space(net, crops=pad, block_size=rate)\n\nbecause a pair of consecutive `space_to_batch` and `batch_to_space` ops with\nthe same `block_size` cancel out when their respective `paddings` and `crops`\ninputs are identical.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|-----------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `value` | A 4-D `Tensor` of type `float`. It needs to be in the default \"NHWC\" format. Its shape is `[batch, in_height, in_width, in_channels]`. |\n| `filters` | A 4-D `Tensor` with the same type as `value` and shape `[filter_height, filter_width, in_channels, out_channels]`. `filters`' `in_channels` dimension must match that of `value`. Atrous convolution is equivalent to standard convolution with upsampled filters with effective height `filter_height + (filter_height - 1) * (rate - 1)` and effective width `filter_width + (filter_width - 1) * (rate - 1)`, produced by inserting `rate - 1` zeros along consecutive elements across the `filters`' spatial dimensions. |\n| `rate` | A positive int32. The stride with which we sample input values across the `height` and `width` dimensions. Equivalently, the rate by which we upsample the filter values by inserting zeros across the `height` and `width` dimensions. In the literature, the same parameter is sometimes called `input stride` or `dilation`. |\n| `padding` | A string, either `'VALID'` or `'SAME'`. The padding algorithm. |\n| `name` | Optional name for the returned tensor. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|---|---|\n| A `Tensor` with the same type as `value`. Output shape with `'VALID'` padding is: \u003cbr /\u003e \\[batch, height - 2 \\* (filter_width - 1), width - 2 \\* (filter_height - 1), out_channels\\]. Output shape with `'SAME'` padding is: \\[batch, height, width, out_channels\\]. ||\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ------ ||\n|--------------|-----------------------------------------------------------------------------------------------------------|\n| `ValueError` | If input/output depth does not match `filters`' shape, or if padding is other than `'VALID'` or `'SAME'`. |\n\n\u003cbr /\u003e"]]