tf.raw_ops.QuantizedMatMulWithBias
Stay organized with collections
Save and categorize content based on your preferences.
Performs a quantized matrix multiplication of a
by the matrix b
with bias add.
tf.raw_ops.QuantizedMatMulWithBias(
a,
b,
bias,
min_a,
max_a,
min_b,
max_b,
Toutput=tf.dtypes.qint32
,
transpose_a=False,
transpose_b=False,
input_quant_mode='MIN_FIRST',
name=None
)
The inputs must be two-dimensional matrices and 1D bias vector. And the inner
dimension of a
(after being transposed if transpose_a
is non-zero) must
match the outer dimension of b
(after being transposed if transposed_b
is
non-zero). Then do broadcast add operation with bias values on the matrix
multiplication result. The bias size must match inner dimension of b
.
Args:
a: A Tensor
. Must be one of the following types: qint8
, quint8
, qint32
, qint16
, quint16
.
A matrix to be multiplied. Must be a two-dimensional tensor of type quint8
.
b: A Tensor
. Must be one of the following types: qint8
, quint8
, qint32
, qint16
, quint16
.
A matrix to be multiplied and must be a two-dimensional tensor of type qint8
.
bias: A Tensor
. Must be one of the following types: float32
, qint32
.
A 1D bias tensor with size matching inner dimension of b
(after being
transposed if transposed_b
is non-zero).
min_a: A Tensor
of type float32
.
The float value that the lowest quantized a
value represents.
max_a: A Tensor
of type float32
.
The float value that the highest quantized a
value represents.
min_b: A Tensor
of type float32
.
The float value that the lowest quantized b
value represents.
max_b: A Tensor
of type float32
.
The float value that the highest quantized b
value represents.
Toutput: An optional tf.DType
from: tf.qint8, tf.quint8, tf.qint32, tf.qint16, tf.quint16
. Defaults to tf.qint32
.
transpose_a: An optional bool
. Defaults to False
.
If true, a
is transposed before multiplication.
transpose_b: An optional bool
. Defaults to False
.
If true, b
is transposed before multiplication.
input_quant_mode: An optional string
from: "MIN_FIRST", "SCALED"
. Defaults to "MIN_FIRST"
.
Input data quantization mode. Either MIN_FIRST(default) or SCALED.
name: A name for the operation (optional).
Returns:
A tuple of Tensor
objects (out, min_out, max_out).
out: A `Tensor` of type `Toutput`.
min_out: A `Tensor` of type `float32`.
max_out: A `Tensor` of type `float32`.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Some content is licensed under the numpy license.
Last updated 2024-04-26 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-04-26 UTC."],[],[],null,["# tf.raw_ops.QuantizedMatMulWithBias\n\n\u003cbr /\u003e\n\nPerforms a quantized matrix multiplication of `a` by the matrix `b` with bias add.\n\n#### View aliases\n\n\n**Compat aliases for migration**\n\nSee\n[Migration guide](https://www.tensorflow.org/guide/migrate) for\nmore details.\n\n[`tf.compat.v1.raw_ops.QuantizedMatMulWithBias`](https://www.tensorflow.org/api_docs/python/tf/raw_ops/QuantizedMatMulWithBias)\n\n\u003cbr /\u003e\n\n tf.raw_ops.QuantizedMatMulWithBias(\n a,\n b,\n bias,\n min_a,\n max_a,\n min_b,\n max_b,\n Toutput=../../tf/dtypes#qint32,\n transpose_a=False,\n transpose_b=False,\n input_quant_mode='MIN_FIRST',\n name=None\n )\n\nThe inputs must be two-dimensional matrices and 1D bias vector. And the inner\ndimension of `a` (after being transposed if `transpose_a` is non-zero) must\nmatch the outer dimension of `b` (after being transposed if `transposed_b` is\nnon-zero). Then do broadcast add operation with bias values on the matrix\nmultiplication result. The bias size must match inner dimension of `b`.\n\nArgs:\na: A `Tensor`. Must be one of the following types: `qint8`, `quint8`, `qint32`, `qint16`, `quint16`.\nA matrix to be multiplied. Must be a two-dimensional tensor of type `quint8`.\nb: A `Tensor`. Must be one of the following types: `qint8`, `quint8`, `qint32`, `qint16`, `quint16`.\nA matrix to be multiplied and must be a two-dimensional tensor of type `qint8`.\nbias: A `Tensor`. Must be one of the following types: `float32`, `qint32`.\nA 1D bias tensor with size matching inner dimension of `b` (after being\ntransposed if `transposed_b` is non-zero).\nmin_a: A `Tensor` of type `float32`.\nThe float value that the lowest quantized `a` value represents.\nmax_a: A `Tensor` of type `float32`.\nThe float value that the highest quantized `a` value represents.\nmin_b: A `Tensor` of type `float32`.\nThe float value that the lowest quantized `b` value represents.\nmax_b: A `Tensor` of type `float32`.\nThe float value that the highest quantized `b` value represents.\nToutput: An optional [`tf.DType`](../../tf/dtypes/DType) from: `tf.qint8, tf.quint8, tf.qint32, tf.qint16, tf.quint16`. Defaults to [`tf.qint32`](../../tf#qint32).\ntranspose_a: An optional `bool`. Defaults to `False`.\nIf true, `a` is transposed before multiplication.\ntranspose_b: An optional `bool`. Defaults to `False`.\nIf true, `b` is transposed before multiplication.\ninput_quant_mode: An optional `string` from: `\"MIN_FIRST\", \"SCALED\"`. Defaults to `\"MIN_FIRST\"`.\nInput data quantization mode. Either MIN_FIRST(default) or SCALED.\nname: A name for the operation (optional).\n\nReturns:\nA tuple of `Tensor` objects (out, min_out, max_out). \n\n out: A `Tensor` of type `Toutput`.\n min_out: A `Tensor` of type `float32`.\n max_out: A `Tensor` of type `float32`."]]