Stay organized with collections
Save and categorize content based on your preferences.
Updated: June, 2021
TensorFlow’s Model Optimization Toolkit (MOT) has been used widely for
converting/optimizing TensorFlow models to TensorFlow Lite models with smaller
size, better performance and acceptable accuracy to run them on mobile and IoT
devices. We are now working to extend MOT techniques and tooling beyond
TensorFlow Lite to support TensorFlow SavedModel as well.
The following represents a high level overview of our roadmap. You should be
aware that this roadmap may change at any time and the order below does not
reflect any type of priority. We strongly encourage you to comment on our
roadmap and provide us feedback in the
discussion group.
Quantization
TensorFlow Lite
- Selective post-training quantization to exclude certain layers from
quantization.
- Quantization debugger to inspect quantization error losses per layer.
- Applying quantization-aware training on more model coverage e.g. TensorFlow
Model Garden.
- Quality and performance improvements for post-training dynamic-range.
quantization.
TensorFlow
- Post Training Quantization (bf16 * int8 dynamic range).
- Quantization Aware Training ((bf16 * int8 weight-only with fake quant).
- Selective post-training quantization to exclude certain layers from
quantization.
- Quantization debugger to inspect quantization error losses per layer.
Sparsity
TensorFlow Lite
- Sparse model execution support for more models.
- Target aware authoring for Sparsity.
- Extend sparse op set with performant x86 kernels.
TensorFlow
- Sparity support in TensorFlow.
Cascading compression techniques
- Quantization + Tensor Compression + Sparsity: demonstrate all
3 techniques working together.
Compression
- Tensor compression API to help compression algorithm developers implement
their own model compression algorithm (e.g. Weight Clustering) including
providing a standard way to test/benchmark.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2022-08-03 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2022-08-03 UTC."],[],[],null,["\u003cbr /\u003e\n\n**Updated: June, 2021**\n\nTensorFlow's Model Optimization Toolkit (MOT) has been used widely for\nconverting/optimizing TensorFlow models to TensorFlow Lite models with smaller\nsize, better performance and acceptable accuracy to run them on mobile and IoT\ndevices. We are now working to extend MOT techniques and tooling beyond\nTensorFlow Lite to support TensorFlow SavedModel as well.\n\nThe following represents a high level overview of our roadmap. You should be\naware that this roadmap may change at any time and the order below does not\nreflect any type of priority. We strongly encourage you to comment on our\nroadmap and provide us feedback in the\n[discussion group](https://groups.google.com/a/tensorflow.org/g/tflite).\n\nQuantization\n------------\n\n#### TensorFlow Lite\n\n- Selective post-training quantization to exclude certain layers from quantization.\n- Quantization debugger to inspect quantization error losses per layer.\n- Applying quantization-aware training on more model coverage e.g. TensorFlow Model Garden.\n- Quality and performance improvements for post-training dynamic-range. quantization.\n\n#### TensorFlow\n\n- Post Training Quantization (bf16 \\* int8 dynamic range).\n- Quantization Aware Training ((bf16 \\* int8 weight-only with fake quant).\n- Selective post-training quantization to exclude certain layers from quantization.\n- Quantization debugger to inspect quantization error losses per layer.\n\nSparsity\n--------\n\n#### TensorFlow Lite\n\n- Sparse model execution support for more models.\n- Target aware authoring for Sparsity.\n- Extend sparse op set with performant x86 kernels.\n\n#### TensorFlow\n\n- Sparity support in TensorFlow.\n\nCascading compression techniques\n--------------------------------\n\n- Quantization + Tensor Compression + Sparsity: demonstrate all 3 techniques working together.\n\nCompression\n-----------\n\n- Tensor compression API to help compression algorithm developers implement their own model compression algorithm (e.g. Weight Clustering) including providing a standard way to test/benchmark."]]