Google today introduced TensorFlow Lite 1.0, its framework for developers deploying AI models on mobile and IoT devices. Improvements include selective registration and quantization during and after training for faster, smaller models. Quantization has led to 4 times compression of some models.
“We are going to fully support it. We’re not going to break things and make sure we guarantee its compatibility. I think a lot of people who deploy this on phones want those guarantees,” TensorFlow engineering director Rajat Monga told VentureBeat in a phone interview.
Lite begins with training AI models on TensorFlow, then is converted to create Lite models for operating on mobile devices. Lite was first introduced at the I/O developer conference in May 2017 and in developer preview later that year.
The TensorFlow Lite team at Google also shared its roadmap for the future today, designed to shrink and speed up AI models for edge deployment, including things like model acceleration, especially for Android developers using neural nets, as well as a Keras-based connecting pruning kit and additional quantization enhancements.
Other changes on the way:
- Support for control flow, which is essential to the operation of models like recurrent neural networks
- CPU performance optimization with Lite models, potentially involving partnerships with other companies
- Expand coverage of GPU delegate operations and finalize the API to make it generally available
A TensorFlow 2.0 model converter to make Lite models will be made available for developers to better understand how things wrong in the conversion process and how to fix it.
TensorFlow Lite is deployed by more than two billion devices today, TensorFlow Lite engineer Raziel Alvarez said onstage at the TensorFlow Dev Summit being held at Google offices in Sunnyvale, California.
TensorFlow Lite increasingly makes TensorFlow Mobile obsolete, except for users who want to utilize it for training, but a solution is in the works, Alvarez said.