Rounding test of Mobilenet(1.0, 224 * 224 input) on flowers102 dataset.
Given floating parameters
V, first our goal is to represent
V as 8-bit integers
V'. And then we transformed back
V' back into its approximate high-precision value by performing the inverse of the quantization operation. At last, we perform gzip to our quantized && inverse-quantized model. The whole process can reduces our model by 70%.
First, we use UInt8 quantization, that is, the parameters are sacled to [0, 255]
Second, we inverse the quantization.
In the last, We apply gzip to compress the inverse-quantized model, and the compression ratio can be up to 70%.