Abstract: By reducing the size of transmitted data between device-side and edge-side machine learning model parts, intermediate activation (IA) compression can alleviate communication overhead, lower ...