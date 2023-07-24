Pruning and clustering are optimization techniques:

Pruning: setting weights to zero

Clustering: grouping weights together into clusters

These techniques modify the weights of a Machine Learning model. In some cases, they enable:

Significant speed-up of the inference execution

Reduction of the memory footprint

Reduction in the overall power consumption of the system

We assume that you can optimize your workload without loss in accuracy and that you target an Arm® Ethos NPU. You can therefore prune and cluster your neural network before using the Vela compiler and deploying it on the Ethos-U hardware. See below for more information on optimizing your workload.

Click here to read more ...



