Resoureces:
Pruning
- Pruning by comparing weights' magnitudes to a threshold value.
- Types:
- unstructured pruning - removing individual weights(connections)/neurons
- Structured pruning - remove entire channels or filters
- Others:
unstructured pruning
https://arxiv.org/pdf/1506.02626.pdf
set zero weights in a weight matrix → increase sparsity in architecture
structured pruning
Quantization
Low-rank factorization
Knowledge distillation
- Transferring knowledge from a large trained model (ensemble of models) to a smaller model
- training & inference → different tasks