llmcompressor.modifiers.pruning.wanda
Modules:
-
base– -
wanda_sparsify–
Classes:
-
WandaPruningModifier–Modifier for applying the one-shot WANDA algorithm to a model
WandaPruningModifier
Bases: SparsityModifierBase
Modifier for applying the one-shot WANDA algorithm to a model from the paper: https://arxiv.org/abs/2306.11695
Sample yaml:
Lifecycle:
- on_initialize
- register_hook(module, calibrate_module, "forward")
- run_sequential / run_basic
- make_empty_row_scalars
- accumulate_row_scalars
- on_sequential_batch_end
- sparsify_weight
- on_finalize
- remove_hooks()
Parameters:
-
–sparsitySparsity to compress model to
-
–sparsity_profileCan be set to 'owl' to use Outlier Weighed Layerwise Sparsity (OWL), more information can be found in the paper https://arxiv.org/pdf/2310.05175
-
–mask_structureString to define the structure of the mask to apply. Must be of the form N:M where N, M are integers that define a custom block shape. Defaults to 0:0 which represents an unstructured mask.
-
–owl_mNumber of outliers to use for OWL
-
–owl_lmbdaLambda value to use for OWL
-
–sequential_targetslist of layer names to compress during OBCQ, or 'ALL' to compress every layer in the model. Alias for
targets -
–targetslist of layer names to compress during OBCQ, or 'ALL' to compress every layer in the model. Alias for
sequential_targets -
–ignoreoptional list of module class names or submodule names to not quantize even if they match a target. Defaults to empty list.
Methods:
-
calibrate_module–Calibration hook used to accumulate the row scalars of the input to the module
-
compress_modules–Sparsify modules which have been calibrated
calibrate_module
Calibration hook used to accumulate the row scalars of the input to the module
Parameters:
-
(moduleModule) –module being calibrated
-
(argstuple[Tensor, ...]) –inputs to the module, the first element of which is the canonical input
-
(_outputTensor) –uncompressed module output, unused
Source code in llmcompressor/modifiers/pruning/wanda/base.py
compress_modules
Sparsify modules which have been calibrated