llmcompressor.modifiers.awq.mappings
Classes:
-
AWQMapping–Dataclass storing config of activation mappings to smooth
Functions:
-
get_layer_mappings_from_architecture–:param architecture: str: The architecture of the model
AWQMapping dataclass
AWQMapping(
smooth_layer: str,
balance_layers: list[str],
activation_hook_target: str | None = None,
)
Dataclass storing config of activation mappings to smooth The output activations of smooth_layer are input activations into the balance_layers
AWQMappings are resolved into ResolvedMappings, which retain pointers to the actual torch.nn.Modules and additional metadata at runtime
Parameters:
-
(smooth_layerstr) –regex or name of the activation layer to smooth
-
(balance_layerslist[str]) –list of regex or names of weight layers that must be balanced to offset the smoothing
-
(activation_hook_targetstr | None, default:None) –optional dotted attribute path relative to the parent module (lowest common ancestor of balance_layers) specifying which submodule to hook for activation caching. Useful for parallel transformer blocks (e.g. Cohere, Gemma 3) where the first balance layer is not the correct place to capture activations. When
None(default), the hook is placed onbalance_layers[0].
ResolvedMapping dataclass
ResolvedMapping(
smooth_name: str,
smooth_layer: Module,
balance_layers: list[Module],
balance_names: list[str],
parent: Module,
parent_name: str,
activation_hook_target: Module | None = None,
)
Dataclass for storing the resolved mappings between an activation layer and the following weights that must be balanced during smoothing
Parameters:
-
(smooth_namestr) –name of the activation layer
-
(smooth_layerModule) –PyTorch module storing the activation layer
-
(balance_layerslist[Module]) –list of PyTorch modules that smooth_layer feeds into, must be balanced to offset the smoothing of smooth_layer
-
(balance_nameslist[str]) –optional list of names of the balance_layers
-
(parentModule) –parent module of the balance_layers
-
(parent_namestr) –name of the parent module
-
(activation_hook_targetModule | None, default:None) –optional resolved module to hook for activation caching. When set, the activation cache hook is placed on this module instead of
balance_layers[0]. Populated fromAWQMapping.activation_hook_target.
get_layer_mappings_from_architecture
Parameters:
-
(architecturestr) –str: The architecture of the model
Returns:
-
list[AWQMapping]–list: The layer mappings for the given architecture