Skip to content

llmcompressor.modifiers.awq.mappings

Classes:

  • AWQMapping

    Dataclass storing config of activation mappings to smooth

Functions:

AWQMapping dataclass

AWQMapping(
    smooth_layer: str,
    balance_layers: list[str],
    activation_hook_target: str | None = None,
)

Dataclass storing config of activation mappings to smooth The output activations of smooth_layer are input activations into the balance_layers

AWQMappings are resolved into ResolvedMappings, which retain pointers to the actual torch.nn.Modules and additional metadata at runtime

Parameters:

  • smooth_layer

    (str) –

    regex or name of the activation layer to smooth

  • balance_layers

    (list[str]) –

    list of regex or names of weight layers that must be balanced to offset the smoothing

  • activation_hook_target

    (str | None, default: None ) –

    optional dotted attribute path relative to the parent module (lowest common ancestor of balance_layers) specifying which submodule to hook for activation caching. Useful for parallel transformer blocks (e.g. Cohere, Gemma 3) where the first balance layer is not the correct place to capture activations. When None (default), the hook is placed on balance_layers[0].

ResolvedMapping dataclass

ResolvedMapping(
    smooth_name: str,
    smooth_layer: Module,
    balance_layers: list[Module],
    balance_names: list[str],
    parent: Module,
    parent_name: str,
    activation_hook_target: Module | None = None,
)

Dataclass for storing the resolved mappings between an activation layer and the following weights that must be balanced during smoothing

Parameters:

  • smooth_name

    (str) –

    name of the activation layer

  • smooth_layer

    (Module) –

    PyTorch module storing the activation layer

  • balance_layers

    (list[Module]) –

    list of PyTorch modules that smooth_layer feeds into, must be balanced to offset the smoothing of smooth_layer

  • balance_names

    (list[str]) –

    optional list of names of the balance_layers

  • parent

    (Module) –

    parent module of the balance_layers

  • parent_name

    (str) –

    name of the parent module

  • activation_hook_target

    (Module | None, default: None ) –

    optional resolved module to hook for activation caching. When set, the activation cache hook is placed on this module instead of balance_layers[0]. Populated from AWQMapping.activation_hook_target.

get_layer_mappings_from_architecture

get_layer_mappings_from_architecture(
    architecture: str,
) -> list[AWQMapping]

Parameters:

  • architecture

    (str) –

    str: The architecture of the model

Returns:

  • list[AWQMapping]

    list: The layer mappings for the given architecture

Source code in llmcompressor/modifiers/awq/mappings.py
def get_layer_mappings_from_architecture(architecture: str) -> list[AWQMapping]:
    """
    :param architecture: str: The architecture of the model
    :return: list: The layer mappings for the given architecture
    """

    if architecture not in AWQ_MAPPING_REGISTRY:
        logger.info(
            f"Architecture {architecture} not found in mappings. "
            f"Using default mappings: {_default_mappings}"
        )

    return AWQ_MAPPING_REGISTRY.get(architecture, _default_mappings)