Skip to content

llmcompressor.observers.min_max

Classes:

  • MemorylessMinMaxObserver

    Compute quantization parameters by taking the min/max of the observed value

  • MinMaxObserver

    Compute quantization parameters by taking the moving average of all min/max values

  • StaticMinMaxObserver

    Compute quantization parameters by taking the min/max of all observed values

MemorylessMinMaxObserver

MemorylessMinMaxObserver(
    base_name: str,
    args: QuantizationArgs,
    module: Optional[Module] = None,
    **observer_kwargs,
)

Bases: Observer

Compute quantization parameters by taking the min/max of the observed value

Parameters:

  • base_name

    (str) –

    str used to name the observer attribute

  • args

    (QuantizationArgs) –

    quantization args used to calibrate and quantize the observed value

  • module

    (Optional[Module], default: None ) –

    optional module with attached quantization parameters. This argument is required to utilize existing qparams such as global_scale or g_idx

  • **observer_kwargs

    keyword arguments for observer initialization

Source code in llmcompressor/observers/base.py
def __init__(
    self,
    base_name: str,
    args: QuantizationArgs,
    module: Optional[torch.nn.Module] = None,
    **observer_kwargs,
):
    super().__init__()
    self.module = ref(module) if module is not None else None
    self.base_name = base_name
    self.args = args

    # populate observer kwargs
    self.args.observer_kwargs = self.args.observer_kwargs or {}
    self.args.observer_kwargs.update(observer_kwargs)

MinMaxObserver

MinMaxObserver(
    base_name: str,
    args: QuantizationArgs,
    module: Optional[Module] = None,
    **observer_kwargs,
)

Bases: MovingAverageObserverBase

Compute quantization parameters by taking the moving average of all min/max values

Parameters:

  • base_name

    (str) –

    str used to name the observer attribute

  • args

    (QuantizationArgs) –

    quantization args used to calibrate and quantize the observed value

  • module

    (Optional[Module], default: None ) –

    optional module with attached quantization parameters. This argument is required to utilize existing qparams such as global_scale or g_idx

  • **observer_kwargs

    keyword arguments for observer initialization

Source code in llmcompressor/observers/moving_base.py
def __init__(
    self,
    base_name: str,
    args: QuantizationArgs,
    module: Optional[torch.nn.Module] = None,
    **observer_kwargs,
):
    super().__init__(base_name, args, module, **observer_kwargs)
    self.avg_constant = self.args.observer_kwargs.get("averaging_constant", 0.01)

    self.past_min_vals = None
    self.past_max_vals = None
    self.past_global_min_vals = None
    self.past_global_max_vals = None

StaticMinMaxObserver

StaticMinMaxObserver(*args, **kwargs)

Bases: Observer

Compute quantization parameters by taking the min/max of all observed values

Parameters:

  • base_name

    str used to name the observer attribute

  • args

    quantization args used to calibrate and quantize the observed value

  • module

    optional module with attached quantization parameters. This argument is required to utilize existing qparams such as global_scale or g_idx

  • **observer_kwargs

    keyword arguments for observer initialization

Source code in llmcompressor/observers/min_max.py
def __init__(self, *args, **kwargs):
    super().__init__(*args, **kwargs)
    self.past_min_vals = None
    self.past_max_vals = None
    self.past_global_min_vals = None
    self.past_global_max_vals = None