llmcompressor.observers.moving_base
Classes:
-
MovingAverageObserverBase–Compute quantization parameters by taking the moving average of min/max values
MovingAverageObserverBase
MovingAverageObserverBase(
base_name: str,
args: QuantizationArgs,
module: Optional[Module] = None,
**observer_kwargs,
)
Bases: Observer
Compute quantization parameters by taking the moving average of min/max values
Parameters:
-
(base_namestr) –str used to name the observer attribute
-
(argsQuantizationArgs) –quantization args used to calibrate and quantize the observed value
-
(moduleOptional[Module], default:None) –optional module with attached quantization parameters. This argument is required to utilize existing qparams such as global_scale or g_idx
-
–**observer_kwargskeyword arguments for observer initialization
Methods:
-
get_current_global_min_max–Calculate the min and max value of the observed value (without moving average)
-
get_current_min_max–Calculate the min and max value of the observed value (without moving average)
-
get_global_min_max–Calculate moving average of min and max values from observed value
-
get_min_max–Calculate moving average of min and max values from observed value
Source code in llmcompressor/observers/moving_base.py
get_current_global_min_max abstractmethod
Calculate the min and max value of the observed value (without moving average) for the purposes of global scale calculation
Source code in llmcompressor/observers/moving_base.py
get_current_min_max abstractmethod
Calculate the min and max value of the observed value (without moving average)
get_global_min_max
Calculate moving average of min and max values from observed value for the purposes of global scale calculation
Parameters:
-
(observedTensor) –value being observed whose shape is (num_observations, 1, group_size)
Returns:
-
MinMaxTuple–minimum value and maximum value whose shapes are (1, )
Source code in llmcompressor/observers/moving_base.py
get_min_max
Calculate moving average of min and max values from observed value
Parameters:
-
(observedTensor) –value being observed whose shape is (num_observations, *qparam_shape, group_size)
Returns:
-
MinMaxTuple–minimum value and maximum value whose shapes are (*qparam_shape, )