llmcompressor.modeling.qwen3_moe

Classes:

CalibrationQwen3MoeSparseMoeBlock –

Calibration version of Qwen3MoeSparseMoeBlock that sends all tokens to all experts.

CalibrationQwen3MoeSparseMoeBlock

CalibrationQwen3MoeSparseMoeBlock(
    original: Qwen3MoeSparseMoeBlock,
    config: Qwen3MoeConfig,
    calibrate_all_experts: bool = True,
)

Bases: MoECalibrationModule

Calibration version of Qwen3MoeSparseMoeBlock that sends all tokens to all experts.

Source code in llmcompressor/modeling/qwen3_moe.py

def __init__(
    self,
    original: OriginalQwen3MoeSparseMoeBlock,
    config: Qwen3MoeConfig,
    calibrate_all_experts: bool = True,
):
    super().__init__()
    self.num_experts = config.num_experts
    self.top_k = config.num_experts_per_tok
    self.norm_topk_prob = config.norm_topk_prob

    self.calibrate_all_experts = calibrate_all_experts
    self.gate = original.gate
    self.experts = original.experts