llmcompressor.transformers.tracing
Modules:
-
debug–
Functions:
-
trace–Debug traceability by tracing a pre-trained model into subgraphs
trace
trace(
model_id: str,
model_class: Type[PreTrainedModel],
sequential_targets: list[str] | str | None = None,
ignore: list[str]
| str = DatasetArguments().tracing_ignore,
modality: str = "text",
trust_remote_code: bool = True,
skip_weights: bool = True,
device_map: str | dict = "cpu",
) -> Tuple[
PreTrainedModel, list[Subgraph], dict[str, torch.Tensor]
]
Debug traceability by tracing a pre-trained model into subgraphs
Parameters:
-
(model_idstr) –stub of the model to load
-
(model_classType[PreTrainedModel]) –class constructor of the pre-trained model. Can use either HF transformers classes or
Traceableclasses defined by LLM Compressor -
(sequential_targetslist[str] | str | None, default:None) –targets for sequential tracing, defaults to automatic inference
-
(ignorelist[str] | str, default:tracing_ignore) –patterns to ignore during tracing
-
(modalitystr, default:'text') –data modality for dummy tracing data, defaults to 'text'
-
(trust_remote_codebool, default:True) –trust remote model code
Example usage from CLI llmcompressor.trace --model_id Qwen/Qwen2-VL-2B-Instruct --model_class Qwen2VLForConditionalGeneration --sequential_targets Qwen2VLDecoderLayer --ignore "lm_head" "re:visual.*" --modality text