llmcompressor.pipelines.sequential.transformers_helpers
Classes:
-
HFCacheProxy–Proxy that represents an instance of
transformers.cache_utils.Cache. -
HFProxy–Proxy that uses metadata to handle data-dependent control-flow.
-
HFProxyableClassMeta–Metaclass that creates a class with its main methods wrapped to be proxyable.
-
HFTracer–Tracer that is able to symbolically trace models from the library. To do that, it uses the HFProxy instead of the
Functions:
-
gen_constructor_wrapper–Wraps
targetto be proxyable. Used for tensor creators liketorch.ones,torch.arangeand so on. -
symbolic_trace–Performs symbolic tracing on the model.
HFCacheProxy
HFProxy
Bases: Proxy
Proxy that uses metadata to handle data-dependent control-flow.
HFProxyableClassMeta
Bases: type
Metaclass that creates a class with its main methods wrapped to be proxyable.
HFTracer
Bases: Tracer
Tracer that is able to symbolically trace models from the library. To do that, it uses the HFProxy instead of the regular PyTorch torch.fx.Proxy.
Methods:
-
keys–Called when a proxy object is has the keys() method called.
-
path_of_module–Helper method to find the qualified name of
modin the Module hierarchy ofroot. For example, ifroothas -
trace–Traces
rootand returns the corresponding FXtorch.fx.Graphrepresentation.rootcan either be a
Source code in llmcompressor/pipelines/sequential/transformers_helpers.py
keys
Called when a proxy object is has the keys() method called. This is what happens when ** is called on a proxy. This should return an iterator if ** is supposed to work in your custom tracer.
Source code in llmcompressor/pipelines/sequential/transformers_helpers.py
path_of_module
Helper method to find the qualified name of mod in the Module hierarchy of root. For example, if root has a submodule named foo, which has a submodule named bar, passing bar into this function will return the string "foo.bar".
Args: mod (str): The Module to retrieve the qualified name for.
Source code in llmcompressor/pipelines/sequential/transformers_helpers.py
trace
trace(
root: Module | Callable[..., Any],
concrete_args: dict[str, Any] | None = None,
dummy_inputs: dict[str, Any] | None = None,
complete_concrete_args_with_inputs_not_in_dummy_inputs: bool = True,
) -> Graph
Traces root and returns the corresponding FX torch.fx.Graph representation. root can either be a torch.nn.Module instance or a Python callable. Note that after this call, self.root may be different from the root passed in here. For example, when a free function is passed to trace(), we will create a torch.nn.Module instance to use as the root and add embedded constants to.
Args: root (torch.nn.Module or Callable): Either a torch.nn.Module`` or a function to be traced through. If root is not a [~transformers.PreTrainedModel], thendummy_inputsmust be passed, otherwise tracing will fail. concrete_args (dict[str, Any], optional): Concrete arguments that should not be treated as Proxies dummy_inputs (dict[str, Any], optional): The dummy inputs needed to handle data-dependent control-flow if root is not a [~transformers.PreTrainedModel]. It can also be used when root is a [~transformers.PreTrainedModel] to specify custom dummy inputs for a subset or all the model inputs. complete_concrete_args_with_inputs_not_in_dummy_inputs (bool, optional, defaults to True): If True, and dummy_inputs is specified, every argument that root can take that is not in dummy_inputs and not in concrete_args will be added to concrete_args, otherwise does nothing.
Returns: torch.fx.Graph: A FX torch.fx.Graph representing the semantics of the passed-in root.
Source code in llmcompressor/pipelines/sequential/transformers_helpers.py
1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 | |
gen_constructor_wrapper
Wraps target to be proxyable. Used for tensor creators like torch.ones, torch.arange and so on.
Source code in llmcompressor/pipelines/sequential/transformers_helpers.py
symbolic_trace
symbolic_trace(
model: PreTrainedModel,
input_names: list[str] | None = None,
disable_check: bool = False,
tracer_cls: type[HFTracer] = HFTracer,
) -> GraphModule
Performs symbolic tracing on the model.
Args: model ([PretrainedModel]): The model to trace. input_names (list[str], optional): The names of the inputs of the traced model. If unset, model.dummy_inputs.keys() are used instead. disable_check (bool, optional, defaults to False): If True, no check is done before trying to trace the model, this is mostly usesul for debugging purposes. tracer_cls (Type[HFTracer], optional, defaults to HFTracer): The tracer class to use for instantiating the tracer. If unset, HFTracer is used instead.
Returns: torch.fx.GraphModule: A GraphModule constructed by recording operations seen while tracing the model.
Example:
```python
from transformers.utils.fx import symbolic_trace
traced_model = symbolic_trace(model, input_names=["input_ids", "attention_mask", "token_type_ids"])
```
Source code in llmcompressor/pipelines/sequential/transformers_helpers.py
1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 | |