llmcompressor.utils.dist
Functions:
-
greedy_bin_packing–Distribute items across bins using a greedy bin-packing heuristic.
-
wait_for_comms–Block until all pending async distributed operations complete.
greedy_bin_packing
greedy_bin_packing(
items: list[T],
num_bins: int,
item_weight_fn: Callable[[T], float] = lambda x: 1,
) -> tuple[list[T], list[list[T]], dict[T, int]]
Distribute items across bins using a greedy bin-packing heuristic.
Items are sorted by weight in descending order, then each item is assigned to the bin with the smallest current total weight. This approximates an even distribution of weight across bins.
Parameters:
-
(itemslist[T]) –items to distribute. Sorted in-place by descending weight.
-
(num_binsint) –number of bins to distribute items across.
-
(item_weight_fnCallable[[T], float], default:lambda x: 1) –callable that returns the weight of an item. Defaults to uniform weight of 1.
Returns:
-
tuple[list[T], list[list[T]], dict[T, int]]–a 3-tuple of: - items: the input list, now sorted by descending weight. - bin_to_items: list of length
num_binswhere each element is the list of items assigned to that bin. - item_to_bin: mapping from each item to its assigned bin index.
Source code in llmcompressor/utils/dist.py
wait_for_comms
Block until all pending async distributed operations complete.
Calls wait() on each work handle, then clears the list in-place so it can be reused for the next batch of operations.
Parameters:
-
(pending_commslist[Work]) –mutable list of async communication handles (returned by
dist.reduce,dist.broadcast, etc. withasync_op=True). The list is cleared after all operations have completed.