diff --git a/README.md b/README.md index c2e6b928..f6c03915 100644 --- a/README.md +++ b/README.md @@ -145,6 +145,7 @@ Optionally, you can use the following command-line flags: | `--flexgen` | Enable the use of FlexGen offloading. | | `--percent PERCENT [PERCENT ...]` | FlexGen: allocation percentages. Must be 6 numbers separated by spaces (default: 0, 100, 100, 0, 100, 0). | | `--compress-weight` | FlexGen: Whether to compress weight (default: False).| +| `--pin-weight [PIN_WEIGHT]` | FlexGen: whether to pin weights (setting this to False reduces CPU memory by 20%). | | `--deepspeed` | Enable the use of DeepSpeed ZeRO-3 for inference via the Transformers integration. | | `--nvme-offload-dir NVME_OFFLOAD_DIR` | DeepSpeed: Directory to use for ZeRO-3 NVME offloading. | | `--local_rank LOCAL_RANK` | DeepSpeed: Optional argument for distributed setups. |