Update ExLlama.md

This commit is contained in:
oobabooga 2023-06-24 20:23:01 -03:00 committed by GitHub
parent b071eb0d4b
commit a70a2ac3be
WARNING! Although there is a key with this ID in the database it does not verify this commit! This commit is SUSPICIOUS.
GPG key ID: 4AEE18F83AFDEB23

View file

@ -1,10 +1,18 @@
# ExLlama
## About
### About
ExLlama is an extremely optimized GPTQ backend ("loader") for LLaMA models. It features much lower VRAM usage and much higher speeds due to not relying on unoptimized transformers code.
ExLlama is an extremely optimized GPTQ backend for LLaMA models. It features much lower VRAM usage and much higher speeds due to not relying on unoptimized transformers code.
## Installation:
### Usage
Configure text-generation-webui to use exllama via the UI or command line:
- In the "Model" tab, set "Loader" to "exllama"
- Specify `--loader exllama` on the command line
### Manual setup
No additional installation steps are necessary since an exllama package is already included in the requirements.txt. If this package fails to install for some reason, you can use the following manual procedure:
1) Clone the ExLlama repository into your `text-generation-webui/repositories` folder:
@ -14,8 +22,3 @@ cd repositories
git clone https://github.com/turboderp/exllama
```
2) Follow the remaining set up instructions in the official README: https://github.com/turboderp/exllama#exllama
3) Configure text-generation-webui to use exllama via the UI or command line:
- In the "Model" tab, set "Loader" to "exllama"
- Specify `--loader exllama` on the command line