Commit graph

3583 commits

Author SHA1 Message Date
Jonathan Sowards
e0d466ed14
Change openai models list result output to match the dummy model example.
Change openai models list result output to match the dummy model example.
2024-03-17 19:38:05 -06:00
Jonathan Sowards
0a67325809
Update script.py
Fixes bug #5675 that lists dummy models instead of loaded models in OpenAI extension. v1/models.
2024-03-17 19:08:37 -06:00
oobabooga
49b111e2dd Lint 2024-03-17 08:33:23 -07:00
oobabooga
d890c99b53 Fix StreamingLLM when content is removed from the beginning of the prompt 2024-03-14 09:18:54 -07:00
oobabooga
d828844a6f Small fix: don't save truncation_length to settings.yaml
It should derive from model metadata or from a command-line flag.
2024-03-14 08:56:28 -07:00
oobabooga
2ef5490a36 UI: make light theme less blinding 2024-03-13 08:23:16 -07:00
oobabooga
40a60e0297 Convert attention_sink_size to int (closes #5696) 2024-03-13 08:15:49 -07:00
oobabooga
edec3bf3b0 UI: avoid caching convert_to_markdown calls during streaming 2024-03-13 08:14:34 -07:00
oobabooga
8152152dd6 Small fix after 28076928ac 2024-03-11 19:56:35 -07:00
oobabooga
28076928ac
UI: Add a new "User description" field for user personality/biography (#5691) 2024-03-11 23:41:57 -03:00
oobabooga
63701f59cf UI: mention that n_gpu_layers > 0 is necessary for the GPU to be used 2024-03-11 18:54:15 -07:00
oobabooga
46031407b5 Increase the cache size of convert_to_markdown to 4096 2024-03-11 18:43:04 -07:00
oobabooga
9eca197409 Minor logging change 2024-03-11 16:31:13 -07:00
oobabooga
afadc787d7 Optimize the UI by caching convert_to_markdown calls 2024-03-10 20:10:07 -07:00
oobabooga
056717923f Document StreamingLLM 2024-03-10 19:15:23 -07:00
oobabooga
15d90d9bd5 Minor logging change 2024-03-10 18:20:50 -07:00
oobabooga
abcdd0ad5b API: don't use settings.yaml for default values 2024-03-10 16:15:52 -07:00
oobabooga
a102c704f5 Add numba to requirements.txt 2024-03-10 16:13:29 -07:00
oobabooga
b3ade5832b Keep AQLM only for Linux (fails to install on Windows) 2024-03-10 09:41:17 -07:00
oobabooga
67b24b0b88 Bump llama-cpp-python to 0.2.56 2024-03-10 09:07:27 -07:00
oobabooga
763f9beb7e Bump bitsandbytes to 0.43, add official Windows wheel 2024-03-10 08:30:53 -07:00
oobabooga
52a34921ef Installer: validate the checksum for the miniconda installer on Windows 2024-03-09 16:33:12 -08:00
oobabooga
cf0697936a Optimize StreamingLLM by over 10x 2024-03-08 21:48:28 -08:00
oobabooga
afb51bd5d6
Add StreamingLLM for llamacpp & llamacpp_HF (2nd attempt) (#5669) 2024-03-09 00:25:33 -03:00
oobabooga
9271e80914 Add back AutoAWQ for Windows
https://github.com/casper-hansen/AutoAWQ/issues/377#issuecomment-1986440695
2024-03-08 14:54:56 -08:00
oobabooga
549bb88975 Increase height of "Custom stopping strings" UI field 2024-03-08 12:54:30 -08:00
oobabooga
238f69accc Move "Command for chat-instruct mode" to the main chat tab (closes #5634) 2024-03-08 12:52:52 -08:00
oobabooga
d0663bae31
Bump AutoAWQ to 0.2.3 (Linux only) (#5658) 2024-03-08 17:36:28 -03:00
oobabooga
0e6eb7c27a
Add AQLM support (transformers loader) (#5466) 2024-03-08 17:30:36 -03:00
oobabooga
2681f6f640
Make superbooga & superboogav2 functional again (#5656) 2024-03-07 15:03:18 -03:00
oobabooga
bae14c8f13 Right-truncate long chat completion prompts instead of left-truncating
Instructions are usually at the beginning of the prompt.
2024-03-07 08:50:24 -08:00
Bartowski
104573f7d4
Update cache_4bit documentation (#5649)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-03-07 13:08:21 -03:00
oobabooga
bef08129bc Small fix for cuda 11.8 in the one-click installer 2024-03-06 21:43:36 -08:00
oobabooga
303433001f Fix a check in the installer 2024-03-06 21:13:54 -08:00
oobabooga
bde7f00cae Change the exllamav2 version number 2024-03-06 21:08:29 -08:00
oobabooga
2ec1d96c91
Add cache_4bit option for ExLlamaV2 (#5645) 2024-03-06 23:02:25 -03:00
oobabooga
fa0e68cefd Installer: add back INSTALL_EXTENSIONS environment variable (for docker) 2024-03-06 11:31:06 -08:00
oobabooga
fcc92caa30 Installer: add option to install requirements for just one extension 2024-03-06 07:36:23 -08:00
oobabooga
2174958362
Revert gradio to 3.50.2 (#5640) 2024-03-06 11:52:46 -03:00
oobabooga
7eee9e9470 Add -k to curl command to download miniconda on windows (closes #5628) 2024-03-06 06:46:50 -08:00
oobabooga
03f03af535 Revert "Update peft requirement from ==0.8.* to ==0.9.* (#5626)"
This reverts commit 72a498ddd4.
2024-03-05 02:56:37 -08:00
oobabooga
d61e31e182
Save the extensions after Gradio 4 (#5632) 2024-03-05 07:54:34 -03:00
oobabooga
ae12d045ea Merge remote-tracking branch 'refs/remotes/origin/dev' into dev 2024-03-05 02:35:04 -08:00
dependabot[bot]
72a498ddd4
Update peft requirement from ==0.8.* to ==0.9.* (#5626) 2024-03-05 07:34:32 -03:00
oobabooga
1437f757a1 Bump HQQ to 0.1.5 2024-03-05 02:33:51 -08:00
oobabooga
63a1d4afc8
Bump gradio to 4.19 (#5522) 2024-03-05 07:32:28 -03:00
oobabooga
164ff2440d Use the correct PyTorch in the Colab notebook 2024-03-05 01:05:19 -08:00
oobabooga
3cfcab63a5 Update an installation message 2024-03-04 20:37:44 -08:00
oobabooga
907bda0d56 Move update_wizard_wsl.sh to update_wizard_wsl.bat 2024-03-04 19:57:49 -08:00
oobabooga
f697cb4609 Move update_wizard_windows.sh to update_wizard_windows.bat (oops) 2024-03-04 19:26:24 -08:00