Commit graph

89 commits

Author SHA1 Message Date
kalomaze
b6077b02e4
Quadratic sampling (#5403)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-02-04 00:20:02 -03:00
oobabooga
87dc421ee8
Bump exllamav2 to 0.0.12 (#5352) 2024-01-22 22:40:12 -03:00
oobabooga
e055967974
Add prompt_lookup_num_tokens parameter (#5296) 2024-01-17 17:09:36 -03:00
oobabooga
372ef5e2d8 Fix dynatemp parameters always visible 2024-01-08 19:42:31 -08:00
oobabooga
29c2693ea0
dynatemp_low, dynatemp_high, dynatemp_exponent parameters (#5209) 2024-01-08 23:28:35 -03:00
oobabooga
0d07b3a6a1
Add dynamic_temperature_low parameter (#5198) 2024-01-07 17:03:47 -03:00
kalomaze
48327cc5c4
Dynamic Temperature HF loader support (#5174)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-01-07 10:36:26 -03:00
oobabooga
0e54a09bcb
Remove exllamav1 loaders (#5128) 2023-12-31 01:57:06 -03:00
oobabooga
c727a70572 Remove redundancy from modules/loaders.py 2023-12-20 19:18:07 -08:00
oobabooga
de138b8ba6
Add llama-cpp-python wheels with tensor cores support (#5003) 2023-12-19 17:30:53 -03:00
oobabooga
0a299d5959
Bump llama-cpp-python to 0.2.24 (#5001) 2023-12-19 15:22:21 -03:00
oobabooga
f6d701624c UI: mention that QuIP# does not work on Windows 2023-12-18 18:05:02 -08:00
Water
674be9a09a
Add HQQ quant loader (#4888)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-12-18 21:23:16 -03:00
oobabooga
f1f2c4c3f4
Add --num_experts_per_token parameter (ExLlamav2) (#4955) 2023-12-17 12:08:33 -03:00
oobabooga
3bbf6c601d AutoGPTQ: Add --disable_exllamav2 flag (Mixtral CPU offloading needs this) 2023-12-15 06:46:13 -08:00
oobabooga
62d59a516f Add trust_remote_code to all HF loaders 2023-12-08 06:29:26 -08:00
oobabooga
98361af4d5
Add QuIP# support (#4803)
It has to be installed manually for now.
2023-12-06 00:01:01 -03:00
oobabooga
7fc9033b2e Recommend ExLlama_HF and ExLlamav2_HF 2023-12-04 15:28:46 -08:00
oobabooga
9940ed9c77 Sort the loaders 2023-11-29 15:13:03 -08:00
oobabooga
a7670c31ca Sort 2023-11-28 18:43:33 -08:00
oobabooga
6e51bae2e0 Sort the loaders menu 2023-11-28 18:41:11 -08:00
oobabooga
0589ff5b12
Bump llama-cpp-python to 0.2.19 & add min_p and typical_p parameters to llama.cpp loader (#4701) 2023-11-21 20:59:39 -03:00
oobabooga
e0ca49ed9c
Bump llama-cpp-python to 0.2.18 (2nd attempt) (#4637)
* Update requirements*.txt

* Add back seed
2023-11-18 00:31:27 -03:00
oobabooga
9d6f79db74 Revert "Bump llama-cpp-python to 0.2.18 (#4611)"
This reverts commit 923c8e25fb.
2023-11-17 05:14:25 -08:00
oobabooga
8b66d83aa9 Set use_fast=True by default, create --no_use_fast flag
This increases tokens/second for HF loaders.
2023-11-16 19:55:28 -08:00
oobabooga
923c8e25fb
Bump llama-cpp-python to 0.2.18 (#4611) 2023-11-16 22:55:14 -03:00
oobabooga
58c6001be9 Add missing exllamav2 samplers 2023-11-16 07:09:40 -08:00
oobabooga
af3d25a503 Disable logits_all in llamacpp_HF (makes processing 3x faster) 2023-11-07 14:35:48 -08:00
feng lui
4766a57352
transformers: add use_flash_attention_2 option (#4373) 2023-11-04 13:59:33 -03:00
oobabooga
aa5d671579
Add temperature_last parameter (#4472) 2023-11-04 13:09:07 -03:00
kalomaze
367e5e6e43
Implement Min P as a sampler option in HF loaders (#4449) 2023-11-02 16:32:51 -03:00
oobabooga
c0655475ae Add cache_8bit option 2023-11-02 11:23:04 -07:00
tdrussell
72f6fc6923
Rename additive_repetition_penalty to presence_penalty, add frequency_penalty (#4376) 2023-10-25 12:10:28 -03:00
oobabooga
ef1489cd4d Remove unused parameter in AutoAWQ 2023-10-23 20:45:43 -07:00
tdrussell
4440f87722
Add additive_repetition_penalty sampler setting. (#3627) 2023-10-23 02:28:07 -03:00
oobabooga
df90d03e0b Replace --mul_mat_q with --no_mul_mat_q 2023-10-22 12:23:03 -07:00
Johan
1d5a015ce7
Enable special token support for exllamav2 (#4314) 2023-10-21 01:54:06 -03:00
turboderp
8a98646a21
Bump ExLlamaV2 to 0.0.5 (#4186) 2023-10-05 19:12:22 -03:00
cal066
cc632c3f33
AutoAWQ: initial support (#3999) 2023-10-05 13:19:18 -03:00
oobabooga
ae4ba3007f
Add grammar to transformers and _HF loaders (#4091) 2023-10-05 10:01:36 -03:00
oobabooga
b6fe6acf88 Add threads_batch parameter 2023-10-01 21:28:00 -07:00
jllllll
41a2de96e5
Bump llama-cpp-python to 0.2.11 2023-10-01 18:08:10 -05:00
oobabooga
56b5a4af74 exllamav2 typical_p 2023-09-28 20:10:12 -07:00
StoyanStAtanasov
7e6ff8d1f0
Enable NUMA feature for llama_cpp_python (#4040) 2023-09-26 22:05:00 -03:00
oobabooga
d0d221df49 Add --use_fast option (closes #3741) 2023-09-25 12:19:43 -07:00
oobabooga
36c38d7561 Add disable_exllama to Transformers loader (for GPTQ LoRA training) 2023-09-24 20:03:11 -07:00
oobabooga
08cf150c0c
Add a grammar editor to the UI (#4061) 2023-09-24 18:05:24 -03:00
oobabooga
eb0b7c1053 Fix a minor UI bug 2023-09-24 07:17:33 -07:00
oobabooga
b227e65d86 Add grammar to llama.cpp loader (closes #4019) 2023-09-24 07:10:45 -07:00
saltacc
f01b9aa71f
Add customizable ban tokens (#3899) 2023-09-15 18:27:27 -03:00