Commit graph

19 commits

Author SHA1 Message Date
oobabooga
10601850d9 Fix after previous commit 2024-06-13 19:54:12 -07:00
oobabooga
0f3a423de1 Alternative solution to "get next logits" deadlock (#6106) 2024-06-13 19:34:16 -07:00
oobabooga
ae86292159 Fix getting Phi-3-small-128k-instruct logits 2024-05-21 10:35:00 -07:00
oobabooga
9f77ed1b98
--idle-timeout flag to unload the model if unused for N minutes (#6026) 2024-05-19 23:29:39 -03:00
wangshuai09
fd4e46bce2
Add Ascend NPU support (basic) (#5541) 2024-04-11 18:42:20 -03:00
oobabooga
2a1063eff5 Revert "Remove non-HF ExLlamaV2 loader (#5431)"
This reverts commit cde000d478.
2024-02-06 06:21:36 -08:00
oobabooga
cde000d478
Remove non-HF ExLlamaV2 loader (#5431) 2024-02-04 01:15:51 -03:00
kalomaze
48327cc5c4
Dynamic Temperature HF loader support (#5174)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-01-07 10:36:26 -03:00
oobabooga
0e54a09bcb
Remove exllamav1 loaders (#5128) 2023-12-31 01:57:06 -03:00
Kim Jaewon
e53f99faa0
[OpenAI Extension] Add 'max_logits' parameter in logits endpoint (#4916) 2023-12-15 00:22:43 -03:00
oobabooga
a2e6d00128 Use convert_ids_to_tokens instead of decode in logits endpoint
This preserves the llama tokenizer spaces.
2023-11-19 09:22:08 -08:00
oobabooga
0fa1af296c
Add /v1/internal/logits endpoint (#4650) 2023-11-18 23:19:31 -03:00
Abhilash Majumder
778a010df8
Intel Gpu support initialization (#4340) 2023-10-26 23:39:51 -03:00
oobabooga
b062d50c45 Remove exllama import that causes problems 2023-09-17 18:00:32 -07:00
oobabooga
ad8ac545a5 Tokenization improvements 2023-09-17 07:02:00 -07:00
saltacc
cd08eb0753
token probs for non HF loaders (#3957) 2023-09-17 10:42:32 -03:00
oobabooga
c0b119c3a3 Improve logit viewer format 2023-08-22 20:35:12 -07:00
oobabooga
8545052c9d Add the option to use samplers in the logit viewer 2023-08-22 20:18:16 -07:00
oobabooga
120fb86c6a
Add a simple logit viewer (#3636) 2023-08-20 20:49:21 -03:00