Commit graph

  • 6857daef51
    Merge pull request #1 from oobabooga/dev Louis Del Valle 2024-05-03 16:37:40 -0500
  • cb31998605 Add a template for NVIDIA ChatQA models oobabooga 2024-05-03 08:19:04 -0700
  • e9c9483171 Improve the logging messages while loading models oobabooga 2024-05-03 08:10:44 -0700
  • e61055253c Bump llama-cpp-python to 0.2.69, add --flash-attn option oobabooga 2024-05-03 04:31:22 -0700
  • 22baacfd65
    Update block_requests.py Louis Del Valle 2024-05-03 01:42:08 -0500
  • 0476f9fe70 Bump ExLlamaV2 to 0.0.20 oobabooga 2024-05-01 16:20:50 -0700
  • 9db81f74f9
    Merge branch 'oobabooga:main' into main Guanghua Lu 2024-05-02 00:57:12 +0800
  • ae0f28530c Bump llama-cpp-python to 0.2.68 oobabooga 2024-05-01 08:40:50 -0700
  • 02beedd503
    Update accelerate requirement from ==0.27.* to ==0.29.* dependabot[bot] 2024-05-01 12:57:14 +0000
  • 594eef433c
    Update optimum requirement from ==1.17.* to ==1.19.* dependabot[bot] 2024-05-01 12:57:14 +0000
  • 2cb4495df2
    Update gradio requirement from ==4.26.* to ==4.28.* dependabot[bot] 2024-05-01 12:57:12 +0000
  • 8f12fb028d
    Merge pull request #5970 from oobabooga/dev oobabooga 2024-05-01 09:56:23 -0300
  • 1eba888af6 Update FUNDING.yml oobabooga 2024-05-01 05:54:21 -0700
  • 251361618c
    Merge branch 'oobabooga:dev' into dev Artificiangel 2024-04-30 11:01:26 -0400
  • 51fb766bea
    Add back my llama-cpp-python wheels, bump to 0.2.65 (#5964) oobabooga 2024-04-30 09:11:31 -0300
  • 8bd090e430 Revert "Remove obsolete --tensorcores references" oobabooga 2024-04-30 03:07:13 -0700
  • 049c3d4761 Bump wheels versions oobabooga 2024-04-30 03:06:53 -0700
  • 7e01776da2 Revert "Bump llama-cpp-python to 0.2.64, use official wheels (#5921)" oobabooga 2024-04-29 19:54:12 -0700
  • d48c519f4e
    Merge branch 'oobabooga:main' into main Guanghua Lu 2024-04-30 10:49:46 +0800
  • 81f603d09f
    Merge pull request #5959 from oobabooga/dev oobabooga 2024-04-29 15:45:48 -0300
  • 1b44204bd7 Use custom model/lora download folder in model downloader Artificiangel 2024-04-29 07:21:09 -0400
  • d64459e20d model/unload should also return "OK", consistent with lora/unload Artificiangel 2024-04-28 08:26:42 -0400
  • dce32fe2d9 Raise HTTPException to pass correct status code to the client Artificiangel 2024-04-28 08:26:15 -0400
  • 3ffb09d465 Run in executor for long blocking functions. Artificiangel 2024-04-28 08:24:45 -0400
  • d8a8a73bfb model/unload should also return "OK", consistent with lora/unload Artificiangel 2024-04-28 06:20:35 -0400
  • b424461e66 Raise HTTPException to pass correct status code to the client Artificiangel 2024-04-28 06:18:24 -0400
  • dde040c9de Run in executor for long blocking functions. Artificiangel 2024-04-28 06:14:39 -0400
  • 08878cd9f9 Add the missing "-" Touch-Night 2024-04-28 11:44:50 +0800
  • 5770e06c48
    Add a retry mechanism to the model downloader (#5943) oobabooga 2024-04-27 12:25:28 -0300
  • e7a7e7e87c Minor change oobabooga 2024-04-27 08:24:09 -0700
  • dcb3cdcfe7 Update the log message oobabooga 2024-04-27 08:20:41 -0700
  • 38082d1a79 Minor changes oobabooga 2024-04-27 08:14:33 -0700
  • e1de71031c Change a log message oobabooga 2024-04-26 11:05:06 -0700
  • ca142ef8f5 Minor fix oobabooga 2024-04-26 09:45:26 -0700
  • 608d6ad5ab Make the wait time a power of 2 instead of linear oobabooga 2024-04-26 09:44:10 -0700
  • dfdb6fee22 Set llm_int8_enable_fp32_cpu_offload=True for --load-in-4bit oobabooga 2024-04-26 09:39:27 -0700
  • b3d8dbafb1 One session per retry per file oobabooga 2024-04-26 09:38:47 -0700
  • 2b334f6044 Minor change oobabooga 2024-04-26 07:42:01 -0700
  • f28a5c6389 Minor changes oobabooga 2024-04-26 07:41:27 -0700
  • b3f0a32bbf Update tqdm oobabooga 2024-04-26 07:35:05 -0700
  • 5332d3ebc9 Add a print oobabooga 2024-04-26 07:32:27 -0700
  • 272fc6a164 Add a retry mechanism to the model downloader oobabooga 2024-04-26 07:28:11 -0700
  • 70845c76fb
    Add back the max_updates_second parameter (#5937) oobabooga 2024-04-26 10:14:51 -0300
  • 6761b5e7c6
    Improved instruct style (with syntax highlighting & LaTeX rendering) (#5936) oobabooga 2024-04-26 10:13:11 -0300
  • 8f4faf2096 Minor change for consistency oobabooga 2024-04-26 05:58:25 -0700
  • b037e7bc74 Lazy syntax highlighting oobabooga 2024-04-26 05:42:00 -0700
  • 0dc1d5322a wip oobabooga 2024-04-26 05:20:31 -0700
  • ae4a2d00e3
    Merge branch 'oobabooga:main' into main Guanghua Lu 2024-04-26 12:30:42 +0800
  • 5808697545 Minor fix oobabooga 2024-04-25 17:46:35 -0700
  • faadeb0ea3 Revert "Update the button colors" oobabooga 2024-04-25 17:46:24 -0700
  • f177ce7dd2 Update the button colors oobabooga 2024-04-25 17:40:10 -0700
  • 1101a5aff9 Add the katex fonts oobabooga 2024-04-25 17:21:57 -0700
  • 5ffe6acb9b Add LaTeX rendering oobabooga 2024-04-25 17:16:40 -0700
  • 9c04365f54 Detect the airoboros-3_1-yi-34b-200k template oobabooga 2024-04-25 16:50:38 -0700
  • a2c2e1e8f4 Revive the max_updates_second parameter oobabooga 2024-04-25 13:36:31 -0700
  • 139becd4be fix: AutoAWQ from_quantized to from_pretrained Pedro Cavalcanti 2024-04-25 20:32:04 +0000
  • bde22a1e43 Add files oobabooga 2024-04-25 12:59:28 -0700
  • c8b7e497b9 wip oobabooga 2024-04-25 11:02:51 -0700
  • 80748f5505 wip oobabooga 2024-04-25 10:19:56 -0700
  • 19b7efc582
    add one chalk Vasyanator 2024-04-25 20:02:04 +0400
  • 054c301173
    translate that into English Vasyanator 2024-04-25 19:58:25 +0400
  • 76d73a145c
    Add a guide on converting to GGUF and quantization Vasyanator 2024-04-25 19:50:31 +0400
  • 8b1dee3ec8 Detect platypus-yi-34b, CausalLM-RP-34B, 34b-beta instruction templates oobabooga 2024-04-24 21:47:43 -0700
  • 4aa481282b Detect the xwin-lm-70b-v0.1 instruction template oobabooga 2024-04-24 17:02:20 -0700
  • 388eff042b add temporary workaround for problem when using Chromium marcel 2024-04-24 21:18:12 +0200
  • bcb4252322
    Update gradio requirement from ==4.26.* to ==4.27.* dependabot[bot] 2024-04-24 16:59:39 +0000
  • 1eadc43a06
    Bump aqlm[cpu,gpu] from 1.1.3 to 1.1.5 dependabot[bot] 2024-04-24 16:59:37 +0000
  • ad122361ea
    Merge pull request #5927 from oobabooga/dev snapshot-2024-04-28 oobabooga 2024-04-24 13:58:53 -0300
  • c9b0df16ee Lint oobabooga 2024-04-24 09:55:00 -0700
  • 4094813f8d Lint oobabooga 2024-04-24 09:53:41 -0700
  • 983c21ce20 Add meaningful error message when model url is empty Srivatsa Joshi 2024-04-24 21:49:14 +0530
  • 64e2a9a0a7 Fix the Phi-3 template when used in the UI oobabooga 2024-04-24 01:34:11 -0700
  • f0538efb99 Remove obsolete --tensorcores references oobabooga 2024-04-24 00:31:28 -0700
  • f3c9103e04
    Revert walrus operator for params['max_memory'] (#5878) Colin 2024-04-24 00:09:14 -0400
  • c725d97368
    nvidia docker: make sure gradio listens on 0.0.0.0 (#5918) Jari Van Melckebeke 2024-04-24 04:17:55 +0200
  • 9b623b8a78
    Bump llama-cpp-python to 0.2.64, use official wheels (#5921) oobabooga 2024-04-23 23:17:05 -0300
  • c7c1a39992 Update README oobabooga 2024-04-23 19:16:06 -0700
  • b0d23bf46d Handle the cuda118 case oobabooga 2024-04-23 19:12:04 -0700
  • 2666a59b0e Bump llama-cpp-python to 0.2.64, use official wheels oobabooga 2024-04-23 18:51:36 -0700
  • 905706a827
    Update Dockerfile Jari Van Melckebeke 2024-04-23 13:49:39 +0200
  • a1211f7620
    update path for docker-compose.yaml in code example Jari Van Melckebeke 2024-04-23 13:31:46 +0200
  • 5d2e8f958b
    Update README.md Jari Van Melckebeke 2024-04-23 13:29:57 +0200
  • ba5001b3b1 Restore conditional assign of max_memory value Column01 2024-04-22 13:50:55 -0400
  • 5007367235
    Merge pull request #1 from Column01/fix-model-params Colin 2024-04-22 13:47:06 -0400
  • 200e0dc197 Replace direct key access with .get methods Column01 2024-04-22 13:46:05 -0400
  • 4a388da98e
    Merge branch 'oobabooga:main' into main Colin 2024-04-22 13:40:19 -0400
  • fbf4d6996e
    fix handling of prefix with intentional space Eve 2024-04-22 04:26:06 +0000
  • af81277044
    Merge 857a2ca5b3 into 0877741b03 Eve 2024-04-22 04:18:09 +0000
  • 857a2ca5b3 Merge remote-tracking branch 'origin/dev' into fix_trailing_spaces netrunnereve 2024-04-22 00:16:14 -0400
  • 81ea2783af update API documentation with examples to list/load models Joachim Chauveheid 2024-04-21 20:13:58 +0300
  • 55306aa4e1 feat: save chat template with instruction template A0nameless0man 2024-04-21 16:10:59 +0000
  • c93c867397 fix: grammar not support utf-8 A0nameless0man 2024-04-21 16:07:12 +0000
  • e5d608ed60
    Update ChatML-format.json FartyPants (FP HAM) 2024-04-21 10:43:46 -0400
  • 7d40cb70cb
    Added ChatML-format.json in formats, since people are still puzzled FartyPants (FP HAM) 2024-04-21 10:41:14 -0400
  • b09badffea
    Merge 47a8d8b520 into 0877741b03 Stefan Daniel Schwarz 2024-04-21 10:26:19 -0400
  • baa4dde9b3 Remove BOS token in the template Touch-Night 2024-04-21 01:54:10 +0800
  • 8ad0222547 Correct Llama-v3 template Touch-Night 2024-04-21 01:12:33 +0800
  • b460d0d35b Add Llama 3 template Touch-Night 2024-04-20 20:11:50 +0800
  • 0877741b03
    Bumped ExLlamaV2 to version 0.0.19 to resolve #5851 (#5880) Ashley Kleynhans 2024-04-20 00:04:40 +0200
  • f1e8f66a9a
    Bumped ExLlamaV2 to version 0.0.19 in the other requirements files to resolve #5851 Ashley Kleynhans 2024-04-19 17:40:36 +0200