Joey Hafner
97e4cc547a
1. homelab [Gitea](https://gitea.jafner.tools/Jafner/homelab), [Github (docker_config)](https://github.com/Jafner/docker_config), [Github (wiki)](https://github.com/Jafner/wiki), [Github (cloud_tools)](https://github.com/Jafner/cloud_tools), [Github (self-hosting)](https://github.com/Jafner/self-hosting). - Rename? Jafner.net? Wouldn't that be `Jafner/Jafner.net/Jafner.net`? 2. Jafner.dev [Github](https://github.com/Jafner/Jafner.dev). 3. dotfiles [Gitea](https://gitea.jafner.tools/Jafner/dotfiles), [Github](https://github.com/Jafner/dotfiles). 4. nvgm [Gitea](https://gitea.jafner.tools/Jafner/nvgm) 5. pamidi [Gitea](https://gitea.jafner.tools/Jafner/pamidi), [Github](https://github.com/Jafner/pamidi) 6. docker-llm-amd [Gitea](https://gitea.jafner.tools/Jafner/docker-llm-amd) 7. doradash [Gitea](https://gitea.jafner.tools/Jafner/doradash) 8. clip-it-and-ship-it [Gitea (PyClipIt)](https://gitea.jafner.tools/Jafner/PyClipIt), [Github](https://github.com/Jafner/clip-it-and-ship-it). 9. razer battery led [Github](https://github.com/Jafner/Razer-BatteryLevelRGB) 10. 5etools-docker [Github](https://github.com/Jafner/5etools-docker) 11. jafner-homebrew [Github](https://github.com/Jafner/jafner-homebrew)
3.6 KiB
3.6 KiB
What we have so far
- Ollama loads and serves a few models via API.
- Ollama itself doesn't have a UI. CLI and API only.
- The API can be accessed at
https://api.ollama.jafner.net
. - Ollama running as configured supports ROCm (GPU acceleration).
- Configured models are described here, and
- Run Ollama with:
HSA_OVERRIDE_GFX_VERSION=11.0.0 OLLAMA_HOST=192.168.1.135:11434 OLLAMA_ORIGINS="app://obsidian.md*" OLLAMA_MAX_LOADED_MODELS=0 ollama serve
- Open-webui provides a pretty web interface for interacting with Ollama.
- The web UI can be accessed at
https://ollama.jafner.net
. - The web UI is protected by Traefik's
lan-only
rule, as well as its own authentication layer. - Run open-webui with:
cd ~/Projects/LLMs/open-webui && docker compose up -d && docker compose logs -f
- Then open the page and log in.
- Connect the frontend to the ollama instance by opening the settings (top-right), clicking "Connections", and setting "Ollama Base URL" to "https://api.ollama.jafner.net". Hit refresh and begin using.
- The web UI can be accessed at
- SillyTavern provides a powerful interface for building and using characters.
- Run SillyTavern with:
cd ~/Projects/LLMs/SillyTavern && ./start.sh
- Run SillyTavern with:
- Oobabooga provides a more powerful web UI than open-webui, but it's less pretty.
- Run Oobabooga with:
cd ~/Projects/LLMs/text-generation-webui && ./start_linux.sh
- Requires the following environment variables be set in
one_click.py
(right after import statements):
- Run Oobabooga with:
os.environ["ROCM_PATH"] = '/opt/rocm'
os.environ["HSA_OVERRIDE_GFX_VERSION"] = '11.0.0'
os.environ["HCC_AMDGPU_TARGET"] = 'gfx1100'
os.environ["PATH"] = '/opt/rocm/bin:$PATH'
os.environ["LD_LIBRARY_PATH"] = '/opt/rocm/lib:$LD_LIBRARY_PATH'
os.environ["CUDA_VISIBLE_DEVICES"] = '0'
os.environ["HCC_SERIALIZE_KERNEL"] = '0x3'
os.environ["HCC_SERIALIZE_KERNEL"]='0x3'
os.environ["HCC_SERIALIZE_COPY"]='0x3'
os.environ["HIP_TRACE_API"]='0x2'
os.environ["HF_TOKEN"]='<my-huggingface-token>'
- Requires the following environment variable be set in `start_linux.sh` for access to non-public model downloads:
# config
HF_TOKEN="<my-huggingface-token>"
That's where we're at.
Set Up Models Directory
- Navigate to the source directory with all models:
cd "~/Nextcloud/Large Language Models/GGUF/"
- Symlink each file into the docker project's models directory:
for model in ./*; do ln $(realpath $model) $(realpath ~/Git/docker-llm-amd/models/$model); done
- Note that the symlinks must be hardlinks or they will not be passed properly into containers.
- Launch ollama:
docker compose up -d ollama
- Create models defined by the modelfiles:
docker compose exec -dit ollama /modelfiles/.loadmodels.sh
Roadmap
-
Set up StableDiffusion-web-UI.
-
Get characters in SillyTavern behaving as expected.
- Repetition issues.
- Obsession with certain parts of prompt.
- Refusals.
-
Set up something for character voices.
-
Set up Extras for SillyTavern.
Notes
- So many of these projects use Python with its various version and dependencies and shit.
- Always use a Docker container or virtual environment.
- It's like a condom.