This is my previous Local LLM Stack environment, which I wanted to share. It's built for machines running Windows and an NVIDIA GPU, leveraging Docker Compose for containerization.
The stack provides a fully portable, accelerated local AI environment using:
- Ollama: The runtime for pulling, serving, and managing local large language models (LLMs) using your NVIDIA GPU. 📦
- Open WebUI: A feature-rich, self-hosted web interface to interact with the models served by Ollama. 🌐
- Caddy: A powerful reverse proxy that manages HTTPS for the entire stack. 🔒
- Watchtower: Configured for automatic updates of all services. 🔄