Skip to content

Instantly share code, notes, and snippets.

@TheNando
Last active June 30, 2025 18:33
Show Gist options
  • Select an option

  • Save TheNando/ce7abf01ac120c3b458955bb7e0fc61f to your computer and use it in GitHub Desktop.

Select an option

Save TheNando/ce7abf01ac120c3b458955bb7e0fc61f to your computer and use it in GitHub Desktop.
Thanks Ollama!

Thanks, Ollama!

First steps

We will be covering running prebuilt models and connecting for code assistance. We will not be covering training new models.

Hardware requirements

  • Depends on the models, but in general, Bring the beef.
  • A good CPU can and plenty of RAM can do alright, but GPUs will do much better.
    • Ryzen 795xx w/ 64GB DDR5@60000hz => 1.5 tokens/sec
    • RTX 3090 w/ 24GB DDR6 => 2.8t/s
  • A system that can run Docker

Get Linux working with an modern Nvidia GPU

Might be easy. Might be hard. Nvidia driver support is spotty. I had to boot installer to graphics safe mode and first boot until I could get the drivers updated. Even had to go to an older kernel version to have a working GPU and network driver.

Setup Docker

Install Docker Engine & Docker Desktop

While it is possible to install Ollama through the OS package manager, we are going to use Docker. This just seems like the most universal method. This way we don't have to worry about any of the dependency versions. We could use a or container-based package manager like Flatpak or Snap, but I want to avoid any sandbox issues. This should work in any Linux distro, MacOS or Windows, though Windows adds an extra layer with WSL and there may be issues getting the GPU to work.

https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository

Configuring resources

Docker may be some conservative defaults. Let's bump them up a bit. My PC has 32 GB of RAM, so we can use more than the meager ~7.5 GB Docker set by default. Infact, without increasing this limit, I was getting errors trying to run any models. I opened Docker's settings and upped the RAM to ~16GB.

The next issue is that, by default, Docker will not use the GPU. We'll have to install some software to change that. I followed the instructions under Nvidia GPU from the Docker hub here: https://hub.docker.com/r/ollama/ollama. But rather than start the container directly, we'll create a compose file to manage Ollama and the WebUI together.

Other issues

I was getting a permission denied while trying to connect to the Docker daemon socket... error. I resolved it following these steps: https://www.hostinger.com/tutorials/how-to-fix-docker-permission-denied-error

Setup folder for compose and volume

https://peter-nhan.github.io/posts/Ollama-intro/

[Optional] Setup network access

https://github.com/ollama/ollama/blob/main/docs/faq.md#how-do-i-configure-ollama-server

docker compose up -d

image

Introducing Ollama WebUI

Create an account and login

Get a model

Let's try a popular one to start with - https://ollama.com/library/deepseek-r1

Others: ollama run incept5/llama3.1-claude ollama run qwen2.5-coder:7b

Understand model limitations

This is where it gets crazy. # of params, 16/32 bit, contexts https://www.reddit.com/r/SillyTavernAI/comments/1j9jkck/im_an_llm_idiot_confused_by_all_the_options_and/

Coding Assistant

VS Code and Continue

Get the extension - https://marketplace.visualstudio.com/items?itemName=Continue.continue

Change the apiBase property on the object in the "models" array to wherever you serve Ollama https://docs.continue.dev/reference

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment