Skip to content

Instantly share code, notes, and snippets.

@vitorcalvi
Created December 30, 2025 07:00
Show Gist options
  • Select an option

  • Save vitorcalvi/b43dcc696d98387314c1837579c89470 to your computer and use it in GitHub Desktop.

Select an option

Save vitorcalvi/b43dcc696d98387314c1837579c89470 to your computer and use it in GitHub Desktop.
Ollama Rocm Dokploy
version: "3.8"
services:
ollama:
image: ollama/ollama:rocm
restart: unless-stopped
privileged: true
dns:
- 8.8.8.8
- 1.1.1.1
deploy:
resources:
limits:
cpus: "14" # Leaves 1 core for the host system
memory: 24g # Keep high for APU shared memory
devices:
- /dev/kfd:/dev/kfd
- /dev/dri:/dev/dri
group_add:
- video
shm_size: "8gb"
volumes:
- ollama:/root/.ollama
ports:
- "11434:11434"
environment:
# -----------------------------------------------------
# APU STABILITY
# -----------------------------------------------------
- LLAMA_HIP_UMA=1
- HSA_OVERRIDE_GFX_VERSION=11.0.0
- HSA_ENABLE_SDMA=0
- GPU_MAX_HW_QUEUES=2
# -----------------------------------------------------
# SPEED OPTIMIZATIONS (For Visible Reasoning)
# -----------------------------------------------------
- OLLAMA_FLASH_ATTENTION=1 # CRITICAL: Makes long reasoning chains generate faster
- OLLAMA_KV_CACHE_TYPE=q8_0 # OPTIMIZED: Uses less memory bandwidth than f16, keeping tokens/s high
- OLLAMA_NUM_PARALLEL=4 # INCREASED: Allows multiple requests to utilize your 15 CPU cores
- OLLAMA_MAX_LOADED_MODELS=1 # Prevents swapping
networks:
- dokploy-network
volumes:
ollama:
networks:
dokploy-network:
external: true
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment