Skip to content

Instantly share code, notes, and snippets.

@n-studio
Last active January 17, 2026 20:25
Show Gist options
  • Select an option

  • Save n-studio/663ae3a4e4d41576436c72d4e77f12ee to your computer and use it in GitHub Desktop.

Select an option

Save n-studio/663ae3a4e4d41576436c72d4e77f12ee to your computer and use it in GitHub Desktop.
Install Ollama with VS Code on Omarchy on an hybird Nvidia laptop

Local Copilot with Ollama on Omarchy (Arch Linux)

This document explains how to configure VS Code Copilot Chat to use a local LLM running on your NVIDIA GPU, optimized for Ruby on Rails development.


1. NVIDIA Driver Setup (Hybrid Intel + NVIDIA)

Run the following commands:

sudo pacman -S nvidia nvidia-utils nvidia-settings linux-headers

Create DRM config:

sudo vim /etc/modprobe.d/nvidia.conf

Add:

options nvidia_drm modeset=1

Rebuild and reboot:

sudo limine-mkinitcpio -P
sudo reboot

Verify:

nvidia-smi

2. Install CUDA-enabled Ollama

sudo pacman -S ollama-cuda
sudo systemctl enable ollama
sudo systemctl start ollama

Check version:

ollama --version

3. Force Ollama to Use NVIDIA GPU

On hybrid systems, explicitly bind Ollama to the discrete GPU.

Edit the service:

sudo systemctl edit ollama

Add:

[Service]
Environment=CUDA_VISIBLE_DEVICES=0

Reload and restart:

sudo systemctl daemon-reexec
sudo systemctl restart ollama

4. Install a Rails-Capable Model

ollama pull qwen3:14b
# or lighter
ollama pull qwen3:7b

5. Verify GPU Inference

Terminal A:

nvidia-smi -l 1

Terminal B:

ollama run qwen3:14b

You should see ollama using several GB of VRAM.


6. Install VS Code + Copilot

sudo pacman -S code

In VS Code Extensions, install:

  • GitHub Copilot
  • GitHub Copilot Chat

Sign in to GitHub.


7. Configure Copilot to Use Ollama

In VS Code:

  • Open Copilot Chat
  • Model Dropdown → Manage Models
  • Provider: Ollama
  • Model: qwen3:14b

Final Architecture

Component GPU
OS / UI Intel
Ollama / LLM NVIDIA
Copilot Chat Local

Sanity Checks

glxinfo | grep "OpenGL renderer"
nvidia-smi

Expected:

  • Intel renderer for desktop
  • ollama visible during inference

Result:

Fully local, GPU-accelerated Copilot for Ruby on Rails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment