Skip to content

Instantly share code, notes, and snippets.

Intro

TODO: Logo: https://openai.com TODO: Logo: https://anthropic.com TODO: Logo: https://deepmind.google

Every prompt you send to ChatGPT, Claude, or Gemini runs on someone else's GPUs. Your data, your code, your company's secrets, all flowing through infrastructure you don't control. For most people, that's fine. But if you're in healthcare, finance, government, or anywhere that compliance actually matters, "fine" isn't good enough.

Here's the thing. Running AI models on your own infrastructure sounds like it should be straightforward. It's not. GPUs are brutally expensive. Get the setup wrong and you're burning hundreds of thousands of dollars on mistakes. Get it right, and you have Inference-as-a-Service that any team in your company can use, with your data never leaving your network.

Intro

I've spent the last few months building custom AI agents for internal teams, and I can tell you: the gap between using a general-purpose coding assistant and having an agent that truly understands your company is enormous. We're talking about the difference between an intern who's brilliant but knows nothing about your business, and a senior engineer who's been with you for years.

In this video, I'll walk you through the complete architecture for building your own AI agent. Every component: system context, tools, knowledge retrieval, multi-agent orchestration, security, observability, and cost optimization. By the end, you'll have a blueprint you can actually follow. Not just theory, but the real decisions you'll face and how to make them.

Why Build Your Own Agent?

Let's say you're working in a company with a bunch of software engineers, and everyone wants to use AI to help with coding, operations, debugging, you name it. The obvious move is to just hand everyone Claude Code, Cursor, or Windsu

You are analyzing documentation to identify all content that can be validated through testing. Your goal is to find every section containing factual claims, executable instructions, or verifiable information.

File to Analyze

File: {filePath} Session: {sessionId}

Core Testing Philosophy

Most technical documentation is testable through two validation approaches: :...skipping...

@vfarcic
vfarcic / vectordb.md
Created August 2, 2025 10:56
Vector Databases, Embeddings, and RAG: A Practical Guide - DevOps AI Toolkit

Vector Databases, Embeddings, and RAG: A Practical Guide

Introduction

  • What are Vector Databases, Embeddings, and RAG?
  • Why they matter in modern AI applications
  • How the DevOps AI Toolkit uses these technologies for intelligent pattern matching

Core Concepts

Manual (Without GitOps)

go test --tag unit

docker image build --tag ghcr.io/vfarcic/silly-demo:v1.2.3 --push

yq --inplace ".spec.template.spec.containers[0].image = \"ghcr.io/vfarcic/silly-demo:v1.2.3\"" staging/deployment.yaml

kubectl --namespace staging apply --filename dev/deployment.yaml

Introduction

Imagine if you could create, for people in your company, a platform that would provide them with the same experience they have when working with AWS, Google Cloud, Azure, or any other public Cloud provider. Imagine if there would be a service for everything they do.

Do you need a database that works exactly as we expect it to work in this company with all the security, backup, compliance, and other policies we have?

Well...

There is a service for that.

Intro

Crossplane v2 is here!

We got some very cool features that I want to go through.

We'll see the changes to Crossplane Composition schemas, a shift to Namespace-scoped resources, direct composition of any resources without the need to rely only on Crossplane Managed resources, new API versions, removal of deprecated features, and more.

I'm sure that, by the end of this walkthrough you'll see that two most requested features are here and that autoring Compositions is now easier.

Intro

Crossplane v2 is here!

We got some very cool features that I want to go through.

We'll see the changes to Crossplane Composition schemas, a shift to Namespace-scoped resources, direct composition of any resources without the need to rely only on Crossplane Managed resources, new API versions, removal of deprecated features, and more.

I'm sure that, by the end of this walkthrough you'll see that two most requested features are here and that autoring Compositions is now easier.

Intro

TODO: Volume is now lower

TODO: Show images conversation-* (6 of them) by adding them one on top or beside each other starting from when I say "like those" (it's like that one in the text) and keep them until the end of the paragraph. Show them starting from conversation-06 moving in descending order towards conversation-01 since that is the most important one.

During recent months I saw a lot of conversation and questions sparked by the release of kro, like the one in the picture. Many of them are related to comparison with Helm. Some people think that kro is, more or less, doing the same work as Helm. Others think that it is a different syntax that accomplishes the same result as Helm. Some are asking whether kro is a replacement for Helm.

Intro

TODO: Volume is now lower

"Crossplane is too complicated!" "Crossplane is only for infrastructure! I need something else for applications."

I hear those and other similar statement very often so I decided to clarify a few things and, hopefully, eliminate some missconceptions.

To be more precise, today I want to debunk one missconception about Crossplane and, at the same time, show how Crossplane addressed one semi-legitimate complaint in the recent v2 release.