Skip to content

Instantly share code, notes, and snippets.

View nerdalert's full-sized avatar
🐈
🦀 🐿

Brent Salisbury nerdalert

🐈
🦀 🐿
View GitHub Profile
$ ./scripts/validate.sh  all
Discovering gateway address...
  Found LoadBalancer hostname: http://a38603e70f1d34daa841061646a16427-402819449.us-east-1.elb.amazonaws.com

==========================================
  Iteration 1: httpbin.org (no auth)
==========================================

Resources:

Baseline Benchmark Results - Feb 24

Run metadata

  • Executed at: 2026-02-24 06:42:10 UTC
  • Repo: ~/vanilla/subscription-maas-413/maas-benchmark-vanilla/maas-benchmark
  • Target host: maas.apps.rosa.vnthh-zgsnt-wuf.rrcb.p3.openshiftapps.com
  • Protocol: https
  • Model ID detected from MAAS API: facebook/opt-125m
  • Model path detected from MAAS API: /llm/facebook-opt-125m-simulated
  • k6 version: k6 v1.5.0 (commit/7961cefa12, go1.25.5, linux/amd64)

MAAS Benchmark - subscription PR

Run metadata

  • Executed at: 2026-02-24 05:13:54 UTC
  • Repo: ~/vanilla/subscription-maas-413/maas-benchmark-vanilla/maas-benchmark
  • Target host: maas.apps.rosa.uu2gf-j2mrj-mmg.iqgw.p3.openshiftapps.com
  • Protocol: https
  • Model ID detected from MAAS API: facebook/opt-125m
  • Model URL detected from MAAS API: http://maas.apps.rosa.uu2gf-j2mrj-mmg.iqgw.p3.openshiftapps.com/llm/facebook-opt-125m-simulated
  • k6 version: k6 v1.5.0 (commit/7961cefa12, go1.25.5, linux/amd64)

MaaS Baseline Benchmark Feb 20, 2026

Run metadata

  • Executed at: 2026-02-21 04:14:07 UTC
  • Repo: ~/perf-k6/maas-benchmark
  • Target host: maas.apps.rosa.j7mgr-s39et-cf9.yd65.p3.openshiftapps.com
  • Protocol: https
  • Model ID detected from MAAS API: facebook/opt-125m
  • k6 version: k6 v1.5.0
Msg @clusterbot in Slack run:
rosa create 4.20.6

models-as-a-service$ ./scripts/deploy.sh --operator-type odh
[INFO] ===================================================
[INFO]   Models-as-a-Service Deployment
[INFO] ===================================================
#### ODH MaaS Deploy Fix ####

#  Step 1: Run the deploy script (it will hang — kill it after it says "Waiting for operator webhook")

  ./scripts/deploy.sh --operator-type odh
  # Wait until you see: "Waiting for deployment/opendatahub-operator-controller-manager in opendatahub-operator-system..."
  # Then Ctrl+C — the operator is actually running in the "opendatahub" namespace

Setup

HOST="maas.apps.brent.pcbk.p1.openshiftapps.com"
TOKEN="eyJhbGciOiJSUzI1NiIsImtpZCI6IjVpZ0pFZGs4R0tFWExERnI2Nkg5bFExeWtwWUw5anhTd3M3ZXFqMFlFM1kifQ.eyJhdWQiOlsibWFhcy1kZWZhdWx0LWdhdGV3YXktc2EiXSwiZXhwIjoxNzcyMjU3NDk2LCJpYXQiOjE3NzEzOTM0OTYsImlzcyI6Imh0dHBzOi8vcmgtb2lkYy5zMy51cy1lYXN0LTEuYW1hem9uYXdzLmNvbS8yN3Bxa3F2ZnVxMG8zNXM5NmEwbWEyMnBzbzZjNDcxMyIsImp0aSI6ImZkZTBiYzM2LWE5YjAtNDA0NC1iZDBmLTc1Mzk4NTNkMmQ2YiIsImt1YmVybmV0ZXMuaW8iOnsibmFtZXNwYWNlIjoibWFhcy1kZWZhdWx0LWdhdGV3YXktdGllci1mcmVlIiwic2VydmljZWFjY291bnQiOnsibmFtZSI6ImNsdXN0ZXItYWRtaW4tYjA5MDY3YTYiLCJ1aWQiOiI3NzEwMjZiYy0xZTkwLTQ2NDctYTllZC1lNjkwNDI0OTE0ZGUifX0sIm5iZiI6MTc3MTM5MzQ5Niwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Om1hYXMtZGVmYXVsdC1nYXRld2F5LXRpZXItZnJlZTpjbHVzdGVyLWFkbWluLWIwOTA2N2E2In0.CPivrhAkXGxfyB46AT5EEje2hgTtoFqbr2u2Giy-63eNxKF0kW8r__cf1wIVqOb3r8HsW_tnk7Xn4gCjcaZ8ZTwo3TRLkrMsxaj_lqPFDkzHWdl3aO6bc5OrWjsRxhqKyhKaZqdMU2ZTIaTFO2BbzxhFB5WqZ351oCGOlXmLVxDtQJRqYJU7ttLFR8mdH_5Xu0SJMPyz9P-HyTriBaRfb9HOTWzAPGsP9ArbWBGeB_soTswiO

Validated Backend Control Plane Prototype #31

Output from deploying: kubernetes-sigs/wg-ai-gateway#31

$> curl -s http://172.18.255.240/v1/models | jq
{
  "object": "list",
  "data": [
    {

TokenRateLimitPolicy demo output

Applies TRLP to the MaaS gateway for the vSR https://${MAAS_HOST}/v1/chat/completions route.

$> export MAAS_HOST="maas.$(oc get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}')"

  export ACCESS_TOKEN=$(curl -sSk --oauth2-bearer "$(oc whoami -t)" \
    --json '{"expiration": "10m"}' \
    "https://${MAAS_HOST}/maas-api/v1/tokens" | jq -r .token)

Classifier on GPU deploy support stdout

$ ./deploy/openshift/deploy-to-openshift.sh --kserve --simulator --classifier-gpu
[SUCCESS] Logged in as cluster-admin
[INFO] Creating namespace: vllm-semantic-router-system
namespace/vllm-semantic-router-system configured
[SUCCESS] Namespace ready
[INFO] Installing KServe and LLMInferenceService CRDs...
[INFO] InferenceService CRD already installed.