Skip to content

Instantly share code, notes, and snippets.

@nerdalert
Created February 23, 2026 16:53
Show Gist options
  • Select an option

  • Save nerdalert/93adce390d4397f2b683bf6ddcebd008 to your computer and use it in GitHub Desktop.

Select an option

Save nerdalert/93adce390d4397f2b683bf6ddcebd008 to your computer and use it in GitHub Desktop.

MaaS Baseline Benchmark Feb 20, 2026

Run metadata

  • Executed at: 2026-02-21 04:14:07 UTC
  • Repo: ~/perf-k6/maas-benchmark
  • Target host: maas.apps.rosa.j7mgr-s39et-cf9.yd65.p3.openshiftapps.com
  • Protocol: https
  • Model ID detected from MAAS API: facebook/opt-125m
  • k6 version: k6 v1.5.0

Setup done

  • Created 5 benchmark tokens via:
    • POST https://$HOST/maas-api/v1/tokens using oc whoami -t
  • Wrote token file expected by k6 script:
    • tokens/all/all_tokens.json
    • Free tokens: 5
    • Premium tokens: 0

Commands run

# Baseline single-user
k6 run \
  -e HOST="maas.apps.rosa.j7mgr-s39et-cf9.yd65.p3.openshiftapps.com" \
  -e PROTOCOL="https" \
  -e MODEL_NAME="facebook/opt-125m" \
  -e BURST_VUS=1 \
  -e BURST_ITERATIONS=20 \
  --summary-export=results/feb20-baseline-burst-1vu-summary.json \
  k6/maas-performance-test.js

# Baseline small concurrency
k6 run \
  -e HOST="maas.apps.rosa.j7mgr-s39et-cf9.yd65.p3.openshiftapps.com" \
  -e PROTOCOL="https" \
  -e MODEL_NAME="facebook/opt-125m" \
  -e BURST_VUS=5 \
  -e BURST_ITERATIONS=100 \
  --summary-export=results/feb20-baseline-burst-5vu-summary.json \
  k6/maas-performance-test.js

Additional diagnostic run:

k6 run \
  -e HOST="maas.apps.rosa.j7mgr-s39et-cf9.yd65.p3.openshiftapps.com" \
  -e PROTOCOL="https" \
  -e MODEL_NAME="facebook/opt-125m" \
  -e BURST_VUS=3 \
  -e BURST_ITERATIONS=30 \
  --summary-export=results/feb20-baseline-burst-3vu-summary.json \
  k6/maas-performance-test.js

Results

Test HTTP reqs Success rate Failed req rate p95 latency (ms) Auth failure checks
1 VU, 20 iters 20 1.00 0.00 207.52 0
3 VUs, 30 iters 30 0.4333 0.5667 33.03 17
5 VUs, 100 iters 100 0.10 0.90 34.53 90

Observations

  • Single-user run is healthy (100% success).
  • At 3 VUs and higher, auth-related failures appear quickly (free_auth_failure check failures in k6 output).
  • 5 VUs run completed but crossed k6 thresholds (http_req_failed, success_rate) with heavy authentication failures.
  • Latency stayed low in failed runs because many requests failed fast (auth failure), so latency alone is not representative there.

Artifacts

  • feb20-maas-benchmark-baseline.md
  • tokens/all/all_tokens.json
  • results/feb20-baseline-burst-1vu-summary.json
  • results/feb20-baseline-burst-1vu.log
  • results/feb20-baseline-burst-3vu-summary.json
  • results/feb20-baseline-burst-3vu.log
  • results/feb20-baseline-burst-5vu-summary.json
  • results/feb20-baseline-burst-5vu.log

Re-run exact baseline (for PR comparison)

cd ~/perf-k6/maas-benchmark
set -euo pipefail

# 1) Resolve MAAS host and model
CLUSTER_DOMAIN=$(kubectl get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}')
HOST="maas.${CLUSTER_DOMAIN}"
MODEL_NAME="facebook/opt-125m"

# 2) Recreate token set (5 free tokens)
mkdir -p tokens/free tokens/all results
rm -f tokens/free/*.json tokens/all/*.json

USER_TOKEN=$(oc whoami -t)
for i in 1 2 3 4 5; do
  RESP=$(curl -sSk \
    -H "Authorization: Bearer ${USER_TOKEN}" \
    -H "Content-Type: application/json" \
    -X POST \
    -d '{"expiration":"30m"}' \
    "https://${HOST}/maas-api/v1/tokens")

  TOK=$(echo "$RESP" | jq -r '.token // empty')
  [ -n "$TOK" ] || { echo "Failed token creation: $RESP"; exit 1; }

  cat > "tokens/free/benchuser-free-${i}.json" <<JSON
{
  "token": "${TOK}",
  "expiration": "30m",
  "expiresAt": 0,
  "user_id": "benchuser-free-${i}",
  "tier": "free"
}
JSON
done

jq -s '.' tokens/free/*.json > tokens/all/free_tokens.json
echo '[]' > tokens/all/premium_tokens.json
jq -s '{"free": .[0], "premium": .[1]}' \
  tokens/all/free_tokens.json \
  tokens/all/premium_tokens.json > tokens/all/all_tokens.json

# 3) Run exact baseline tests
k6 run \
  -e HOST="$HOST" \
  -e PROTOCOL="https" \
  -e MODEL_NAME="$MODEL_NAME" \
  -e BURST_VUS=1 \
  -e BURST_ITERATIONS=20 \

  --summary-export=results/feb20-baseline-burst-1vu-summary.json \
  k6/maas-performance-test.js | tee results/feb20-baseline-burst-1vu.log

k6 run \
  -e HOST="$HOST" \
  -e PROTOCOL="https" \
  -e MODEL_NAME="$MODEL_NAME" \
  -e BURST_VUS=3 \
  -e BURST_ITERATIONS=30 \
  --summary-export=results/feb20-baseline-burst-3vu-summary.json \
  k6/maas-performance-test.js | tee results/feb20-baseline-burst-3vu.log || true

k6 run \
  -e HOST="$HOST" \
  -e PROTOCOL="https" \
  -e MODEL_NAME="$MODEL_NAME" \
  -e BURST_VUS=5 \
  -e BURST_ITERATIONS=100 \
  --summary-export=results/feb20-baseline-burst-5vu-summary.json \
  k6/maas-performance-test.js | tee results/feb20-baseline-burst-5vu.log || true

# 4) Quick metric extract
for f in \
  results/feb20-baseline-burst-1vu-summary.json \
  results/feb20-baseline-burst-3vu-summary.json \
  results/feb20-baseline-burst-5vu-summary.json; do
  echo "=== $f ==="
  jq '{http_reqs:.metrics.http_reqs.count,success_rate:.metrics.success_rate.value,http_req_failed:.metrics.http_req_failed.value,p95_ms:.metrics.http_req_duration.values["p(95)"]}' "$f"
done

Follow-up: Measure token creation time (and estimate for 50)

Use this during PR validation to measure live token mint latency without creating many tokens.

cd ~/perf-k6/maas-benchmark
set -euo pipefail

CLUSTER_DOMAIN=$(kubectl get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}')
HOST="maas.${CLUSTER_DOMAIN}"
URL="https://${HOST}/maas-api/v1/tokens"
AUTH="Authorization: Bearer $(oc whoami -t)"

# Small sample size (change to 10/20 as needed)
SAMPLES=10
OUT_FILE=/tmp/token_create_times_ms.txt
: > "$OUT_FILE"

for i in $(seq 1 $SAMPLES); do
  START=$(date +%s%3N)
  RESP=$(curl -sSk \
    -H "$AUTH" \
    -H "Content-Type: application/json" \
    -X POST \
    -d '{"expiration":"10m"}' \
    "$URL")
  END=$(date +%s%3N)

  TOK=$(echo "$RESP" | jq -r '.token // empty')
  [ -n "$TOK" ] || { echo "Token request failed: $RESP"; exit 1; }

  echo $((END-START)) >> "$OUT_FILE"
done

sort -n "$OUT_FILE" > /tmp/token_create_times_ms_sorted.txt
AVG=$(awk '{s+=$1} END {printf "%.2f", s/NR}' "$OUT_FILE")
P50=$(awk 'BEGIN{c=0} {a[++c]=$1} END{idx=int((c+1)/2); print a[idx]}' /tmp/token_create_times_ms_sorted.txt)
P90=$(awk 'BEGIN{c=0} {a[++c]=$1} END{idx=int(c*0.9); if(idx<1) idx=1; print a[idx]}' /tmp/token_create_times_ms_sorted.txt)

EST50_AVG_MS=$(awk -v a="$AVG" 'BEGIN{printf "%.0f", a*50}')
EST50_P50_MS=$((P50*50))
EST50_P90_MS=$((P90*50))

echo "HOST=$HOST"
echo "SAMPLES=$SAMPLES"
echo "TIMES_MS=$(tr '\n' ' ' < "$OUT_FILE" | sed 's/ $//')"
echo "AVG_MS=$AVG"
echo "P50_MS=$P50"
echo "P90_MS=$P90"
echo "EST_50_TOKENS_AT_AVG_MS=$EST50_AVG_MS"
echo "EST_50_TOKENS_AT_P50_MS=$EST50_P50_MS"
echo "EST_50_TOKENS_AT_P90_MS=$EST50_P90_MS"

Interpretation:

  • EST_50_TOKENS_AT_AVG_MS: expected sequential time for 50 token requests.
  • EST_50_TOKENS_AT_P90_MS: conservative estimate if requests are slower than usual.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment