nerdalert/Maas-bench-feb20.md

## Maas-bench-feb20.md

      
    Raw
  

              Maas-bench-feb20.md
            
          
    MaaS Baseline Benchmark Feb 20, 2026

Run metadata


Executed at: 2026-02-21 04:14:07 UTC
Repo: ~/perf-k6/maas-benchmark
Target host: maas.apps.rosa.j7mgr-s39et-cf9.yd65.p3.openshiftapps.com
Protocol: https
Model ID detected from MAAS API: facebook/opt-125m
k6 version: k6 v1.5.0

Setup done


Created 5 benchmark tokens via:

POST https://$HOST/maas-api/v1/tokens using oc whoami -t


Wrote token file expected by k6 script:

tokens/all/all_tokens.json
Free tokens: 5
Premium tokens: 0


Commands run

# Baseline single-user
k6 run \
  -e HOST="maas.apps.rosa.j7mgr-s39et-cf9.yd65.p3.openshiftapps.com" \
  -e PROTOCOL="https" \
  -e MODEL_NAME="facebook/opt-125m" \
  -e BURST_VUS=1 \
  -e BURST_ITERATIONS=20 \
  --summary-export=results/feb20-baseline-burst-1vu-summary.json \
  k6/maas-performance-test.js

# Baseline small concurrency
k6 run \
  -e HOST="maas.apps.rosa.j7mgr-s39et-cf9.yd65.p3.openshiftapps.com" \
  -e PROTOCOL="https" \
  -e MODEL_NAME="facebook/opt-125m" \
  -e BURST_VUS=5 \
  -e BURST_ITERATIONS=100 \
  --summary-export=results/feb20-baseline-burst-5vu-summary.json \
  k6/maas-performance-test.js
Additional diagnostic run:
k6 run \
  -e HOST="maas.apps.rosa.j7mgr-s39et-cf9.yd65.p3.openshiftapps.com" \
  -e PROTOCOL="https" \
  -e MODEL_NAME="facebook/opt-125m" \
  -e BURST_VUS=3 \
  -e BURST_ITERATIONS=30 \
  --summary-export=results/feb20-baseline-burst-3vu-summary.json \
  k6/maas-performance-test.js
Results


Test
HTTP reqs
Success rate
Failed req rate
p95 latency (ms)
Auth failure checks


1 VU, 20 iters
20
1.00
0.00
207.52
0


3 VUs, 30 iters
30
0.4333
0.5667
33.03
17


5 VUs, 100 iters
100
0.10
0.90
34.53
90


Observations


Single-user run is healthy (100% success).
At 3 VUs and higher, auth-related failures appear quickly (free_auth_failure check failures in k6 output).
5 VUs run completed but crossed k6 thresholds (http_req_failed, success_rate) with heavy authentication failures.
Latency stayed low in failed runs because many requests failed fast (auth failure), so latency alone is not representative there.

Artifacts


feb20-maas-benchmark-baseline.md
tokens/all/all_tokens.json
results/feb20-baseline-burst-1vu-summary.json
results/feb20-baseline-burst-1vu.log
results/feb20-baseline-burst-3vu-summary.json
results/feb20-baseline-burst-3vu.log
results/feb20-baseline-burst-5vu-summary.json
results/feb20-baseline-burst-5vu.log

Re-run exact baseline (for PR comparison)

cd ~/perf-k6/maas-benchmark
set -euo pipefail

# 1) Resolve MAAS host and model
CLUSTER_DOMAIN=$(kubectl get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}')
HOST="maas.${CLUSTER_DOMAIN}"
MODEL_NAME="facebook/opt-125m"

# 2) Recreate token set (5 free tokens)
mkdir -p tokens/free tokens/all results
rm -f tokens/free/*.json tokens/all/*.json

USER_TOKEN=$(oc whoami -t)
for i in 1 2 3 4 5; do
  RESP=$(curl -sSk \
    -H "Authorization: Bearer ${USER_TOKEN}" \
    -H "Content-Type: application/json" \
    -X POST \
    -d '{"expiration":"30m"}' \
    "https://${HOST}/maas-api/v1/tokens")

  TOK=$(echo "$RESP" | jq -r '.token // empty')
  [ -n "$TOK" ] || { echo "Failed token creation: $RESP"; exit 1; }

  cat > "tokens/free/benchuser-free-${i}.json" <<JSON
{
  "token": "${TOK}",
  "expiration": "30m",
  "expiresAt": 0,
  "user_id": "benchuser-free-${i}",
  "tier": "free"
}
JSON
done

jq -s '.' tokens/free/*.json > tokens/all/free_tokens.json
echo '[]' > tokens/all/premium_tokens.json
jq -s '{"free": .[0], "premium": .[1]}' \
  tokens/all/free_tokens.json \
  tokens/all/premium_tokens.json > tokens/all/all_tokens.json

# 3) Run exact baseline tests
k6 run \
  -e HOST="$HOST" \
  -e PROTOCOL="https" \
  -e MODEL_NAME="$MODEL_NAME" \
  -e BURST_VUS=1 \
  -e BURST_ITERATIONS=20 \

  --summary-export=results/feb20-baseline-burst-1vu-summary.json \
  k6/maas-performance-test.js | tee results/feb20-baseline-burst-1vu.log

k6 run \
  -e HOST="$HOST" \
  -e PROTOCOL="https" \
  -e MODEL_NAME="$MODEL_NAME" \
  -e BURST_VUS=3 \
  -e BURST_ITERATIONS=30 \
  --summary-export=results/feb20-baseline-burst-3vu-summary.json \
  k6/maas-performance-test.js | tee results/feb20-baseline-burst-3vu.log || true

k6 run \
  -e HOST="$HOST" \
  -e PROTOCOL="https" \
  -e MODEL_NAME="$MODEL_NAME" \
  -e BURST_VUS=5 \
  -e BURST_ITERATIONS=100 \
  --summary-export=results/feb20-baseline-burst-5vu-summary.json \
  k6/maas-performance-test.js | tee results/feb20-baseline-burst-5vu.log || true

# 4) Quick metric extract
for f in \
  results/feb20-baseline-burst-1vu-summary.json \
  results/feb20-baseline-burst-3vu-summary.json \
  results/feb20-baseline-burst-5vu-summary.json; do
  echo "=== $f ==="
  jq '{http_reqs:.metrics.http_reqs.count,success_rate:.metrics.success_rate.value,http_req_failed:.metrics.http_req_failed.value,p95_ms:.metrics.http_req_duration.values["p(95)"]}' "$f"
done
Follow-up: Measure token creation time (and estimate for 50)

Use this during PR validation to measure live token mint latency without creating many tokens.
cd ~/perf-k6/maas-benchmark
set -euo pipefail

CLUSTER_DOMAIN=$(kubectl get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}')
HOST="maas.${CLUSTER_DOMAIN}"
URL="https://${HOST}/maas-api/v1/tokens"
AUTH="Authorization: Bearer $(oc whoami -t)"

# Small sample size (change to 10/20 as needed)
SAMPLES=10
OUT_FILE=/tmp/token_create_times_ms.txt
: > "$OUT_FILE"

for i in $(seq 1 $SAMPLES); do
  START=$(date +%s%3N)
  RESP=$(curl -sSk \
    -H "$AUTH" \
    -H "Content-Type: application/json" \
    -X POST \
    -d '{"expiration":"10m"}' \
    "$URL")
  END=$(date +%s%3N)

  TOK=$(echo "$RESP" | jq -r '.token // empty')
  [ -n "$TOK" ] || { echo "Token request failed: $RESP"; exit 1; }

  echo $((END-START)) >> "$OUT_FILE"
done

sort -n "$OUT_FILE" > /tmp/token_create_times_ms_sorted.txt
AVG=$(awk '{s+=$1} END {printf "%.2f", s/NR}' "$OUT_FILE")
P50=$(awk 'BEGIN{c=0} {a[++c]=$1} END{idx=int((c+1)/2); print a[idx]}' /tmp/token_create_times_ms_sorted.txt)
P90=$(awk 'BEGIN{c=0} {a[++c]=$1} END{idx=int(c*0.9); if(idx<1) idx=1; print a[idx]}' /tmp/token_create_times_ms_sorted.txt)

EST50_AVG_MS=$(awk -v a="$AVG" 'BEGIN{printf "%.0f", a*50}')
EST50_P50_MS=$((P50*50))
EST50_P90_MS=$((P90*50))

echo "HOST=$HOST"
echo "SAMPLES=$SAMPLES"
echo "TIMES_MS=$(tr '\n' ' ' < "$OUT_FILE" | sed 's/ $//')"
echo "AVG_MS=$AVG"
echo "P50_MS=$P50"
echo "P90_MS=$P90"
echo "EST_50_TOKENS_AT_AVG_MS=$EST50_AVG_MS"
echo "EST_50_TOKENS_AT_P50_MS=$EST50_P50_MS"
echo "EST_50_TOKENS_AT_P90_MS=$EST50_P90_MS"
Interpretation:

EST_50_TOKENS_AT_AVG_MS: expected sequential time for 50 token requests.
EST_50_TOKENS_AT_P90_MS: conservative estimate if requests are slower than usual.
Test	HTTP reqs	Success rate	Failed req rate	p95 latency (ms)	Auth failure checks
1 VU, 20 iters	20	1.00	0.00	207.52	0
3 VUs, 30 iters	30	0.4333	0.5667	33.03	17
5 VUs, 100 iters	100	0.10	0.90	34.53	90
No results found