Last active
November 12, 2025 19:59
-
-
Save nerdalert/9e99fff177a657794e24423b4010702d to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| $ ./deployment/scripts/deploy-openshift.sh | |
| ========================================= | |
| π MaaS Platform OpenShift Deployment | |
| ========================================= | |
| π Checking prerequisites... | |
| Required tools: | |
| - oc: Client Version: 4.8.11 | |
| - jq: jq-1.7 | |
| - kustomize: {v5.7.1 2025-07-23T12:45:29Z } | |
| - git: git version 2.43.0 | |
| βΉοΈ Note: OpenShift Service Mesh should be automatically installed when GatewayClass is created. | |
| If the Gateway gets stuck in 'Waiting for controller', you may need to manually | |
| install the Red Hat OpenShift Service Mesh operator from OperatorHub. | |
| 1οΈβ£ Checking OpenShift version and Gateway API requirements... | |
| OpenShift version: 4.19.10 | |
| β OpenShift 4.19.10 supports Gateway API via GatewayClass (no feature gates needed) | |
| 2οΈβ£ Creating namespaces... | |
| βΉοΈ Note: If ODH/RHOAI is already installed, some namespaces may already exist | |
| namespace/opendatahub created | |
| namespace/kserve created | |
| namespace/kuadrant-system created | |
| namespace/llm created | |
| namespace/maas-api created | |
| 3οΈβ£ Installing dependencies... | |
| Checking for existing Kuadrant installation... | |
| No existing installation found, checking for leftover CRDs... | |
| Installing Kuadrant... | |
| β Namespace kuadrant-system already exists | |
| π Creating Kuadrant OperatorGroup... | |
| operatorgroup.operators.coreos.com/kuadrant-operator-group created | |
| π Creating Kuadrant CatalogSource... | |
| catalogsource.operators.coreos.com/kuadrant-operator-catalog created | |
| π Installing kuadrant (via OLM Subscription)... | |
| subscription.operators.coreos.com/kuadrant-operator created | |
| β³ Waiting for kuadrant-operator-controller-manager deployment to be created... (attempt 1/7) | |
| β³ Waiting for kuadrant-operator-controller-manager deployment to be created... (attempt 2/7) | |
| β³ Waiting for operators to be ready... | |
| deployment.apps/kuadrant-operator-controller-manager condition met | |
| deployment.apps/limitador-operator-controller-manager condition met | |
| deployment.apps/authorino-operator condition met | |
| Patching Kuadrant operator... | |
| clusterserviceversion.operators.coreos.com/kuadrant-operator.v1.3.0 patched | |
| β Kuadrant operator patched | |
| β Successfully installed kuadrant | |
| 4οΈβ£ Deploying Gateway infrastructure... | |
| Cluster domain: apps.rosa.j92cy-fqwow-wob.07gj.p3.openshiftapps.com | |
| Deploying Gateway and GatewayClass... | |
| gatewayclass.gateway.networking.k8s.io/openshift-default serverside-applied | |
| gateway.gateway.networking.k8s.io/openshift-ai-inference serverside-applied | |
| gateway.gateway.networking.k8s.io/maas-default-gateway serverside-applied | |
| 5οΈβ£ Checking for OpenDataHub/RHOAI KServe... | |
| β οΈ KServe not detected. Deploying ODH KServe components... | |
| π Installing odh... | |
| ========================================= | |
| π OpenDataHub (ODH) Installation | |
| ========================================= | |
| 1οΈβ£ Installing ODH Operator from repository manifests... | |
| Using operator image: quay.io/opendatahub/opendatahub-operator:latest | |
| namespace/opendatahub-operator-system created | |
| /tmp/tmp.9eROJgy6ok ~/maas-prs/11-gsed/maas-billing | |
| /tmp/tmp.9eROJgy6ok/opendatahub-operator /tmp/tmp.9eROJgy6ok ~/maas-prs/11-gsed/maas-billing | |
| mkdir -p /tmp/tmp.9eROJgy6ok/opendatahub-operator/bin | |
| Downloading sigs.k8s.io/controller-tools/cmd/controller-gen@v0.17.3 | |
| /tmp/tmp.9eROJgy6ok/opendatahub-operator/bin/controller-gen rbac:roleName=controller-manager-role crd:ignoreUnexportedFields=true webhook paths="./..." output:crd:artifacts:config=config/crd/bases | |
| mkdir -p config/crd/external/tmp | |
| GOFLAGS="-mod=readonly" /tmp/tmp.9eROJgy6ok/opendatahub-operator/bin/controller-gen crd paths=/home/ubuntu/go/pkg/mod/github.com/openshift/api@v0.0.0-20230823114715-5fdd7511b790/route/v1/... output:crd:artifacts:config=config/crd/external/tmp | |
| mv config/crd/external/tmp/*.yaml config/crd/external/ | |
| rm -rf config/crd/external/tmp | |
| mkdir -p config/crd/external/tmp | |
| GOFLAGS="-mod=readonly" /tmp/tmp.9eROJgy6ok/opendatahub-operator/bin/controller-gen crd paths=/home/ubuntu/go/pkg/mod/github.com/openshift/api@v0.0.0-20230823114715-5fdd7511b790/user/v1/... output:crd:artifacts:config=config/crd/external/tmp | |
| mv config/crd/external/tmp/*.yaml config/crd/external/ | |
| rm -rf config/crd/external/tmp | |
| mkdir -p config/crd/external/tmp | |
| GOFLAGS="-mod=readonly" /tmp/tmp.9eROJgy6ok/opendatahub-operator/bin/controller-gen crd paths=/home/ubuntu/go/pkg/mod/github.com/openshift/api@v0.0.0-20230823114715-5fdd7511b790/config/v1/... output:crd:artifacts:config=config/crd/external/tmp | |
| find config/crd/external/tmp -type f -name '*_authentications.yaml' -exec mv {} config/crd/external/ \;; | |
| rm -rf config/crd/external/tmp | |
| namespace/opendatahub-operator-system configured | |
| customresourcedefinition.apiextensions.k8s.io/auths.services.platform.opendatahub.io created | |
| customresourcedefinition.apiextensions.k8s.io/dashboards.components.platform.opendatahub.io created | |
| customresourcedefinition.apiextensions.k8s.io/datascienceclusters.datasciencecluster.opendatahub.io created | |
| customresourcedefinition.apiextensions.k8s.io/datasciencepipelines.components.platform.opendatahub.io created | |
| customresourcedefinition.apiextensions.k8s.io/dscinitializations.dscinitialization.opendatahub.io created | |
| customresourcedefinition.apiextensions.k8s.io/feastoperators.components.platform.opendatahub.io created | |
| customresourcedefinition.apiextensions.k8s.io/featuretrackers.features.opendatahub.io created | |
| customresourcedefinition.apiextensions.k8s.io/gatewayconfigs.services.platform.opendatahub.io created | |
| customresourcedefinition.apiextensions.k8s.io/hardwareprofiles.infrastructure.opendatahub.io created | |
| customresourcedefinition.apiextensions.k8s.io/kserves.components.platform.opendatahub.io created | |
| customresourcedefinition.apiextensions.k8s.io/kueues.components.platform.opendatahub.io created | |
| customresourcedefinition.apiextensions.k8s.io/llamastackoperators.components.platform.opendatahub.io created | |
| customresourcedefinition.apiextensions.k8s.io/modelcontrollers.components.platform.opendatahub.io created | |
| customresourcedefinition.apiextensions.k8s.io/modelregistries.components.platform.opendatahub.io created | |
| customresourcedefinition.apiextensions.k8s.io/monitorings.services.platform.opendatahub.io created | |
| customresourcedefinition.apiextensions.k8s.io/rays.components.platform.opendatahub.io created | |
| customresourcedefinition.apiextensions.k8s.io/trainingoperators.components.platform.opendatahub.io created | |
| customresourcedefinition.apiextensions.k8s.io/trustyais.components.platform.opendatahub.io created | |
| customresourcedefinition.apiextensions.k8s.io/workbenches.components.platform.opendatahub.io created | |
| serviceaccount/opendatahub-operator-controller-manager created | |
| clusterrole.rbac.authorization.k8s.io/opendatahub-operator-controller-manager-role created | |
| clusterrole.rbac.authorization.k8s.io/opendatahub-operator-metrics-reader created | |
| clusterrolebinding.rbac.authorization.k8s.io/opendatahub-operator-controller-manager-rolebinding created | |
| service/opendatahub-operator-controller-manager-metrics-service created | |
| service/opendatahub-operator-webhook-service created | |
| deployment.apps/opendatahub-operator-controller-manager created | |
| mutatingwebhookconfiguration.admissionregistration.k8s.io/opendatahub-operator-mutating-webhook-configuration created | |
| validatingwebhookconfiguration.admissionregistration.k8s.io/opendatahub-operator-validating-webhook-configuration created | |
| /tmp/tmp.9eROJgy6ok ~/maas-prs/11-gsed/maas-billing | |
| ~/maas-prs/11-gsed/maas-billing | |
| Waiting for operator to be ready (this may take a few minutes)... | |
| deployment.apps/opendatahub-operator-controller-manager condition met | |
| 2οΈβ£ Creating DSCInitialization resource... | |
| dscinitialization.dscinitialization.opendatahub.io/default-dsci created | |
| Waiting for DSCInitialization to be ready... | |
| Waiting for DSCInitialization to be ready... (1/30) | |
| β DSCInitialization is ready | |
| 3οΈβ£ Creating DataScienceCluster... | |
| datasciencecluster.datasciencecluster.opendatahub.io/default-dsc created | |
| Waiting for DataScienceCluster to be ready... | |
| Status: Phase=, Ready= (1/60) | |
| Status: Phase=Not Ready, Ready=False (2/60) | |
| Status: Phase=Not Ready, Ready=False (3/60) | |
| Status: Phase=Not Ready, Ready=False (4/60) | |
| Status: Phase=Not Ready, Ready=False (5/60) | |
| Status: Phase=Not Ready, Ready=False (6/60) | |
| β DataScienceCluster is ready | |
| ========================================= | |
| π Verification | |
| ========================================= | |
| DSCInitialization Status: | |
| NAME AGE PHASE CREATED AT | |
| default-dsci 71s Ready 2025-11-12T19:20:29Z | |
| DataScienceCluster Status: | |
| NAME READY REASON | |
| default-dsc True | |
| ========================================= | |
| β ODH Installation Complete! | |
| ========================================= | |
| Next steps: | |
| 1. Deploy your models using KServe InferenceService | |
| If you encounter issues, check the logs: | |
| - ODH Operator: kubectl logs -n openshift-operators deployment/opendatahub-operator-controller-manager | |
| - DSCInitialization: kubectl describe dscinitializations default-dsci | |
| - DataScienceCluster: kubectl describe datasciencecluster default-dsc | |
| π Selected components set up successfully for OpenShift! | |
| 6οΈβ£ Waiting for Kuadrant operators to be installed by OLM... | |
| β³ Waiting for CSV kuadrant-operator.v1.3.0 to succeed (timeout: 300s)... | |
| β CSV kuadrant-operator.v1.3.0 succeeded | |
| β³ Waiting for CSV authorino-operator.v0.22.0 to succeed (timeout: 60s)... | |
| β CSV authorino-operator.v0.22.0 succeeded | |
| β³ Waiting for CSV limitador-operator.v0.16.0 to succeed (timeout: 60s)... | |
| β CSV limitador-operator.v0.16.0 succeeded | |
| β³ Waiting for CSV dns-operator.v0.15.0 to succeed (timeout: 60s)... | |
| β CSV dns-operator.v0.15.0 succeeded | |
| Verifying Kuadrant CRDs are available... | |
| β³ Waiting for CRD kuadrants.kuadrant.io to appear (timeout: 30s)β¦ | |
| β CRD kuadrants.kuadrant.io detected, waiting for it to become Established... | |
| customresourcedefinition.apiextensions.k8s.io/kuadrants.kuadrant.io condition met | |
| β³ Waiting for CRD authpolicies.kuadrant.io to appear (timeout: 10s)β¦ | |
| β CRD authpolicies.kuadrant.io detected, waiting for it to become Established... | |
| customresourcedefinition.apiextensions.k8s.io/authpolicies.kuadrant.io condition met | |
| β³ Waiting for CRD ratelimitpolicies.kuadrant.io to appear (timeout: 10s)β¦ | |
| β CRD ratelimitpolicies.kuadrant.io detected, waiting for it to become Established... | |
| customresourcedefinition.apiextensions.k8s.io/ratelimitpolicies.kuadrant.io condition met | |
| β³ Waiting for CRD tokenratelimitpolicies.kuadrant.io to appear (timeout: 10s)β¦ | |
| β CRD tokenratelimitpolicies.kuadrant.io detected, waiting for it to become Established... | |
| customresourcedefinition.apiextensions.k8s.io/tokenratelimitpolicies.kuadrant.io condition met | |
| 7οΈβ£ Deploying Kuadrant configuration (now that CRDs exist)... | |
| kuadrant.kuadrant.io/kuadrant created | |
| 8οΈβ£ Deploying MaaS API... | |
| serviceaccount/maas-api created | |
| clusterrole.rbac.authorization.k8s.io/maas-api created | |
| clusterrolebinding.rbac.authorization.k8s.io/maas-api created | |
| configmap/tier-to-group-mapping created | |
| service/maas-api created | |
| deployment.apps/maas-api created | |
| httproute.gateway.networking.k8s.io/maas-api-route created | |
| authpolicy.kuadrant.io/maas-api-auth-policy created | |
| Restarting Kuadrant operator to apply Gateway API provider recognition... | |
| deployment.apps/kuadrant-operator-controller-manager restarted | |
| Waiting for Kuadrant operator to be ready... | |
| Waiting for deployment "kuadrant-operator-controller-manager" rollout to finish: 1 old replicas are pending termination... | |
| Waiting for deployment spec update to be observed... | |
| Waiting for deployment spec update to be observed... | |
| Waiting for deployment spec update to be observed... | |
| Waiting for deployment "kuadrant-operator-controller-manager" rollout to finish: 1 old replicas are pending termination... | |
| deployment "kuadrant-operator-controller-manager" successfully rolled out | |
| π Waiting for Gateway to be ready... | |
| Note: This may take a few minutes if Service Mesh is being automatically installed... | |
| β Service Mesh operator already detected | |
| Waiting for Gateway to become ready... | |
| gateway.gateway.networking.k8s.io/maas-default-gateway condition met | |
| 1οΈβ£1οΈβ£ Applying Gateway Policies... | |
| authpolicy.kuadrant.io/gateway-auth-policy serverside-applied | |
| ratelimitpolicy.kuadrant.io/gateway-rate-limits serverside-applied | |
| tokenratelimitpolicy.kuadrant.io/gateway-token-rate-limits serverside-applied | |
| 1οΈβ£3οΈβ£ Patching AuthPolicy with correct audience... | |
| Attempting to detect audience... | |
| Token created successfully | |
| JWT payload extracted | |
| Payload decoded successfully | |
| Detected audience: https://rh-oidc.s3.us-east-1.amazonaws.com/27bd6cg0vs7nn08mue83fbof94dj4m9a | |
| authpolicy.kuadrant.io/maas-api-auth-policy patched | |
| β AuthPolicy patched | |
| 1οΈβ£4οΈβ£ Updating Limitador image for metrics exposure... | |
| limitador.limitador.kuadrant.io/limitador patched | |
| β Limitador image updated | |
| ========================================= | |
| β οΈ TEMPORARY WORKAROUNDS (TO BE REMOVED) | |
| ========================================= | |
| Applying temporary workarounds for known issues... | |
| π§ Restarting Kuadrant, Authorino, and Limitador operators to refresh webhook configurations... | |
| pod "authorino-76d7b84c9-z869s" deleted | |
| pod "kuadrant-operator-controller-manager-6464cc7dd4-6flqf" deleted | |
| pod "kuadrant-operator-controller-manager-79c5d79bcb-m2xnt" deleted | |
| pod "limitador-operator-controller-manager-84d8fbb794-6zd8m" deleted | |
| β Kuadrant operator restarted | |
| deployment.apps/authorino-operator restarted | |
| β Authorino operator restarted | |
| deployment.apps/limitador-operator-controller-manager restarted | |
| β Limitador operator restarted | |
| Waiting for operators to be ready... | |
| Waiting for deployment "kuadrant-operator-controller-manager" rollout to finish: 0 of 1 updated replicas are available... | |
| deployment "kuadrant-operator-controller-manager" successfully rolled out | |
| deployment "authorino-operator" successfully rolled out | |
| Waiting for deployment "limitador-operator-controller-manager" rollout to finish: 1 old replicas are pending termination... | |
| Waiting for deployment "limitador-operator-controller-manager" rollout to finish: 1 old replicas are pending termination... | |
| deployment "limitador-operator-controller-manager" successfully rolled out | |
| ========================================= | |
| Deploying observability components... | |
| telemetrypolicy.extensions.kuadrant.io/user-group created | |
| servicemonitor.monitoring.coreos.com/limitador-metrics created | |
| β Observability components deployed | |
| ========================================= | |
| β Deployment Complete! | |
| ========================================= | |
| π Status Check: | |
| Component Status: | |
| MaaS API pods running: 1 | |
| Kuadrant pods running: 8 | |
| KServe pods running: 2 | |
| Gateway Status: | |
| Accepted: True | |
| Programmed: True | |
| Policy Status: | |
| AuthPolicy: False | |
| TokenRateLimitPolicy: | |
| Policy Enforcement Status: | |
| AuthPolicy Enforced: | |
| RateLimitPolicy Enforced: | |
| TokenRateLimitPolicy Enforced: | |
| TelemetryPolicy Enforced: True | |
| ========================================= | |
| π§ Troubleshooting: | |
| ========================================= | |
| If policies show 'Not enforced' status: | |
| 1. Check if Gateway API provider is recognized: | |
| kubectl describe authpolicy gateway-auth-policy -n openshift-ingress | grep -A 5 'Status:' | |
| 2. If Gateway API provider is not installed, restart all Kuadrant operators: | |
| kubectl rollout restart deployment/kuadrant-operator-controller-manager -n kuadrant-system | |
| kubectl rollout restart deployment/authorino-operator -n kuadrant-system | |
| kubectl rollout restart deployment/limitador-operator-controller-manager -n kuadrant-system | |
| 3. Check if OpenShift Gateway Controller is available: | |
| kubectl get gatewayclass | |
| 4. If policies still show 'MissingDependency', ensure environment variable is set: | |
| kubectl get deployment kuadrant-operator-controller-manager -n kuadrant-system -o jsonpath='{.spec.template.spec.containers[0].env[?(@.name=="ISTIO_GATEWAY_CONTROLLER_NAMES")]}' | |
| 5. If environment variable is missing, patch the deployment: | |
| kubectl -n kuadrant-system patch deployment kuadrant-operator-controller-manager --type='json' \ | |
| -p='[{"op": "add", "path": "/spec/template/spec/containers/0/env/-", "value": {"name": "ISTIO_GATEWAY_CONTROLLER_NAMES", "value": "openshift.io/gateway-controller/v1"}}]' | |
| 6. Restart Kuadrant operator after patching: | |
| kubectl rollout restart deployment/kuadrant-operator-controller-manager -n kuadrant-system | |
| kubectl rollout status deployment/kuadrant-operator-controller-manager -n kuadrant-system --timeout=60s | |
| 7. Wait for policies to be enforced (may take 1-2 minutes): | |
| kubectl describe authpolicy gateway-auth-policy -n openshift-ingress | grep -A 10 'Status:' | |
| If metrics are not visible in Prometheus: | |
| 1. Check ServiceMonitor: | |
| kubectl get servicemonitor limitador-metrics -n kuadrant-system | |
| 2. Check Prometheus targets: | |
| kubectl port-forward -n openshift-monitoring svc/prometheus-k8s 9090:9091 & | |
| # Visit http://localhost:9090/targets and look for limitador targets | |
| If webhook timeout errors occur during model deployment: | |
| 1. Restart ODH model controller: | |
| kubectl rollout restart deployment/odh-model-controller -n opendatahub | |
| 2. Temporarily bypass webhook: | |
| kubectl patch validatingwebhookconfigurations validating.odh-model-controller.opendatahub.io --type='json' -p='[{"op": "replace", "path": "/webhooks/1/failurePolicy", "value": "Ignore"}]' | |
| # Deploy your model, then restore: | |
| kubectl patch validatingwebhookconfigurations validating.odh-model-controller.opendatahub.io --type='json' -p='[{"op": "replace", "path": "/webhooks/1/failurePolicy", "value": "Fail"}]' | |
| If API calls return 404 errors (Gateway routing issues): | |
| 1. Check HTTPRoute status: | |
| kubectl get httproute -A | |
| kubectl describe httproute facebook-opt-125m-simulated-kserve-route -n llm | |
| 2. Check if model is accessible directly: | |
| kubectl get pods -n llm | |
| kubectl port-forward -n llm svc/facebook-opt-125m-simulated-kserve-workload-svc 8080:8000 & | |
| curl -k https://localhost:8080/health | |
| 3. Test model with correct name and HTTPS: | |
| curl -k -H "Content-Type: application/json" -d '{"model": "facebook/opt-125m", "prompt": "Hello", "max_tokens": 50}' https://localhost:8080/v1/chat/completions | |
| 4. Check Gateway status: | |
| kubectl get gateway -A | |
| kubectl describe gateway maas-default-gateway -n openshift-ingress | |
| If metrics are not generated despite successful API calls: | |
| 1. Verify policies are enforced: | |
| kubectl describe authpolicy gateway-auth-policy -n openshift-ingress | grep -A 5 'Enforced' | |
| kubectl describe ratelimitpolicy gateway-rate-limits -n openshift-ingress | grep -A 5 'Enforced' | |
| 2. Check Limitador metrics directly: | |
| kubectl port-forward -n kuadrant-system svc/limitador-limitador 8080:8080 & | |
| curl http://localhost:8080/metrics | grep -E '(authorized_hits|authorized_calls|limited_calls)' | |
| 3. Make test API calls to trigger metrics: | |
| # Use HTTPS and correct model name: facebook/opt-125m | |
| for i in {1..5}; do curl -k -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -d '{"model": "facebook/opt-125m", "prompt": "Hello $i", "max_tokens": 50}' "https://${HOST}/llm/facebook-opt-125m-simulated/v1/chat/completions"; done | |
| ========================================= | |
| π Next Steps: | |
| ========================================= | |
| 1. Deploy a sample model: | |
| kustomize build docs/samples/models/simulator | kubectl apply -f - | |
| 2. Get Gateway endpoint: | |
| CLUSTER_DOMAIN=$(kubectl get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}') | |
| HOST="maas.${CLUSTER_DOMAIN}" | |
| 3. Get authentication token: | |
| TOKEN_RESPONSE=$(curl -sSk -H "Authorization: Bearer $(oc whoami -t)" -H "Content-Type: application/json" -X POST -d '{"expiration": "10m"}' "${HOST}/maas-api/v1/tokens") | |
| TOKEN=$(echo $TOKEN_RESPONSE | jq -r .token) | |
| 4. Test model endpoint: | |
| MODELS=$(curl -sSk ${HOST}/maas-api/v1/models -H "Content-Type: application/json" -H "Authorization: Bearer $TOKEN" | jq -r .) | |
| MODEL_NAME=$(echo $MODELS | jq -r '.data[0].id') | |
| MODEL_URL="${HOST}/llm/facebook-opt-125m-simulated/v1/chat/completions" # Note: This may be different for your model | |
| curl -sSk -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -d "{\"model\": \"${MODEL_NAME}\", \"prompt\": \"Hello\", \"max_tokens\": 50}" "${MODEL_URL}" | |
| 5. Test authorization limiting (no token 401 error): | |
| curl -sSk -H "Content-Type: application/json" -d "{\"model\": \"${MODEL_NAME}\", \"prompt\": \"Hello\", \"max_tokens\": 50}" "${MODEL_URL}" -v | |
| 6. Test rate limiting (200 OK followed by 429 Rate Limit Exceeded after about 4 requests): | |
| for i in {1..16}; do curl -sSk -o /dev/null -w "%{http_code}\n" -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -d "{\"model\": \"${MODEL_NAME}\", \"prompt\": \"Hello\", \"max_tokens\": 50}" "${MODEL_URL}"; done | |
| 7. Run validation script (Runs all the checks again): | |
| ./deployment/scripts/validate-deployment.sh | |
| 8. Check metrics generation: | |
| kubectl port-forward -n kuadrant-system svc/limitador-limitador 8080:8080 & | |
| curl http://localhost:8080/metrics | grep -E '(authorized_hits|authorized_calls|limited_calls)' | |
| 9. Access Prometheus to view metrics: | |
| kubectl port-forward -n openshift-monitoring svc/prometheus-k8s 9090:9091 & | |
| # Open http://localhost:9090 in browser and search for: authorized_hits, authorized_calls, limited_calls | |
| # Validation: | |
| MODELS=$(curl -sSk ${HOST}/maas-api/v1/models -H "Content-Type: application/json" -H "Authorization: Bearer $TOKEN" | jq -r .) | |
| MODEL_NAME=$(echo $MODELS | jq -r '.data[0].id') | |
| MODEL_URL="${HOST}/llm/facebook-opt-125m-simulated/v1/chat/completions" # Note: This may be different for your model | |
| curl -sSk -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -d "{\"model\": \"${MODEL_NAME}\", \"prompt\": \"Hello\", \"max_tokens\": 50}" "${MODEL_URL}" | |
| {"id":"chatcmpl-4b6034d7-33c6-4052-bbf5-e2ee2960ad28","created":1762977259,"model":"facebook/opt-125m","usage":{"prompt_tokens":0,"completion_tokens":44,"total_tokens":44},"object":"chat.completion","do_remote_decode":false,"do_remote_prefill":false,"remote_block_ids":null,"remote_engine_id":"","remote_host":"","remote_port":0,"choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"The rest is silence. I am your AI assistant, how can I help you today? Testing@, #testing 1$ ,2%,3^, [4\u0026*5], 6~, 7-"}}]} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment