PR #365 (sso_user_id_updates by CapitolCoder) — 3,143 additions, 80 deletions, 16 files
PR #363 (feature/COM-18-ad-phase-1 by jordotech) — 3,124 additions, 264 deletions, 22 files
Both PRs target the same goal: make platform-api Auth0-aware so SSO users get proper UUIDs instead of Auth0 sub strings.
| Aspect | PR #365 | PR #363 |
|---|---|---|
| Base branch | feature/COM-18-ad-phase-1 (depends on #363) |
develop (standalone) |
| Auth discovery endpoint | Not included | POST /public/auth/discover (new) |
| Auth0 JWKS middleware | Not included (inherits from #363) | Added RS256 verification in check_org_access |
/organizations/me Bearer support |
Not included (inherits from #363) | Added Auth0 user from request.state.auth0_user |
/validate + /check Auth0 detection |
Yes — rewrites these endpoints | Yes — already implemented |
| User UUID resolution | New utils/user_resolution.py utility |
Inline in _validate_token_impl() |
| clj-pg-wrapper endpoint used | GET /admin/user/{email} (read-only lookup) + POST /api/v1/users/by-email (get-or-create) |
POST /api/v1/users/ensure (get-or-create) |
| JIT provisioning enhancement | Adds clj-pg-wrapper user creation in /organizations/me |
Not needed — /users/ensure is called from /validate |
AUTH0_TEST_MODE |
New env var + HS256 test token support | Not needed |
| Shell-based e2e tests | 420-line sso_e2e.just with bash scripts |
Not included |
| New Python files | 4 (user_resolution.py, 3 test files, 4 scripts) |
3 (auth_discover.py, schemas/auth_discover.py, 1 test file) |
PR #363 (ours): UUID resolution happens at /validate time via POST /users/ensure. Every Auth0 token validation creates-or-returns the user. Simple, single call site.
PR #365: Splits resolution into two separate utilities:
lookup_user_uuid_by_email()— read-onlyGET /admin/user/{email}, used inget_current_user(),check_org_access(), and/validateget_or_create_user_by_email()—POST /api/v1/users/by-email, used only in/organizations/meJIT
This means a first-time SSO user hitting /validate gets a 401 ("User not provisioned yet") because the read-only lookup returns None. They must hit /organizations/me first to trigger JIT creation. This creates an ordering dependency in the auth flow.
-
First-time SSO user gets 401 on
/validate— The read-onlylookup_user_uuid_by_email()returnsNonefor new users, and the endpoint returns401 "User not provisioned yet"instead of creating the user. The agentic-backend calls/validatefirst — SSO users would be blocked until they visit the frontend and trigger/organizations/meJIT. -
Depends on PR #363 — Base branch is
feature/COM-18-ad-phase-1, meaning it layers on top of our PR. It can't be merged independently. -
Depends on PR #26's endpoint —
user_resolution.pycallsPOST /api/v1/users/by-email(from PR #26 in clj-pg-wrapper), not our/users/ensure. These endpoints have different semantics (email_verified=TruevsFalse,EmailStrvalidation vs plain string). -
Rewrites code that already works —
auth.pyis rewritten from scratch, replacing our working_validate_token_impl()with_handle_token_validation(). The new version is async but introduces the 401 regression for new users. -
AUTH0_TEST_MODEis a security surface — Adds an env var that accepts HS256 self-signed tokens when enabled. While useful for e2e, this pattern risks accidental enablement in production. -
Dual-header forwarding in
/validate— AddsAuthorization: Bearerheader parsing directly in the/validateand/checkendpoints. This duplicates what agentic-backend already does (it forwards viaX-User-Token). The existing flow works: agentic-backend extracts the Bearer token and sends it asX-User-Token. -
Over-engineered test infrastructure — 420-line bash e2e justfile, 4 Python scripts, 1,063-line e2e test file, 946-line SSO auth test file. Total: ~2,800 lines of test code for ~300 lines of implementation.
-
verify_auth0_user_tokenreturnsuser_id=None— Changes the existing function to returnNoneforuser_idinstead of the Auth0sub. This breaks the existing contract — any code that relied onuser_idbeing non-None after successful verification now fails.
- The separation of read-only lookup vs get-or-create is conceptually clean
- JIT user creation in
/organizations/mealongside DynamoDB membership is a good idea - The e2e test coverage (if it worked) would be valuable
- The
capitol_user_idJWT claim future-proofing is forward-thinking
PR #363 should be the base. It's already merged-ready with working auth discovery, JWKS middleware, Bearer token support, and UUID resolution via /users/ensure. PR #365 layers on top of it but introduces a 401 regression for first-time SSO users and requires PR #26's /by-email endpoint.
If specific ideas from #365 are valuable (JIT user creation in /organizations/me, test infrastructure), they can be cherry-picked as follow-ups after #363 lands.
1.
First-time SSO user gets 401 on /validate — The read-only lookup_user_uuid_by_email() returns None for new users, and the endpoint returns 401 "User not provisioned yet" instead of creating the user. The agentic-backend calls /validate first — SSO users would be blocked until they visit the frontend and trigger /organizations/me JIT.Did this intentionally so that the member creation in DynamoDB and user creation in clj pg wrapper are in sync during JIT.
Reduces chances of data mismatch where user exists in clj but not in dynamo and vice versa
2.
AUTH0_TEST_MODE is a security surface — Adds an env var that accepts HS256 self-signed tokens when enabled. While useful for e2e, this pattern risks accidental enablement in production.This is a risk and did not like adding it either, but was the quickest to run end to end tests locally and can/should be removed
3.
verify_auth0_user_token returns user_id=None — Changes the existing function to return None for user_id instead of the Auth0 sub. This breaks the existing contract — any code that relied on user_id being non-None after successful verification now fails.This is very intentional, we do not want to use the auth zero
subfield as the user id anywhere in platform api code or other services, even as a fallback.This prevents agentic-backend from using the auth zero
subas user uuid in its database and avoids data corruption.We should only use the canonical user uuid from clj in all our services.