-
Notifications
You must be signed in to change notification settings - Fork 8
feat: Token exchange for tiled insertion #1342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1342 +/- ##
==========================================
+ Coverage 95.00% 95.15% +0.14%
==========================================
Files 42 43 +1
Lines 2765 2848 +83
==========================================
+ Hits 2627 2710 +83
Misses 138 138 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
tpoliaw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might take a while to understand properly but I've added a few comments from the first pass through.
From reading this and the keycloak docs I think I'm on board with using token exchange but I'm still trying to figure out how it all fits together.
| @@ -0,0 +1,4 @@ | |||
| CONTEXT_HEADER = "traceparent" | |||
| VENDOR_CONTEXT_HEADER = "tracestate" | |||
| AUTHORIZAITON_HEADER = "authorization" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| AUTHORIZAITON_HEADER = "authorization" | |
| AUTHORIZATION_HEADER = "authorization" |
| ) | ||
| url: HttpUrl = HttpUrl("http://localhost:8407") | ||
| api_key: str | None = os.environ.get("TILED_SINGLE_USER_API_KEY", None) | ||
| token_exchange_secret: str = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe protect the client secret in case the config ever makes its way into logging etc
| token_exchange_secret: str = Field( | |
| token_exchange_secret: SecretStr = Field( |
| description="Token exchange client secret", default="" | ||
| ) | ||
| token_url: str = Field(default="") | ||
| token_exchange_client_id: str = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the client_id of the tiled service? The keycloak docs use the terms requester-client and target-client - could we use something similar here to distinguish which client this should be?
| ) | ||
| url: HttpUrl = HttpUrl("http://localhost:8407") | ||
| api_key: str | None = os.environ.get("TILED_SINGLE_USER_API_KEY", None) | ||
| token_exchange_secret: str = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these three fields always going to be all present or all absent? Could we make it into a nested TokenExchange | None field that in turn has every field required?
|
|
||
|
|
||
| for client in "system-test-blueapi" "ixx-cli-blueapi"; do | ||
| for client in "system-test-blueapi" "ixx-cli-blueapi" "ixx-blueapi" "tiled" "tiled-cli"; do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this have to be a for-loop? It looks like you're looping over the clients and then immediately using a case statement to do something different for each one.
| # ports: | ||
| # - 4181:4181 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete?
| match = re.search(r"(https?://\S+)", line) | ||
| if match: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| match = re.search(r"(https?://\S+)", line) | |
| if match: | |
| if match := re.search(r"(https?://\S+)", line): |
| if configuration.oidc is None: | ||
| raise InvalidConfigError( | ||
| "Tiled has been configured but oidc configuration is missing " | ||
| "this field is required to make authorization decisions." | ||
| ) | ||
| if tiled_conf.token_exchange_secret == "": | ||
| raise InvalidConfigError( | ||
| "Tiled has been enabled but Token exchange secret has not been set " | ||
| "this field is required to enable tiled insertion." | ||
| ) | ||
| tiled_conf.token_url = configuration.oidc.token_endpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean we can't run blueapi against unauthenticated local tiled instances for testing?
| if self._refresh_token is None: | ||
| raise Exception("Cannot refresh session as no refresh token available") | ||
| with self._sync_lock: | ||
| response = httpx.post( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to use httpx? I think we're using requests for all other blueapi http stuff (including in this module) and keeping it to a single library would be good if we can. There's an issue to look at moving everything to httpx but having a split for different things is confusing.
| if pass_through_headers is None: | ||
| raise ValueError( | ||
| "Tiled config is enabled but no " | ||
| f"{AUTHORIZAITON_HEADER} header in request" | ||
| ) | ||
| authorization_header_value = pass_through_headers.get(AUTHORIZAITON_HEADER) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't check that the pass through headers contains the auth header. Might as well try and get it and fail if it's not there. Might need to default pass_through_headers if it's None (or make it required?) - numtracker is already needing to do it above.
It would also be good if we could still run blueapi in testing using an unauthenticated instance of tiled. Could we make auth optional if it's not configured then make the error useful if we get a 401 back from tiled?
| if pass_through_headers is None: | |
| raise ValueError( | |
| "Tiled config is enabled but no " | |
| f"{AUTHORIZAITON_HEADER} header in request" | |
| ) | |
| authorization_header_value = pass_through_headers.get(AUTHORIZAITON_HEADER) | |
| if not (auth_header := pass_through_headers.get(AUTHORIZATION_HEADER)): | |
| raise ValueError( | |
| "Tiled config is enabled but no " | |
| f"{AUTHORIZAITON_HEADER} header in request" | |
| ) |
Option 1: Service account with write access
One approach is to use a service account that is allowed to write.
For example, we could create a service account (e.g.
ixx-tiled-writer) that has permissions for the ixx beamline only. This service account would be able to write to all ixx beamline sessions exposed viaixx-blueapi.diamond.ac.uk, but would not be able to write to sessions belonging to other beamlines.We will need to add AuthZ in blueapi to make sure that the person writing to the session has access to it.This would require creating a dedicated Keycloak client (
ixx-tiled-writer) with its own client ID and secret.This information will be added as a hard coded token claims as
{ "fedid": "ixx-tiled-writer", "permissions": ["ixx-admin"] }Downside:
If the client ID and secret are leaked, a malicious actor would gain read/write access to all ixx beamline sessions (though still not delete access). While scoped, this is still a significant risk.
An alternative service-account approach is to not hardcode any beamline permissions into the token at all. In this model, a single service account will have write to any beamline session, and
blueapiwould perform an explicit authorization check before allowing writes, ensuring the caller has access to the target beamline session.This is effectively equivalent to an API-key style integration with Tiled.
Security concern:
If this single service account is leaked, the attacker would gain read/write access to all beamline data (again, excluding delete). This is a much larger blast radius and therefore a serious vulnerability.
The core reason this problem exists is that, in this approach, we are not propagating end-user identity (
fedid) through to Tiled.While this approach is simpler to implement, it is significantly less secure, as a malicious actor could read data that is not intended for them.
Note: I couldn’t find a approach where we encode the fedid that we have got from the blueapi token into the service account token dynamically.(might not be possible because you will be able to impersonate anyone)
Option 2: Token exchange preserving user identity (
fedid)The second approach is more complex but significantly more secure: using Keycloak token exchange to preserve the original user identity and permissions.
In this model,
ixx-blueapiuses its client secret to exchange a user access token for another token that is scoped for Tiled. Importantly, the exchanged token retains the same user identity and permissions as the original token.This requires enabling the following settings on the
ixx-blueapiclient:"standard.token.exchange.enabled": "true""standard.token.exchange.enableRefreshRequestedTokenType": "SAME_SESSION"With token exchange:
ixx-blueapiclient secret is leaked, it cannot be used to gain additional access to Tiled data + You will need a access token with a valid session and access token only lasts for 5 mins inidentity-testand 1 minute inauthnImplications for
blueapi-cliTo support this, the
blueapi-cliKeycloak client must behave like other clients(argocd,etc) in the realm. In particular:offline_access)Because token exchange never creates a new user session, a service account alone cannot be used here. We therefore need a client that can establish a user session.
We will need device flow client (
ixx-blueapi-cli) with:ixx-blueapi(the private client backingixx-blueapi.diamond.ac.uk)ixx-blueapiI have verified that long scans continue to work correctly: the session remains active because the token is continually exchanged and the user is actively interacting with
blueapi, even though this happens in a machine-to-machine style workflow.Will this have any impact on the GDA side ,as far as I know GDA is per beamline so it could easily have a device flow client per beamline. ?
Testing changes
System tests were updated to use device flow login via Playwright, which opens a browser and performs a real login. This works correctly in CI.
This change was necessary because service accounts do not have user sessions and therefore cannot perform token exchange.
When using Playwright, you must run:
Alternatively, you can comment out the Playwright login and perform a manual login using:
References
https://www.keycloak.org/securing-apps/token-exchange
Related PR in Tiled
There is also a related PR in Tiled adding support for custom authentication in the Tiled client:
bluesky/tiled#1269