Access Control
By default, every Transformer and Pipeline has access controls applied to it. As with other components in SDL, access control decisions are made by OPA according to a policy that is customizable for a particular use case, rather than being baked into the code directly. The policy controlling the Data Pipeline Engine is located here.
| Because all access control decisions are done via OPA, the associated policy is what controls the semantics described below. Any discussion of access controls in this document uses the default policy above as reference. A team wanting to extend or modify the access control paradigm for Pipelines and Transformers should generally start with the builtin policy and modify it as needed, to maintain existing functionality to the extent possible. |
Overview
Each Pipeline and Transformer is annotated with three fields related to access control:
-
created_by: This is filled in by the engine when a Pipeline or Transformer is created, according to the identity associated with the JWT used to authenticate with the API. By default, the entity referred to bycreated_byhas both read and write permissions to the resource. -
access_controls: This is an array of complex objects whose structure is defined below. Essentially, this allows any entity with write access to share the resource with another user or group, as defined in Keycloak. -
security_markings: This is a free-text field intended to contain security markings for the resource (for example, classification level). In order to maintain generality across different use cases, the pipeline engine does not perform any validation on this field, allowing entities to write anything into it. Additionally, the default OPA policy does not perform any filtering by this field. This is intended to be modified by an individual use case to allow mandatory access controls beyond what exist as attributes and groups for an entity in Keycloak. For example, an OPA policy could be written to take advantage of a classification parser to perform ACCM filtering on a pipeline or transformer automatically.
The combination of these three fields is intended to support various access control scenarios, including:
-
Basic access control, which restricts visibility of pipelines and transformers to the user that created them.
-
Sharing, where a user grants another user or group read or read+write permissions to a Pipeline or Transformer.
-
Creation on-behalf-of, where a service creates a Pipeline or Transformer but makes it visible (potentially read-only) to a user or group.
-
Mandatory access control, where the security markings applied to a Pipeline or Transformer restrict its visibility beyond the attributes of a user or their membership in a particular group.
Data Model
Entity URNs
Access control information is tied to users and/or groups as defined in Keycloak (or whatever OIDC provider is being used). The way this information is encoded in the Pipeline Engine is via a URN namespaced to the specific type. Three formats are allowed:
-
urn:rdp:keycloak-user:<user>: The most common. This should map to thepreferred_usernameof a user as referenced in Keycloak. -
urn:rdp:keycloak-group:<group>: A group as referenced in Keycloak. Note that it is assumed that all users and groups are in the same realm. -
urn:rdp:keycloak-svc-acct: Currently defined but unsupported. The intent would be to allow service accounts to create Pipelines or Transformers on behalf of users.
| While it is technically possible to create Keycloak Groups with special characters in their name (like spaces), it is not recommended if they will be used with the Data Pipelines. |
access_controls
By default, a pipeline or transformer is created with no access_controls; only the user who created it is able to view and modify it. Additionally, when creating or updating a Pipeline or Transformer, a user can add a set of additional permissions, like the format below:
[
{
"entity": "urn:rdp:keycloak-user:user-a",
"read": true,
"write": true
},
{
"entity": "urn:rdp:keycloak-group:Group-A",
"read": true,
"write": false
}
]
The format should be self-documenting. The result of these access controls would be that the given entity (Pipeline or Transformer) would be:
-
Readable by both
urn:rdp:keycloak-user:user-aandurn:rdp:keycloak-group:Group-A(i.e. GET and LIST requests will return the entity) -
Writable by
urn:rdp:keycloak-user:user-a. Note that this would allowuser-ato delete the Transformer.
Even if a Pipeline or Transformer grants write permissions to another user or group, the original created_by remains, even if it is further updated.
|