Definitions
A connector is a linkage to a third party content storage service. The connector allows the contents within the storage to be accessed by the Publisher UI.
Here you will find the details regarding commonly used terms related to Publisher Connectors. These terms are used in the rest of this guide.
Service registry
A service that keeps a record of which other services exist in the cluster. This allows services within the cluster to find each other at run time. The service registry within Publisher implements the GA4GH Service Registry specification. Connector services are registered with the service registry when they are first deployed and deregistered when they are undeployed. Components within Publisher that need to know which Connectors are available consult the service registry to find out.
Security tokens
OAuth: OAuth is an open standard for requesting and using access tokens. OAuth defines several “flows” for obtaining tokens in different situations. Internal Publisher components such as Connectors use the “client credentials flow” to obtain tokens for calling endpoints on other services.
Access token: an unguessable, unforgeable string that can accompany a network request to prove that the caller is allowed to access that API. In Publisher, all OAuth access tokens are JWTs.
JWT: JSON Web Token is an open standard (RFC 7519) that defines a compact and self-contained way for securely transmitting information between parties as a JSON object. All Publisher services require a valid JWT that contains the correct claims to accompany each request.
JWT Claim: An entry in the JSON object embedded within the body of a JWT.
Scope: A claim in a JWT which specifies a limit on which actions a client is allowed to perform.
Resource: A fully-qualified URI that specifies a certain subset of data within a service. All Publisher Connectors share the same base resource URI, and each data source connection has a URI beneath that. Resources are used together with actions, scopes, and other JWT claims to determine what a given access token allows.
Actions: A claim in a JWT which specifies the actions that are permitted on a specific resource. In Publisher Connectors, actions allow operations such as defining connections, listing tables, listing blobs, accessing table data, and accessing blob data.
Authorization server: In OAuth 2, the authorization server is the only component that can create access tokens. Thus, all access policies within the system also lie within the authorization server. Each authorization server has a unique issuer URI and its own signing key for tokens to keep a proper boundary between security domains. In Publisher, the authorization server is called Wallet.
Resource server: In OAuth 2, a resource server is any server that hosts protected resources. A resource server requires a valid authorization token that was issued by a trusted authorization server to accompany each request. In Publisher, all services including Connectors are resource servers.
Connection types
Properties of a connection including a unique id, an icon and the type of APIs the connection supports (i.e. “blobs”, “tables”).
Connection (Data source)
A connection to a third party content storage service such as Microsoft’s Azure Blob Storage which allows the contents within the storage to be accessed from Publisher.
JSON patch
JSON Patch is a web standard format for describing changes in a JSON document. It is meant to be used together with HTTP Patch which allows for the modification of existing HTTP resources. The patch request usually contains a version number (etag) which is compared against the version number of the resource on the server. If the versions do not match, the request will be rejected, and an error code 409
will be returned.
See RFC6902 to learn more about JSON Patch
Blob sub-API
BlobList: Lists the blobs within a connection. If this connection only supports tables, this endpoint returns an empty BlobList.
Blob: An object styled after the GA4GH DRS API DrsObject.
BlobIdentifier: Identifies a blob within the scope of a given connection.
AccessUrl: A fully resolvable URL that can be used to fetch the actual object bytes.
Table sub-API
TablesList: Lists the tables within the requested connection. If this connection does not contain any tables, returns an empty TablesList.
Table: Describes a table of data.
TableData: A paginated collection of tabular data.
JSON Schema: A vocabulary that allows you to annotate and validate JSON documents.
SearchRequest: Request body containing an SQL query with zero or more positional parameters.
Connection health
To accurately reflect a connections' health against its storage backend system, we require all implementers to log all API calls to external storage systems. All methods in the BlobService and TableService interface, have a ConnectionHealthLogger
as its first parameter for this reason. This class provides two record
methods to log a success/error record depending on whether the external API call successfully went through or not. It also provides logError
& logSuccess
methods which can be used to directly record a success/error log.
An important thing to keep in mind is that, we only want to log external storage API calls. For example, if the connection details (such as its authentication details) have changed and connector can’t make any calls to the storage API, the user would see the connection health as “Disconnected” and can work on fixing the connection.
In the Publisher UI, the health of a connections can be viewed from the Data Sources
page. The overall health of a connection is determined based on the outcome of the last five calls to the storage backend API -
UP
: All five previous API calls succeeded.DOWN
: All five previous API calls failed.INTERMITTENT
: Some of the previous five API calls failed.
Connection logs
Returns the activity logs of the given data source connection.
Learn how to access a connection log from Publisher UI.
Backend call
Detailed status of an API call made for a connection.
Inbound security requirements
All requests to the connector must contain the correct scope in its JWT depending on the type of request being made. The scope requirements are as follows:
dlcon:connection:write
: For creating a new connection, updating the configuration details of an existing connection or deleting an existing connection.dlcon:connection:read
: For listing existing connections or accessing configuration details on an existing connection.dlcon:connection:list
: For listing blobs for a connection (Exclusive to blob storage).dlcon:connection-type:read
: For listing all connection types.
See connector specifications on the scope requirement for each request
The correct actions claim is also required to access a connector. The actions required will vary depending on the connector. For example, the actions required by the azure-blobstore
are dlcon:table:data
and dlcon:table:info
.
For resource claim, use http://dlcon.local
for local development and https://dlcon.${SPACE_DNS_NAME}
for actual deployments.
All connectors share a common audience because components inside Publisher that need to access some connector need equal access to all connectors. (Data Consumers are granted access by collection, not by data source).
Security requirements for initiating HTTP calls within the cluster
In order to access resources provided by other services within a cluster, the service must initiate a token exchange. Using the OAuth 2 resource indicator extension, a service client must indicate the resource they wish to authorize for during an OAuth 2 flow. On successful authorization, the access token returned is a JWS containing the actions the relevant principal is authorized to perform on the requested resources, or a reference to that resource when the list of actions exceeds a certain size unsuitable for an access token.
A sample token exchange call is as follows:
POST oauth/token?grant_type=urn:ietf:params:oauth:grant-type:token-exchange&resource={resource}&scope={scope}&subject_token_type=urn:ietf:params:oauth:token-type:jwt&subject_token={subject_token}
Here is an example of the decoded body of an access token containing actions directly:
{
"sub": "max@dnastack.com",
"iat": 1516239022,
"exp": 1516249022,
"azp": "data-lake-ui-client-id",
"aud": "https://data-lake.mss.ng/collection",
"actions": {
"https://data-lake.mss.ng/collection/mssng-db6/ga4gh/search": [
"search:ListTables",
"search:PostQuery"
],
"https://data-lake.mss.ng/collection/mssng-db6/ga4gh/drs": [
"drs:GetObject",
"drs:GetAccessMethod"
]
}
}
Connection editor UI
Each connector must provide their own Data Source Configuration
which is unique to each connector. The custom configuration will be rendered in an iframe inside the Add/Edit Data Source
dialog.
Below is an example of the configuration page for the Azure Blob Storage
connector.

The URL to the configuration screen should be in the following format: For adding a new Data Source:
https://{connectorUrl}/frontend/datasource/new?frontend={publisherUrl}
For editing an existing data source:
https://{connectorUrl}/frontend/datasource/edit?id={datasourceId}&frontend={publisherUrl}
The configuration screen is responsible for validating all credentials provided in its UI.
Change monitoring
The connector can submit change notifications to DNAStack’s synapse-pubsub
service. This allows changes to the connector to be broadcasted to other services that may be listening.
For a connector to create a change message, it must have the write:topic
scope. Likewise, for a connector to read from a notification, it must have the read:topic
scope.
The endpoint for submitting a change message is POST {Pubsub URL}/topics/{topicName}
and the message should contain the following properties:
Long
id;Long
index;String
topicName;Instant
createdAt;Map<String, String>
headers;Map<String, Object>
body;
Auditing
The connector can submit audit events to DNAStack’s audit-log-service
. This allows changes made to the connector to be viewed and tracked by Publisher admins. Generally, whenever a connection is created, updated or deleted an audit event should be created to track the action.
For a connector to be able to access the auditing service, the action claim audit:create
must be present in the JWT when requesting the resource {audit-service-url}/events
.
An audit event message can be generated using DNAStack’s AuditEventBodyBuilder
and consists of the following components:
resource
: Should be/connections/
or/connections/{id}
if modifying or deleting an existing connection.outcome
:created
,modified
ordeleted
depending on the action performed.action
: Should always bedlcon:connection:write
.timestamp
: The instant when the log is generated.