Protocol YAML Files
IMPORTANT
Starting with Tetra Data Platform (TDP) version 3.6.0 and the TetraScience Software Development Kit (SDK) 2.0 release, the
protocol.yml
file format replaces both the previousprotocol.json
andscript.js
file formats when creating self-service Tetra Data pipelines (SSPs).
Protocols define the business logic of your pipeline by specifying the steps and the functions within task scripts that run those steps. You can configure your own custom protocols for use in an SSP by creating a protocol.yml
file.
NOTE
For instructions on how to create and deploy a custom protocol, see Create and Deploy a Protocol in the “Hello, World!” SSP Example.
protocol.yml
File Example
protocol.yml
File ExampleprotocolSchema: "v3"
name: "Example Protocol"
description: "This is just to show a basic protocol.yml file"
config:
idsFileCategory:
label: "IDS File Category"
description: "Optional File Category"
type: "string"
required: false
default: "IDS"
steps:
- id: raw-to-ids
task:
namespace: common
slug: akta-raw-to-ids
version: v3.1.6
function: akta-raw-to-ids
input:
input_file_pointer: $( workflow.inputFile )
ids_file_category: $( config.idsFileCategory )
Expressions
Instead of directly specifying everything in a protocol, you can use expressions to programmatically control the workflow at run time.
Expressions are specified by enclosing them in the following: $( ... )
Within an expression, you have access to several contexts that contain data and a limited list of functions. These contexts can be combined however you want within a single expression.
Expression Example
input_file_pointer: $( workflow.inputFile )
Contexts
The following contexts contain data and a limited list of functions that you can combine however you want within a single expression.
workflow
Context
workflow
ContextThe workflow
context contains data relevant to the current workflow.
workflow
Context Examples
Value | Contents |
---|---|
workflow.id | Current workflow id |
workflow.inputFile | Pointer to the file that triggered the workflow |
workflow.startedAt | ISO 8601-formatted string of the date and time when the workflow started |
workflow.pipeline.id | id of the pipeline that triggered the workflow |
config
Context
config
ContextThe config
context contains data configured by the pipeline user.
config
Context Example
config:
foo:
label: Foo
type: string
bar:
label: Hey Bear
type: object
NOTE
For this
config
context example, you could write$( config.foo )
or$( config.bar )
, but$( config.baz )
would result in an error.
constants
Context
constants
ContextContains all the constant values defined in the protocol. If the constant contains an expression, it will be evaluated before access.
constants
Context Example
constants:
foo: "bar"
baz:
- 1
- 2
NOTE
For this
constants
context example, you could write$( constants.foo )
or$( constants.baz )
, but$( constants.qux )
would result in an error.
protocol
Context
protocol
ContextContains manifest fields. For example, a protocol
context could contain $( protocol.name )
or $( protocol.version )
. If the value is not specified in the protocol.yml
file, it will return a null
value.
protocol
Context Examples
For example protocol context values, see the Manifest Fields section.
steps
Context
steps
ContextContains all of the step results. A step can be accessed in this context only once it has been run, or skipped. Steps can be referenced either by their id
or by their index within the steps list.
For example, if the first step has id: stepOne
, then its output could be accessed by any of the following:
steps[0].output
steps['stepOne'].output
steps.stepOne.output
steps
Context Examples
Value | Contents |
---|---|
steps.\<step_id>.output / steps[*].output | The output returned from the task script in the referenced step |
steps.\<step_id>.error / steps[*].error | Errors returned from the task script Note: This value is returned only if the referenced task script has its continueOnError: value set to true . |
steps.\<step_id>.status / steps[*].status | Status of the referenced step, listed as one of the following values: - success - failed - skipped |
steps.\<step_id>.isSuccess / steps[*].isSuccess | Contains true if the step was successful, or false if the step failed or was skipped |
steps.\<step_id>.isFailed / steps[*].isFailed | Contains true if the step failed, or false if the step was successful or was skipped |
steps.\<step_id>.isSkipped / steps[*].isSkipped | Contains true if the step was skipped (using if ), or false if the step was either successful or failed |
Functions
You can use the following function:
values in a protocol.yml
file to alter data within expressions.
IMPORTANT
Any functionality beyond the ones listed in the following table should be added to a task script, not a protocol.
Function | Description | Example |
---|---|---|
coalesce(...args) | Returns the first value that isn’t null , or null if all values are null | coalesce(null, null, "red", "blue") returns "red" |
not(value) | Returns the negation of the value | not(true) returns false |
isNumber(value) | Returns true if the value is a number | isNumber(5) returns true isNumber("5") returns false |
isString(value) | Returns true if the value is a string | isString(5) returns false isString("5") returns true |
isBoolean(value) | Returns true if the value is a boolean data type | isBoolean(false) returns true isBoolean("false") returns false |
isNull(value) | Returns true if the value is null | isNull(null) returns true isNull(4) returns false |
protocolSchema: “v3”
(Required)
protocolSchema: “v3”
(Required)Type: String (must be “v3"
)
The protocolSchema: "v3"
property is required to indicate that this protocol is in the v3
format.
Manifest Fields
The following manifest fields are optional, top-level metadata fields.
Field Name | Type | Description |
---|---|---|
type | String (must be protocol ) | Artifact type |
namespace | String (must be common or a string starting with client- or private- ) | Protocol namespace |
version | String (must be in the following form: vMAJOR.MINOR.PATCH , where MAJOR , MINOR , and PATCH are all numbers) | Protocol version number |
slug | String | Protocol slug/identifier |
name | String | Protocol name |
description | String | Protocol description |
catalog_keys | List of strings | Used to look up the appropriate labels so that they can be added to the labels array in the protocol’s manifest.json file |
labels | List of key-value pairs Example - name: LABEL_NAME value: LABEL_VALUE - name: LABEL_NAME2 value: LABEL_VALUE2 | Used to annotate files and trigger pipelines without changing any file data For more information, see Labels. |
config
config
Type: Object
The config
property is an object that maps a configuration ID to for a config object. Each config ID is unique so that it can be referenced in the steps
property.
IMPORTANT
Config objects can’t contain expressions.
config
Property Example
config
Property Exampleconfig:
numberConfig:
label: "Enter a Number"
description: "This number will be used by the task scripts"
type: number
default: 9000
secretExample:
label: "ELN Password"
description: "The password to push data to some ELN"
type: secret
config.<config_id>.label
(Required)
config.<config_id>.label
(Required)Type: String
The config
label that’s rendered in the TDP’s Pipeline Manager page.
config.<config_id>.type
(Required)
config.<config_id>.type
(Required)Type: String
The config type determines how the TDP UI renders the configuration element and how it is passed into the workflow.
The config.<config_id>.type
value must be one of the following:
"number"
"string"
"text"
"boolean"
"secret"
"object"
config.<config_id>.required
(Required)
config.<config_id>.required
(Required)Type: Boolean
Determines if the config is required or not. This requirement is reflected in the TDP UI and is also checked at workflow run time. If not set, this defaults to false
.
.config.<config_id>.description
.config.<config_id>.description
Type: String
A short description of the config
property. This description is what’s rendered on the Pipeline Manager page in the TDP.
config.<config_id>.default
config.<config_id>.default
Type: String
The default config
property. This default property is used if the config
property isn’t configured in the TDP UI. This property should contain a value whose type matches the config's type
.
Constants
Type: Object
The constants
property is an object that maps a constant ID to any value. This property allows complex protocols to define common values once, instead of having them listed multiple times throughout the protocol.yml
file.
Each constant ID is unique so that it can be referenced in the steps
property, or from other constants
. Constants can contain expressions. However, those expressions don't have access to the steps
context, because no steps have run when constants are evaluated.
IMPORTANT
Circular dependencies aren’t allowed and are detected by the TDP automatically.
constants Property Example
constants:
# a number
REMOTE_PORT: 8080
# a string
REMORE_URL: "http://example.com"
# a list
VALID_VALUES:
- 80
- 443
- $( constants.REMOTE_PORT )
# an object
CONFIG_OBJECT:
url: $( constants.REMORE_URL )
ports: $( constants.VALID_VALUES )
Steps (Required)
Type: List
The steps
property defines the tasks that the protocol runs and their inputs. The steps
property contains a list of step objects. Each step object runs a task script.
steps
Property Example
steps
Property Examplesteps:
- id: stepOne
task:
namespace: common
slug: raw-to-ids
version: v1.0.0
function: main
input:
input_file_pointer: $( workflow.inputFile )
- task:
namespace: common
slug: push-to-eln
version: v2.1.0
function: "push-it"
input:
ids_file: $( steps.stepOne.output.ids_file )
options:
timeoutInSec: $( config.pushToElnTimeout )
steps[*].id
steps[*].id
Type: String
Defines a unique identifier for the step. The allows the step object to be referenced by its id
in later steps. If no id
is specified for a step, then later steps must refer to it by its index.
steps[*].if
steps[*].if
Type: Boolean or expression
If this property is set, then the step will run only if the value is true
, or if the expression evaluates to true
. If the steps[*].if
property isn’t set, then the step always runs.
NOTE
You can use the
steps[*].if
property to conditionally run steps based on the output of a previous step or a configuration value.
steps[*].continueOnError
steps[*].continueOnError
Type: Boolean or expression
If this property is set and the task fails, then the protocol continues to run only if the value is true
, or the expression evaluates to true
. If the steps[*].continueOnError
property isn’t set, then the protocol always ends in failure if the task fails.
NOTE
You can use the
steps[*].continueOnError
property to handle errors generated by failing task scripts. For example, you can use this property to programmatically run a cleanup task after a failed task.
steps[*].description
steps[*].description
Type: String
A short description of the step.
steps[*].task
(Required)
steps[*].task
(Required)Type: Task object
Defines the task that the step runs.
steps[*].task.namespace
(Required)
steps[*].task.namespace
(Required)Type: String
Defines the step’s namespace. The value can be any one of the following: common
, client
, or private
.
steps[*].task.slug
(Required)
steps[*].task.slug
(Required)Type: String
Defines the step’s task slug, such as akta-raw-to-ids
.
steps[*].task.version
(Required)
steps[*].task.version
(Required)Type: String
Defines the step’s task version number, such as v3.1.6
.
steps[*].task.function
(Required)
steps[*].task.function
(Required)Type: String
Identifies the function that the task script runs. Task scripts can define multiple functions, so this property is required and must match one of the function slugs defined by the task script.
steps[*].input
steps[*].input
Type: Any
The input value that’s evaluated and passed directly into the task script function.
NOTE
The
steps[*].input
property can contain expressions.
steps[*].options
steps[*].options
Type: Object
Specifies default options for the task script runner.
steps[*].options.memoryInMB
steps[*].options.memoryInMB
Type: Number or expression
Determines the default memory used to run this step’s task.
NOTE
The
steps[*].options.memoryInMB
property can be overridden on the Pipeline Manager page in the TDP UI.
steps[*].options.timeoutInSec
Type: Number or expression
Determines how long before a task is considered failed, even if it hasn't completed.
Updated 6 months ago