Protocol YAML Files
IMPORTANTStarting with Tetra Data Platform (TDP) version 3.6.0 and the TetraScience Software Development Kit (SDK) 2.0 release, the
protocol.ymlfile format replaces both the previousprotocol.jsonandscript.jsfile formats when creating self-service Tetra Data pipelines (SSPs).
Protocols define the business logic of your pipeline by specifying the steps and the functions within task scripts that run those steps. You can configure your own custom protocols for use in an SSP by creating a protocol.yml file.
NOTEFor instructions on how to create and deploy a custom protocol, see Create and Deploy a Protocol in the “Hello, World!” SSP Example.
protocol.yml File Example
protocol.yml File ExampleprotocolSchema: "v3"
name: "Example Protocol"
description: "This is just to show a basic protocol.yml file"
config:
idsFileCategory:
label: "IDS File Category"
description: "Optional File Category"
type: "string"
required: false
default: "IDS"
steps:
- id: raw-to-ids
task:
namespace: common
slug: akta-raw-to-ids
version: v3.1.6
function: akta-raw-to-ids
input:
input_file_pointer: $( workflow.inputFile )
ids_file_category: $( config.idsFileCategory )
Expressions
Instead of directly specifying everything in a protocol, you can use expressions to programmatically control the workflow at run time.
Expressions are specified by enclosing them in the following: $( ... )
Within an expression, you have access to several contexts that contain data and a limited list of functions. These contexts can be combined however you want within a single expression.
Expression Example
input_file_pointer: $( workflow.inputFile )
Contexts
The following contexts contain data and a limited list of functions that you can combine however you want within a single expression.
workflow Context
workflow ContextThe workflow context contains data relevant to the current workflow.
workflowContext Examples
| Value | Contents |
|---|---|
workflow.id | Current workflow id |
workflow.inputFile | Pointer to the file that triggered the workflow |
workflow.startedAt | ISO 8601-formatted string of the date and time when the workflow started |
workflow.pipeline.id | id of the pipeline that triggered the workflow |
config Context
config ContextThe config context contains data configured by the pipeline user.
configContext Example
config:
foo:
label: Foo
type: string
bar:
label: Hey Bear
type: object
NOTEFor this
configcontext example, you could write$( config.foo )or$( config.bar ), but$( config.baz )would result in an error.
constants Context
constants ContextContains all the constant values defined in the protocol. If the constant contains an expression, it will be evaluated before access.
constantsContext Example
constants:
foo: "bar"
baz:
- 1
- 2
NOTEFor this
constantscontext example, you could write$( constants.foo )or$( constants.baz ), but$( constants.qux )would result in an error.
protocol Context
protocol ContextContains manifest fields. For example, a protocol context could contain $( protocol.name ) or $( protocol.version ). If the value is not specified in the protocol.yml file, it will return a null value.
protocolContext Examples
For example protocol context values, see the Manifest Fields section.
steps Context
steps ContextContains all of the step results. A step can be accessed in this context only once it has been run, or skipped. Steps can be referenced either by their id or by their index within the steps list.
For example, if the first step has id: stepOne, then its output could be accessed by any of the following:
steps[0].outputsteps['stepOne'].outputsteps.stepOne.output
stepsContext Examples
Value | Contents |
|---|---|
| The output returned from the task script in the referenced step |
| Errors returned from the task script
|
| Status of the referenced step, listed as one of the following values:
|
| Contains |
| Contains |
| Contains |
Functions
You can use the following function: values in a protocol.yml file to alter data within expressions.
IMPORTANTAny functionality beyond the ones listed in the following table should be added to a task script, not a protocol.
Function | Description | Example |
|---|---|---|
| Returns the first value that isn’t | c |
| Returns the negation of the value |
|
| Returns |
|
| Returns |
|
| Returns |
|
| Returns |
|
protocolSchema: “v3” (Required)
protocolSchema: “v3” (Required)Type: String (must be “v3")
The protocolSchema: "v3" property is required to indicate that this protocol is in the v3 format.
Manifest Fields
The following manifest fields are optional, top-level metadata fields.
Field Name | Type | Description |
|---|---|---|
| String (must be | Artifact type |
| String (must be | Protocol namespace |
| String (must be in the following form: | Protocol version number |
| String | Protocol slug/identifier |
| String | Protocol name |
| String | Protocol description |
| List of strings | Used to look up the appropriate labels so that they can be added to the |
| List of key-value pairs Example
| Used to annotate files and trigger pipelines without changing any file data For more information, see Labels. |
config
configType: Object
The config property is an object that maps a configuration ID to for a config object. Each config ID is unique so that it can be referenced in the steps property.
IMPORTANTConfig objects can’t contain expressions.
config Property Example
config Property Exampleconfig:
numberConfig:
label: "Enter a Number"
description: "This number will be used by the task scripts"
type: number
default: 9000
secretExample:
label: "ELN Password"
description: "The password to push data to some ELN"
type: secret
config.<config_id>.label (Required)
config.<config_id>.label (Required)Type: String
The config label that’s rendered in the TDP’s Pipeline Manager page.
config.<config_id>.type (Required)
config.<config_id>.type (Required)Type: String
The config type determines how the TDP UI renders the configuration element and how it is passed into the workflow.
The config.<config_id>.type value must be one of the following:
"number""string""text""boolean""secret""object"
config.<config_id>.required (Required)
config.<config_id>.required (Required)Type: Boolean
Determines if the config is required or not. This requirement is reflected in the TDP UI and is also checked at workflow run time. If not set, this defaults to false.
.config.<config_id>.description
.config.<config_id>.descriptionType: String
A short description of the config property. This description is what’s rendered on the Pipeline Manager page in the TDP.
config.<config_id>.default
config.<config_id>.defaultType: String
The default config property. This default property is used if the config property isn’t configured in the TDP UI. This property should contain a value whose type matches the config's type.
Constants
Type: Object
The constants property is an object that maps a constant ID to any value. This property allows complex protocols to define common values once, instead of having them listed multiple times throughout the protocol.yml file.
Each constant ID is unique so that it can be referenced in the steps property, or from other constants. Constants can contain expressions. However, those expressions don't have access to the steps context, because no steps have run when constants are evaluated.
IMPORTANTCircular dependencies aren’t allowed and are detected by the TDP automatically.
constants Property Example
constants:
# a number
REMOTE_PORT: 8080
# a string
REMORE_URL: "http://example.com"
# a list
VALID_VALUES:
- 80
- 443
- $( constants.REMOTE_PORT )
# an object
CONFIG_OBJECT:
url: $( constants.REMORE_URL )
ports: $( constants.VALID_VALUES )
Steps (Required)
Type: List
The steps property defines the tasks that the protocol runs and their inputs. The steps property contains a list of step objects. Each step object runs a task script.
steps Property Example
steps Property Examplesteps:
- id: stepOne
task:
namespace: common
slug: raw-to-ids
version: v1.0.0
function: main
input:
input_file_pointer: $( workflow.inputFile )
- task:
namespace: common
slug: push-to-eln
version: v2.1.0
function: "push-it"
input:
ids_file: $( steps.stepOne.output.ids_file )
options:
timeoutInSec: $( config.pushToElnTimeout )
steps[*].id
steps[*].idType: String
Defines a unique identifier for the step. The allows the step object to be referenced by its id in later steps. If no id is specified for a step, then later steps must refer to it by its index.
steps[*].if
steps[*].ifType: Boolean or expression
If this property is set, then the step will run only if the value is true, or if the expression evaluates to true. If the steps[*].if property isn’t set, then the step always runs.
NOTEYou can use the
steps[*].ifproperty to conditionally run steps based on the output of a previous step or a configuration value.
steps[*].continueOnError
steps[*].continueOnErrorType: Boolean or expression
If this property is set and the task fails, then the protocol continues to run only if the value is true, or the expression evaluates to true. If the steps[*].continueOnError property isn’t set, then the protocol always ends in failure if the task fails.
NOTEYou can use the
steps[*].continueOnErrorproperty to handle errors generated by failing task scripts. For example, you can use this property to programmatically run a cleanup task after a failed task.
steps[*].description
steps[*].descriptionType: String
A short description of the step.
steps[*].task (Required)
steps[*].task (Required)Type: Task object
Defines the task that the step runs.
steps[*].task.namespace (Required)
steps[*].task.namespace (Required)Type: String
Defines the step’s namespace. The value can be any one of the following: common, client, or private.
steps[*].task.slug (Required)
steps[*].task.slug (Required)Type: String
Defines the step’s task slug, such as akta-raw-to-ids.
steps[*].task.version (Required)
steps[*].task.version (Required)Type: String
Defines the step’s task version number, such as v3.1.6.
steps[*].task.function (Required)
steps[*].task.function (Required)Type: String
Identifies the function that the task script runs. Task scripts can define multiple functions, so this property is required and must match one of the function slugs defined by the task script.
steps[*].input
steps[*].inputType: Any
The input value that’s evaluated and passed directly into the task script function.
NOTEThe
steps[*].inputproperty can contain expressions.
steps[*].options
steps[*].optionsType: Object
Specifies default options for the task script runner.
steps[*].options.memoryInMB
steps[*].options.memoryInMBType: Number or expression
Determines the default memory used to run this step’s task.
NOTEThe
steps[*].options.memoryInMBproperty can be overridden on the Pipeline Manager page in the TDP UI.
steps[*].options.timeoutInSec
Type: Number or expression
Determines how long before a task is considered failed, even if it hasn't completed.
Documentation Feedback
Do you have questions about our documentation or suggestions for how we can improve it? Start a discussion in TetraConnect Hub. For access, see Access the TetraConnect Hub.
NOTEFeedback isn't part of the official TetraScience product documentation. TetraScience doesn't warrant or make any guarantees about the feedback provided, including its accuracy, relevance, or reliability. All feedback is subject to the terms set forth in the TetraConnect Hub Community Guidelines.
Updated about 1 month ago
