This topic shows examples of the pipeline and workflow objects, and defines the parameters in each object.
Pipeline Object and Parameters
Sample pipeline objects appear below.
{
"id": "8592187a-d3ed-4410-8ffb-ee47826bebf7",
"name": "HG-Covid-Data",
"description": "Reads in Covid USA states Data",
"triggerType": "custom",
"triggerCondition": {
"groupOperator": "AND",
"groupLevel": 1,
"groups": [
{
"groupLevel": 2,
"groupOperator": "AND",
"groups": [
{
"key": "sourceId",
"operator": "is",
"value": "4095be8f-40c6-487d-99ce-bef42682c36b"
}
]
},
{
"groupLevel": 2,
"groupOperator": "AND",
"groups": [
{
"key": "category",
"operator": "is",
"value": "raw"
}
]
}
]
},
"protocolSlug": "hg-prot-covid-data",
"protocolVersion": "v2.1.0",
"createdAt": "2021-11-08T19:00:02.701Z",
"updatedAt": "2022-02-03T17:54:46.962Z",
"pipelineConfig": {
"notificationsConfig": {
"sendOnSuccessful": true,
"sendOnFailed": true,
"notificationEmailAddresses": [
"[email protected]",
"[email protected]"
]
}
},
"masterScriptNamespace": "private-tetrascience",
"masterScriptSlug": "hg-prot-covid-data",
"masterScriptVersion": "v2.1.0",
"status": null,
"standby": 0,
"retryBehavior": null
}
{
"id": "28fd27a9-c931-49fe-9d21-661a7503cee5",
"name": "SDC Measurement RAW to IDS to Solace",
"description": null,
"triggerType": "custom",
"triggerCondition": {
"groupOperator": "AND",
"groupLevel": 1,
"groups": [
{
"groupLevel": 2,
"groupOperator": "AND",
"groups": [
{
"key": "tags",
"operator": "has a tag that is",
"value": "igor_sdc"
}
]
},
{
"groupLevel": 2,
"groupOperator": "AND",
"groups": [
{
"key": "category",
"operator": "is",
"value": "raw"
}
]
}
]
},
"protocolSlug": "sdc-measurement-raw-to-ids-push-to-solace",
"protocolVersion": "v1.0.0",
"createdAt": "2021-04-22T20:20:24.968Z",
"updatedAt": "2022-02-22T06:38:40.794Z",
"pipelineConfig": {
"solace-username": "solace-cloud-client",
"solace-password": {
"ssm": "/development/tetrascience/org-secrets/ea-solace",
"secret": true
},
"solace-url": "https://mr1oqbbo5q4w2z.messaging.solace.cloud",
"solace-port": "9443",
"solace-topic": "SDC.measurement.results",
"ignore-ssl": false,
"notificationsConfig": {}
},
"masterScriptNamespace": "common",
"masterScriptSlug": "sdc-measurement-raw-to-ids-push-to-solace",
"masterScriptVersion": "v1.0.0",
"status": "disabled",
"standby": 0,
"retryBehavior": null,
"priority": null,
"maxSlotLimit": null,
"taskScriptTimeoutMins": null
}
Pipeline Object Parameters
Pipeline object parameters are defined in the following table.
Field | Description |
---|---|
id | Globally unique identifier for this specific pipeline. |
name | User-defined and human-friendly name of the pipeline. |
description | Description of the pipeline. |
triggerType | Indicates the schema to be used for the triggerCondition . |
triggerCondition | Defines trigger conditions of the pipeline, based on file source type, category, metadata, tags, and labels. If a pipeline is created/updated via the UI, the triggerExpression generates the triggerCondition that the pipeline API uses. If a pipeline is created/updated directly through the API, we require the triggerCondition and will generate the triggerExpression from the condition. |
triggerCondition.groupLevel | Indicates which hierarchical level the operator and group belong to. Starts with 1 and goes up to 2. |
triggerCondition.groupOperator | Comparator between groups at the same level. Possible values: AND , OR |
triggerCondition.groups | List of trigger conditions that the groupOperator is applied to. |
protocolSlug | Unique slug that identifies the protocol used in this pipeline. |
protocolVersion | Version of the protocol used by this pipeline. Must always be in the following format: v.X.Y.Z For example: v2.1.3. |
createdAt | Date/Time when this pipeline was created. |
updatedAt | The last time (Date/Time) when this pipeline had a configuration parameter change. |
pipelineConfig | Contains the following pipeline configuration parameters: - notificationsConfig : List of email addresses to notify on the success or the failure of the pipeline. - sendOnSuccessful : boolean (true/false). Indicates whether a notification should be sent out if the pipeline succeeds. - sendOnFailed : boolean (true/false). Indicates whether a notification should be sent out if the pipeline fails. - notificationEmailAddresses : Array of strings. Comma-separated list of email addresses to notify whether the user wants a notification to be sent on success and/or failure of the pipeline. |
masterScriptNamespace | Namespace of the protocol used in this pipeline. Can be any of the following: common , client , private . |
masterScriptSlug | Deprecated; duplicates protocolSlug . |
masterScriptVersion | Deprecated; duplicates protocolVersion . |
status | Indicates whether the pipeline is active. Possible values: - disabled : Indicates the pipeline is inactive. - null : Indicates that the pipeline is not active (disabled) - standby : Indicates how many instances of resources should be on hot standby to process files for this protocol. Must be between 0 and 5. Hot standby of an instance can add financial cost. |
retryBehavior | Retry is per step (not workflow level). Each step has a starting memory that defaults to 512MB, but can be overridden in step config.json . Max memory will be 30GB. Possible values: - Always retry 3 times (default) (UI Value), null - (API Value) Retries step on any type of error. Available memory will be doubled with each retry. - Retry 3 times (after OOM error only) (UI Value), oom_only (API Value)Retry step ONLY on an out-of-memory error. Available memory will be doubled on each retry. - No retry (UI Value),off (API Value) Do not retry under any circumstance. |
Workflow Object and Parameters
A sample workflow object appears below.
{
"id": "4d78df0e-6990-4d89-bca3-73261a99ca66",
"orgSlug": "tetrascience",
"pipelineId": "8592187a-d3ed-4410-8ffb-ee47826bebf7",
"masterScriptNamespace": "private-tetrascience",
"masterScriptSlug": "hg-prot-covid-data",
"masterScriptVersion": "v2.1.0",
"protocolSlug": "hg-prot-covid-data",
"protocolVersion": "v2.1.0",
"protocol": {
"protocolSchema": "v2",
"name": "HG Protocol COVID-JSON to IDS",
"description": "",
"steps": [
{
"slug": "first-step-covid-json-ids",
"description": "Generates COVID data row (IDS JSON) for each state.",
"type": "generator",
"script": {
"namespace": "private-tetrascience",
"slug": "hg-tscr-covid-data",
"version": "v2.0.1"
},
"functionSlug": "main"
}
],
"config": []
},
"pipelineConfig": {
"notificationsConfig": {
"sendOnSuccessful": true,
"sendOnFailed": true,
"notificationEmailAddresses": [
"[email protected]",
"[email protected]"
]
},
"pipelineName": "HG-Covid-Data"
},
"inputFile": {
"meta": {
"fileId": "18384a5f-9007-4028-85cd-e191a04a71d5",
"source": {
"box": {
"id": "56d9293c-2dc7-4133-b21b-c1468aefb41f",
"size": 2559900,
"filePath": "/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json",
"integrationSource": "4095be8f-40c6-487d-99ce-bef42682c36b"
},
"type": "box"
},
"traceId": "18384a5f-9007-4028-85cd-e191a04a71d5",
"sourceId": "4095be8f-40c6-487d-99ce-bef42682c36b",
"sourceName": "HG-Box-Covid-Data-States",
"sourceType": "box",
"integrationId": "56d9293c-2dc7-4133-b21b-c1468aefb41f",
"integrationType": "box"
},
"type": "s3file",
"bucket": "ts-platform-dev-datalake",
"fileId": "18384a5f-9007-4028-85cd-e191a04a71d5",
"fileKey": "tetrascience/4095be8f-40c6-487d-99ce-bef42682c36b/RAW/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json",
"version": "1kj7Fdl63JaVE_iBcSeS1EGKLX6UNdDR",
"customTags": [],
"customMetadata": {}
},
"tasks": [
{
"slug": "first-step-covid-json-ids",
"input": {
"input_file": {
"meta": {
"fileId": "18384a5f-9007-4028-85cd-e191a04a71d5",
"source": {
"box": {
"id": "56d9293c-2dc7-4133-b21b-c1468aefb41f",
"size": 2559900,
"filePath": "/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json",
"integrationSource": "4095be8f-40c6-487d-99ce-bef42682c36b"
},
"type": "box"
},
"traceId": "18384a5f-9007-4028-85cd-e191a04a71d5",
"sourceId": "4095be8f-40c6-487d-99ce-bef42682c36b",
"sourceName": "HG-Box-Covid-Data-States",
"sourceType": "box",
"integrationId": "56d9293c-2dc7-4133-b21b-c1468aefb41f",
"integrationType": "box"
},
"type": "s3file",
"bucket": "ts-platform-dev-datalake",
"fileId": "18384a5f-9007-4028-85cd-e191a04a71d5",
"fileKey": "tetrascience/4095be8f-40c6-487d-99ce-bef42682c36b/RAW/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json",
"version": "1kj7Fdl63JaVE_iBcSeS1EGKLX6UNdDR",
"customTags": [],
"customMetadata": {}
}
},
"retry": 0,
"events": [
{
"at": "2022-03-21T15:53:40.403+00:00",
"status": "pending"
},
{
"at": "2022-03-21T15:54:09.932+00:00",
"status": "in-progress"
},
{
"at": "2022-03-21T15:54:40.449+00:00",
"status": "completed"
}
],
"output": {
"type": "success",
"result": [
{
"type": "s3file",
"bucket": "ts-platform-dev-datalake",
"fileId": "59c769b3-f73b-41f9-8d76-9dc04fb5da9d",
"fileKey": "tetrascience/4095be8f-40c6-487d-99ce-bef42682c36b/IDS/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json/covid-state-tseries-AK-2021-11-19.json",
"version": "2S3R7BGUAUGpRD8af7uxhp5lG56EAap_"
},
{
"type": "s3file",
"bucket": "ts-platform-dev-datalake",
"fileId": "7ab509dc-8f1d-48b2-8e73-1c682296b0f0",
"fileKey": "tetrascience/4095be8f-40c6-487d-99ce-bef42682c36b/IDS/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json/covid-state-tseries-AL-2021-11-19.json",
"version": "PcB6tKOfnNds1ooPWQaVNieitqWSYiLd"
},
{
"type": "s3file",
"bucket": "ts-platform-dev-datalake",
"fileId": "99eb82e0-e89d-4d69-8f34-b9df3114c589",
"fileKey": "tetrascience/4095be8f-40c6-487d-99ce-bef42682c36b/IDS/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json/covid-state-tseries-AR-2021-11-19.json",
"version": "5p4GQ6AhlIn_EPuSaMOivPRnxXYbO9Z1"
},
<...truncated for doc brevity>
{
"type": "s3file",
"bucket": "ts-platform-dev-datalake",
"fileId": "944f7d30-3193-493f-bf05-0a1681df6c4a",
"fileKey": "tetrascience/4095be8f-40c6-487d-99ce-bef42682c36b/IDS/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json/covid-state-tseries-WY-2021-11-19.json",
"version": "2P5hobBY8GwPKS8jkDPeELqPdc0zhz2M"
}
]
},
"script": {
"list": [
"requirements.txt",
"README-internal.md",
"config.json",
"README.md",
"Pipfile",
"main.py",
"Pipfile.lock",
"__test__/test_business_logic.py",
"__test__/test_config.py",
"__test__/test_integration.py",
"__test__/__pycache__/test_business_logic.cpython-37.pyc",
"__test__/__pycache__/test_config.cpython-37.pyc",
"__test__/__pycache__/test_integration.cpython-37.pyc",
"__test__/data/expected.json",
"__test__/data/input.json",
"__pycache__/main.cpython-37.pyc"
],
"slug": "hg-tscr-covid-data",
"docker": {
"image": "706717599419.dkr.ecr.us-east-2.amazonaws.com/ts-platform-development-task-private-tetrascience-hg-tscr-covid-data@sha256:c8efc76762d45bf74fe2ec9ae80f817224c715b8b2892b877862ce118b6980bb"
},
"version": "v2.0.1",
"language": "python",
"maxCount": 30,
"createdAt": "2022-03-16T14:25:41.393Z",
"functions": [
{
"slug": "main",
"function": "main.main"
}
],
"hasSource": true,
"namespace": "private-tetrascience",
"timestamp": 1647440600360,
"runnerType": "ecs",
"buildLogSaved": true,
"buildDurationMs": 78820
},
"status": "completed",
"taskId": "3ec7ffec-ed45-440d-9375-0288f835b25e",
"options": {},
"createdAt": "2022-03-21T15:53:40.387+00:00",
"containerId": "755298eb-da0a-47fb-a3a5-e02d02fab7d9",
"functionSlug": "main",
"lastUpdatedAt": "2022-03-21T15:54:40.455+00:00",
"taskMemoryInMB": 512,
"cloudWatchUrl": "https://us-east-2.console.aws.amazon.com/cloudwatch/home?region=us-east-2#logs-insights:queryDetail=~(end~'2022-03-21T15*3a59*3a40.455Z~start~'2022-03-21T15*3a48*3a40.387Z~timeType~'ABSOLUTE~tz~'Local~editorString~'filter*20*60taskId*60*3d*273ec7ffec-ed45-440d-9375-0288f835b25e*27*20and*20*60containerId*60*3d*27755298eb-da0a-47fb-a3a5-e02d02fab7d9*27*0a*7c*20fields*20*40timestamp*2c*20*40message*0a*7c*20sort*20*40timestamp*20asc~source~(~'*2fecs*2ftaskscripts*2fts-platform*2fcontainers))"
}
],
"supersededTasks": [],
"status": "completed",
"events": [
{
"at": "2022-03-21T15:53:40.280+00:00",
"status": "in-progress"
},
{
"at": "2022-03-21T15:54:40.465+00:00",
"status": "completed"
}
],
"createdAt": "2022-03-21T15:53:40.010Z",
"lastUpdatedAt": "2022-03-21T15:54:40.467Z",
"restarted": false,
"masterScriptLogs": [],
"retryBehavior": null,
"orchestratorId": null,
"output": {
"type": "success",
"result": [
{
"type": "s3file",
"bucket": "ts-platform-dev-datalake",
"fileId": "59c769b3-f73b-41f9-8d76-9dc04fb5da9d",
"fileKey": "tetrascience/4095be8f-40c6-487d-99ce-bef42682c36b/IDS/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json/covid-state-tseries-AK-2021-11-19.json",
"version": "2S3R7BGUAUGpRD8af7uxhp5lG56EAap_"
},
{
"type": "s3file",
"bucket": "ts-platform-dev-datalake",
"fileId": "7ab509dc-8f1d-48b2-8e73-1c682296b0f0",
"fileKey": "tetrascience/4095be8f-40c6-487d-99ce-bef42682c36b/IDS/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json/covid-state-tseries-AL-2021-11-19.json",
"version": "PcB6tKOfnNds1ooPWQaVNieitqWSYiLd"
},
... <truncated>
{
"type": "s3file",
"bucket": "ts-platform-dev-datalake",
"fileId": "944f7d30-3193-493f-bf05-0a1681df6c4a",
"fileKey": "tetrascience/4095be8f-40c6-487d-99ce-bef42682c36b/IDS/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json/covid-state-tseries-WY-2021-11-19.json",
"version": "2P5hobBY8GwPKS8jkDPeELqPdc0zhz2M"
}
]
}
}
Workflow Object Parameters
Workflow object parameters are defined in the following table.
Field | Description |
---|---|
id | Globally unique identifier for this specific workflow. |
orgSlug | Indicates which organization this workflow was generated from. |
pipelineId | Globally unique id of the pipeline that was triggered for this workflow. |
masterScriptNamespace | Namespace of the protocol used in this workflow. Can be any of the following: common , client , private . |
masterScriptSlug | This field is deprecated; duplicates the protocolSlug . |
masterScriptVersion | This field is deprecated; duplicates protocolVersion . |
protocolSlug | Slug of the protocol used in this workflow. |
protocolVersion | Version of the protocol used by this workflow. |
protocol | Protocol definition that includes protocol parameter definitions and taskscript information for each step. |
protocol.protocolSchema | Version number of the schema. |
protocol.name | Name of the protocol. |
protocol.description | Description of the protocol. |
protocol.steps | Slug, description, type, script, and functionSlug for the protocol. |
protocol.config | Configuration details. |
pipelineConfig | Pipeline configuration and pipeline parameter value settings that are passed to the workflow. These are: - notificationsConfig : List of email addresses to notify upon success or failure of the pipeline. - sendOnSuccessful : boolean (true/false). Indicates whether the notification should be sent if the pipeline succeeds. - sendOnFailed : boolean (true/false). Indicates whether the notification should be sent if the pipeline fails. - notificationEmailAddresses : An array of strings. Comma-separated list of email addresses to notify if the user wants notification on success and/or failure of the pipeline.- pipelineName : Name of the pipeline. |
inputFile | Input parameter object that is passed into the taskscript as defined by protocol parameters. (Passed into the taskscript by script.js when calling workflow.runTask ). |
tasks[] | An array that contains information on the last execution of each step in a protocol. If a step was run multiple times (for retries), the info in this array will contain info for ONLY the last run of each step. See supersededTasks[] for previous runs/retries. Each element in the list has the following fields: - slug - the taskscript slug for the given step - input file - includes the fileId, source, source ID, size, filepath, integration source and type. It also provides the traceId, sourceId, sourceName, sourceType, integrationId, and integrationType, the type of file (e.g. s3file), the bucket it is stored in, fileId, fileKey, version, customTags, and customMetadata. - Retry - indicates which retry run this represents. Since the items in the task[] are the last run of each step, this also indicates the total number of retries (after the initial run) that were run for the given task. Should be a number from zero to three. Zero indicates that the initial run of the script was successful. Three means that there was an initial run of the script and 3 additional retries (total of 4 runs of the script) - events - Presents a list of events that track status changes for individual taskscript executions (at the container level). Events provide the following information: - at : Timestamp of the event. - status : Indicates event type: (pending - step is waiting to be executed, in progress - step is currently executing, complete - step successfully completed. )output - indicates the taskscript result info for the given run. On successful completion, this will contain whatever the task script returns on exit. For example, if the taskscript returns a list of files, then the output result will contain a list of files. If the taskscript returns number, then the output.result will be the returned number. |
script | Detailed information about the list of files in the pipeline, the slug, docker image information, version, language, maxCount, when the script was created, function information, whether it hasSource, the namespace, timestamp, runnerType, and buildDuration in milliseconds. It also provides information about build log. |
status | Last status event. |
taskId | Globally unique id of the taskscript. |
createdAt | Date/Time when the taskscript was started. |
containerId | Id of the container that ran the task script. |
functionSlug | Provides the taskscript code function name. |
lastUpdatedAt | Date/Time this taskscript had an event update. |
taskMemoryInMB | Starting memory for the taskscript container as defined by the taskscript configuration. |
cloudWatchUrl | CloudWatch location of logs for this run of the taskscript. |
supersededTasks[] | Contains run information for each retry for each step. Excludes the last run of each step. The last (latest) run of each steps is in the tasks[] field. |
retry | Indicates the retries instance for this run. A number between zero and three. |
output | The taskscript result info for the given run. |
status | Final success/fail status of the script |
createdAt | Date/Time when this workflow was started. |
lastUpdatedAt | Date/Time when this workflow had an event update. |
restarted | Indicates with the workflow was restarted. |
masterScriptLogs | Indicates the location of the script logs |
orchestratorId | Internal component processing id. |
output | returned by script.js (in v3.1 this was called result). The standard output structure for root.output, supersededTasks[].output, tasks[].output: Successful case: { "type": ”success” "result": "anything that the taskscript or script.js returns, can be [], string, boolean, etc."} } Error case: If a taskscript throws an exception, then the output will contain an error message. {"type": ”error” "result": { "message": "error message details"} } |
events | List of events to track status for general workflow (not specific to a taskscript). at - Timestamp of the event status - Indicates event type: in-progress - indicates the workflow is currently executing (generated just before container startup, so includes container startup time) completed - workflow completed successfully. |