Pipeline and Workflow Objects and Parameters

This topic shows examples of the pipeline and workflow objects, and defines the parameters in each object.

Pipeline Object and Parameters

Sample pipeline objects appear below.

{
    "id": "8592187a-d3ed-4410-8ffb-ee47826bebf7",
    "name": "HG-Covid-Data",
    "description": "Reads in Covid USA states Data",
    "triggerType": "custom",
    "triggerCondition": {
        "groupOperator": "AND",
        "groupLevel": 1,
        "groups": [
            {
                "groupLevel": 2,
                "groupOperator": "AND",
                "groups": [
                    {
                        "key": "sourceId",
                        "operator": "is",
                        "value": "4095be8f-40c6-487d-99ce-bef42682c36b"
                    }
                ]
            },
            {
                "groupLevel": 2,
                "groupOperator": "AND",
                "groups": [
                    {
                        "key": "category",
                        "operator": "is",
                        "value": "raw"
                    }
                ]
            }
        ]
    },
    "protocolSlug": "hg-prot-covid-data",
    "protocolVersion": "v2.1.0",
    "createdAt": "2021-11-08T19:00:02.701Z",
    "updatedAt": "2022-02-03T17:54:46.962Z",
    "pipelineConfig": {
        "notificationsConfig": {
            "sendOnSuccessful": true,
            "sendOnFailed": true,
            "notificationEmailAddresses": [
                "[email protected]",
                "[email protected]"
            ]
        }
    },
    "masterScriptNamespace": "private-tetrascience",
    "masterScriptSlug": "hg-prot-covid-data",
    "masterScriptVersion": "v2.1.0",
    "status": null,
    "standby": 0,
    "retryBehavior": null
}
{
    "id": "28fd27a9-c931-49fe-9d21-661a7503cee5",
    "name": "SDC Measurement RAW to IDS to Solace",
    "description": null,
    "triggerType": "custom",
    "triggerCondition": {
        "groupOperator": "AND",
        "groupLevel": 1,
        "groups": [
            {
                "groupLevel": 2,
                "groupOperator": "AND",
                "groups": [
                    {
                        "key": "tags",
                        "operator": "has a tag that is",
                        "value": "igor_sdc"
                    }
                ]
            },
            {
                "groupLevel": 2,
                "groupOperator": "AND",
                "groups": [
                    {
                        "key": "category",
                        "operator": "is",
                        "value": "raw"
                    }
                ]
            }
        ]
    },
    "protocolSlug": "sdc-measurement-raw-to-ids-push-to-solace",
    "protocolVersion": "v1.0.0",
    "createdAt": "2021-04-22T20:20:24.968Z",
    "updatedAt": "2022-02-22T06:38:40.794Z",
    "pipelineConfig": {
        "solace-username": "solace-cloud-client",
        "solace-password": {
            "ssm": "/development/tetrascience/org-secrets/ea-solace",
            "secret": true
        },
        "solace-url": "https://mr1oqbbo5q4w2z.messaging.solace.cloud",
        "solace-port": "9443",
        "solace-topic": "SDC.measurement.results",
        "ignore-ssl": false,
        "notificationsConfig": {}
    },
    "masterScriptNamespace": "common",
    "masterScriptSlug": "sdc-measurement-raw-to-ids-push-to-solace",
    "masterScriptVersion": "v1.0.0",
    "status": "disabled",
    "standby": 0,
    "retryBehavior": null,
    "priority": null,
    "maxSlotLimit": null,
    "taskScriptTimeoutMins": null
}

Pipeline Object Parameters

Pipeline object parameters are defined in the following table.

FieldDescription
idGlobally unique identifier for this specific pipeline.
nameUser-defined and human-friendly name of the pipeline.
descriptionDescription of the pipeline.
triggerTypeIndicates the schema to be used for the triggerCondition.
triggerConditionDefines trigger conditions of the pipeline, based on file source type, category, metadata, tags, and labels. If a pipeline is created/updated via the UI, the triggerExpression generates the triggerCondition that the pipeline API uses. If a pipeline is created/updated directly through the API, we require the triggerCondition and will generate the triggerExpression from the condition.
triggerCondition.groupLevelIndicates which hierarchical level the operator and group belong to. Starts with 1 and goes up to 2.
triggerCondition.groupOperatorComparator between groups at the same level. Possible values: AND, OR
triggerCondition.groupsList of trigger conditions that the groupOperator is applied to.
protocolSlugUnique slug that identifies the protocol used in this pipeline.
protocolVersionVersion of the protocol used by this pipeline. Must always be in the following format: v.X.Y.Z For example: v2.1.3.
createdAtDate/Time when this pipeline was created.
updatedAtThe last time (Date/Time) when this pipeline had a configuration parameter change.
pipelineConfigContains the following pipeline configuration parameters:

- notificationsConfig: List of email addresses to notify on the success or the failure of the pipeline.
- sendOnSuccessful: boolean (true/false). Indicates whether a notification should be sent out if the pipeline succeeds.
- sendOnFailed: boolean (true/false). Indicates whether a notification should be sent out if the pipeline fails.
- notificationEmailAddresses: Array of strings. Comma-separated list of email addresses to notify whether the user wants a notification to be sent on success and/or failure of the pipeline.
masterScriptNamespaceNamespace of the protocol used in this pipeline. Can be any of the following: common, client, private.
masterScriptSlugDeprecated; duplicates protocolSlug.
masterScriptVersionDeprecated; duplicates protocolVersion.
statusIndicates whether the pipeline is active. Possible values:

- disabled: Indicates the pipeline is inactive.
- null: Indicates that the pipeline is not active (disabled)
- standby: Indicates how many instances of resources should be on hot standby to process files for this protocol. Must be between 0 and 5. Hot standby of an instance can add financial cost.
retryBehaviorRetry is per step (not workflow level). Each step has a starting memory that defaults to 512MB, but can be overridden in step config.json. Max memory will be 30GB.
Possible values:

- Always retry 3 times (default) (UI Value), null - (API Value)
Retries step on any type of error. Available memory will be doubled with each retry.

- Retry 3 times (after OOM error only) (UI Value), oom_only (API Value)
Retry step ONLY on an out-of-memory error. Available memory will be doubled on each retry.

- No retry (UI Value),off (API Value)
Do not retry under any circumstance.

Workflow Object and Parameters

A sample workflow object appears below.

{
    "id": "4d78df0e-6990-4d89-bca3-73261a99ca66",
    "orgSlug": "tetrascience",
    "pipelineId": "8592187a-d3ed-4410-8ffb-ee47826bebf7",
    "masterScriptNamespace": "private-tetrascience",
    "masterScriptSlug": "hg-prot-covid-data",
    "masterScriptVersion": "v2.1.0",
    "protocolSlug": "hg-prot-covid-data",
    "protocolVersion": "v2.1.0",
    "protocol": {
        "protocolSchema": "v2",
        "name": "HG Protocol COVID-JSON to IDS",
        "description": "",
        "steps": [
            {
                "slug": "first-step-covid-json-ids",
                "description": "Generates COVID data row (IDS JSON) for each state.",
                "type": "generator",
                "script": {
                    "namespace": "private-tetrascience",
                    "slug": "hg-tscr-covid-data",
                    "version": "v2.0.1"
                },
                "functionSlug": "main"
            }
        ],
        "config": []
    },
    "pipelineConfig": {
        "notificationsConfig": {
            "sendOnSuccessful": true,
            "sendOnFailed": true,
            "notificationEmailAddresses": [
                "[email protected]",
                "[email protected]"
            ]
        },
        "pipelineName": "HG-Covid-Data"
    },
    "inputFile": {
        "meta": {
            "fileId": "18384a5f-9007-4028-85cd-e191a04a71d5",
            "source": {
                "box": {
                    "id": "56d9293c-2dc7-4133-b21b-c1468aefb41f",
                    "size": 2559900,
                    "filePath": "/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json",
                    "integrationSource": "4095be8f-40c6-487d-99ce-bef42682c36b"
                },
                "type": "box"
            },
            "traceId": "18384a5f-9007-4028-85cd-e191a04a71d5",
            "sourceId": "4095be8f-40c6-487d-99ce-bef42682c36b",
            "sourceName": "HG-Box-Covid-Data-States",
            "sourceType": "box",
            "integrationId": "56d9293c-2dc7-4133-b21b-c1468aefb41f",
            "integrationType": "box"
        },
        "type": "s3file",
        "bucket": "ts-platform-dev-datalake",
        "fileId": "18384a5f-9007-4028-85cd-e191a04a71d5",
        "fileKey": "tetrascience/4095be8f-40c6-487d-99ce-bef42682c36b/RAW/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json",
        "version": "1kj7Fdl63JaVE_iBcSeS1EGKLX6UNdDR",
        "customTags": [],
        "customMetadata": {}
    },
    "tasks": [
        {
            "slug": "first-step-covid-json-ids",
            "input": {
                "input_file": {
                    "meta": {
                        "fileId": "18384a5f-9007-4028-85cd-e191a04a71d5",
                        "source": {
                            "box": {
                                "id": "56d9293c-2dc7-4133-b21b-c1468aefb41f",
                                "size": 2559900,
                                "filePath": "/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json",
                                "integrationSource": "4095be8f-40c6-487d-99ce-bef42682c36b"
                            },
                            "type": "box"
                        },
                        "traceId": "18384a5f-9007-4028-85cd-e191a04a71d5",
                        "sourceId": "4095be8f-40c6-487d-99ce-bef42682c36b",
                        "sourceName": "HG-Box-Covid-Data-States",
                        "sourceType": "box",
                        "integrationId": "56d9293c-2dc7-4133-b21b-c1468aefb41f",
                        "integrationType": "box"
                    },
                    "type": "s3file",
                    "bucket": "ts-platform-dev-datalake",
                    "fileId": "18384a5f-9007-4028-85cd-e191a04a71d5",
                    "fileKey": "tetrascience/4095be8f-40c6-487d-99ce-bef42682c36b/RAW/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json",
                    "version": "1kj7Fdl63JaVE_iBcSeS1EGKLX6UNdDR",
                    "customTags": [],
                    "customMetadata": {}
                }
            },
            "retry": 0,
            "events": [
                {
                    "at": "2022-03-21T15:53:40.403+00:00",
                    "status": "pending"
                },
                {
                    "at": "2022-03-21T15:54:09.932+00:00",
                    "status": "in-progress"
                },
                {
                    "at": "2022-03-21T15:54:40.449+00:00",
                    "status": "completed"
                }
            ],
            "output": {
                "type": "success",
                "result": [
                    {
                        "type": "s3file",
                        "bucket": "ts-platform-dev-datalake",
                        "fileId": "59c769b3-f73b-41f9-8d76-9dc04fb5da9d",
                        "fileKey": "tetrascience/4095be8f-40c6-487d-99ce-bef42682c36b/IDS/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json/covid-state-tseries-AK-2021-11-19.json",
                        "version": "2S3R7BGUAUGpRD8af7uxhp5lG56EAap_"
                    },
                    {
                        "type": "s3file",
                        "bucket": "ts-platform-dev-datalake",
                        "fileId": "7ab509dc-8f1d-48b2-8e73-1c682296b0f0",
                        "fileKey": "tetrascience/4095be8f-40c6-487d-99ce-bef42682c36b/IDS/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json/covid-state-tseries-AL-2021-11-19.json",
                        "version": "PcB6tKOfnNds1ooPWQaVNieitqWSYiLd"
                    },
                    {
                        "type": "s3file",
                        "bucket": "ts-platform-dev-datalake",
                        "fileId": "99eb82e0-e89d-4d69-8f34-b9df3114c589",
                        "fileKey": "tetrascience/4095be8f-40c6-487d-99ce-bef42682c36b/IDS/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json/covid-state-tseries-AR-2021-11-19.json",
                        "version": "5p4GQ6AhlIn_EPuSaMOivPRnxXYbO9Z1"
                    },
 
<...truncated for doc brevity>

                    {
                        "type": "s3file",
                        "bucket": "ts-platform-dev-datalake",
                        "fileId": "944f7d30-3193-493f-bf05-0a1681df6c4a",
                        "fileKey": "tetrascience/4095be8f-40c6-487d-99ce-bef42682c36b/IDS/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json/covid-state-tseries-WY-2021-11-19.json",
                        "version": "2P5hobBY8GwPKS8jkDPeELqPdc0zhz2M"
                    }
                ]
            },
            "script": {
                "list": [
                    "requirements.txt",
                    "README-internal.md",
                    "config.json",
                    "README.md",
                    "Pipfile",
                    "main.py",
                    "Pipfile.lock",
                    "__test__/test_business_logic.py",
                    "__test__/test_config.py",
                    "__test__/test_integration.py",
                    "__test__/__pycache__/test_business_logic.cpython-37.pyc",
                    "__test__/__pycache__/test_config.cpython-37.pyc",
                    "__test__/__pycache__/test_integration.cpython-37.pyc",
                    "__test__/data/expected.json",
                    "__test__/data/input.json",
                    "__pycache__/main.cpython-37.pyc"
                ],
                "slug": "hg-tscr-covid-data",
                "docker": {
                    "image": "706717599419.dkr.ecr.us-east-2.amazonaws.com/ts-platform-development-task-private-tetrascience-hg-tscr-covid-data@sha256:c8efc76762d45bf74fe2ec9ae80f817224c715b8b2892b877862ce118b6980bb"
                },
                "version": "v2.0.1",
                "language": "python",
                "maxCount": 30,
                "createdAt": "2022-03-16T14:25:41.393Z",
                "functions": [
                    {
                        "slug": "main",
                        "function": "main.main"
                    }
                ],
                "hasSource": true,
                "namespace": "private-tetrascience",
                "timestamp": 1647440600360,
                "runnerType": "ecs",
                "buildLogSaved": true,
                "buildDurationMs": 78820
            },
            "status": "completed",
            "taskId": "3ec7ffec-ed45-440d-9375-0288f835b25e",
            "options": {},
            "createdAt": "2022-03-21T15:53:40.387+00:00",
            "containerId": "755298eb-da0a-47fb-a3a5-e02d02fab7d9",
            "functionSlug": "main",
            "lastUpdatedAt": "2022-03-21T15:54:40.455+00:00",
            "taskMemoryInMB": 512,
            "cloudWatchUrl": "https://us-east-2.console.aws.amazon.com/cloudwatch/home?region=us-east-2#logs-insights:queryDetail=~(end~'2022-03-21T15*3a59*3a40.455Z~start~'2022-03-21T15*3a48*3a40.387Z~timeType~'ABSOLUTE~tz~'Local~editorString~'filter*20*60taskId*60*3d*273ec7ffec-ed45-440d-9375-0288f835b25e*27*20and*20*60containerId*60*3d*27755298eb-da0a-47fb-a3a5-e02d02fab7d9*27*0a*7c*20fields*20*40timestamp*2c*20*40message*0a*7c*20sort*20*40timestamp*20asc~source~(~'*2fecs*2ftaskscripts*2fts-platform*2fcontainers))"
        }
    ],
    "supersededTasks": [],
    "status": "completed",
    "events": [
        {
            "at": "2022-03-21T15:53:40.280+00:00",
            "status": "in-progress"
        },
        {
            "at": "2022-03-21T15:54:40.465+00:00",
            "status": "completed"
        }
    ],
    "createdAt": "2022-03-21T15:53:40.010Z",
    "lastUpdatedAt": "2022-03-21T15:54:40.467Z",
    "restarted": false,
    "masterScriptLogs": [],
    "retryBehavior": null,
    "orchestratorId": null,
    "output": {
        "type": "success",
        "result": [
            {
                "type": "s3file",
                "bucket": "ts-platform-dev-datalake",
                "fileId": "59c769b3-f73b-41f9-8d76-9dc04fb5da9d",
                "fileKey": "tetrascience/4095be8f-40c6-487d-99ce-bef42682c36b/IDS/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json/covid-state-tseries-AK-2021-11-19.json",
                "version": "2S3R7BGUAUGpRD8af7uxhp5lG56EAap_"
            },
            {
                "type": "s3file",
                "bucket": "ts-platform-dev-datalake",
                "fileId": "7ab509dc-8f1d-48b2-8e73-1c682296b0f0",
                "fileKey": "tetrascience/4095be8f-40c6-487d-99ce-bef42682c36b/IDS/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json/covid-state-tseries-AL-2021-11-19.json",
                "version": "PcB6tKOfnNds1ooPWQaVNieitqWSYiLd"
            },
         
 ... <truncated>
 
            {
                "type": "s3file",
                "bucket": "ts-platform-dev-datalake",
                "fileId": "944f7d30-3193-493f-bf05-0a1681df6c4a",
                "fileKey": "tetrascience/4095be8f-40c6-487d-99ce-bef42682c36b/IDS/All Files/TetraScience/coviddata/statesdata/covid_states_tseries_2021_11_19.json/covid-state-tseries-WY-2021-11-19.json",
                "version": "2P5hobBY8GwPKS8jkDPeELqPdc0zhz2M"
            }
        ]
    }
}

Workflow Object Parameters

Workflow object parameters are defined in the following table.

FieldDescription
idGlobally unique identifier for this specific workflow.
orgSlugIndicates which organization this workflow was generated from.
pipelineIdGlobally unique id of the pipeline that was triggered for this workflow.
masterScriptNamespaceNamespace of the protocol used in this workflow. Can be any of the following: common, client, private.
masterScriptSlugThis field is deprecated; duplicates the protocolSlug.
masterScriptVersionThis field is deprecated; duplicates protocolVersion.
protocolSlugSlug of the protocol used in this workflow.
protocolVersionVersion of the protocol used by this workflow.
protocolProtocol definition that includes protocol parameter definitions and taskscript information for each step.
protocol.protocolSchemaVersion number of the schema.
protocol.nameName of the protocol.
protocol.descriptionDescription of the protocol.
protocol.stepsSlug, description, type, script, and functionSlug for the protocol.
protocol.configConfiguration details.
pipelineConfigPipeline configuration and pipeline parameter value settings that are passed to the workflow. These are:

- notificationsConfig: List of email addresses to notify upon success or failure of the pipeline.
- sendOnSuccessful: boolean (true/false). Indicates whether the notification should be sent if the pipeline succeeds.
- sendOnFailed: boolean (true/false). Indicates whether the notification should be sent if the pipeline fails.
- notificationEmailAddresses: An array of strings. Comma-separated list of email addresses to notify if the user wants notification on success and/or failure of the pipeline.
- pipelineName: Name of the pipeline.
inputFileInput parameter object that is passed into the taskscript as defined by protocol parameters. (Passed into the taskscript by script.js when calling workflow.runTask).
tasks[]An array that contains information on the last execution of each step in a protocol. If a step was run multiple times (for retries), the info in this array will contain info for ONLY the last run of each step. See supersededTasks[] for previous runs/retries. Each element in the list has the following fields:

- slug - the taskscript slug for the given step
- input file - includes the fileId, source, source ID, size, filepath, integration source and type. It also provides the traceId, sourceId, sourceName, sourceType, integrationId, and integrationType, the type of file (e.g. s3file), the bucket it is stored in, fileId, fileKey, version, customTags, and customMetadata.
- Retry - indicates which retry run this represents. Since the items in the task[] are the last run of each step, this also indicates the total number of retries (after the initial run) that were run for the given task. Should be a number from zero to three. Zero indicates that the initial run of the script was successful. Three means that there was an initial run of the script and 3 additional retries (total of 4 runs of the script)
- events - Presents a list of events that track status changes for individual taskscript executions (at the container level). Events provide the following information:
- at: Timestamp of the event.
- status: Indicates event type: (pending - step is waiting to be executed, in progress - step is currently executing, complete - step successfully completed. )
output - indicates the taskscript result info for the given run. On successful completion, this will contain whatever the task script returns on exit. For example, if the taskscript returns a list of files, then the output result will contain a list of files. If the taskscript returns number, then the output.result will be the returned number.
scriptDetailed information about the list of files in the pipeline, the slug, docker image information, version, language, maxCount, when the script was created, function information, whether it hasSource, the namespace, timestamp, runnerType, and buildDuration in milliseconds. It also provides information about build log.
statusLast status event.
taskIdGlobally unique id of the taskscript.
createdAtDate/Time when the taskscript was started.
containerIdId of the container that ran the task script.
functionSlugProvides the taskscript code function name.
lastUpdatedAtDate/Time this taskscript had an event update.
taskMemoryInMBStarting memory for the taskscript container as defined by the taskscript configuration.
cloudWatchUrlCloudWatch location of logs for this run of the taskscript.
supersededTasks[]Contains run information for each retry for each step. Excludes the last run of each step. The last (latest) run of each steps is in the tasks[] field.
retryIndicates the retries instance for this run. A number between zero and three.
outputThe taskscript result info for the given run.
statusFinal success/fail status of the script
createdAtDate/Time when this workflow was started.
lastUpdatedAtDate/Time when this workflow had an event update.
restartedIndicates with the workflow was restarted.
masterScriptLogsIndicates the location of the script logs
orchestratorIdInternal component processing id.
outputreturned by script.js (in v3.1 this was called result).
The standard output structure for root.output, supersededTasks[].output, tasks[].output:
Successful case: { "type": ”success” "result": "anything that the taskscript or script.js returns, can be [], string, boolean, etc."} }
Error case: If a taskscript throws an exception, then the output will contain an error message.
{"type": ”error” "result": { "message": "error message details"} }
eventsList of events to track status for general workflow (not specific to a taskscript).
at - Timestamp of the event
status - Indicates event type:
in-progress - indicates the workflow is currently executing (generated just before container startup, so includes container startup time)
completed - workflow completed successfully.