Set Up and Edit Pipelines

The Manage Pipeline page displays all of your pipelines in one place so that you can set them up, view them, and edit them. You can create a pipeline on this page and also enable or disable them on this page. If you are interested in basic information on pipelines and pipeline terminology, check out the pipeline overview and terminology page.

📘

NOTE:

Only administrators of an organization can set up, edit, disable, and enable pipelines.

Access the Manage Pipelines Page

To see the Manage Pipelines page, do the following.

  1. In the TDP, click the main menu in the upper left corner of the page.
  2. Select the Pipelines Design option. The Manage Pipelines page appears.
10211021

Manage Pipeline Page

The Manage Pipelines page shows:

  • The name of the pipeline
  • A description of the pipeline
  • Information about the trigger
  • A graphical representation of the steps in the pipeline
  • Which pipelines are active and which are disabled

Set up a Pipeline

Setting up a pipeline involves four steps.

  • Step 1: Define Trigger Conditions
  • Step 2: Select the Protocol
  • Step 3: Set Notifications
  • Step 4: Finalize the Details/Settings

Step 1: Define Trigger Conditions

Trigger conditions indicate the criteria a file must meet for pipeline processing to begin. Trigger conditions can be simple or complex.

  • For simple trigger conditions, files must meet just one criterion, like a data file that has a specific metadata tag.
  • Complex trigger conditions are a combination of several trigger conditions, such as a data file that has a certain metadata tag and is also in a specific file path. Conditions can be combined using standard Boolean operators (AND/OR) and can even be nested.

To define trigger conditions, do the following.

  1. On the Managing Pipelines page, click the New Pipeline button. The New Pipeline page appears.
936936

Defining a Trigger

  1. Select the trigger source type from the drop-down menu.
  • Trigger source types are divided into two categories: Platform Metadata and Custom Metadata.
  • Platform Metadata source types are available to all TDP users.
  • Custom Metadata source types are available to your organization.

To learn more about the different trigger source types, see the table below.

List of Triggers Source Types (Platform Metadata) and Descriptions

Trigger Source TypeDescription
Source TypeIndicates the instrument that generated the data.
SourceSource that loaded the data into the data lake (e.g. specific Agents, API upload, box upload).
PipelineIndicates the name of a pipeline.
IDSIndicates the IDS schema.
IDS TypeIndicates the type of IDS (e.g. lcuv_empower).
File PathThe file path in the data lake.
File CategoryFile category. Options are: RAW (sourced directly from an instrument), IDS (harmonized JSON) and PROCESSED (auxiliary data extracted from a RAW file).
TagsTags available to the organization.
  1. In the value field, type in the name or select an option from the drop-down menu.
  2. Enter whether you want the item to match the value by selecting is or is not from the drop-down menu. Note that some source types have other options.
963963

Adding a trigger condition

  1. If you want to add another item for your trigger (like, a certain pipeline has already been run) click Add Field and repeat steps 2-4.
  2. If you want the pipeline to run if the file meets both trigger conditions, select "Matches All-AND". If you want the pipeline to run if the file meets at least one trigger condition, select "Matches All-ANY".
  3. If you want to nest trigger conditions, select Add Field Group, then repeat steps 2-6.
779779
  1. When complete, click the Next button.
  2. Go to the next step: Step 2: Select the Protocol.

Step 2: Select the Protocol

After you have defined a trigger, select and configure the protocol.

There are many different protocols that are available for you to use. Protocols are divided into two basic categories.

To select a protocol, complete the following steps.

  1. In the Select Protocol section of the Managing Pipeline page, scroll down the list and select a protocol. You can also search by entering text in the Search field.
921921

Select Protocol

  1. Enter the configuration options for the protocol, if there are any.
627627

Sample Configuration Options for Empower to IDS, v3.3.0 Protocol

  1. If you want more details on the script, click the View Details button to see the protocol.json and script.js files. The protocol.json file defines the protocol. It provides a brief description of the steps run and the configurations. The script.js file shows the workflow.
764764

Protocol.json and Script.js files for the pipeline

  1. Click the Select this Protocol button.
  2. Click the Next button.
  3. Go to the next step: Step 3: Set Notifications.

Step 3: Set Notifications

After you have defined your trigger and selected the protocol, set notification options. You can determine when to send notifications and who you want to send them to.

  1. In the Set Notifications section of the Managing Pipeline section of the page, determine whether you want send email notifications if the pipeline runs successfully and/or if it fails.
898898

Set Notifications

  • If you want to send an email when the pipeline completes successfully, slide the Send on successful pipelines slider to the right.
  • If you want to get an email when the pipeline fails, slide the Send on failed pipelines slider to the right.
240240

Notification sliders

  1. Indicate where you want notifications to be sent by adding an email address. Click “Add an e-mail address”, then add the email address in the text field that appears.
263263

Email Address Added

📘

NOTE:

For ease of maintainance, using a group alias for email addresses instead of individual emails.

  1. If you need to add more email addresses, repeat step 2.
  2. Click the Next button.
  3. Go to the next step: Step 4: Finalize the details.

Step 4: Finalize the Details

After you have defined your trigger, selected the protocol, and set notifications, the last step is to provide details about the pipeline, such as its name, description, whether it should be active (enabled), and how many standby instances you want to include (if any).

To finalize the details, complete the following steps.

  1. Enter the name for the pipeline.
  2. Enter the pipeline description.
  3. Choose whether you want the pipeline to be available for processing. If you want the pipeline to start running as soon as files that meet the trigger conditions are ingested in the data lake, move the Enabled slider to the right. Otherwise, leave it as is (slid to the left.)
14901490

Finalize Details

  1. If your schema supports it, you can also indicate that you want to have up to 5 standby container instances.
  2. When complete, click the Create Pipeline button.
  3. If the pipeline has been enabled, it will start when the trigger conditions are met.

Edit Pipelines

To edit a pipeline, complete the following steps.

  1. In the Manage Pipelines page, click the pipeline you want to edit. The Edit Pipeline page appears.
23382338

Edit Pipeline

  1. Click the edit button to the right of the section you want to edit: Trigger, Protocol, Notifications, or Details.
  2. When complete, click the Save button to save that section.
  3. Click Save to save the entire pipeline.

Enable an Existing Pipeline

For a pipeline to run if a file meets the trigger condition, you'll need to enable it. To do this, complete the following steps.

  1. In the Manage Pipelines page, click the pipeline. The Edit Pipeline page appears.
  2. Click the edit button to the right of the Details section.
  3. Move the Enabled slider to the right.
  4. When complete, click the Save button to save that section.
  5. Click Save to save the entire pipeline.

Disable an Existing Pipeline

To make a pipeline unavailable, disable it. To do this, complete the following steps.

  1. In the Manage Pipelines page, click the pipeline. The Edit Pipeline page appears.
  2. Click the edit button to the right of the Details section.
  3. Move the Enabled slider to the left.
  4. When complete, click the Save button to save that section.
  5. Click Save to save the entire pipeline.

Create Standby Instances to Speed Up Processing

📘

NOTE:

Additional charges might be incurred if you choose to use this feature.

Pipeline code runs in one or more containers. A container is a standalone package of software that includes everything that is needed to execute a program, such as code, system tools, system libraries, settings, and configuration or runtime settings. Containers ensure that programs run the same, even when they are run in different environments. Starting a container can take a minute or so to complete.

To improve efficiency, a pipeline reuses the same container for multiple steps that use the same task script. Instead of waiting for containers to be initialized and started each time a step is run, it is only initialized and started for each script. But, if the container is idle for a while (about 15 minutes), it is automatically stopped. If that container is needed after that, it must be started again.

If you need to process data very quickly and you cannot wait the extra minute or two for containers to start again, consider using standby.

Standby allows you to start up to five containers and leave them running. Here is how to do this.

  1. In the Manage Pipelines page, click the pipeline. The Edit Pipeline page appears.
  2. Click the edit button to the right of the Details section.
  3. Enter the number of standby instances you want to allocate. Note that you can allocate up to 5 instances.
  4. When complete, click the Save button to save that section.
  5. Click Save to save the entire pipeline.

Destroy Standby Instances

To destroy standby instances, complete the following steps.

  1. In the Manage Pipelines page, click the pipeline. The Edit Pipeline page appears.
  2. Click the edit button to the right of the Details section.
  3. Set the number of standby instances to 0.
  4. When complete, click the Save button to save that section.
  5. Click Save to save the entire pipeline.