Monitor Pipeline Processing

Once pipelines are set up and enabled, they can process files when triggered. There are several ways that you can monitor pipeline processing:

  • View the TDP Pipeline Processing Page
  • View the TDP Dashboard
  • Use the Workflow Search API
  • Email Notifications

If you'd like some basic information on pipelines before you proceed, see the Pipeline Overview page.

Use the TDP Pipeline Processing Page to Monitor the Pipeline

Pipeline processing can be easily monitored on the Pipeline Processing page. This page allows you to monitor pipeline processing, files, workflows, and logs. You can also view and download files and more.

Viewing Pipeline Processing Details and Files

  1. In the TDP, click the main menu and select Pipelines, then Pipeline Processing.
Main Menu iconMain Menu icon

Main Menu icon

  1. Pipelines are listed on the left side of the page. To find the pipeline you want, do the following.
    • To search for a specific pipeline, type the information in the search box in the upper left corner. Click the drop-down menu to toggle between searching by name or by protocol.
    • Browse the pipelines on the left side of the page and click on the one you want to see. Use the scrollbar if needed.
Select a Pipeline to View Processing DetailsSelect a Pipeline to View Processing Details

Select a Pipeline to View Processing Details

  1. If you'd like to see a specific file, files that have certain date stamps, or that are of a certain status (All, Pending, In Progress, Completed, Failed) use the filters at the top of the page. You can also choose to simply browse, using the scrollbar as necessary.
Pipeline Processing FiltersPipeline Processing Filters

Pipeline Processing Filters

  1. You can select files by clicking on them one at a time or by selecting several or all of them. To select the files, click the check box next to the File field label.
File Field LabelFile Field Label

File Field Label

For each file, the following information is shown:

  • File Name
  • Output Files, if any are generated
  • File Status (Pending, In Progress, Completed, or Failed)
  • Timestamp of the File Status

The right side of the window has two panels: Pipeline and Workflow.

  • In the Pipeline Panel, you can click a link that will take you to a page where you can edit your pipeline, scan for unprocessed files, and view high-level trigger, protocol, and step details.
  • In the Workflow Panel, you can click links to open workflows, files, and a file workflow history. You can also use it to reprocess files, view workflow logs, manage files, and more.
Pipeline Processing Pipeline and Workflow PanelsPipeline Processing Pipeline and Workflow Panels

Pipeline Processing Pipeline and Workflow Panels

Using the Pipeline Panel to Access Pipeline Editor, Scan Files, and View Details

The following topics explain how see the Edit Pipeline page, scan for unprocessed files, selected unprocessed files for processing, and view high-level trigger, protocol, and step details.

Pipeline Processing Pipeline PanelPipeline Processing Pipeline Panel

Pipeline Processing Pipeline Panel

Editing a Pipeline

You can go to the page that will allow you to edit the pipeline by clicking the Edit Pipeline link in the Pipeline Panel. Detailed documentation on how to edit pipelines appears in this topic.

Scanning Unprocessed Files

To scan for unprocessed files in the Pipeline Panel, click the Scan for Unprocessed Files link. The date of the last scan is updated.

Processing Unprocessed Files

To process files in the Pipeline Panel, complete the following steps.

  1. Click Select Processed Files. Files appear in a pop up window.
Selecting Files for ProcessingSelecting Files for Processing

Selecting Files for Processing

  1. Click the checkboxes to the left of the files you want to process, then click Process Selected Files. Or, click Process All Unprocessed Files to process everything that has not yet been processed.

Viewing Trigger, Protocol, and Step Details

In the Pipeline Panel, view the Trigger, Protocol, and Step details.

  • The Trigger shows the criteria that must be met for the pipeline to process a file.
  • The Protocol shows the following:
    The namespace, which is a combination of which organization the file belongs to and who can use the file. Namespaces are discussed in detail in this topic.
    The slug is the unique name of the protocol. Slugs are discussed in detail in this topic.
    * The version shows the version number of the pipeline used.
  • The Steps shows the name of each step that is part of the pipeline workflow.

Using the Workflow Panel to View Workflows, Workflow Histories, Logs, File Properties, and to Manage Files

The Workflow panel provides details on files and workflows. It also allows you to view, download, add attributes (metadata, labels, and tags), and reprocess them. This panel becomes visible when you select one or more files from the list in the center of the page. You can also view and download files from this panel.

Pipeline Processing Workflow PanelPipeline Processing Workflow Panel

Pipeline Processing Workflow Panel

Viewing a Workflow

To view a workflow, complete the following steps.

  1. To open a workflow, click the workflow link.
  2. The workflow shows the pipeline, date completed, protocol, duration, workflow ID, status, other related workflows, the input file, output file (if any), and log information.

Specific information about workflows appears in this topic.

Viewing the Workflow History

To view the file workflow history, complete the following steps.

  1. View the Workflow History information near the bottom of the Workflow panel. The timestamp that the workflow is completed appears.
  2. Click the link that appears after the timestamp.
  3. This window shows the same information that is in the workflow window, which is addressed, in detail, in this topic.

Viewing Logs

To view logs, click a file on the page, then click the View Logs link in the Workflow panel. The Workflow Logs popup window appears.

Pipeline Processing Workflow LogsPipeline Processing Workflow Logs

Pipeline Processing Workflow Logs

Viewing File Properties

To view file properties, select a file, then look at the File Properties in the Workflow panel. It shows the Input file, its source type, and name, as well as the same information for the output file if there is one.

Viewing Files

To view a file and download a file, complete the following steps.

  1. Select a file, then in the Workflow panel, select the Open File link for the file you want to see.
Pipeline Processing Workflow Panel (Open File)Pipeline Processing Workflow Panel (Open File)

Pipeline Processing Workflow Panel (Open File)

  1. The File Details page appears. The File Name, File ID, Date Created, and Integration Type appears, along with any attributes.

Once you open a file, you can upload a new file version, add attributes (metadata, labels, and tags), or perform operations such as preview, download, or delete the file.

Downloading a File

To download the file, click the Download link.

View File Details

You can also view more details by clicking the View More Details (JSON) link.

Uploading New File Versions

To upload a new file version, complete the following steps.

  1. Select a file, then in the Workflow panel, select the Open File link for the file you want to see.
  2. Click "Upload New Version".
  3. If desired, add a label. Labels are explained in detail in this topic. To add a label, click the Add New Label link, then add the Label Name and Label Value. For more information on accepted values, see this topic.
  4. If you want to add metadata or tags, click the Advanced Fields section.
  5. Add the metadata and/or tags.
    • To add metadata, click the Add Metadata link, then add the Metadata Field and Metadata Value. For the characters allowed, see this topic.
    • To add tags, click the Add Tag checkbox and enter the tag. For the characters allowed, see this topic.
  6. Either click the file box and select a file using your computer's file browser or drag a file to upload.
  7. When complete, click the Upload button.

Add Attributes (Metadata, Labels, Tags)

To add attributes (Metadata, Labels, or Tags) to a file, complete the following steps.

  1. Select a file, then in the Workflow panel, select the Open File link for the file you want to see.
  2. Click the Add Attributes link.
    • To add a label click the Add New Label link, then add the Label Name and Label Value. For more information on accepted values, see this topic.
    • To add metadata, click the Add Metadata link, then add the Metadata Field and Metadata Value. For the characters allowed, see this topic.
    • To add tags, click the Add Tag checkbox and enter the tag. For the characters allowed, see this topic.
  3. Click Apply.

Deleting a File

To download the file, click the Delete link.

Reprocessing Files

To reprocess a file, select one or more files, then in the Workflow panel select the Reprocess link.

Note that Retrying and File and Reprocessing a file are very two different terms. For more information on both, see this topic.

Retrying and Reprocessing Files

Retry and Reprocess are two related, but different terms. By default, if there is an error or issue that prevents the processing of a file, the pipeline retries processing three more times before the pipeline fails. This is done automatically, but you can do it manually as well in the TDP Dashboard. Reprocess is starting the process again; it is as if you've re-uploaded the file.

Here is a table that compares both features and explains what happens if you edit the pipeline.

Scenario

Retry

Reprocess

If you configure the pipeline to use another protocol.

Still uses the old protocol.

Will use the new protocol.

If you force overwrite the protocol (and the protocol version stays the same)

Will use the force-written protocol.

Will use the force-written protocol.

If the protocol refers to a wildcard version of a task-script e.g. 1.x and you upload a bumped minor version of the task-script.

Will use the new task script version.

Will use the new task script version.

If you force overwrite a task-script (and the script version stays the same).

Will use the force-written script.

Will use the force-written script.

If you update the pipeline config value directly in the pipeline design page.

Still uses the old pipeline config value.

Will use the new pipeline config value.

If you update a non-secret pipeline config value on Shared Settings page.

Still uses the old pipeline config.

Will use the new pipeline config.

If you update a secret pipeline config value on Shared Settings page.

Will use the new secret value in most cases (see note below)

Will use the new secret value.

📘

NOTE:

If you update a secret pipeline config value on the Shared Settings page, the secret is resolved when we create a new task instance, and it can only get the latest secret from SSM Parameter Store. If the workflow still uses an existing task instance, it will use the old secret. On the other hand, if the workflow creates a new task instance, it will use the new secret. The reason for this is that secret only appears in SSM or task instance, and we don’t copy the secret value anywhere else. so when we retry the workflow, it can only get a secret reference, not the real value.

Use the TDP Dashboard to Monitor the Pipeline

Pipelines run at set intervals and new files appear automatically as a new row on the Dashboard page. Most outputs files will also be saved automatically to the data lake.

View Basic Information on Pipeline Progress

To view basic information on pipeline progress, complete the following steps.

  1. In the TDP, select Dashboard from the main menu. The Dashboard appears.
My Pipelines PageMy Pipelines Page

My Pipelines Page

  1. The Dashboard shows the pipelines that are currently running as well as those that have completed successfully or that have failed. The name of the pipeline and its description, a button to view details, one or more icons showing the steps of the workflow, the status (pending, in-progress, completed, or failed), and the date are displayed.

  2. If you want, click the filters at the top of the page to filter by status (completed or failed).

FiltersFilters

Filters

Refreshing the Dashboard

If you do not see your recent files on the Dashboard, you can either wait a few minutes or force a manual refresh via the Refresh button.

View More Information on Pipeline Status

There are two ways to view more details about the processing:

  • View pipeline status and an error log.
  • View workflow status, details, and an error log.

View Pipeline Status and an Error Log

  1. Click the down arrow next to the name of the pipeline to view a summary of details about the trigger, conditions, the name of the pipeline, and the log that shows processing detail.
  2. The date and time it started, as well as the duration of the processing, appears on the page.
Summary of Processing DetailsSummary of Processing Details

Summary of Processing Details

View Workflow Status, Details, and an Error log

  1. Click the View Details button in the pipeline processing entry to view the workflow.
Detailed Information and ErrorsDetailed Information and Errors

Detailed Information and Errors

  1. The following table describes the fields shown.

Field

Description

Pipeline

Pipeline name.

Workflow ID

Unique identifier of the workflow.

Input File

Name of the file that is submitted for processing.

Output File(s)

Name(s) of the files that are created as a result of processing.

Date Completed

Date the processing was completed.

Duration

How long it took for the processing to complete.

Status

Status of the processing: pending, in-progress, completed, failed.

Other Workflows

The name of other pipelines that call the same input file.

  1. The log file appears in the bottom half of the screen. By default, the log is displayed at the “info” level, which provides general information about pipeline processing.
    a. If you want to troubleshoot a problem, change the log level to debug by moving the Display debug output slider to the right. More details about processing appears.
    b. To view task logs, click the View Task Logs in Cloudwatch. Cloudwatch is a metrics repository in AWS that contains log data.

Retry Processing

By default, if there is an error or issue that prevents the processing a file, the pipeline tries three more times before the pipeline fails.

If the failure is due to an out of memory error, more memory is allocated with each retry:

  • Initial Amount of Memory Allocated: 512 MB
  • Memory allocated for the 1st Retry: 1 GB
  • Memory allocated for the 2nd Retry: 2 GB
  • Memory allocated for the 3rd Retry: 3 GB

If the out of memory error persists after the third retry, the pipeline fails.

If a file is unable to successfully go through the Pipeline, you can also manually click the Retry button that will display next to the workflow steps. The Retry button allows you to manually re-trigger the pipeline for that file.

Retry ProcessingRetry Processing

Retry Processing

Use the API to Monitor Pipeline Processing

You can use the API to monitor pipeline processing. See Search Workflows API for more details.

Use the Notifications Feature to Get Emails about the Outcome of Pipeline Processing

You can have emails sent that indicate whether a pipeline is successful or has failed.
See Set Notifications for more details.


Did this page help you?