Create Custom Protocols with Python

Protocols define the business logic of your Tetra Data Pipeline. If your custom logic isn't complex (Python scripts with less than 12,000 characters), you can configure your own protocol directly in the TDP user interface by using a custom Python script and the python-exec protocol.

📘

NOTE

For custom pipeline setups that are more complex (12,000 characters or more), it's recommended that you create and manage your own self-service Tetra Data Pipelines (SSPs).

Example python-exec Use Cases

To view example Python scripts for common pipeline use cases, see the following resources in the TetraConnect Hub:

For access, see Access the TetraConnect Hub.

Create Custom Pipeline Logic by Using python-exec

To create custom pipeline logic by using the python-exec protocol, do the following.

Step 1: Create a Python Script

Create a Python script that contains your custom pipeline's business logic.

When you create the script, keep in mind the following:

  • The script will be able to use all the functions in the Context API by using the global context variable.
  • The script will also receive a reference to the input file for the pipeline at input['input_file']. For example, the following script will print out the name of the file that triggered the pipeline:
result = context.get_file_name(input['input_file'])
print(result)

Available Python Libraries

To view a list of the Python libraries that the latest python-exec protocol version supports, see the artifact's ReadMe tab on the Protocol Details page in the TDP. For instructions, see View Protocols and Their Details.

Python Script Example

The following example Python script programmatically reads a file's contents and then adds labels to the file based on the equipment number found in the file path:

# Pipeline input always consists of _input_ object (dict with 1 key: `input_file`)
# and context API object (see developers.tetrascience.com)

"""Set up"""
# You can use built-in python libraries and a short list of additional packages
from pathlib import Path
import simplejson as json

# Interact with TDP logging using the context.get_logger() function
logger = context.get_logger()

# Simple log message
logger.log(f"input object keys: {list(input.keys())}")

# Get the input file pointer
ids_file_pointer = input['input_file']

# Read the IDS file contents
ids_file_bytes = context.read_file(ids_file_pointer, form="body")["body"]
ids_file = json.loads(ids_file_bytes)

# Instantiate the list of labels to be added
labels = []

"""Begin retrieving label values"""
# Get the IDS file path, which contains the raw file path
ids_file_path = Path(ids_file_pointer['fileKey'])

# Get equipment number from file path
eq_path= ids_file_path.parents[1]
eq_num = eq_path.name.upper()
labels.append(
    {
        "name": "eq_number",
        "value": eq_num
    }
)

# Detailed log message
logger.log(
    {
        "message": f"eq_num: {eq_num}",
        "level": "info"
    }
)

"""Add labels to IDS file"""
context.add_labels(ids_file_pointer, labels)

Step 2: Create a python-exec Pipeline

To create a Tetra Data Pipeline that runs your custom Python script, follow the instructions in Set Up a Pipeline.

For Step 4: Select a Protocol in the pipeline configuration procedure, do the following:

  1. On the Pipeline Edit page, choose Select Protocol.
  2. In the Search field, enter Python Exec. Then, select the Python Exec (common/python-exec) protocol. A Configuration pane appears.
  3. In the Python Script (required) section, choose Edit. A String Configuration: Python Script dialog appears.
  4. Select the Python tab. Then, enter your custom Python script.
  5. Choose Update.
  6. (Optional) Configure email notifications about successful and failed pipeline executions.
  7. Choose Save Changes.