Create Custom Protocols with Python
Protocols define the business logic of your Tetra Data Pipeline. If your custom logic isn't complex (Python scripts with less than 12,000 characters), you can configure your own protocol directly in the TDP user interface by using a custom Python script and the python-exec
protocol.
NOTE
For custom pipeline setups that are more complex (12,000 characters or more), it's recommended that you create and manage your own self-service Tetra Data Pipelines (SSPs).
Example python-exec
Use Cases
python-exec
Use CasesTo view example Python scripts for common pipeline use cases, see the following resources in the TetraConnect Hub:
- Python-exec: Decorating raw and IDS files with context from IDS
- Python-exec: Decorating files using Excel as a data source
- Python-exec: adding labels with Context API
For access, see Access the TetraConnect Hub.
Create Custom Pipeline Logic by Using python-exec
python-exec
To create custom pipeline logic by using the python-exec
protocol, do the following.
Step 1: Create a Python Script
Create a Python script that contains your custom pipeline's business logic.
When you create the script, keep in mind the following:
- The script will be able to use all the functions in the Context API by using the global
context
variable. - The script will also receive a reference to the input file for the pipeline at
input['input_file']
. For example, the following script will print out the name of the file that triggered the pipeline:
result = context.get_file_name(input['input_file'])
print(result)
Available Python Libraries
To view a list of the Python libraries that the latest python-exec
protocol version supports, see the artifact's ReadMe tab on the Protocol Details page in the TDP. For instructions, see View Protocols and Their Details.
Python Script Example
The following example Python script programmatically reads a file's contents and then adds labels to the file based on the equipment number found in the file path:
# Pipeline input always consists of _input_ object (dict with 1 key: `input_file`)
# and context API object (see developers.tetrascience.com)
"""Set up"""
# You can use built-in python libraries and a short list of additional packages
from pathlib import Path
import simplejson as json
# Interact with TDP logging using the context.get_logger() function
logger = context.get_logger()
# Simple log message
logger.log(f"input object keys: {list(input.keys())}")
# Get the input file pointer
ids_file_pointer = input['input_file']
# Read the IDS file contents
ids_file_bytes = context.read_file(ids_file_pointer, form="body")["body"]
ids_file = json.loads(ids_file_bytes)
# Instantiate the list of labels to be added
labels = []
"""Begin retrieving label values"""
# Get the IDS file path, which contains the raw file path
ids_file_path = Path(ids_file_pointer['fileKey'])
# Get equipment number from file path
eq_path= ids_file_path.parents[1]
eq_num = eq_path.name.upper()
labels.append(
{
"name": "eq_number",
"value": eq_num
}
)
# Detailed log message
logger.log(
{
"message": f"eq_num: {eq_num}",
"level": "info"
}
)
"""Add labels to IDS file"""
context.add_labels(ids_file_pointer, labels)
Step 2: Create a python-exec
Pipeline
python-exec
PipelineTo create a Tetra Data Pipeline that runs your custom Python script, follow the instructions in Set Up a Pipeline.
For Step 4: Select a Protocol in the pipeline configuration procedure, do the following:
- On the Pipeline Edit page, choose Select Protocol.
- In the Search field, enter Python Exec. Then, select the Python Exec (common/python-exec) protocol. A Configuration pane appears.
- In the Python Script (required) section, choose Edit. A String Configuration: Python Script dialog appears.
- Select the Python tab. Then, enter your custom Python script.
- Choose Update.
- (Optional) Configure email notifications about successful and failed pipeline executions.
- Choose Save Changes.
Updated 1 day ago