Manage and Apply Attributes

You can use labels, metadata, and tags (attributes) to annotate files and trigger pipelines, without changing the actual data in your files.

You can add, edit, and delete attributes in multiple places in the TDP, such as when you do the following:

  • View search results
  • Add files from a Tetra Agent (you can add metadata, but not labels)
  • View pipeline processing details for a file
  • Add or edit a connector
  • Edit labels in bulk

For more information about the different types of attributes, see Attributes. For a list of recommended attributes to use when first onboarding data to the Tetra Data Platform (TDP), see Recommended Labels.

🚧

IMPORTANT

You can’t use a new attribute as a pipeline trigger until you add a file to the TDP that includes the attribute. For instructions on how to manually upload files, see Upload a New File or New Version of the File.

How to Manage Attributes

You can view, add, edit, and delete attributes on the Attribute Management page. Custom labels, metadata, and tags are available to all users. However, you can set more restrictive permissions for data integrity based on your organization's use case. For example, you can allow only admins to create or edit attributes.

📘

NOTE

To manage labels for more than one file at a time, see the Edit Labels in Bulk section of this topic.

View Attributes

To view your labels, metadata, and tags, do the following:

  1. Sign in to the TDP. Then, in the left navigation pane, choose Attribute Management. The Attribute Management page appears.
  2. Select either the Labels, Metadata, or Tags tab to see a list of each available attribute.

Create a New Label

To create a new label, do the following:

  1. On the Attribute Management page, select the Labels tab.
  2. Select the upper right Add Label Name button. The Create Label Name dialog appears.
  3. For NAME, enter a name for the new label.
  4. (Optional) For DESCRIPTION, enter a label description.
  5. Choose Save.

📘

Label Values

Label values must be less than 128 characters, and can only include letters, numbers, spaces, and the following symbols: +, -, ., or _.

Add a New Metadata Field

There are two types of metadata used within the TDP:

  • Default metadata is collected automatically and is immutable within the TDP. Examples include File ID, Created Date, and Source Type. These fields are related to the TDP. As a result, the Created Date represents when the file was first uploaded to the Data Lake; not when the file was created by its original source.
  • Custom metadata is user-defined and editable. You can use custom metadata to enhance the default metadata by adding organization-specific information.

You can apply custom metadata to files automatically and manually. When setting up any new data source (for example, IoT or DataHub) you can specify a custom metadata field and value to attach to all incoming files from that source. After you complete this set up, it will continue to run automatically until you edit the source and remove the custom metadata.

Custom metadata is available to all users. However, you can set it to a more restrictive level for data integrity based on your use case (for example, only accessible to privileged users or admins).

To add a new metadata field, do the following:

  1. On the Attribute Management page, select the Metadata tab.
  2. Select the upper right Add Metadata button. The Create New Metadata dialog appears.
  3. Enter a name for the new metadata.
  4. Choose Save.

📘

Metadata Values

Metadata field names can consist of the following characters: * ^[0-9a-zA-Z-_+ ]+$

Add a New Tag

You can use custom tags to organize datasets and perform batch operations. You can apply custom tags to files automatically and manually. When setting up any new data source (for example, IoT or DataHub) you can specify a custom tag to attach to all incoming files from that source. After you complete this set up, it will continue to run automatically until you edit the source and remove the custom tag.

Custom tags are available to all users. However, you can set it to a more restrictive level for data integrity based on your use case (for example, only accessible to privileged users or admins).

To add a new tag, do the following:

  1. On the Attribute Management page, select the Tags tab.
  2. Select the upper right Add Tag button.
  3. Enter a name for the new tag.
  4. Choose Save.

📘

Tag Values

Tag names can consist of the following characters:

  • All alphanumeric characters
  • spaces
  • plus sign
  • dash
  • period
  • underscore
  • forward slash

Apply Attributes to Specific Files

To apply labels, metadata, and tags to specific files, do the following:

  1. In the left navigation menu, choose the hamburger icon. Then, choose Search Files.
  2. Search for the files that you’d like to apply attributes to. For instructions, see How to Search Files in the Data Lake.
  3. Hover over the file that you want to apply attributes to (or select the row). A list of menu icons appears on the right of the file's row.
  4. From the list of menu icons, choose More. Then, choose Add/Edit Attributes. The Edit Attributes dialog appears.
  5. Add attributes to the file by doing one of the following:

To Add a Label to a Specific File

  • In the LABELS section of the Edit Attributes dialog, select the upper right plus sign (+) icon.
  • For Label Name, either select a label from the drop-down list or choose + Add Label Name from the bottom of the list to add a new label name.
  • Choose Save.

To Add Metadata to a Specific File

  • In the Edit Attributes dialog, select Advanced Fields. The section expands to show the METADATA and TAGS fields.
  • In the METADATA section, for file name, either select metadata from the drop-down list or choose + Add Metadata from the bottom of the list to add new metadata.
  • Choose Save.

To Add Tags to a Specific File

  • In the Edit Attributes dialog, select Advanced Fields. The section expands to show the METADATA and TAGS fields.
  • In the TAGS section, select tags from the drop-down list or choose + Add Tag from the bottom of the list to add a new tag.
  • Choose Save.

Edit Labels in Bulk

You can use the Bulk Edit of Labels feature to add, remove, or update labels for more than one file at a time. This functionality can help you quickly fix a large number of files that have incorrect or incomplete labels, or enrich your data after it’s in the Tetra Scientific Data Cloud.

📘

NOTE

  • Bulk label edits don't trigger pipelines. To run a pipeline based on bulk label edits, you must run the pipeline manually after the bulk label edit job completes.
  • Before running a bulk label edit operation on 500,000 or more files, customers must contact their CSM to verify the action.

To edit labels in bulk, do the following:

  1. Sign in to the TDP with an admin account to update or remove labels in bulk. Any TDP user can add labels.
  2. In the left navigation menu, choose the hamburger icon. Then, choose Search Files.
  3. Search for the files that you’d like to edit. For instructions, see How to Search Files in the Data Lake.
  4. Decide if you want to modify the labels on all of the files returned by your search or modify labels on the files you select only. To modify a subset of files, select the check boxes next to each file’s name.
  5. Choose the Bulk Actions button at the top of the page. A drop-down list appears.
  6. To edit labels on the files you select only, choose Edit Labels on <#> Selected Files. To edit labels on all of the files returned by your search, choose Edit Labels on <#> Searched Files. The Bulk Edit of Labels dialog appears.
  7. In the left drop-down list (default setting is Add), choose the type of edit operation that you want to perform by selecting one of the following:
    • Add—creates a new label that’s applied to each file
    • Update—modifies labels or label values for each file
    • Remove Value—deletes a specific label value for each file
    • Remove Label—removes a label and its values from each file

📘

NOTE

You can run up to five bulk label edit operations on each search. To add another edit operation, choose the plus icon (+) in the upper right of the dialog. Then, select the type of edit operation that you want to perform from the drop-down list.

  1. Based on the option(s) you select, do one or more of the following:

To Add a New Label in Bulk

  • For Label Name, select an existing label name or enter a new one.
  • For Value, select an existing label value or enter a new one.

To Update a Label in Bulk

  • For from, select an existing label name and/or value that you want to update or enter it directly.
  • For to, select an existing label name and/or value that you want to make the label’s new name or value, or enter a new one.

To Remove a Label Value in Bulk

  • For Label Name, select an existing label name that you want to remove a value from or enter one directly.
  • For Value, select the label value that you want to remove or enter it directly.

To Remove a Label in Bulk

  • For Label Name, select the existing label name that you want to remove or enter it directly.
  1. Choose OK. A dialog appears notifying you that your bulk label edit job was created. To view the status of the job, either choose View Jobs in the dialog or follow the instructions in the Monitor a Bulk Label Edit Job's Status section of this topic.

📘

NOTE

The amount of time it takes to process a bulk label edit job depends on the number of files that you’re modifying and the size of those files. The more files that you modify and the larger that those files are, the longer the operation takes.

Monitor a Bulk Label Edit Job's Status

To monitor a bulk label edit job's status, do the following:

  1. In the left navigation pane, choose the hamburger icon. Then, choose Health Monitoring. The Health Monitoring page appears.
  2. Choose the Files tab. Then, choose the Files Reprocessing button.
  3. Select the Bulk Label tab. A list of all your active and inactive bulk label edit jobs appears, which includes the following information:
    • STATE—shows the job’s status (In progress, Success, or Failed)
    • JOB NAME—shows the job’s name
    • COMPLETION %—shows how much of the job has been processed
    • STARTED—shows the date and time the job started processing
    • LAST UPDATED—shows the date and time the job was last updated
    • JOB ID—shows the job ID
    • DETAILS—opens a dialog that shows the job’s label updates and a link that displays all of the files affected by the operation (View bulk label job details).