Empower Tetra Data Migration Script User Guide

The Empower Tetra Data Migration Script programmatically converts Empower data generated by a Tetra Data Schema version to a newer schema version without re-uploading the original raw files to the Tetra Data Platform (TDP). By converting existing Empower data to the latest Tetra Data Schema version, the data remains searchable in the TDP—even for Empower projects that are no longer available for re-upload, such as Empower Historical Data.

The Empower Tetra Data Migration Script provides the following benefits:

  • The new IDS JSON files replace the existing IDS JSON files on the TDP as if it was produced from a RAW JSON file uploaded by the latest Agent. If the same injection is uploaded by the Empower Agent again, the new RAW JSON file will overwrite the existing RAW JSON file so that the RAW-to-IDS pipeline runs as normal.
  • The original RAW JSON v1 files are not altered or compromised.
  • The IDS JSON file produced from a RAW JSON file from Agent v4.2.x or earlier will have a migrated_from: empower_raw_json_schema_v1 label.
  • With IDS v16 and protocol v9.1.2 and higher, a new top-level field named raw_json_schema_version in the IDS files allows you to query for this value by using SQL queries.

To ensure data integrity, it's recommended that you contact your customer success manager (CSM) to help walk you through the following procedure.

📘

NOTE

Once migrated, the new IDS JSON contains only the data that is already available in the original raw JSON file. Any new fields introduced in the later Agent versions will not be populated and no new RAW JSON will be created in the TDP. If you need fields that are available only after Tetra Empower Agent v5.x, you must re-upload the same injection to the TDP by using latest Tetra Empower Agent version.

When Is Empower Data Migration Needed?

If you've used Empower Agent v4.2.x to produce RAW JSON v1 files and upgraded to Empower Agent v5.x, which is now generating RAW JSON v2 files, and you want all IDS files to be on the same version, it's recommended that you use the Empower Tetra Data Migration Script to migrate older Empower Tetra Data schemas to the latest version.

If you want to keep historical data separate and only need to query your latest data, there is no need to migrate your historical Empower data.

How It Works

Because protocol empower-raw-to-ids v8.x.x supports only Empower Agent v5.1.x and higher, and the challenges presented by re-uploading all Empower injection data to the TDP, protocol empower-raw-to-ids v8.2.1 and v9.1.2 includes a new feature that allows it to process RAW JSON v1 files (generated by Agent v4.2.x and earlier).

If this new feature is not manually deactivated in the protocol's config file, the protocol will run a migration script that does the following:

  • Detects RAW JSON v1 files generated by Empower Agent v4.2.x and earlier
  • Processes the historical data to the latest Tetra Data Schema version (v15.0.0 and higher)
  • Produces new IDS JSON v15.0.0 or higher files that include all of the metadata, tags, and labels inherited from the RAW JSON v1 files and are labeled with the following: migrated_from: empower_raw_json_schema_v1

📘

NOTE

To deactivate the Empower Tetra Data Migration Script, go to the protocol empower-raw-to-ids v8.2.1 or v9.1.2 config file and change the disableMigrateV1RawJsonSchema value to true.

Run the Empower Tetra Data Migration Script

To run the Empower Tetra Data Migration script, do the following.

Prerequisites

Make sure that you do the following before running the script:

  • Upgrade to the latest Tetra Empower Agent version.
  • Deactivate the Tetra Empower Agent, so that it isn't generating new data.
  • Deactivate any pipelines that use the empower-raw-to-ids protocol to prevent any new injections from being generated while the script runs.

📘

NOTE

The Empower Tetra Data Migration Script supports the ability to migrate the following lcuv-empower IDS versions to IDS v15 and higher:

  • v7.x
  • v8.x
  • v9.x
  • v10.x

Step 1: Configure the Pipeline for Migration

Upgrade the empower-raw-to-ids protocol to produce the desired IDS version by doing the following:

  1. Determine what protocol to use by referring to the compatibility matrix.

  2. Make sure the pipeline config disableMigrateV1RawJsonSchema value is set to false. Then, save the protocol.

  3. Select the files to be reprocessed based on the date that they were generated.

Step 2: Reprocess Existing RAW Files to the Desired IDS Version

🚧

IMPORTANT

It is strongly recommend that you stop RAW data ingestion from all of your Tetra Empower Agents during the reconciliation job. When many workflows are created, a significant delay can be introduced between the creation of the workflow object and when the workflow runs. This can create a race condition where, at the time the workflow is processed, the input file is not the latest version of the RAW file object anymore. Given the expected duration of the reprocessing job and the rate of upload of new Empower data, the risk of this race condition occurring could be significant if you have a large number of injections.

Reprocess existing RAW files so that they convert to your desired IDS version by doing the following:

  1. Sign in to the TDP.

  2. In the left navigation menu, choose Pipelines. Then, choose File Processing. The File Processing page appears.

  3. Create a Bulk Pipeline Process Job from the File Processing Page. For WORKFLOW STATE, make sure that you select Completed Successfully and Unprocessed.

  4. Update any downstream applications to point to latest IDS version.

    Empower Raw to IDS protocol

Step 3: Verify That the Migration is Successful

Verify that all files have completed migration and that there are no failures by doing the following:

  1. Check that all bulk Processing Jobs completed by doing the following:

    • In the left navigation menu, choose Bulk Actions.

    • Choose Pipelines. The Pipelines page appears and displays a list of all your active and inactive bulk pipeline process jobs.

    • In the NAME column, select the name of the pipeline (migration). Then, review the File Status section for file failures. For more information, see Monitor a Bulk Pipeline Process Job's Status.

      Pipelines page

  2. Verify that there is an IDS in the expected schema version for all of your Empower RAW files by doing the following:

    • Search for all common/lcuv-empower:v15.0.1 or common/lcuv-empower:v16.0.1 IDS files with metadata.migrated_from:"empower_raw_json_schema_v1".
    • In the left navigation menu, choose Search. Then, choose Search (Classic).
    • Select Labels & Advanced Filters.
    • Select IDS, is, common/lcuv-empower:v15.0.1 or common/lcuv-empower:v16.0.1

  1. Make sure downstream applications successfully updated and online
  2. Restart the Tetra Empower Agent. For instructions, see the Agent Status section of the Tetra Empower Agent User Manual (Version 5.2.x).

FAQs

How do I know that I'm migrating all of my data to the latest Agent version, and that all of the data is still going to be there?

Verify that all files have completed migration and that there are no failures by following the instructions in Step 3: Verify That the Migration is Successful.

📘

NOTE

The RAW JSON v1 file will not be overwritten in the TDP unless you re-upload the RAW files and generate new injections using Agent v5.x or higher.

What is the downtime if I run the migration?

Migration downtime is dependent on the number of files being processed.

How long does the migration take?

Around 6000 files can be processed in an hour, based on average files sizes.

How does this impact my current workflow?

It is strongly recommend that you stop RAW data ingestion from all of your Tetra Empower Agents during the reconciliation job. After the migration is complete, you can then activate the new Agent version.