Reprocess Files

To ensure consistency between places where your data is stored and accessed in the Tetra Data Platform (TDP), you can reprocess either selected files, or create Reconciliation Jobs for a larger set of files. There are two ways that you can monitor and manage file processing in your TDP environment:

  • The Files health monitoring dashboard
  • The Reconciliation Jobs page

Data Storage Locations

Data within the TDP is currently stored in the following systems (locations):

  • Data Lake (S3)
  • System Properties (FileInfo) Service
  • Search Indices (Elasticsearch)
  • Athena

Reconciliation Job Options

Because TDP data storage locations are loosely integrated, data discrepancies may occur (for example, after an AWS database outage). To address these potential data discrepancies, the Reconciliation Jobs page provides clear reporting and monitoring capabilities that help you do the following:

  • Solve historical data inconsistencies or major service/platform failures and return to a consistent state in a timely manner (typically performed by an Enterprise IT Admin)
  • Ensure regular clean-up of data inconsistencies that may occur intermittently (typically performed by an IT Admin)

To perform a system file cleanup and return to a consistent data state, you can reconcile files and create jobs for the following data storage locations:

  • DL to File Info
  • FileInfo to ES
  • FileInfo to Athena

Jobs

A job is comprised of a maximum of two phases: scan and reprocess. A job may also have only one phase (scan or reprocess). Jobs are unique for each organization.

🚧

IMPORTANT

You can cancel a job, but you can't pause and then resume a job.

Job Phases

A job phase can be a service scan or service reprocess. During a service scan phase, the job generates a list of reprocessing events, which determines how the reprocess phase functions.

Monitor and Reprocess Files by Using the Files Health Monitoring Dashboard

To monitor and manage file reprocessing from the Files health monitoring dashboard, see Monitor Files Health.

Monitor and Reprocess Files by Using the Reconciliation Jobs Page

To create a Reconciliation Job from the Reconciliation Jobs page, do the following:

  1. In the left navigation menu, select the hamburger menu icon.
  2. Choose Bulk Actions. Then, choose Reconciliation Jobs. The Reconciliation Jobs page appears.
  3. Select the upper right Create Reconciliation Job button. The Create Reconciliation Job dialog appears.
  4. In the Create Reconciliation Job dialog, enter the following:
    • For TYPE, enter the type of job that you want to create a job for (DL to File Info, FileInfo to ES, or FileInfo to Athena).
    • For WHICH FILES, select either Specific Files or All files, based on what files you want to reprocess.
    • (For FileInfo to ES and FileInfo to Athena jobs only) For, ERROR CODE, select either the specific error code type to run the job on, or All Errors to run a job on all error code types.
    • For DATE RANGE, enter the date range for the files that you want to reprocess.
    • (For FileInfo to ES and FileInfo to Athena jobs only) For FILE CATEGORY, select the file category that you want the job to run for (All Categories, IDS, RAW, or PROCESSED)
    • (For DL to FileInfo jobs only) For CONCURRENCY, select either Low, Medium, or High.
    • (Optional) For NAME, enter a name for the job.

📘

NOTE

Reconciliation Job names can be no more than 64 characters, and can contain the following symbols only:

/^[0-9a-zA-Z-_+. ]+$

  1. Choose Reprocess Files. A dialog appears that confirms that the job was created. To view the job's status after it's created, view the Reconciliation Jobs page.

Reconciliation Jobs Page Contents

The Reconciliation Jobs page displays a list of all your active and inactive Reconciliation Jobs, which includes the following information:

  • STATE—shows the job’s status
  • NAME—shows the job’s name
  • COMPLETION—shows how much of the job has been processed (measured as a percentage)
  • STARTED—shows the date and time the job started processing
  • COMPLETED—shows the date and time the job completed
  • ERRORS—shows any errors that occurred during the job
  • COMPONENT—shows the storage type (fileinfoToAthena, fileInfoToEs, or s3ToFileInfo)
  • INFO—opens a Bulk Processing Job Details dialog that shows additional information about the job, including the JOB ID, TYPE, and FILE STATUS for each file the job processed.

📘

NOTE

If the Reconciliation Jobs page shows failed files for a job after it's run, contact your customer success manager (CSM) for troubleshooting support.