Reprocess Files
To ensure consistency between places where your data is stored and accessed in the Tetra Data Platform (TDP), you can reprocess either selected files, or create Reconciliation Jobs for a larger set of files. There are two ways that you can monitor and manage file processing in your TDP environment:
Data Storage Locations
Data within the TDP is currently stored in the following systems (locations):
- Tetra Data Lake (Amazon S3)
- System Properties (FileInfo) Service
- Search Indices (OpenSearch)
- Amazon Athena
Reconciliation Job Options
Because TDP data storage locations are loosely integrated, data discrepancies may occur (for example, after an AWS database outage). To address these potential data discrepancies, the Reconciliation Jobs page provides clear reporting and monitoring capabilities that help you do the following:
- Solve historical data inconsistencies or major service/platform failures and return to a consistent state in a timely manner (typically performed by an Enterprise IT Admin)
- Ensure regular clean-up of data inconsistencies that may occur intermittently (typically performed by an IT Admin)
To perform a system file cleanup and return to a consistent data state, you can reconcile files and create jobs for the following data storage locations:
- DL to File Info
- FileInfo to ES
- FileInfo to Athena
Jobs
A job is comprised of a maximum of two phases: scan and reprocess. A job may also have only one phase (scan or reprocess). Jobs are unique for each organization.
IMPORTANT
You can cancel a job, but you can't pause and then resume a job.
Job Phases
A job phase can be a service scan or service reprocess. During a service scan phase, the job generates a list of reprocessing events, which determines how the reprocess phase functions.
Monitor and Reprocess Files by Using the Files Health Monitoring Dashboard
To monitor and manage file reprocessing from the Files health monitoring dashboard, see Monitor Files Health.
Monitor and Reprocess Files by Using the Data Reconciliation Page
To create a Reconciliation Job from the Data Reconciliation page, do the following:
- In the left navigation menu, choose Bulk Actions.
- Choose Data Reconciliation. The Data Reconciliation page appears.
- Select the upper right Create Reconciliation Job button. The Create Reconciliation Job dialog appears.
- In the Create Reconciliation Job dialog, enter the following:
- For TYPE, enter the type of job that you want to create a job for (DL to File Info, FileInfo to ES, or FileInfo to Athena).
- For WHICH FILES, select either Specific Files or All files, based on what files you want to reprocess.
- (For FileInfo to ES and FileInfo to Athena jobs only) For, ERROR CODE, select either the specific error code type to run the job on, or All Errors to run a job on all error code types.
- For DATE RANGE, enter the date range for the files that you want to reprocess.
- (For FileInfo to ES and FileInfo to Athena jobs only) For FILE CATEGORY, select the file category that you want the job to run for (All Categories, IDS, RAW, or PROCESSED)
- (For DL to FileInfo jobs only) For CONCURRENCY, select either Low, Medium, or High.
- (Optional) For NAME, enter a name for the job.
NOTE
Reconciliation Job names can be no more than 64 characters, and can contain the following symbols only:
/^[0-9a-zA-Z-_+. ]+$
- Choose Reprocess Files. A dialog appears that confirms that the job was created. To view the job's status after it's created, view the Reconciliation Jobs page.
Data Reconciliation Page Contents
The Data Reconciliation page displays a list of all your active and inactive Reconciliation Jobs, which includes the following information:
- STATE—shows the job’s status
- NAME—shows the job’s name
- COMPLETION—shows how much of the job has been processed (measured as a percentage)
- STARTED—shows the date and time the job started processing
- COMPLETED—shows the date and time the job completed
- ERRORS—shows any errors that occurred during the job
- COMPONENT—shows the storage type (fileinfoToAthena, fileInfoToEs, or s3ToFileInfo)
- INFO—opens a Bulk Processing Job Details dialog that shows additional information about the job, including the JOB ID, TYPE, and FILE STATUS for each file the job processed.
NOTE
If the Data Reconciliation page shows failed files for a job after it's run, contact your customer success manager (CSM) for troubleshooting support.
Updated 8 months ago