Monitor Files Health

Data consistency is crucial for data stored and accessed within the Tetra Data Platform (TDP). You can use the Files Health dashboard to ensure consistency between places where data is stored and accessed in the TDP. Data is currently stored in these locations:

  • Data Lake (S3)
  • System Properties (FileInfo) Service
  • Search Indices (Elasticsearch)
  • Athena

The Files Health Dashboard provides clear reporting and monitoring capabilities on the health of those data files in the TDP.

To access and view the Files Health Dashboard:

  1. Log in to TDP using an Administrator user account.
  2. In the Tetra Data Platform, click the Hamburger icon at the top left corner of the page to expand the TDP menu options (or hover over the list of icons to display the menu options).
  3. Select Health Monitoring from the list of menu options that appears on the left side of the page.
  4. From the Health Monitoring page, click the Files tab to view the Files Health Dashboard:

Files Health Dashboard

As an Administrator, you can quickly identify if any inconsistencies exist across the services. At the top of page, you can view:

  • Total number of files in the Data Lake (S3) - The total number of files includes files in these categories: RAW, IDS, PROCESSED, and TMP (files or artifacts from a pipeline).
  • Health status of the Elasticsearch (ES) cluster - State provided by AWS CloudWatch indicating cluster health:
    • Green - Cluster is healthy, no action required
    • Yellow - One or more of the replica shards on the ES cluster are not allocated to a node
    • Red - At least one primary shard is not allocated to any node


Yellow or Red Status?

If the ES Cluster Health status is yellow or red, please contact your TetraScience Customer Success Manager (CSM) and/or your AWS IT team.

To review free and used storage space details for the ES cluster, hover over the information icon next to the status:


Hover to view storage space details for ES cluster

The amount of free storage space of the ES cluster determines the color:

  • Green: Free storage space > 50%
  • Yellow: 30% <= Free storage space <= 50%
  • Red: Free storage space <= 30%

The middle of the page shows an overview of the number of files processed for each of the TDP services:


Processed Files across TDP Services


Data Displayed

Data in the overview table refreshes on a regular basis. For each category, the Last Update column indicates when the statistics were last calculated. Please note that the calculation of totals does not include any files processed in the last four hours.

The Processed File section includes:

  • Category: Data Lake file categories: TOTAL, RAW, IDS, PROCESSED, and TMP (files or artifacts from a pipeline). By default, the TOTAL category is selected and its corresponding files display below.
  • Last Update: Time/Date indicates when the statistics were last calculated.
  • Files Uploaded: Number of files that were uploaded to the category. Only the latest version of a file (excluding those files marked for deletion) is uploaded.
  • DL to FileInfo: Number of files uploaded from the Data Lake (S3) to the FileInfo Service (in green) and percentage of file discrepancies (in red).
  • FileInfo to ES: Number of files uploaded from the FileInfo Service to ES Indices (in green) and percentage of file discrepancies (in red).
  • FileInfo to Athena: Number of files uploaded from the FileInfo Service to Athena (in green) and percentage of file discrepancies (in red).

You can hover over the file results in the DL to FileInfo, FileInfo to ES, and FileInfo to Athena columns to view:

  • Number of processed files (displays in green)
  • Number of discrepancies (displays in red)
  • Number of expected files

The criticality of any file inconsistency is indicated by color:

  • Green - Indicates no error
  • Orange - Indicates (1 <= x <=10) range of errors
  • Red - Indicates > 10 errors

If any file discrepancies exist, you can click Files Reprocessing to perform a system file cleanup and return to a consistent data state. For more details about how to reprocess files, click here.

File Events Section includes:
The list of files that display show the latest failures (only):


List of Files with Discrepancies

The Detailed List of Files section includes:

  • Category: Data Lake file categories: RAW, IDS, and TMP (files or artifacts from a pipeline).
  • File Name: Name of file. You can hover over the name to view its entire file path. To copy the file, click the copy file icon.
  • File ID: Unique identifier of a file in TDP (primary key). You can hover over the ID to view it entirely. To copy the unique ID for the file, click the copy file icon.
  • Event Timestamp: Date and time of when the failed event occurred.
  • Component: Source of what caused the file to have discrepancies.

To organize and narrow the list of files to display, you can:

  • Enter a Name or ID in the Search box to search the files
  • Select to sort files by: Name A-Z, Name Z-A, Date New - Old, Date Old - New, or Category
  • Set the amount of files to display at a time (25 files is the default)
  • Narrow files to display based on component type, show: All, only FileInfo, only Athena, or only Elasticsearch failures.

To view the error details of a file, you can click a file in the list:


Selected File Showing Details

This table describes these additional file details:

Date Updated Date/Time when the error or failure occurred in the file.
Trace ID If applicable, identifier that links related files together (foreign key).
Pipeline ID If applicable, identifier for the pipeline used.
IDS Schema If applicable, shows the name of the IDS Schema.
AWS link and Errors Displays the error message and includes a link to query AWS CloudWatch based on the File ID. You can access CloudWatch to troubleshoot and refine your search using the ContextID. Access to the AWS CloudWatch link is based on your user privilege setting.
ContextID An AWS CloudWatch specific ID you can use to navigate the logs to help locate the relevant section you want to refine.

File Processing Failures

File processing failures can also be found in Health Monitoring. You can find fileinfo, Athena, and Elasticsearch file failures.

  1. In the Health Monitoring page, click the Files tab.

File Processing Failures

  1. Scroll down to see the File Processing Failures part of the page.
  2. You can filter by component type:
  • All
  • FileInfo
  • Athena
  • Elasticsearch

Files that have failed appear with the following details.

CategoryIndicates the type of file such as RAW, IDS, PROCESS, or TMP. Files that are not one of these types are labeled as "UNKNOWN"
File NameLists the name of the file.
File IDSystem generated unique file identifier.
Failure TimeTime that the file failed.
ComponentIndicates whether it is a FileInfo, Athena, or Elasticsearch file processing failure.

Click the file name to view additional information about the file and the error. You can also view logs if you want to get more details about the error. To view more details about a log, click the arrow in the Raw Event column.


File Processing Details