TetraScience

Tetra Data Platform Documentation

Welcome to TetraScience Tetra Data Platform (TDP) documentation site. Here, you'll find Product Documentation, API Documentation, and Release Notes for TDP components.

Release Notes    API Documentation

Tetra Data Platform Health Monitoring

Health Monitoring dashboards provide an easy way for you to assess the health and performance of Tetra Data Platform components.

There are six different dashboards that you can use to monitor health of the different components:

  • An Overall dashboard (simply named "Dashboard") that gives you a snapshot of the health of the components of entire Tetra Data Platform ecosystem, from end-to-end
  • Agent
  • Data Hub
  • Connector
  • Pipeline
  • File

Accessing the Health Monitoring Dashboards

To view the health monitoring dashboards, click the profile icon, then select Health Monitoring from the menu that appears.

Health Monitoring DashboardHealth Monitoring Dashboard

Health Monitoring Dashboard

Dashboard Organization and Health Statuses

Each dashboard is divided into two major sections:
• A simple statistical visualization that shows you how many instances of a component are in healthy, unhealthy, or in critical condition.

Example of a Health VisualizationExample of a Health Visualization

Example of a Health Visualization

• A detailed list of component instances that you can filter. Some dashboards allow you to view historical status data as well.

Example of Component InstancesExample of Component Instances

Example of Component Instances

Assessing Component Health (Overview)

Component health is assessed using different methods. For each component, one of three different states are assigned.
• The Healthy state indicates that the component is operating optimally within specified parameters. What is considered optimal differs by component. For example, a healthy pipeline has a run time that is less than one standard deviation from the average run time for the last five runs. But, a healthy connector makes a connection within three attempts.

• The Unhealthy state indicates that the component is not operating optimally, but has not failed. How this definition is applied also differs by component. For example, an unhealthy Data Hub has a memory usage value that is greater than 80% but less than or equal to 90% for the past five contiguous minutes.

• The Critical state indicates that the component has failed or is well outside the specified parameters. Like the Healthy and Unhealthy states, the exact elements to contribute to the Critical state differs by component. For example, if the percentage of used disk space for a windows agent is greater than 90%, that component is in a critical state.

Component-specific Healthy, Unhealthy, and Critical states are described in further detail in the component visualizations section.

Component Visualizations

The following subsections details what the Healthy, Unhealthy, and Critical statuses mean for each of the following components: data hub, data hub connector, data source connector, windows agent, and pipeline.

Windows Agents

Shows the aggregate numbers of Healthy, Unhealthy, and Critical Tetra Windows-based Agents. Here is how those three statuses are defined.

StatusEvent
HealthyA Tetra Agent's is in a Healthy state when:

Online: A status from the agent was received within the past 5 minutes and/or a file was received in the past 40 minutes
File Transmission: Files are being transmitted because a status from the Agent was received within the past 5 minutes and/or a file was received in the past 40 minutes
Environment: The percentage of disk space used is less than or equal to 80%, the percentage of memory used is less than or equal to 80% and/or CPU usage is less than or equal to 80%.
UnhealthyThe Tetra Agent's status is Unhealthy when:

Online: A status from the Agent was not received within the past 5 minutes but a file was received in the past 40 minutes
File Transmission: An intermittent status from the Agent for more than three but less than five minutes and a file has not been received in the past 20 minutes
Environment: The percentage of disk space used is greater than 80% but less than or equal to 90%, the percentage of memory used is greater than 80% but less than or equal to 90% and/or CPU usage is greater than 80% but less than or equal to 90%.
CriticalThe Tetra Agent's status is Critical when:

Online: A status from the Agent was not received within the past 5 minutes and a file was not received in the past 40 minutes
File Transmission: The upload error rate is greater than 70% per all upload events, and the Scan aAccess rate is greater than 70% within the past hour. The scan access rate is the ability of the agent to access a particular folder or drive.
Environment: The percentage of disk space used is greater than 90%, the percentage of memory used is greater than 90% and/or CPU usage is greater than 90%.

Data Hubs

Displays the numbers of Healthy, Unhealthy, and Critical Data Hub instances.

StatusEvent
HealthyA Data Hub is in a Healthy state when:

Online: The last status received was three or less minutes ago.
Environment: The percentage of disk space used is less than or equal to 80%, the percentage of memory used is less than or equal to 80% and/or CPU usage is less than or equal to 80%.
UnhealthyA Data Hub is in an Unhealthy state when:

Online: A status has not been received in greater than three but less than five minutes.
Environment: The percentage of disk space used is greater than 80% but less than or equal to 90%, the percentage of memory used is greater than 80% but less than or equal to 90% and/or CPU usage is greater than 80% but less than or equal to 90%.
CriticalA Data Hub is in a Critical state when:

Online: The status has not been received for more than 5 minutes.
Environment: Disk percentage used is greater than 90%, Memory percentage used is greater than 90%, and CPU used is greater than 90%.

Hub Connectors

Displays the numbers of Healthy, Unhealthy, and Critical Connector instances.

StatusEvent
HealthyThe Hub Connector status is Healthy when

Online: The last status received was three or less minutes ago.
Environment: The percentage of memory used is less than or equal to 80%.
UnhealthyThe Hub Connector status is Unhealthy when

Online: A status has not been received in more than three but less than or equal to five minutes.
Environment: The memory usage is greater than 80% but less than or equal to 90% for the past five contiguous minutes.
CriticalThe Hub Connector status is Critical when:

Online: A status has not been received in the past five minutes.
Environment: The percentage of memory use is greater than 90% for the past five continguous minutes.

Data Source Connectors

Shows the numbers of Healthy, Unhealthy, and Critical Connector instances.

StatusEvent
HealthyThe Data Source Connector status is Healthy when a connection is made within three attempts.
UnhealthyThe Data Source Connector status is Unhealthy when:

Idle integrations have a waiting time of 5 times the polling interval and/or Active integrations have a processing time that matches the polling interval.
CriticalThe Data Source Connector status is Critical when:

Connection: A connection cannot be made after three consecutive attempts
Waiting/Processing Time: The Data Source Connector status is Unhealthy when the IDLE integrations have a waiting time of more than 5 times the polling interval and/or ACTIVE integrations have a processing time that exceeds the polling interval.

Pipelines

Displays the numbers of Healthy, Unhealthy, and Critical Instances.

StatusEvent
HealthyA pipelines is considered healthy when:

Failures: Less than 20% of the workflows have failed in the past 24 hours.
Run Time: The run time is less than one standard deviation from the average run time for the last 5 runs. For example, if the mean of the past 5 runs is 32, and the standard deviation is 5.7, the run time should be between 26.3 – 37.7 seconds to be considered healthy.
UnhealthyPipelines are considered unhealthy when:

Failures: More than or equal to 20%, but less than 60% of the workflows have failed in the past 24 hours.
Run Time: The run time is greater than one standard deviation but less than 2 standard deviations of the average run time for the last 5 runs.
CriticalPipelines have a critical status when:

Failures: All pipelines have failed in the past hour and/or more than 60% of the workflows have failed in the past 24 hours.
Run Time: The run time is greater than two standard deviations of the average run time for the last 5 runs.

Viewing the Overall TDP Health Dashboard

The Overall TDP Health Dashboard provides high-level details about the health of the each component of the TDP, end-to-end. You can view quick health visualizations for:
• Files
• Data Hubs
• Connectors
• Windows Agents
• Data Sources
• Pipelines

To view the Health Monitoring Dashboard, do the following.

  1. Log into TDP as using an Administrator account,
  2. Click the profile icon, which is in the upper right corner of the screen.
Example of a Profile IconExample of a Profile Icon

Example of a Profile Icon

  1. Select Health Monitoring from the menu that appears.
Profile MenuProfile Menu

Profile Menu

  1. The Health Monitoring screen appears. The Dashboard tab should appear by default. If it does not, click the Dashboard tab to view it.
Overall Health DashboardOverall Health Dashboard

Overall Health Dashboard

File Statistics

This section of the dashboard gives an aggregate count of:
• Files Uploaded: The number of files that were uploaded by all members of the organization.
• Workflows Triggered: The number of workflows that were launched.
• Files Indexed: The number of files that were indexed.
• Files in Athena: The number of files that are in Athena. Athena files can be queried using SQL.
• Files Failed Indexing: The number of files that have failed indexing.

Overall File Health

Indicates whether the overall File Health status is Healthy, Unhealthy, or Critical.

Problems

This is an aggregate list of all components that have critical issues. If want to see information about components that are not in the critical state, you’ll need to look at the Windows Agents, Data Hubs, Data Source Connectors, Pipelines, or Files dashboards.

You can filter issues using the All, Windows Agents, Datahubs, Connectors, Data Sources, and Pipelines buttons at the top of this section.

The critical status list shows the following information for each component.

FieldDescription
HealthIndicates the status. For this dashboard, only critical issues are shown.
NameIndicates the name of the component instance that is currently in the critical state. You can hover over the name to see additional details about the component such as CPU, Memory, and Disk usage. What appears when you hover over the name is customized by the type of component. For example, connectors show the memory and last time that a status was received (last contact), where the agents show CPU, Memory, and Disk Usage statistics as well as the last contact. Click the copy file icon to copy the unique ID for that component instance.
Health DescriptionIndicates why the component has been assigned the critical state. By default, only one issue is shown. If there are more issues than one, a link that indicates the number of other issues (e.g. 1 More) appears. Click the link to see the other issues.
LinkProvides a link to the place in TDP where you can see configuration details for the component.

To view even more information about individual components, see the component-specific dashboards.

Viewing the Windows Agents' Health Dashboard

The Windows Agents Health Dashboard provides statistics on window-based agents. Windows based agents include the following:
• Tetra Chromeleon Agent
• Tetra Empower Agent
• Tetra File-Log Agent
• Tetra LabX Agent
• Tetra Unicorn Agent

📘

NOTE:

More details about each of these agents can be found in the Tetra Agents section of the documentation.

In the Health Monitoring screen, click the Windows Agents tab to see the Windows Agents Health Dashboard.

Windows Agents DashboardWindows Agents Dashboard

Windows Agents Dashboard

Overall Status

The aggregate status of the agents appear as a graphic near the top of the screen. For more information the status, see https://developers.tetrascience.com/docs/monitoring-tdp-health#component-visualizations.

Searching the Windows Agent’s List and Applying Filters

Search for the name of the component by entering all or a portion of the agent’s name or unique identifier (UID) or the type of agent in the search text box. You can also apply a filter by selecting the All, Critical, Unhealthy, or Healthy buttons next to the search box.

Viewing the Windows Agent Individual Component List

The Windows Agent individual components are listed along with other information indicated in the following table.

FieldDescription
HealthIndicates the status. For this dashboard, only critical issues are shown.
NameIndicates the name of the component instance that is currently in the critical state. You can hover over the name to see additional details about the component such as CPU, Memory, and Disk usage. What appears when you hover over the name is customized by the type of component. For example, connectors show the memory and last time that a status was received (last contact), where the agents show CPU, Memory, and Disk Usage statistics as well as the last contact. Click the copy file icon to copy the unique ID for that component instance.
Latest StatusIndicates when the latest status was assigned. If you’d like to see a history of the status from the past month, click the View History link that appears beneath the status.
Health DescriptionIndicates why the component has been assigned the critical state. By default, only one issue is shown. If there are more issues than one, a link that indicates the number of other issues (e.g. 1 More) appears. Click the link to see the other issues.
Link

Viewing Status History

To view a history of the status of a component over the past month, click the View History link in the Last Status section for the agent you are interested in. The following fields appear in the table.

FieldDescription
TimeIndicates the time that the status was recorded for historical purposes.
ChangeIndicates the status when a change has occurred.
ErrorsShows the errors/issues that are the reason for the status.
WarningsIndicates errors or warnings that indicate the reason for the status. Typically, warnings appear when the system moves from a Healthy to an Unhealthy state.

Viewing Data Hubs Health Dashboard

The Data Hubs Dashboard provides statistical information on the health of the data hubs (and the connectors installed on it) that are part of the Tetra Data Platform.

In the Health Monitoring screen, click the Data Hubs tab to see the Data Hubs Health Dashboard.

Data Hubs Health DashboardData Hubs Health Dashboard

Data Hubs Health Dashboard

Overall Status

The aggregate status of the overall state of TDP data source data hubs and associated connectors appear as a graphic near the top of the screen. For more information the status, see https://developers.tetrascience.com/docs/monitoring-tdp-health#component-visualizations.

Searching the Data Hubs Health List and Applying Filters

Search for the name of the component by entering all or a portion of the data hub’s name or unique identifier (UID) in the search text box.

You can also apply a filter by selecting the All, Critical, Unhealthy, or Healthy buttons next to the search box.

Underneath the search box and filters, a table provides health details for each data hub or connector. The following describes each of the fields in the table.

FieldDescription
HealthIndicates the status. For this dashboard, only critical issues are shown.
NameIndicates the name of the component instance that is currently in the critical state.
You can hover over the name to see additional details about the component such as CPU, Memory, and Disk usage. What appears when you hover over the name is customized by the type of component. For example, connectors show the memory and last time that a status was received (last contact), where the agents show CPU, Memory, and Disk Usage statistics as well as the last contact. Click the copy file icon to copy the unique ID for that component instance.
Latest StatusIndicates when the latest status was assigned. If you’d like to see a history of the status from the past month, click the View History link that appears beneath the status.
Health DescriptionIndicates why the component has been assigned the critical state. By default, only one issue is shown. If there are more issues than one, a link that indicates the number of other issues (e.g. 1 More) appears. Click the link to see the other issues.
LinkProvides a link to the place in TDP where you can see configuration details for the component.

Viewing the Data Source Connectors Health Dashboard

The Data Source Connectors Dashboard provides statistical information on the health of the data source conectors that are part of the Tetra Data Platform.

In the Health Monitoring screen, click the Data Source Connectors tab to see the Data Source Connectors Health Dashboard.

Data Source Connectors Health DashboardData Source Connectors Health Dashboard

Data Source Connectors Health Dashboard

Overall Status

The overall state of all TDP data source connectors appear as a graphic near the top of the screen. For more information the status, see https://developers.tetrascience.com/docs/monitoring-tdp-health#component-visualizations.

Searching the Data Source Health List and Applying Filters

Search for the name of the component by entering all or a portion of the data source's name or unique identifier (UID) in the search text box. You can also apply a filter by selecting the All, Critical, Unhealthy, or Healthy buttons next to the search box. Underneath the search box and filters, a table provides health details for each data hub or connector. The following describes each of the fields in the table.

FieldDescription
HealthIndicates the status. For this dashboard, only critical issues are shown.
NameIndicates the name of the component instance that is currently in the critical state.
You can hover over the name to see additional details about the component such as CPU, Memory, and Disk usage. What appears when you hover over the name is customized by the type of component. For example, connectors show the memory and last time that a status was received (last contact), where the agents show CPU, Memory, and Disk Usage statistics as well as the last contact. Click the copy file icon to copy the unique ID for that component instance.
Latest StatusIndicates when the latest status was assigned. If you’d like to see a history of the status from the past month, click the View History link that appears beneath the status.
Health DescriptionIndicates why the component has been assigned the critical state. By default, only one issue is shown. If there are more issues than one, a link that indicates the number of other issues (e.g. 1 More) appears. Click the link to see the other issues.
LinkProvides a link to the place in TDP where you can see configuration details for the component.

Viewing the Pipelines Health Dashboard

The Pipelines Dashboard provides statistical information on the health of the data source connectors that are part of the Tetra Data Platform.

In the Health Monitoring screen, click the Pipelines tab to see the Pipelines Health Dashboard.

Overall Status

The aggregate status of the overall state of TDP pipelines appear as a graphic near the top of the screen. For more information the status, see https://developers.tetrascience.com/docs/monitoring-tdp-health#component-visualizations.

Searching the Pipelines Health List and Applying Filters

Search for the name of the component by entering all or a portion of the data hub’s name or unique identifier (UID) in the search text box. You can also apply a filter by selecting the All, Critical, Unhealthy, or Healthy buttons next to the search box.

Pipelines Health DashboardPipelines Health Dashboard

Pipelines Health Dashboard

Underneath the search box and filters, a table provides health details for each pipeline. The following describes each of the fields in the table.

FieldDescription
HealthIndicates the status. For this dashboard, only critical issues are shown.
NameIndicates the name of the component instance that is currently in the critical state. You can hover over the name to see additional details about the component such as CPU, Memory, and Disk usage. What appears when you hover over the name is customized by the type of component. For example, connectors show the memory and last time that a status was received (last contact), where the agents show CPU, Memory, and Disk Usage statistics as well as the last contact.Click the copy file icon to copy the unique ID for that component instance.
Latest StatusIndicates when the latest status was assigned. If you’d like to see a history of the status from the past month, click the View History link that appears beneath the status.
Average RuntimeIndicates the average runtime for the past five pipeline runs.
Health DescriptionIndicates why the component has been assigned the critical state. By default, only one issue is shown. If there are more issues than one, a link that indicates the number of other issues (e.g. 1 More) appears. Click the link to see the other issues.
LinkProvides a link to the place in TDP where you can see configuration details for the component.

Viewing the File Health Dashboard

The Files Dashboard provides statistical information on the health of the files that are part of the Tetra Data Platform.

In the Health Monitoring screen, click the Files tab to see the Files Health Dashboard.

File Health DashboardFile Health Dashboard

File Health Dashboard

Overall Status

File statistics and the overall file health status appears near the top of the screen. For more information the status, see https://developers.tetrascience.com/docs/monitoring-tdp-health#component-visualizations.

Searching the Files Health List and Applying Filters

Search for the name of the component by entering all or a portion of the files name in the search text box. You can also specify that you want to see files that have been uploaded to the data lake (S3 bucket) or indexed during a certain date range.

Underneath the search box and date filters, a table provides health details for each file. The following describes each of the fields in the table.

FieldDescription
NameIndicates the name of the file.
Date UploadedIndicates the date and time the file was uploaded.
Date IndexedIndicates the date and time the file was indexed.
SourceIndicates the source of the the file.
LinkIndicates the unique ID for the file. Clicking the pages icon next to the file copies the file ID to your clipboard so that you can use it elsewhere.

Updated about a month ago


Tetra Data Platform Health Monitoring


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.