Data Sync Utility

The Data Sync Utility enables seamless transfer of primary scientific data from laboratory instruments and control software to desktop analytical applications that are commonly used by biopharmaceutical companies. It supports specialized scientific applications, such as flow cytometry software and widely used analytical tools like Excel, JMP, and GraphPad Prism. Analysis results are uploaded to the Tetra Scientific Data and AI Cloud with comprehensive scientific metadata, making them easily discoverable, automatable for integration with laboratory information management systems (LIMS) and electronic lab notebook (ELN) systems, and suitable for advanced analytics.

🚧

IMPORTANT

The Data Sync Utility is currently available through an early adopter program (EAP) and is activated for customers through coordination with TetraScience. For more information, or to activate the Data Sync Utility in your environment, contact your customer success manager (CSM) or account executive.

Prerequisites

The Data Sync Utility requires the following:

  • Tetra Data Platform (TDP) v4.2.3 or higher (it can work with TDP v4.1.3 and higher, but requires a manual workaround for SSO login in TDP v4.2.2 and earlier)
  • Administrator privileges (required for MacOS installation)
  • Supported operating systems:
    • Windows 7 or higher
    • MacOS 10.15 or higher
  • Minimum hardware requirements:
    • 1.6 GHz or faster processor
    • 1 GB of RAM
    • 1 GB of storage and additional space for synchronized data

Installation

MacOS

  1. Download one of the following .dmg installers:
  2. Install the application through the Applications folder.
  3. Configure TDP instance connection by opening the application and then entering your TDP server URL (for example, tdp.tetrascience-demo.com).

Windows

  1. Download .exe Windows installer
  2. Run the installer locally.
  3. Configure TDP instance connection by opening the application and then entering your TDP server URL (for example, tdp.tetrascience-demo.com).

File Synchronization

Ignored Files

The Data Sync Utility automatically ignores the following files and patterns:

  • Partial downloads and temporary files:

    • **/*.part - Partial downloads
    • **/*.tmp - Temporary files
    • **/~*.* - Files starting with ~
    • **/$*.* - Files starting with $
    • Files that start with a dot (.)
  • macOS specific:

    • .DS_Store - Stores custom folder attributes
    • .AppleDouble - Stores additional file resources
    • .LSOverride - Contains the absolute path to the app to be used
    • Icon\r - Custom Finder icon
    • ._* - Thumbnails
    • .Spotlight-V100 - Directory that might appear on external disk
    • .Trashes - File that might appear on external disk
    • __MACOSX - Resource fork
  • Linux specific:

    • ~$ - Backup files
  • Windows specific:

    • Thumbs.db - Image file cache
    • ehthumbs.db - Folder config file
    • Desktop.ini - Stores custom folder attributes

Download Process

  • Automatic retry mechanism for failed downloads
  • Checksum verification:
    • Computed on both download and upload
    • Downloads are retried up to three times if checksums don't match
    • Error raised after three failed attempts
  • Parallel download processing:
    • Uses threaded workers for improved performance
    • Number of workers scales with available CPU cores
    • Minimum: two workers
    • Maximum: six workers
  • File limit:
    • Maximum of 10,000 files can be downloaded at once
    • For saved searches containing more than 10,000 files, only the first 10,000 files are downloaded
    • Files are sorted by creation date (newest first) when applying this limit

Upload Process

  • Periodic scanning for new/modified files:
    • Scans synced folders for changes
    • Compares file modified timestamps against last sync time
    • Files with newer timestamps are marked as staged
  • Automatic upload processing:
    • Staged files are processed at one-minute intervals
    • Checksum comparison prevents duplicate uploads
    • If checksum matches existing TDP file, upload is skipped
  • Metadata handling:
    • Computes intersection of metadata, tags, and labels from source files
    • Common metadata (MTL) is attached to uploaded files
    • Adds ts-analysis:true label to uploaded files
    • Sets sourceType to data-sync-utility for new files
    • Preserves original sourceType for file version updates

Path Management

Long Path Handling

⚠️

IMPORTANT

To ensure compatibility with analysis software that can't handle long file paths (more than 260 characters), the Data Sync Utility implements path shortening for local storage.

The utility uses "trivial folder" detection to shorten paths:

  • Identifies folders that only contain a single subfolder (no files)
  • Maintains root folder and immediate parent folder of synced files
  • Condenses intermediate trivial folders into a 16-character hash
  • Hash is surrounded by ~ characters

Example

TetraScience/Downloads/Nested/DSU/DSU-1/example.xlsx


will be shortened to:


TetraScience/~xxxx~/DSU-1/example.xlsx

Security and Configuration

Security Features

  • User authentication through TDP login
  • Data encryption:
    • HTTPS for data transfer
    • AES-256-CBC for local authentication storage
  • Access controls enforced through TetraScience API
  • Role-based access and user permissions
  • Compliance with industry regulations (GDPR, HIPAA)

Configuration Options

  • Proxy settings: follows system-level proxy configuration
  • Synchronization schedules: configurable (minimum 5 minutes)
  • Local storage management with warnings for disk space

Limitations

Windows OneDrive Integration

When syncing files to OneDrive on Windows, file size counts may be unreliable. This is caused by OneDrive's file virtualization and placeholder system, which can affect how file sizes are reported to the Data Sync Utility. As a workaround, you can do the following:

  • Use a local folder outside of OneDrive for synchronization
  • Ensure files are fully downloaded from OneDrive before synchronization
  • Use the "Always keep on this device" option in OneDrive for synced folders

Relative Time Filtering

The Data Sync Utility supports relative time filtering, which isn't currently supported in the TDP web client. This means that any relative time filters applied in the Data Sync Utility will not be reflected in the standard search page in the TDP.

Organization Switching

When switching between organizations, your saved searches will be cleared. To avoid this issue, it is recommended to set a default organization in your account settings in the TDP. This way, when you log back into the Data Sync Utility, you are automatically placed in the correct organization.

Role Changes

When role changes are made to a user's account, the user will need to log out of the Data Sync Utility and log back in for these changes to take effect. This is necessary to refresh the user's permissions and access rights.

Frequently Asked Questions

How does the utility handle synchronization of large scientific datasets?

  • Downloads copies of data from Tetra Data Platform
  • Uploads new/changed files to Tetra Data Platform
  • Prevents downloads if insufficient disk space
  • Shows data size before initial download

What data formats are supported?

  • All file formats supported by the local OS
    -Temporary files from common desktop applications (for example, Excel) are excluded from upload

Are there any size limitations for data files?

  • Download: No specific size restrictions (limited by available disk space and network speed)
  • Upload: 200 MB per file limit

How are updates and patches managed?

  • Currently distributed via new installers
  • Future releases will support auto-deployment through IT management systems

What is the procedure for reporting issues or requesting new features?

  • During prototype and EAP phases: Report through regular meetings, email, or Zendesk
  • After the general availability release: Report through Zendesk