How to Search Files in the Data Lake

📘

Tetra Data Platform (TDP) Versions

  • For TDP versions >= 3.2, please continue with this page.
  • For TDP versions < 3.2, please review this page.

This page describes how to search files in the Data Lake using the Tetra Data Platform (TDP) web-based user interface. For details about using Search in the Tetra Web API, click here.

You can easily apply filters to create complex searches and fine-tune queries. The Search feature on TDP behaves similarly to a common website search. Each field you enter is analyzed as both keyword (default) and text (field.text). The default behavior for Search sorts results by relevance or score.

Use the Search feature to:

  • View and sort search results by name, source type, or last modified date
  • Browse files in specific folders and subfolders
  • Filter search results by entering text in the Search box
  • Save a query or grouping of results as a collection using the List view
  • Save file path locations as shortcuts using the Browse view
  • View files from specific file categories, sources, or pipelines
  • Upload, download, preview, and delete files
  • Open the file page to view its details
  • View JSON files
  • Add or edit attributes (metadata, tags, or labels)

Access the Search Feature

To access the Search feature:

  1. In the Tetra Data Platform, click the Hamburger icon at the top left corner of the page to expand the TDP menu options (or hover over the list of icons to display the menu options):
47

Hamburger icon

  1. Select Search Files from the list of menu options that appears on the left side of the page.
193

Search Files option

The Search page displays and enables you to:

  • (In List view) Create, save, and manage search queries (grouping of results) as a collection to display at the top of the Search page
  • (In Browse view) Create, save, and manage file path locations as a shortcut to display at the top of the My Home page
  • Use quick filters where you can:
    • List files by category (RAW, PROCESSED, or IDS), source, or pipeline.
    • Browse files by your organization's folders in the Tetra Lake.
  • Easily search files by entering any text in a Search box and view which filters have been applied.
  • Conduct advanced searching and file uploading using additional filters and features.
  • Review the file search results in a display area sorted by relevance (by default).
1868

Search Panel

This table provides a list of the Search page items and their descriptions:

Search ItemDescription
All Files buttonClick to display all of the available files. All files display as the default.
Save button- From List view, click to create and save a search query as a collection.
- From Browse view, click to save a file path location as a shortcut to reuse.
For details, see How to Save Collections and Shortcuts.
List buttonClick to display the files using a list format.
Browse buttonClick to browse files within your organization's folders and subfolders.
Search box with Search buttonEnter any term or field that you want to search, and then click Search. The Search feature filters and searches on terms similarly to a popular website's Search. To search and match for an exact phrase, enclose the text with "double quotes". These are possible search examples:

- MyOrgTestFile traceId:bad94687-5cf1-4a55-9454 category:IDS
- labels.value:name NOT labels.value:nameone
- (_exists:metdata.name:country) empowr
- fileId: abcdef-1589*

For basic examples, see Basic Search Examples.
For more examples and their results, see Search Query Examples.
HIghlightHighlights matches of terms in yellow. Note that turning this on might slow down your query.
OptionsSelect to implement advanced search methods. You can search using basic filters, search on attributes, Data (IDS) filters, search IDS files by schema field, and search by RAW.
Upload FileClick to upload a file.
File CategoryIndicates the type of files to show: RAW, PROCESSED, or IDS.
Source TypeWhen you click List, the available file sources display based on File Category (RAW, PROCESSED, or IDS), Source Type, or Pipeline. You can expand/collapse these source types to view the files.

When you click on an item in the Source Type list, that selection is added to the Search string as an AND item. To remove the item from the Search string, click Clear.
My HomeShows the files in your folder. This is available when you click Browse.
Tetra LakeWhen you click Browse, you can search for files based on your organization's folders and subfolders.
NameDisplays the list of file names. Additionally, you can:
- Select the box next to Name to delete the selected files in bulk.
- Click on a file in the list to toggle the display of its summary details.
Source TypeIndicates the source of the file, for example: Log File Watcher.
Last ModifiedIndicates the date and time when the file was last modified. You can click Last Modified to sort the files chronologically from earliest to latest, or vice-versa.
Show Match DetailsClick to toggle details about where any matched terms display in the file.

Perform a Basic File Search

To perform a basic file search:

  1. From the Search page, click List. The file name, source type, and the date/time that each file was last modified displays.
  2. Enter terms and fields that you want to search in the Search box. To search and match for an exact phrase, enclose the text with "double quotes". The Search feature filters and searches on terms similar to a common website search. You can enter both full-text queries (search all of the text associated with a file) and filtered queries. Filters are case-insensitive and AND is the default Boolean operator.

📘

Fuzzy Search

A fuzzy search is done by means of a fuzzy matching program, which returns a list of results based on likely relevance even though search argument words and spellings may not exactly match. By adding the ~ after a keyword, you can make a typo (maximum of two characters), and still return relevant results.

However, do not add a ~ to the end of a keyword/value of a filter, or if the keyword it contains these wildcard characters (*, ?, or !) because the query will fail.

  1. Click Search. Files that match the search criteria you entered in the Search box display as results in the file list. The default behavior for Search is to sort results by relevance (or score) instead of by Last Modified. However, you can organize the result set chronologically by sorting on Last Modified. Click here for basic search examples.
  2. To perform additional filtering, you can select a source or pipeline from the list. To avoid unnecessary scrolling, you can expand and collapse the Source Types and Pipelines from the side panel of the Search page. Additionally, you can search within the filter or facet values. For example, you can click Source Type and start typing “humidity sensor”. You can also create and save a search query as a collection. For more details, see How to Save Collections and Shortcuts.
  3. To further filter your search results, you can select filters from the Options tabs.
  4. To clear the existing search criteria you entered in the Search box, click Clear.

View a File Summary

To view a summary of the file details, click the file from the list of files.

985

File Summary

This table describes the list of File Summary items:

FieldDescription
Related FilesLists:
- Input Files: Files from which the current file was derived. For example, for an IDS file, the RAW file would be the input file.
- Output Files: Files that this current file produced. For example, the IDS file typically produces a JSON file.
Date CreatedLists the date and time when the file was created.
Integration TypeLists the integration (for example, datapipeline) that was used to ingest the file into the Data Lake.
ProtocolLists the steps and configurations used to process data for the pipeline, if any. The protocol consists of two files: protocol.json and script.js. The protocol is the "heart" of the pipeline.
Task ScriptLists the task script related to this file, if any. Task scripts contain the code for the business logic needed to process the data.
SchemaLists the schema (structure of the data) related to this file, if any. If a schema exists, you can click View Schema to open the Data Schema page.
NamespaceLists the namespace for the schema.
VersionLists the version of the schema (for example, v3.0.0)
Metadata, Labels, TagsDisplays relevant metadata, labels, or tags, if any.

At the upper right corner of the File Summary panel, these icons and the More option display:

410

File Summary icons and More option

View Additional File Details

To view additional file details, click Open File Page for the selected file. The File Details page displays:

1601

File Details page

This table describes the sections on the Files Details page:

SectionDescription
File Versions Lists the total number of file versions in chronological order with the most recent file displayed at the top of the list.
You can:
- Click a version to display its details in the File Details section.
- Hover over the file to displays its full name, date/time when it was uploaded, and its full ID.
- Copy the file ID to a clipboard.
File ActionsYou can:
- Click Download to download the file to your computer.
- Click View File Info Details to open a preview of the JSON file details.
- Click Add New Version to upload a new version of the file.
- Click Add Attributes to add or edit attributes (such as metadata, tags, and labels) to the file.
- Click Remove to remove the file and its subsequent versions. This action is only available for the most recent file version.
File DetailsDisplays these file details:
- Version number
- File Name, File ID, Date/Time when file was uploaded, and File Path. You can copy the file name, file ID, and file path to a clipboard.
- Source Type and Source Name
- Size of file
AttributesDisplays any associated file attributes such as: metadata, tags, and labels.
Workflow HistoryIf the pipelineID was provided in the URL, you can click WorkFlow History to expand/collapse any related workflow data. When an input file triggers a pipeline, a workflow is created to process the input file. A workflow typically contains multiple steps that execute one or more Task scripts.
Related FilesDisplays:
- Input Files: Files from which the current file was derived. For example, for an IDS file, the RAW file would be the input file.
- Output Files: Files that this current file produced. For example, the IDS file typically produces a JSON file.

Download a File

To download a file to your computer, you can click:

  • The Download File icon from the File Summary page.
  • The Download File Action from the File Details page.

View JSON File Details

To preview the JSON file details, you can click:

  • The View JSON icon from the from the File Summary page.
  • The View File Info Details File Action from the File Details page.

From the JSON preview window, you can view details such as: total number of items, source type, when the file was created, the location of the file (bucket), source, category, and so on.

Add and Edit Attributes

To add or edit attributes (such as metadata, tags, and labels), you can click:

  • More from the File Summary page, and then select Add/Edit Attributes.
  • The Add Attributes File Action from the File Details page.

Follow the instructions in this topic. When you have finished adding or editing attributes, click Apply.

Upload a New File or New Version of the File

To upload a new version of the file, you can click:

  • Upload File at the top right of the Search page.
  • More from the File Summary page, and then select Upload New Version.
  • The Add New Version File Action from the File Details page.
  1. For a brand new file, you must select a source type for the uploaded file. Each newly uploaded file needs to be attributed to a source type.
  2. You can add a new label, if desired. Labels are applied to an existing file without creating new versions. For details, see this topic. To add metadata or tags, click Advanced Fields. These fields create new file versions and trigger new workflows when modified. Be aware that the contents of these files may be versioned across edits.
  3. Click the file upload box to select a file to upload, or drag and drop the file into the box.
  4. When complete, click Upload.

📘

File Size Limitation

The maximum file size you can upload through the TDP UI is 200 MB. To upload larger files, use the TDP API or a Tetra Agent or Connector.

Preview a File

To preview a file stored in TDP, click the Preview icon on the Search Files page or on the File Details page.

Preview is available for these valid files and file types:
Images with file type: .png, .jpeg, .gif, or .bmp (including 360-degree images)
.pdf
.csv
.xlsx
.docx
Video with file type: .mp4 or .webm
* Audio with file type .mp3

If any valid file is greater than 50 MB, then a warning message displays indicating that the file size is too large to display in a preview.

Delete a File

To delete a file, you can:

  • Select one or more file(s) from list of files on the Search page, then click Bulk Actions, and Delete Selected.
  • Click More from the File Summary page, and then select Delete.
  • Click the Remove File Action from the File Details page. This action is only available for the most recent file version.

Click OK to confirm deletion of the file(s) and its subsequent versions; or click Cancel to retain the file(s).

Browse Files in Folders

To browse files in folders:

  1. From the Search page, click Browse. The Tetra Lake folders display as the source instead of the Source Type and Pipeline.
1309

Tetra Lake folders

  1. Select the folder to browse the Tetra Lake's folder hierarchy based on your organization. You can continue to select subfolders until the files you are searching for display in the results section of the page.

Your current file path location displays at the top of the page. Additionally, you can save file path searches and add as shortcuts to the top of the My Home page. To quickly return to your home directory (and your shortcuts), you can click My Home at the top of the folder list. For more details, see How to Save Collections and Shortcuts. Any files that you removed display under the Removed Sources section at the end of the folder list indicated with the red trash icon.

  1. To further filter your search results, you can select filters from the Options tabs, or manually create a search query.