Search Files (GET)

Description

Returns information on files that meet the search query. The search query can contain one or more parameters. Keep the following in mind.

The search query only retrieves the latest version of a file.
The search query applies boolean AND logic. Every additional parameter added to the query adds an AND param to the search query. This includes custom tags and custom metadata, as well as each additional tag or metadata key.

Authentication

Pass authentication information in the header. You will need the JSON Web Token (JWT) and the name of the organization to authenticate. For more information, see https://developers.tetrascience.com/reference/authentication.

Responses

Responses are in JSON Format.

200 OK

Request was successful. Results can include what is in the following table.

Parameter	Description	Type
from	Results were retrieved starting with the array index specified. By default, files are retrieved from the start of the array (index number 0). Pairing this with the size parameter allows you to see a subset of results.	number
size	Indicates the number of results to return. By default, information for all files matching the query is returned. Pairing this with the from parameter allows you to see a subset of results.	number
sort	Sorts the results by file creation date. Values are desc (descending), asc (ascending).	string
hasNext	Indicates whether other files that matched the search query were found but not included in the search results. This usually happens when you’ve used from and size to restrict the size of the query. Values are true, false.	boolean
hits	Contains the search results.	mixed (string, uuid)
orgSlug	Name of your organization.	string
fileId	UUID for the file in the data lake.	uuid
traceId	UUID generated when the RAW file is first uploaded to the data lake. All files derived from the RAW file, such as an IDS file, have the same traceId.	uuid
rawFileId	UUID for the RAW file that this file was generated from.	uuid
category	File category. Values are RAW, IDS, and PROCESSED.	string
ids	Name, type, and version of the schema used to create the IDS file. Format is {namespace}/{idsType}:{idsVersion}	string
filePath	Path to the IDS JSON file.	date
createdAt	Date and time that the file was created. Note that the date/timestamp is in Zulu (UTC) time.	date
integration	Tetrascience integrates with many different products, such as Box, Dotmatics, and DataHub. The response contains the integration type, file type that is produced, the source, source type, integration ID and source ID.	mixed (uuid, string)
integration.id	ID for the integration.	uuid
integration.type	Integration type. Values are EGNYTE, BOX, DOTMATICS, HRB CELLARIO, DATAHUB, PIPELINE, RAW, and API.	string
integration.name	Name of the integration.	String
integration.workflowId (Added version TDP 3.1.0)	Workflow ID for the pipeline.	String
integration.masterScript (Added version TDP 3.1.0)	Name of the pipeline protocol used.	String
integration.taskScript (Added version TDP 3.1.0)	Name of the task script for the pipeline.	String
integration.taskSlug (Added version TDP 3.1.0)	Name of the slug for the task.	String
integration.taskExecutionId (Added version TDP 3.1.0)	Execution ID.	String
source	Indicates the source of the file. This is typically the source ID, the name of an instrument that generated the data, and a brief description of the file data.	mixed (string, uuid)
source.id	ID for the source.	uuid
source.type	Type of source, which is typically the name of the instrument or other lab equipment used to generate the data or report.	string
source.name	Name of the source.	string
file	Provides file location information including the bucket, path, checksum, size, type, and version.	mixed (string, number)
file.bucket	Name of the data lake S3 bucket the file is assigned to.	array
file.path	Path to the data lake S3 bucket where the file is stored, starting with the organization’s root directory.	string
file.checksum	Checksum (number of bits used to verify file integrity) assigned by the data lake S3 bucket.	string
file.size	Size of the file.	number
file.type	Type of file.	string
file.version	File version’s ID.	uuid
metadata	Lists metadata, if present. Metadata could include items like the instrument or other lab equipment used to create the file and/or custom metadata the user added to the file (if any).	array of strings
tags	Lists tags that the user added to the file.	array
deleted	Indicates whether the file was deleted. Values are TRUE, FALSE.	boolean
outdated	Indicates whether there is at least one newer version of the file available. If a newer version is available, outdated is set to TRUE. Values are TRUE, FALSE.	boolean

401 Unauthorized

There is a problem with authorization. See https://developers.tetrascience.com/reference/authentication for more details.

500 Internal Server Error

There is a problem with the website’s server or there is a network issue.

Additional Examples

Search for Files by Category

Files in the data lake are categorized as RAW, IDS, or PROCESSED. Adding the category parameter to your search query can speed up your search and provide more targeted results. Note that when RAW files are processed, source as well as custom tags and metadata, remain the same for the resulting IDS file. The following fetches IDS files.

curl --location --request GET 'https://api.tetrascience.com/v1/fileinfo/search?category=IDS' \
--header 'ts-auth-token: {JWT_TOKEN}' \
--header 'x-org-slug: tetrascience'

Search for File Created after a Specific Date

Use the fromDate parameter to search for all files created after a certain date. The following fetches all RAW files created on or after November 1, 2020.

curl --location --request GET 'https://api.tetrascience.com/v1/fileinfo/search?category=RAW&fromDate=2020-11-01T00:00:00.000Z' \
--header 'ts-auth-token: {JWT_TOKEN}' \
--header 'x-org-slug: tetrascience'

Search for Files by Date Range to Narrow Search Results

Sometimes when you search, many results are returned. To better target your results and speed up searching, use fromDate and toDate parameters to only fetch information about files created within a specific date range. In the following example, information about IDS files that were created between November 1 - 15, 2020 are retrieved.

curl --location --request GET 'https://api.tetrascience.com/v1/fileinfo/search?category=IDS&fromDate=2020-11-01T00:00:00.000Z&toDate=2020-11-15T00:00:00.000Z' \
--header 'ts-auth-token: {JWT_TOKEN}' \
--header 'x-org-slug: tetrascience'

Search for Pending Unprocessed Files (Pre-Tagged)

To search for files that will be processed by a third-party system, use the tag parameter to search for files tagged as pending. The following example searches for files created on or after November 1, 2020 that have pending in the metadata tags.

curl --location --request GET 'https://api.tetrascience.com/v1/fileinfo/search?category=IDS&tags=pending&fromDate=2020-11-01T00:00:00.000Z' \
--header 'ts-auth-token: {JWT_TOKEN}' \
--header 'x-org-slug: tetrascience'

Search for Files Without a Specific Tag

To search for files without a specific tag, use the excludeTags parameter. The following command finds files without created on or after November 1, 2020 that have not completed processing.

curl --location --request GET 'https://api.tetrascience.com/v1/fileinfo/search?excludeTags=complete&fromDate=2020-11-01T00:00:00.000Z' \
--header 'ts-auth-token: {JWT_TOKEN}' \
--header 'x-org-slug: tetrascience'

Paginating Results

One way to chunk a large result set is to paginate with the from and size parameters. from indicates where the resultset begins. Note that the resultset is ordered by date. size indicates how many results to fetch per request. The following command retrieves information about 15 IDS files, starting with the 10th result of the unfiltered, complete resultset.

curl --location --request GET 'https://api.tetrascience.com/v1/fileinfo/search?category=IDS&from=10&size=15' \
--header 'ts-auth-token: {JWT_TOKEN}' \
--header 'x-org-slug: tetrascience'

Search by IDS Type and Version

To find files for a specific IDS type and version, use the ids parameter. The format is {namespace}/{slug}:{version}. In the following example, information for files with the following ids type and version are retrieved: common/test-empower:v3.0.0.

curl --location --request GET 'https://api.tetrascience.com/v1/fileinfo/search?ids=common/test-empower:v3.0.0' \
--header 'ts-auth-token: {JWT_TOKEN}' \
--header 'x-org-slug: tetrascience'

Note: To find a value, use the data lake schemas API, which can be found here: https://developers.tetrascience.com/reference/list-schemas.

Find All Files Generated by a Specific Pipeline Workflow

(Added version TDP 3.1.0) To find files generated by a specific workflow, use the workflowId parameter. In the following example, information for files with the workflowId eeac26b1-5034-4a1d-a97f-0c8b8cd2ea1e that was generated from August 21, 2021 to the present date are retrieved.

curl --location --request GET 'https://api.tetrascience.com/v1/fileinfo/search?category=IDS&fromDate=2021-08-21T00:00:00.000Z&workflowId=eeac26b1-5034-4a1d-a97f-0c8b8cd2ea1e' \
--header 'ts-auth-token: {JWT token here}' \
--header 'x-org-slug: tetrascience' \
--header 'Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ0c1Rva2VuU2NoZW1hVmVyc2lvbiI6MSwiYXV0aFN0YWNrcyI6eyJpbmZvcm1hdGljcyI6eyJ1c2VySWQiOiI3OWM0NWViNi03ZDY5LTQyM2EtOTg0Yi0xMzM2YTMzMjJjYTkiLCJvcmdhbml6YXRpb25zIjp7IjAxNjljZDc5LTA3NmQtNGY3Yy04Y2QzLTE1ODQxMGZkZTUwZSI6eyJzbHVnIjoidGV0cmFzY2llbmNlIiwicm9sZXMiOlsiYWRtaW4iXX19fX0sImlhdCI6MTYwNjczMDk0OSwiZXhwIjoxNjA5MzIyOTQ5fQ.r5LkZkIJsWT1VMMLYMW84hJdYdbQyknz4yyrCPjcLQw'