Description
Returns information on files that meet the search query. The search query can contain one or more parameters. Keep the following in mind.
- The search query only retrieves the latest version of a file.
- The search query applies boolean AND logic. Every additional parameter added to the query adds an AND param to the search query. This includes custom tags and custom metadata, as well as each additional tag or metadata key.
Authentication
Pass authentication information in the header. You will need the JSON Web Token (JWT) and the name of the organization to authenticate. For more information, see https://developers.tetrascience.com/reference/authentication.
Responses
Responses are in JSON Format.
200 OK
Request was successful. Results can include what is in the following table.
Parameter | Description | Type |
---|---|---|
from | Results were retrieved starting with the array index specified. By default, files are retrieved from the start of the array (index number 0). Pairing this with the size parameter allows you to see a subset of results. | number |
size | Indicates the number of results to return. By default, information for all files matching the query is returned. Pairing this with the from parameter allows you to see a subset of results. | number |
sort | Sorts the results by file creation date. Values are desc (descending), asc (ascending). | string |
hasNext | Indicates whether other files that matched the search query were found but not included in the search results. This usually happens when you’ve used from and size to restrict the size of the query. Values are true, false. | boolean |
hits | Contains the search results. | mixed (string, uuid) |
orgSlug | Name of your organization. | string |
fileId | UUID for the file in the data lake. | uuid |
traceId | UUID generated when the RAW file is first uploaded to the data lake. All files derived from the RAW file, such as an IDS file, have the same traceId. | uuid |
rawFileId | UUID for the RAW file that this file was generated from. | uuid |
category | File category. Values are RAW, IDS, and PROCESSED. | string |
ids | Name, type, and version of the schema used to create the IDS file. Format is {namespace}/{idsType}:{idsVersion} | string |
filePath | Path to the IDS JSON file. | date |
createdAt | Date and time that the file was created. Note that the date/timestamp is in Zulu (UTC) time. | date |
integration | Tetrascience integrates with many different products, such as Box, Dotmatics, and DataHub. The response contains the integration type, file type that is produced, the source, source type, integration ID and source ID. | mixed (uuid, string) |
integration.id | ID for the integration. | uuid |
integration.type | Integration type. Values are EGNYTE, BOX, DOTMATICS, HRB CELLARIO, DATAHUB, PIPELINE, RAW, and API. | string |
integration.name | Name of the integration. | String |
integration.workflowId (Added version TDP 3.1.0) | Workflow ID for the pipeline. | String |
integration.masterScript (Added version TDP 3.1.0) | Name of the pipeline protocol used. | String |
integration.taskScript (Added version TDP 3.1.0) | Name of the task script for the pipeline. | String |
integration.taskSlug (Added version TDP 3.1.0) | Name of the slug for the task. | String |
integration.taskExecutionId (Added version TDP 3.1.0) | Execution ID. | String |
source | Indicates the source of the file. This is typically the source ID, the name of an instrument that generated the data, and a brief description of the file data. | mixed (string, uuid) |
source.id | ID for the source. | uuid |
source.type | Type of source, which is typically the name of the instrument or other lab equipment used to generate the data or report. | string |
source.name | Name of the source. | string |
file | Provides file location information including the bucket, path, checksum, size, type, and version. | mixed (string, number) |
file.bucket | Name of the data lake S3 bucket the file is assigned to. | array |
file.path | Path to the data lake S3 bucket where the file is stored, starting with the organization’s root directory. | string |
file.checksum | Checksum (number of bits used to verify file integrity) assigned by the data lake S3 bucket. | string |
file.size | Size of the file. | number |
file.type | Type of file. | string |
file.version | File version’s ID. | uuid |
metadata | Lists metadata, if present. Metadata could include items like the instrument or other lab equipment used to create the file and/or custom metadata the user added to the file (if any). | array of strings |
tags | Lists tags that the user added to the file. | array |
deleted | Indicates whether the file was deleted. Values are TRUE, FALSE. | boolean |
outdated | Indicates whether there is at least one newer version of the file available. If a newer version is available, outdated is set to TRUE. Values are TRUE, FALSE. | boolean |
401 Unauthorized
There is a problem with authorization. See https://developers.tetrascience.com/reference/authentication for more details.
500 Internal Server Error
There is a problem with the website’s server or there is a network issue.
Additional Examples
Search for Files by Category
Files in the data lake are categorized as RAW, IDS, or PROCESSED. Adding the category parameter to your search query can speed up your search and provide more targeted results. Note that when RAW files are processed, source as well as custom tags and metadata, remain the same for the resulting IDS file. The following fetches IDS files.
curl --location --request GET 'https://api.tetrascience.com/v1/fileinfo/search?category=IDS' \
--header 'ts-auth-token: {JWT_TOKEN}' \
--header 'x-org-slug: tetrascience'
Search for File Created after a Specific Date
Use the fromDate parameter to search for all files created after a certain date. The following fetches all RAW files created on or after November 1, 2020.
curl --location --request GET 'https://api.tetrascience.com/v1/fileinfo/search?category=RAW&fromDate=2020-11-01T00:00:00.000Z' \
--header 'ts-auth-token: {JWT_TOKEN}' \
--header 'x-org-slug: tetrascience'
Search for Files by Date Range to Narrow Search Results
Sometimes when you search, many results are returned. To better target your results and speed up searching, use fromDate and toDate parameters to only fetch information about files created within a specific date range. In the following example, information about IDS files that were created between November 1 - 15, 2020 are retrieved.
curl --location --request GET 'https://api.tetrascience.com/v1/fileinfo/search?category=IDS&fromDate=2020-11-01T00:00:00.000Z&toDate=2020-11-15T00:00:00.000Z' \
--header 'ts-auth-token: {JWT_TOKEN}' \
--header 'x-org-slug: tetrascience'
Search for Pending Unprocessed Files (Pre-Tagged)
To search for files that will be processed by a third-party system, use the tag parameter to search for files tagged as pending. The following example searches for files created on or after November 1, 2020 that have pending in the metadata tags.
curl --location --request GET 'https://api.tetrascience.com/v1/fileinfo/search?category=IDS&tags=pending&fromDate=2020-11-01T00:00:00.000Z' \
--header 'ts-auth-token: {JWT_TOKEN}' \
--header 'x-org-slug: tetrascience'
Search for Files Without a Specific Tag
To search for files without a specific tag, use the excludeTags parameter. The following command finds files without created on or after November 1, 2020 that have not completed processing.
curl --location --request GET 'https://api.tetrascience.com/v1/fileinfo/search?excludeTags=complete&fromDate=2020-11-01T00:00:00.000Z' \
--header 'ts-auth-token: {JWT_TOKEN}' \
--header 'x-org-slug: tetrascience'
Paginating Results
One way to chunk a large result set is to paginate with the from and size parameters. from indicates where the resultset begins. Note that the resultset is ordered by date. size indicates how many results to fetch per request. The following command retrieves information about 15 IDS files, starting with the 10th result of the unfiltered, complete resultset.
curl --location --request GET 'https://api.tetrascience.com/v1/fileinfo/search?category=IDS&from=10&size=15' \
--header 'ts-auth-token: {JWT_TOKEN}' \
--header 'x-org-slug: tetrascience'
Search by IDS Type and Version
To find files for a specific IDS type and version, use the ids parameter. The format is {namespace}/{slug}:{version}. In the following example, information for files with the following ids type and version are retrieved: common/test-empower:v3.0.0.
curl --location --request GET 'https://api.tetrascience.com/v1/fileinfo/search?ids=common/test-empower:v3.0.0' \
--header 'ts-auth-token: {JWT_TOKEN}' \
--header 'x-org-slug: tetrascience'
Note: To find a value, use the data lake schemas API, which can be found here: https://developers.tetrascience.com/reference/list-schemas.
Find All Files Generated by a Specific Pipeline Workflow
(Added version TDP 3.1.0) To find files generated by a specific workflow, use the workflowId parameter. In the following example, information for files with the workflowId eeac26b1-5034-4a1d-a97f-0c8b8cd2ea1e that was generated from August 21, 2021 to the present date are retrieved.
curl --location --request GET 'https://api.tetrascience.com/v1/fileinfo/search?category=IDS&fromDate=2021-08-21T00:00:00.000Z&workflowId=eeac26b1-5034-4a1d-a97f-0c8b8cd2ea1e' \
--header 'ts-auth-token: {JWT token here}' \
--header 'x-org-slug: tetrascience' \
--header 'Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ0c1Rva2VuU2NoZW1hVmVyc2lvbiI6MSwiYXV0aFN0YWNrcyI6eyJpbmZvcm1hdGljcyI6eyJ1c2VySWQiOiI3OWM0NWViNi03ZDY5LTQyM2EtOTg0Yi0xMzM2YTMzMjJjYTkiLCJvcmdhbml6YXRpb25zIjp7IjAxNjljZDc5LTA3NmQtNGY3Yy04Y2QzLTE1ODQxMGZkZTUwZSI6eyJzbHVnIjoidGV0cmFzY2llbmNlIiwicm9sZXMiOlsiYWRtaW4iXX19fX0sImlhdCI6MTYwNjczMDk0OSwiZXhwIjoxNjA5MzIyOTQ5fQ.r5LkZkIJsWT1VMMLYMW84hJdYdbQyknz4yyrCPjcLQw'