Tetra Box Connector

Box is a service that offers secure file sharing. Tetra Data Platform has a built-in integration that allows users to pull raw data files from their secure Box storage and into the Tetra Data Lake.

How does the connector work

TetraScience leverages Box's API to constantly detect file change events in your Box account and upload the files into our Data Lake and then trigger Data Pipelines.

Our Box Connector currently tracks file creation events, including different versions of the same file. If you remove your file from Box, TetraScience Data Lake will not mirror that and will not delete the files we collected.

Box integration will track (listen) to 3 type events in your Box account:

  • File has been uploaded (create event)
  • File has been changed (update event)
  • File has been copied from another Box location (copy event)

The integration will detect the file creation in Box every 60 seconds.

When you first create a Box integration, it will pull all existing files that match the provided file pattern and put the files in our Data Lake.

How to configure the connector

Set up your Box account

First, create a dedicated API user for this integration. For production usage, the best practice is to create an API user (standard user) dedicated for this integration. We recommend you name it: [email protected].

After the user is created, share the Box folder that you would like the integration to track with the API user with read-only permissions.

📘

Organize your Box folders

It is always a good idea to leverage the folder structure to organize your data, the best practice is to include your study number, project name/id, instrument name/id and etc in the folder path. For example:

Shared/instruments-data/plate-reader-1
Shared/instruments-data/plate-reader-2

If you are organizing data from your CRO, you can consider something like the following:

Shared/study-1/CRO-A/assay-x
Shared/study-1/CRO-A/assay-y
Shared/study-1/CRO-A/assay-z

Create and configure a Box integration.

212

Begin by click the "hamburger" menu icon on the top left to open the sidebar menu.

333

Please select "Box" from under the "Data Sources" menu category. You will be presented with the Box Source Management page.

1257

This is the landing page for all existing Box integrations. In order to add any folders, a Box account must be linked and added to the TetraScience Platform.
TetraScience supports connecting multiple disparate Box accounts, and each account can have multiple directories and folders to integrate.

877

A modal will pop-up and request authorization to access the Box account which will then load the Box page itself.

401

If you are not logged into Box you will be presented with the opportunity to do so. Once logged in, an authorization modal from Box will be presented. Click "Grant access to box" to complete the access request.

1242

Once a Box account is linked you can begin to add folders, or "sources", and choose specific file masks. Click on the "Add source" button.

975

This is the Box folder picker. Navigate to the highest level directory you wish to begin tracking files in, and select it. Press Next to continue.

975

You are required to enter a unique name for the source, it can be helpful to include a brief description to help you keep organized if there are many tracked folders.

968

Lastly, you have the option to include specific metadata and tags attributed to all files which come from. If there are specific metadata fields attributed to your organization they will be presented. Tags can be defined on this page and are useful markers to search or filter this source's files.
Once you are done click "Save" to proceed.

1225

You will then be presented with the updated Box Source Management page. The Box account integration can be deleted by click on the hamburger menu enclosed in green. If you need to make changes to the specific directory or folders you selected you can select that through the hamburger menu boxed in blue.