Self-Service Data Apps FAQs

Frequently asked questions about Self-Service Data Apps

The following are answers to frequently asked questions about Self-Service Data Apps. If you can't find an answer to your question, contact your customer account leader.

For more information, see Self-Service Data Apps in the TetraConnect Hub. For acccess, see Access the TetraConnect Hub.

What platform versions support Self-Service Data Apps?

TDP v4.3.2 and later support Self-Service Data Apps.

What are the different types of authentication tokens available in data apps, and when should you use each?

Data apps support two types of authentication tokens:

  • User Token (ts-auth-token from cookies): Retrieved through st.context.cookies.get("ts-auth-token") in Streamlit apps. This token represents the currently logged-in user's permissions. Use it when your app should respect individual user permissions and data access rights (for example, a data review app where different users should see different datasets based on their TDP permissions).
  • App Token (from environment variables): Also referred to as the "connector token." This token represents the data app's service account permissions. App tokens are JWT tokens issued to the data app's service account with the Member role only. App tokens can write files because the Member role includes file write permissions. Use this token when the app needs consistent access regardless of which user is logged in (for example, a dashboard that aggregates data across all accessible datasets).

Many apps use a hybrid approach: user tokens for the TDP API calls that should respect user permissions (file retrieval, search) and app tokens for backend services or system-level operations.

🚧

IMPORTANT

Using the user's token is the best practice for maintaining an audit trail. When you use the app token, you lose all traceability on who performed what action within the data app. There is no way to distinguish which user performed an action to upload or write files back to the TDP or trigger other operations. Use user tokens for all operations that should be attributed to specific users (file uploads, data modifications, label changes). Only use app tokens for system-level operations that are truly independent of user identity.

How do you obtain and use the app token?

The app token is not directly available as an environment variable. Instead, the JWT_TOKEN_PARAMETER environment variable contains the SSM Parameter Store path where the app token is stored.

To retrieve the token, do the following:

  1. Read the JWT_TOKEN_PARAMETER environment variable to get the SSM path.
  2. Use AWS SSM to fetch the actual token value from that path.
  3. Use the token in the ts-auth-token header when you call the TDP APIs.
import os
import boto3

# Get the SSM path
jwt_param_path = os.getenv("JWT_TOKEN_PARAMETER")

# Fetch the token from SSM
ssm = boto3.client('ssm')
response = ssm.get_parameter(Name=jwt_param_path, WithDecryption=True)
jwt_token = response['Parameter']['Value']

# Use in API calls
headers = {
    "ts-auth-token": jwt_token,
    "x-org-slug": os.getenv("ORG_SLUG")
}

What data in the TDP is visible from within a data app for a given user?

Data visibility depends on how your app retrieves data:

  • Using the user token (from cookies): The user can access the same data within the app that they can see within the TDP.
  • Using the app token (from environment variables): The app has access to files based on the service account permissions, which are not tied to individual user permissions.
  • Using SQL queries: The data app uses the container task role or the provider secrets, which are not tied to user permissions.
📘

NOTE

As noted above, using the app token eliminates the audit trail of which specific user performed the action.

Can a data app identify which user on the TDP is interacting with the app?

Yes. You can call the v1/users/me endpoint with the user's token to retrieve information about the currently logged-in user.

What is the recommended best practice for setting an access token for a data provider as an environment variable?

Use the TDP Providers to securely store and share secrets with your data app by doing the following:

  1. Go to the Providers tab in the TDP under the Data and AI workspace page.
  2. Create a custom provider (for example, SERVICE_USER) and add a field (for example, TOKEN).
  3. Attach the provider to your data app when you enable it.
  4. Access the environment variable within your app's code by using the naming convention <PROVIDER_NAME>_<FIELD_NAME>.

For example, a custom provider called Service User with a field called Token is injected into the app as SERVICE_USER_TOKEN. You can access it in Python through os.getenv('SERVICE_USER_TOKEN'). For more information, see Data App Providers in the TetraConnect Hub. For access, see Access the TetraConnect Hub.

📘

NOTE

Provider secrets are injected as environment variables and are available to your app code regardless of whether you use user tokens or app tokens for the TDP API authentication. They are distinct from the default environment variables that are automatically injected into all data apps.

How does authentication and authorization work for embedded apps?

Embedded apps do not automatically authenticate your session. Developers must explicitly choose which authentication token to use, and the two options have meaningfully different behaviors:

  • User Token: Developers must explicitly retrieve the user token from cookies by calling st.context.cookies.get("ts-auth-token") in Streamlit apps. This token inherits the identity and permissions of the logged-in user, which preserves the audit trail for all actions the app performs on their behalf.
  • App Token: The app token is not automatically available. Developers must explicitly retrieve it from AWS SSM by using the JWT_TOKEN_PARAMETER environment variable (see How do you obtain and use the app token? above). If a developer does not retrieve either token, the app has no authentication token — there is no passive fallback. The app token has the Member role only and provides consistent access, but loses all traceability of which user performed a given action.

Authorization (accessing the app): Access to the app itself is governed by the TDP's Access Rules:

  • Labels (ABAC): When you create or configure an app, you can assign labels to it through the Attribute Management section of the app settings.
  • ACL/Access Rules: These labels are then used in Data Access Rules (under Organization Settings) to control which users or groups can see and launch the app. For example, an admin can create a rule that says "Users in Group A can access apps labeled 'Department: Chemistry'." This applies the same Attribute-Based Access Control (ABAC) logic used for files to the apps themselves.
🚧

IMPORTANT

The choice of token type directly affects both the audit trail and the data access behavior of your app. Use user tokens for all operations that should be attributed to specific users. See What are the different types of authentication tokens available in data apps? for a full comparison.

Do embedded apps have default access to all data, and do they respect RBAC and Data Access Rules?

No, embedded apps do not have default access to all data. However, the behavior depends on which token the app uses:

  • Using the user token: The app respects the logged-in user's permissions, including Data Access Rules (DARs). If a user does not have permission to view a certain dataset through the standard TDP Search UI, the app will receive a 403 Forbidden or empty results when it attempts to access that same data through the API. Data Access Controls based on labels and attributes are fully enforced.
  • Using the app token: The app operates under the service account's permissions (Member role), which are not tied to the individual user's Data Access Rules. The app token does not enforce per-user DARs, so the app may access data that the logged-in user would not otherwise be able to see.
🚧

IMPORTANT

If your app must respect per-user Data Access Rules, you must use the user token. The app token bypasses per-user DAR enforcement.

What environment variables are available within data apps?

The following default environment variables are automatically available when a data app is provisioned and enabled:

Connector-level defaults available to all data apps

VariableDescription
ORG_SLUGOrganization identifier
CONNECTOR_IDUnique connector/data app instance ID
TDP_ENDPOINTExternal TDP API base URL
AWS_REGIONAWS region for the deployment
DATALAKE_BUCKETS3 bucket for the data lake
STREAM_BUCKETS3 bucket for streaming/ingestion
JWT_TOKEN_PARAMETERSSM path to the app's auth token
KMS_KEY_IDKMS key for encryption

Data app-specific defaults

VariableDescription
DATA_APP_IDUnique data app instance ID
ATHENA_BUCKETS3 bucket for Athena tables
ATHENA_S3_OUTPUT_LOCATIONS3 location for Athena query results
TDP_INTERNAL_ENDPOINTInternal TDP API endpoint
INFRASTRUCTURE_NAMEDeployment stack name
ENVIRONMENTEnvironment identifier (for example, production, staging)

Custom environment variables for third-party integrations are not automatically added. Use Providers to add secrets to the TDP and share them with your data app. For more information, see Data App Providers in the TetraConnect Hub. For access, see Access the TetraConnect Hub.

For local development, should you use a login token or a service user token?

Either approach works, but consider the following:

  • Login token: Best for apps that need to respect user-specific permissions. This approach allows you to test permission-based behavior locally. Set this as an environment variable (for example, TS_AUTH_TOKEN) for local development.
  • Service user token: Best for apps that need consistent access regardless of the user. This approach is simpler for local development but may not reflect production permission boundaries.

What is the difference between the TDP_ENDPOINT and the TDP_INTERNAL_ENDPOINT?

These two environment variables serve different purposes:

  • the TDP_ENDPOINT: The external, tenant-facing TDP API base URL (for example, https://api.tetrascience.com). Use this for standard TDP API calls that are documented in the TetraScience API Reference.
  • the TDP_INTERNAL_ENDPOINT: The internal TDP API endpoint for service-to-service calls within the TetraScience network. Use this for internal APIs that are not exposed on the tenant-facing endpoint.

What do you need to know about manifest.json files to publish a data app?

Data apps require that catalog_keys be defined with an existing key. It is recommended to define catalog_keys: ["any"] in your manifest.json. The requirements field has rules around its definition but is not a required field, so it is recommended to omit requirements from manifest.json.

For more information, see Data App manifest.json Files in the TetraConnect Hub. For access, see Access the TetraConnect Hub.

Can data apps run SELECT or UPSERT queries on Delta Tables managed by other apps?

Data apps can run SELECT queries on Delta Tables managed by other apps. However, UPSERT queries on delta tables managed by other apps are not supported.

Can a data app run SELECT queries on all IDS tables?

Yes. Data apps can query any table in the Lakehouse.

How many tables can a data app create?

Data apps can create any number of tables.

Why does my app's session state sometimes get reset even when I don't close the app?

AWS performs regular maintenance of the clusters where data apps run. If an app is running on a cluster while it is restarted, all session information is lost. Store any information needed long-term outside of the app. In TDP v4.3.2 and later, data apps can save data back to the platform.

What gets deleted in the TDP when you install, delete, or upgrade a data app?

Any state that is stored on the running data app container is lost during a delete (and most likely during version changes, depending on how the data is stored). Data apps have access to a connector key/value store, which is a table that preserves data across data app version changes.

What determines whether an app artifact gets a tile?

The slug, namespace, and version fields in the app's manifest.json file are what the platform uses to determine a tile for grouping data apps.

  • To rename an app or change its icon: Change the name field or icon file, publish a new app version, and then update the tile to the new version.
  • To create a new app tile: Change the slug field (and whatever else you want to change, such as the name or icon), publish the app, and then install the app from the gallery.

For more information, see Data App manifest.json Files in the TetraConnect Hub. For access, see Access the TetraConnect Hub.

Can users bookmark a data app with a URL?

Yes. The URLs for data apps persist across sessions, even when upgraded.

How do you resolve errors during app installation?

If you publish a data app and can see it in the Data & AI Workspace Gallery but receive an error message when you try to install it, there are several possible causes. One common cause is the use of special characters in the application name. App names must be alphanumeric characters and spaces only. Characters such as apostrophes or parentheses in the name cause installation to fail.

How long does a simple SQL read or write take?

SQL query performance depends on the data warehouse state. As a baseline reference, Athena-based queries take approximately two to three seconds. Self-Service Data Apps queries should be faster than Athena, assuming the warehouse is already running.

What is the cold start time for a data app cluster?

Cold starts take approximately seven minutes, regardless of whether it is the first start or a subsequent one. With more usage of Databricks, startup times are reduced because the warehouse is already running.

How does observability work for data apps, and can developers see logs without AWS CloudWatch access?

Data apps automatically capture stdout and stderr from the application container, so standard startup logs and Python tracebacks are recorded even without custom developer logging. These container logs are visible on the data app details page in the TDP for developers and administrators.

📘

NOTE

Container logs are available to developers and administrators through the data app details page, but they are not automatically surfaced to end users within the app. If you need to expose log information to end users, you must implement custom logging within your application.

Where is the Docker registry for images published through ts-cli, and can a custom registry be specified for vulnerability scanning?

Images published through ts-cli are pushed to a TetraScience-managed private Elastic Container Registry (ECR) associated with your specific TDP instance and organization. The platform does not currently allow you to specify a custom registry for deployment.

To ensure image security and detect vulnerabilities before the app becomes active, integrate scanning tools (such as Snyk or Trivy) into your local development workflow or CI/CD pipeline. Scan the image immediately after the local build but before running the ts-cli publish command to ensure only clean images are pushed to the platform's registry.

Documentation Feedback

Do you have questions about our documentation or suggestions for how we can improve it? Start a discussion in TetraConnect Hub. For access, see Access the TetraConnect Hub.

📘

NOTE

Feedback isn't part of the official TetraScience product documentation. TetraScience doesn't warrant or make any guarantees about the feedback provided, including its accuracy, relevance, or reliability. All feedback is subject to the terms set forth in the TetraConnect Hub Community Guidelines.