Run an Inference (AI Services v1.0.x)

This guide shows how to run inferences on installed Scientific AI Workflows using TetraScience AI Services versions 1.0.x.

Run an Inference

After you've installed Scientific AI Workflow and have its inference URL, you can run inferences programmatically by doing either of the following:

Run an online (synchronous) inference to invoke the appropriate LLM, AI agent, or custom model to return real-time predictions from text-based inputs.
Run an offline (asynchronous) inference to provide inputs through JSON files or specify datasets for batch predictions and asynchronous processing.

For AI workflows that require large files, such as model weights, configuration files, or reference data, you can also upload AI Workflow assets before running an inference.

Run an Online (Synchronous) Inference

Online (synchronous) inference performs real-time ML inference and returns results immediately. Use online inference for real-time predictions, chat and conversation workflows, and low-latency requirements (less than 60 seconds). For batch processing, large files, or long-running inference (more than 60 seconds), use an offline (asynchronous) inference instead.

📘
NOTE
Online inference is only available for AI workflows that have a Model Serving Endpoint configured.

Required Policies

To submit an online inference request, users must have one of the following policy permissions:

Operation	Endpoint	Policies with Required Permissions
Submit inference request	`POST /v1/inference/online`	Tenant Admin Organization Admin Machine Learning Engineer Developer
Get inference metrics	`GET /v1/inference/online/{requestId}/metrics`	Tenant Admin Organization Admin Machine Learning Engineer Developer Operations Analyst

API Endpoint

POST /v1/inference/online

Request Headers

Header	Required	Description
`x-org-slug`	Yes	Organization slug identifier (for example, `acme-corp`)
`ts-auth-token`	Yes	Authentication token (JWT)
`Content-Type`	Yes	Must be `application/json`

Request Body

Parameter	Type	Required	Description
`aiWorkflow`	string	Yes	The AI workflow identifier (for example, `cell-analysis`)
`version`	string	No	Specific version of the AI workflow (for example, `v1.0.0`). If not specified, the latest version is used.
`inferenceOptions`	object	No	Configuration options for the inference request
`inferenceInput`	object	Yes	Input data for the model (format varies by model type)

Inference Options

Parameter	Type	Default	Description
`stream`	boolean	`false`	Enable streaming response mode (Server-Sent Events)
`metrics`	boolean	`false`	Enable metrics collection for this request
`timeout`	integer	`300000`	Request timeout in milliseconds (1000-300000, which is 1 second to 5 minutes)

Inference Input Formats

The inferenceInput field supports multiple formats, depending on the model serving endpoint requirements:

Format	Description	Example Field
Tabular (array)	Array of arrays for tabular data	`data`
Tabular (records)	Array of objects with named fields	`dataframe_records`
Images	Base64-encoded image data	`images`
Chat messages	Conversation format with role/content	`messages`

Request Examples

Tabular Data (Buffered Response)

{
  "aiWorkflow": "lead-clone-selection",
  "version": "v2.1.0",
  "inferenceInput": {
    "dataframe_records": [
      { "feature1": 1.5, "feature2": "category_a", "feature3": 42 },
      { "feature1": 2.3, "feature2": "category_b", "feature3": 37 }
    ]
  },
  "inferenceOptions": {
    "stream": false,
    "metrics": true,
    "timeout": 30000
  }
}

Chat Conversation (Streaming Response)

{
  "aiWorkflow": "coding-assistant",
  "version": "v1.0.0",
  "inferenceInput": {
    "messages": [
      { "role": "user", "content": "Write a Python function to calculate factorial" }
    ]
  },
  "inferenceOptions": {
    "stream": true,
    "timeout": 30000
  }
}

Image Analysis

{
  "aiWorkflow": "cell-image-analysis",
  "version": "v1.5.0",
  "inferenceInput": {
    "images": [
      { "b64": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk..." }
    ]
  },
  "inferenceOptions": {
    "stream": false,
    "metrics": true
  }
}

Response

Buffered Response (JSON)

When stream is false (default), the response is returned as a complete JSON object:

Field	Type	Description
`requestId`	string	Unique identifier for this inference request (for example, `sync_123e4567-e89b-12d3-a456-426614174000`)
`status`	string	Status of the inference request: `success` or `failed`
`inferenceOutput`	any	Model output (format varies by model)
`metrics`	object	Optional metrics about the inference execution
`timestamp`	string	ISO 8601 timestamp of the response
`errorMessage`	string	Error message if status is `failed`

Example Response (Tabular Predictions)

{
  "requestId": "sync_123e4567-e89b-12d3-a456-426614174000",
  "status": "success",
  "inferenceOutput": {
    "predictions": [0.85, 0.72]
  },
  "metrics": {
    "runDuration": 245,
    "executionDuration": 245
  },
  "timestamp": "2025-09-29T10:30:00Z"
}

Streaming Response (Server-Sent Events)

When stream is true, results are streamed as Server-Sent Events (SSE) with Content-Type: text/event-stream:

data: {"chunk": "partial result", "done": false}

data: {"chunk": "more content", "done": false}

data: [DONE]

Error Responses

Status Code	Description	Example
`400`	Bad request - validation failed	AI workflow not found, invalid input format, invalid timeout value
`401`	Authentication failed	Invalid or missing authentication token
`403`	Access denied	Organization doesn't have access, insufficient role permissions
`404`	AI workflow not found	Specified workflow doesn't exist
`422`	Unprocessable entity	Model returned validation error for input data
`500`	Internal server error	Model serving endpoint failed, inference timeout
`503`	Service unavailable	Model serving endpoint is temporarily unavailable

Get Inference Metrics

To retrieve metrics for a completed online inference request:

GET /v1/inference/online/{requestId}/metrics

Path Parameters

Parameter	Type	Required	Description
`requestId`	string	Yes	The online inference request ID (for example, `sync_123e4567-e89b-12d3-a456-426614174000`)

Response

Field	Type	Description
`requestId`	string	Unique identifier for the inference request
`aiWorkflow`	string	AI workflow identifier
`organization`	string	Organization slug
`status`	string	Status: `success`, `failed`, or `timeout`
`processingTime`	number	Time taken to process the request (milliseconds)
`timestamp`	string	ISO 8601 timestamp
`inputSize`	integer	Size of input data in bytes
`errorMessage`	string	Error message if inference failed

Example Metrics Response

{
  "requestId": "sync_123e4567-e89b-12d3-a456-426614174000",
  "aiWorkflow": "lead-clone-selection:v2.1.0",
  "organization": "acme-corp",
  "status": "success",
  "processingTime": 1245,
  "timestamp": "2025-09-29T10:30:00Z",
  "inputSize": 2048
}

Run an Offline (Asynchronous) Inference

Offline (asynchronous) inference submits file-based inference requests for batch processing. Use offline inference for processing large files or datasets, running batch inference jobs, tasks expected to take more than 60 seconds, and when files are already uploaded to the platform. For real-time predictions or low-latency requirements, use online (synchronous) inference instead.

📘
NOTE
Keep in mind the following when running offline inferences:

Offline inference supports partial success, where optional files can fail without blocking the inference job. Critical files (primary data files) must succeed for inference to proceed.

Offline inference can be automated using the ts-inference-protocol. Configure a file processing pipeline with this protocol to automatically trigger inference requests when new files are uploaded to the platform. The protocol accepts configuration for the inferenceUrl, aiWorkflow, and authentication token, and then automatically submits matching files for inference processing.

Required Policies

To submit an offline inference request, you must have one of the following policy permissions:

Operation	Endpoint	Policies with Required Permissions
Submit inference request	`POST /v1/inference`	Tenant Admin Organization Admin Machine Learning Engineer Developer
Get inference status	`GET /v1/inference/{inferenceId}/status`	Tenant Admin Organization Admin Machine Learning Engineer Developer Operations Analyst

API Endpoint

POST /v1/inference

Request Headers

Header	Required	Description
`x-org-slug`	Yes	Organization slug identifier (for example, `acme-corp`)
`ts-auth-token`	Yes	Authentication token (JWT)
`Content-Type`	Yes	Must be `application/json`

Request Body

Parameter	Type	Required	Description
`aiWorkflow`	string	Yes	The AI workflow identifier (for example, `cell-analysis`). You can get a list of available AI workflows using the `GET /v1/install` endpoint.
`version`	string	No	Specific version of the AI workflow (for example, `v1.0.0`). If not specified, the latest version is used.
`inputFiles`	array	Yes	List of input files for inference processing (minimum 1 file)

Input File Object

Each file in the inputFiles array must include:

Parameter	Type	Required	Default	Description
`fileUuid`	string	Yes	—	UUID of the file in TDP
`role`	string	No	`file`	Role of the file in processing: `file`, `instructions`, or `parameters`
`processingOrder`	integer	No	`1`	Order in which file should be processed (minimum 1)
`dependencies`	array	No	`[]`	List of file UUIDs this file depends on

Request Examples

Basic Inference Request

{
  "aiWorkflow": "cell-analysis",
  "inputFiles": [
    {
      "fileUuid": "550e8400-e29b-41d4-a716-446655440000",
      "role": "file",
      "processingOrder": 1,
      "dependencies": []
    }
  ]
}

Multi-File Inference with Dependencies

{
  "aiWorkflow": "lead-clone-selection",
  "version": "v2.1.0",
  "inputFiles": [
    {
      "fileUuid": "550e8400-e29b-41d4-a716-446655440001",
      "role": "file",
      "processingOrder": 1,
      "dependencies": []
    },
    {
      "fileUuid": "550e8400-e29b-41d4-a716-446655440002",
      "role": "instructions",
      "processingOrder": 2,
      "dependencies": ["550e8400-e29b-41d4-a716-446655440001"]
    },
    {
      "fileUuid": "550e8400-e29b-41d4-a716-446655440003",
      "role": "parameters",
      "processingOrder": 3,
      "dependencies": []
    }
  ]
}

Response

The API returns immediately with a 202 Accepted status and provides a request ID for tracking progress.

Field	Type	Description
`requestId`	string	Unique identifier for the inference request (for example, `req-20250929-abc123def456`)
`statusUrl`	string	URL to check the status of the inference request
`partialSuccess`	boolean	Indicates if some optional files failed to stage but inference can still proceed
`stagingSummary`	object	Detailed breakdown of file staging results (only present for partial success)

Example Response (All Files Staged Successfully)

{
  "requestId": "req-20250929-abc123def456",
  "statusUrl": "/v1/inference/req-20250929-abc123def456"
}

Example Response (Partial Staging Success)

{
  "requestId": "req-20250929-abc123def456",
  "statusUrl": "/v1/inference/req-20250929-abc123def456",
  "partialSuccess": true,
  "stagingSummary": {
    "totalFiles": 3,
    "successfulFiles": 2,
    "failedFiles": 1,
    "criticalFailures": 0,
    "optionalFailures": 1,
    "warnings": ["Optional file failed to stage: metadata.json (Access denied)"]
  }
}

Error Responses

Status Code	Description	Example
`400`	Bad request - validation failed	Invalid AI workflow, no valid files, critical files failed to stage
`401`	Authentication failed	Invalid or missing authentication token
`403`	Access denied	No access to AI workflow, file doesn't belong to organization
`404`	Resource not found	AI workflow or files not found
`409`	Conflict	AI workflow is disabled or inactive
`422`	Unprocessable entity	Invalid file format or empty files
`500`	Internal server error	Staging, queue, or database service errors
`503`	Service unavailable	FileInfo service timeout or external dependency unavailable

Get Inference Status

To check the status of an offline inference request:

GET /v1/inference/{inferenceId}/status

Path Parameters

Parameter	Type	Required	Description
`inferenceId`	string	Yes	The inference request ID

Query Parameters

Parameter	Type	Default	Description
`includeFiles`	boolean	`false`	Include detailed file information in the response

Response

Field	Type	Description
`requestId`	string	Unique identifier for the inference request
`dbxRunId`	number	Databricks job run ID (when processing has started)
`status`	string	Status: `pending`, `processing`, `completed`, or `failed`
`aiWorkflow`	string	AI workflow identifier
`organization`	string	Organization slug
`user`	string	User who submitted the request
`createdAt`	string	ISO 8601 timestamp when request was created
`updatedAt`	string	ISO 8601 timestamp when request was last updated
`completedAt`	string	ISO 8601 timestamp when inference completed (only for completed/failed)
`inputFiles`	array	Original input file specifications
`stagingLocation`	object	S3 staging location information
`outputLocation`	object	S3 output location information
`manifestPath`	string	S3 path to the processing manifest
`errorMessage`	string	Error message if inference failed
`partialSuccess`	boolean	Whether this request had partial staging success
`stagingSummary`	object	Summary of file staging results
`files`	array	Detailed file information (only if `includeFiles=true`)

Example Status Response (Processing)

{
  "requestId": "req-20250929-abc123def456",
  "status": "processing",
  "aiWorkflow": "cell-analysis",
  "organization": "acme-corp",
  "user": "user123",
  "createdAt": "2025-09-29T10:30:00Z",
  "updatedAt": "2025-09-29T10:35:00Z",
  "stagingLocation": {
    "bucket": "inference-staging-bucket",
    "prefix": "tenant=acme/org=acme-corp/2025/09/29/req-20250929-abc123def456/",
    "fullPath": "s3://inference-staging-bucket/tenant=acme/org=acme-corp/2025/09/29/req-20250929-abc123def456/"
  },
  "outputLocation": {
    "bucket": "inference-output-bucket",
    "prefix": "tenants/acme/orgs/acme_corp/schemas/ai_assets/2025/09/29/req-20250929-abc123def456/",
    "fullPath": "s3://inference-output-bucket/tenants/acme/orgs/acme_corp/schemas/ai_assets/2025/09/29/req-20250929-abc123def456/"
  }
}

Example Status Response (Completed)

{
  "requestId": "req-20250929-abc123def456",
  "status": "completed",
  "aiWorkflow": "cell-analysis",
  "organization": "acme-corp",
  "user": "user123",
  "createdAt": "2025-09-29T10:30:00Z",
  "updatedAt": "2025-09-29T10:45:00Z",
  "completedAt": "2025-09-29T10:45:00Z"
}

Upload AI Workflow Assets

Some AI workflows require large files, such as model weights, configuration files, or reference data to be uploaded in advance before inference can be performed. The Assets API provides endpoints for listing existing assets and uploading new files to an AI workflow's assets directory.

Use the Assets API when you need to do any of the following:

Upload model weights or checkpoints for inference notebooks
Provide configuration files that customize workflow behavior
Store reference datasets used during inference
Upload any large files that cannot be passed inline with inference requests

Required Policies

To manage AI workflow assets, users must have one of the following platform roles:

Operation	Endpoint	Policies with Required Permissions
List assets	`GET /v1/assets`	Tenant Admin Organization Admin Machine Learning Engineer Developer Operations Analyst
Upload assets	`POST /v1/assets`	Tenant Admin Organization Admin Machine Learning Engineer Developer

List Assets

Retrieve a hierarchical listing of files and folders in an AI workflow's assets directory:

GET /v1/assets

Request Headers

Header	Required	Description
`x-org-slug`	Yes	Organization slug identifier (for example, `acme-corp`)
`ts-auth-token`	Yes	Authentication token (JWT)

Query Parameters

Parameter	Type	Required	Default	Description
`namespace`	string	Yes	—	Namespace of the AI workflow (for example, `common`)
`aiWorkflow`	string	Yes	—	Slug of the AI workflow (for example, `cell-analysis`)
`path`	string	No	`/`	Path within the assets folder to list

Example Request

curl -X GET "https://api.tetrascience.com/v1/assets?namespace=common&aiWorkflow=cell-analysis&path=/models" \
  -H "x-org-slug: acme-corp" \
  -H "ts-auth-token: YOUR_JWT_TOKEN"

Example Response

{
  "name": "models",
  "type": "folder",
  "path": "/models",
  "children": [
    {
      "name": "weights.pkl",
      "type": "file",
      "path": "/models/weights.pkl",
      "size": 104857600,
      "lastModified": "2025-09-29T10:30:00.000Z"
    },
    {
      "name": "config.json",
      "type": "file",
      "path": "/models/config.json",
      "size": 1024,
      "lastModified": "2025-09-29T09:15:00.000Z"
    }
  ]
}

Upload Assets

Uploading files to an AI workflow's assets directory is a two-step process:

Get a presigned URL: Call the Assets API to get a presigned Amazon Simple Storage Service (Amazon S3) URL for uploading
Upload the file: Use the presigned URL to upload your file directly to Amazon S3.

Step 1: Get Presigned Upload URL

POST /v1/assets

Request Headers

Header	Required	Description
`x-org-slug`	Yes	Organization slug identifier (for example, `acme-corp`)
`ts-auth-token`	Yes	Authentication token (JWT)

Query Parameters

Parameter	Type	Required	Default	Description
`namespace`	string	Yes	—	Namespace of the AI workflow
`aiWorkflow`	string	Yes	—	Slug of the AI workflow
`fileName`	string	Yes	—	Name of the file to upload (no path separators allowed)
`path`	string	No	`/`	Path within the assets folder where the file will be stored

Example Request

curl -X POST "https://api.tetrascience.com/v1/assets?namespace=common&aiWorkflow=cell-analysis&path=/models&fileName=weights.pkl" \
  -H "x-org-slug: acme-corp" \
  -H "ts-auth-token: YOUR_JWT_TOKEN"

Example Response

{
  "url": "https://s3.amazonaws.com/ai-assets-bucket/path/to/weights.pkl?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Signature=...",
  "expiresIn": 86400
}

Field	Type	Description
`url`	string	Presigned S3 URL for uploading the file
`expiresIn`	integer	URL expiration time in seconds (24 hours)

Step 2: Upload File to Presigned URL

Use the presigned URL from Step 1 to upload your file directly to Amazon S3:

curl -X PUT "$PRESIGNED_URL" \
  -H "Content-Type: application/octet-stream" \
  --data-binary @./weights.pkl

Complete Upload Example

The following example shows the complete two-step process for uploading a model weights file:

# Step 1: Get presigned URL
PRESIGNED_URL=$(curl -s -X POST "https://api.tetrascience.com/v1/assets?namespace=common&aiWorkflow=cell-analysis&path=/models&fileName=model_weights.pkl" \
  -H "x-org-slug: acme-corp" \
  -H "ts-auth-token: YOUR_JWT_TOKEN" | jq -r '.url')

# Step 2: Upload file using presigned URL
curl -X PUT "$PRESIGNED_URL" \
  -H "Content-Type: application/octet-stream" \
  --data-binary @./model_weights.pkl

Error Responses

Status Code	Description	Example
`400`	Bad request - invalid fileName or path	`Invalid fileName: must be a filename without path separators`
`401`	Authentication failed	Invalid or missing authentication token
`404`	AI workflow not found	`AI workflow 'cell-analysis' not found in namespace 'common'`
`500`	Internal server error	Failed to list assets or create presigned URL

Documentation Feedback

Do you have questions about our documentation or suggestions for how we can improve it? Start a discussion in TetraConnect Hub. For access, see Access the TetraConnect Hub.

📘
NOTE
Feedback isn't part of the official TetraScience product documentation. TetraScience doesn't warrant or make any guarantees about the feedback provided, including its accuracy, relevance, or reliability. All feedback is subject to the terms set forth in the TetraConnect Hub Community Guidelines.

Updated 3 months ago