Run an Inference (AI Services v1.2.x)
This guide shows how to run inferences on installed Scientific AI Workflows using TetraScience AI Services versions 1.2.x.
Run an Inference
After you've installed Scientific AI Workflow and have its inference URL, you can run inferences programmatically by doing any of the following:
- Run a real-time (synchronous) inference to invoke the appropriate LLM, AI agent, or custom model to return real-time predictions from text-based inputs.
- Run a batch (asynchronous) inference to provide inputs through JSON files or specify datasets for batch predictions and asynchronous processing.
- Invoke a training notebook to trigger model training, data prefetching, or other custom notebook tasks through the API.
For AI workflows that require large files, such as model weights, configuration files, or reference data, you can also upload AI Workflow assets before running an inference.
You can also manage knowledge bases to create and query vectorized knowledge stores for retrieval-augmented generation (RAG) use cases.
Choosing Between Real-Time and Batch Inference
Use the following table to determine which inference type is appropriate for your use case:
| Consideration | Real-Time Inference | Batch Inference |
|---|---|---|
| Input size | Small — rows, short text, single image | Large files, many files |
| Expected runtime | Less than 60 seconds | Minutes to hours |
| Input is a TDP file UUID | No — send data inline | Yes |
| GxP audit trail | Metrics endpoint available | Full history in System Log |
Prerequisites
To run an inference, the following is required:
- Tetra Data Platform v4.4.1 or later
- TetraScience AI Services activated with at least one Scientific AI Workflow installed in your TDP environment
- A JSON Web Token (JWT) created either from your My Account page for development and testing purposes, or for a Service User account for automation and production setups
Authentication
Every inference API call requires two headers:
ts-auth-token: <your JWT token>
x-org-slug: <your organization slug>
You can locate your organization slug on the Organization Details page.
To get a JWT, do either of the following:
-
For development and testing, use the personal token from your My Account page. The token expiration is configurable up to 24 hours.
-or-
-
For automation and production setups, create a Service User account and generate a token for it.
IMPORTANTNever hardcode token values in scripts or commit them to source control. Use environment variables or a secrets manager instead.
Run a Real-Time (Synchronous) Inference
Real-Time (synchronous) inference performs real-time ML inference and returns results immediately. Use Real-Time Inference for real-time predictions, chat and conversation workflows, and low-latency requirements (less than 60 seconds). For batch processing, large files, or long-running inference (more than 60 seconds), use an batch (asynchronous) inference instead.
NOTEReal-Time Inference is only available for AI workflows that have a Model Serving Endpoint configured.
Required Policies
To submit a Real-Time Inference request, users must have one of the following policy permissions:
Operation | Endpoint | Policies with Required Permissions |
|---|---|---|
Submit inference request |
| |
Get inference metrics |
|
Real-Time Inference Quickstart
Follow these steps to quickly run a real-time inference (approximately 5 minutes):
Step 1: Verify a Workflow is Installed
curl -s "https://<your-gateway>/ai-platform/v1/workflows" \
-H "ts-auth-token: $TOKEN" -H "x-org-slug: $ORG_SLUG" \
| jq '.[] | select(.status=="installed") | {slug, version}'Step 2: Prepare Your Input Data
Determine the format your workflow expects (tabular, chat, or image). For example, for a chat-based workflow:
{
"messages": [
{ "role": "user", "content": "Summarize the following clinical note..." }
]
}Step 3: Submit the Inference Request
curl -s -X POST "https://<your-gateway>/ai-platform/v1/inference/online" \
-H "ts-auth-token: $TOKEN" -H "x-org-slug: $ORG_SLUG" \
-H "Content-Type: application/json" \
-d '{
"aiWorkflow": "your-workflow-slug",
"inferenceInput": {
"messages": [{"role": "user", "content": "Your prompt here"}]
},
"inferenceOptions": {"stream": false, "metrics": true}
}'Step 4: Process the Response
The response returns immediately:
{
"requestId": "sync_123e4567-e89b-12d3-a456-426614174000",
"status": "success",
"inferenceOutput": { ... },
"timestamp": "2025-09-29T10:30:00Z"
}
NOTEFor streaming responses, set
"stream": trueand handle Server-Sent Events (SSE) in your client.
Real-Time (Synchronous) Inference API Reference
Endpoint
POST /v1/inference/online
Request Headers
| Header | Required | Description |
|---|---|---|
x-org-slug | Yes | Organization slug identifier (for example, acme-corp) |
ts-auth-token | Yes | Authentication token (JWT) |
Content-Type | Yes | Must be application/json |
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
aiWorkflow | string | Yes | The AI workflow identifier (for example, cell-analysis) |
version | string | No | Specific version of the AI workflow (for example, v1.0.0). If not specified, the latest version is used. |
inferenceOptions | object | No | Configuration options for the inference request |
inferenceInput | object | Yes | Input data for the model (format varies by model type) |
Inference Options
| Parameter | Type | Default | Description |
|---|---|---|---|
stream | boolean | false | Enable streaming response mode (Server-Sent Events) |
metrics | boolean | false | Enable metrics collection for this request |
timeout | integer | 300000 | Request timeout in milliseconds (1000-300000, which is 1 second to 5 minutes) |
Inference Input Formats
The inferenceInput field supports multiple formats, depending on the model serving endpoint requirements:
| Format | Description | Example Field |
|---|---|---|
| Tabular (array) | Array of arrays for tabular data | data |
| Tabular (records) | Array of objects with named fields | dataframe_records |
| Images | Base64-encoded image data | images |
| Chat messages | Conversation format with role/content | messages |
Request Examples
Tabular Data (Buffered Response)
{
"aiWorkflow": "lead-clone-selection",
"version": "v2.1.0",
"inferenceInput": {
"dataframe_records": [
{ "feature1": 1.5, "feature2": "category_a", "feature3": 42 },
{ "feature1": 2.3, "feature2": "category_b", "feature3": 37 }
]
},
"inferenceOptions": {
"stream": false,
"metrics": true,
"timeout": 30000
}
}Chat Conversation (Streaming Response)
{
"aiWorkflow": "coding-assistant",
"version": "v1.0.0",
"inferenceInput": {
"messages": [
{ "role": "user", "content": "Write a Python function to calculate factorial" }
]
},
"inferenceOptions": {
"stream": true,
"timeout": 30000
}
}Image Analysis
{
"aiWorkflow": "cell-image-analysis",
"version": "v1.5.0",
"inferenceInput": {
"images": [
{ "b64": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk..." }
]
},
"inferenceOptions": {
"stream": false,
"metrics": true
}
}Response
Buffered Response (JSON)
When stream is false (default), the response is returned as a complete JSON object:
| Field | Type | Description |
|---|---|---|
requestId | string | Unique identifier for this inference request (for example, sync_123e4567-e89b-12d3-a456-426614174000) |
status | string | Status of the inference request: success or failed |
inferenceOutput | any | Model output (format varies by model) |
metrics | object | Optional metrics about the inference execution |
timestamp | string | ISO 8601 timestamp of the response |
errorMessage | string | Error message if status is failed |
Example Response (Tabular Predictions)
{
"requestId": "sync_123e4567-e89b-12d3-a456-426614174000",
"status": "success",
"inferenceOutput": {
"predictions": [0.85, 0.72]
},
"metrics": {
"runDuration": 245,
"executionDuration": 245
},
"timestamp": "2025-09-29T10:30:00Z"
}Streaming Response (Server-Sent Events)
When stream is true, results are streamed as Server-Sent Events (SSE) with Content-Type: text/event-stream:
data: {"chunk": "partial result", "done": false}
data: {"chunk": "more content", "done": false}
data: [DONE]
Error Responses
| Status Code | Description | Example |
|---|---|---|
400 | Bad request - validation failed | AI workflow not found, invalid input format, invalid timeout value |
401 | Authentication failed | Invalid or missing authentication token |
403 | Access denied | Organization doesn't have access, insufficient role permissions |
404 | AI workflow not found | Specified workflow doesn't exist |
422 | Unprocessable entity | Model returned validation error for input data |
500 | Internal server error | Model serving endpoint failed, inference timeout |
503 | Service unavailable | Model serving endpoint is temporarily unavailable |
Get Inference Metrics
To retrieve metrics for a completed Real-Time Inference request, call the following endpoint.
Endpoint
GET /v1/inference/online/{requestId}/metrics
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
requestId | string | Yes | The Real-Time Inference request ID (for example, sync_123e4567-e89b-12d3-a456-426614174000) |
Response
| Field | Type | Description |
|---|---|---|
requestId | string | Unique identifier for the inference request |
aiWorkflow | string | AI workflow identifier |
organization | string | Organization slug |
status | string | Status: success, failed, or timeout |
processingTime | number | Time taken to process the request (milliseconds) |
timestamp | string | ISO 8601 timestamp |
inputSize | integer | Size of input data in bytes |
errorMessage | string | Error message if inference failed |
Example Metrics Response
{
"requestId": "sync_123e4567-e89b-12d3-a456-426614174000",
"aiWorkflow": "lead-clone-selection:v2.1.0",
"organization": "acme-corp",
"status": "success",
"processingTime": 1245,
"timestamp": "2025-09-29T10:30:00Z",
"inputSize": 2048
}Run a Batch (Asynchronous) Inference
Batch (asynchronous) inference submits file-based inference requests for batch processing. Use Batch Inference for processing large files or datasets, running batch inference jobs, tasks expected to take more than 60 seconds, and when files are already uploaded to the platform. For real-time predictions or low-latency requirements, use a real-time (synchronous) inference instead.
NOTEKeep in mind the following when running Batch Inferences:
- Batch Inference supports partial success, where optional files can fail without blocking the inference job. Critical files (primary data files) must succeed for inference to proceed.
- Batch Inference can be automated using the
ts-inference-protocol. Configure a file processing pipeline with this protocol to automatically trigger inference requests when new files are uploaded to the platform. The protocol accepts configuration for theinferenceUrl,aiWorkflow, and authenticationtoken, and then automatically submits matching files for inference processing.
Required Policies
To submit a Batch Inference request, you must have one of the following policy permissions:
Operation | Endpoint | Policies with Required Permissions |
|---|---|---|
Submit inference request |
| |
Get inference status |
|
Batch Inference Quickstart
Follow these steps to quickly run your first batch inference (approximately 10 minutes):
Step 1: Verify a Workflow is Installed
curl -s "https://<your-gateway>/ai-platform/v1/workflows" \
-H "ts-auth-token: $TOKEN" -H "x-org-slug: $ORG_SLUG" \
| jq '.[] | select(.status=="installed") | {slug, version}'Step 2: Find a file UUID in TDP
Navigate to a file in the TDP, open its Details view, and then copy the UUID from the URL or metadata panel.
Example file UUID: 550e8400-e29b-41d4-a716-446655440000
Step 3: Submit the Inference Request
RESPONSE=$(curl -s -X POST "https://<your-gateway>/ai-platform/v1/inference" \
-H "ts-auth-token: $TOKEN" -H "x-org-slug: $ORG_SLUG" \
-H "Content-Type: application/json" \
-d '{"aiWorkflow": "your-workflow-slug", "inputFiles": [{"fileUuid": "550e8400-e29b-41d4-a716-446655440000", "role": "file"}]}')
REQUEST_ID=$(echo $RESPONSE | jq -r '.requestId')
echo "Request ID: $REQUEST_ID"Step 4: Poll Until Complete
while true; do
STATUS=$(curl -s "https://<your-gateway>/ai-platform/v1/inference/$REQUEST_ID/status" \
-H "ts-auth-token: $TOKEN" -H "x-org-slug: $ORG_SLUG" | jq -r '.status')
echo "Status: $STATUS"
if [[ "$STATUS" == "completed" || "$STATUS" == "failed" ]]; then break; fi
sleep 10
doneStep 5: Download Results
curl -s "https://<your-gateway>/ai-platform/v1/inference/$REQUEST_ID/status?includeFiles=true" \
-H "ts-auth-token: $TOKEN" -H "x-org-slug: $ORG_SLUG" \
| jq '.outputFiles[] | {name: .fileName, url: .preSignedUrl}'
NOTEWhen running batch inferences, keep in mind the following:
- Pre-signed URLs expire in approximately 1 hour. Download them immediately after retrieval.
- Batch inference requests are logged in the System Log as an
AI Workflow. Retain therequestIdfrom every API call for audit linkage.
Run a Batch (Asynchronous) Inference API Reference
Endpoint
POST /v1/inference
Request Headers
| Header | Required | Description |
|---|---|---|
x-org-slug | Yes | Organization slug identifier (for example, acme-corp) |
ts-auth-token | Yes | Authentication token (JWT) |
Content-Type | Yes | Must be application/json |
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
aiWorkflow | string | Yes | The AI workflow identifier (for example, cell-analysis). You can get a list of available AI workflows using the GET /v1/install endpoint. |
version | string | No | Specific version of the AI workflow (for example, v1.0.0). If not specified, the latest version is used. |
inputFiles | array | Yes | List of input files for inference processing (minimum 1 file) |
Input File Object
Each file in the inputFiles array must include:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
fileUuid | string | Yes | — | UUID of the file in TDP |
role | string | No | file | Role of the file in processing: file, instructions, or parameters |
processingOrder | integer | No | 1 | Order in which file should be processed (minimum 1) |
dependencies | array | No | [] | List of file UUIDs this file depends on |
Request Examples
Basic Inference Request
{
"aiWorkflow": "cell-analysis",
"inputFiles": [
{
"fileUuid": "550e8400-e29b-41d4-a716-446655440000",
"role": "file",
"processingOrder": 1,
"dependencies": []
}
]
}Multi-File Inference with Dependencies
{
"aiWorkflow": "lead-clone-selection",
"version": "v2.1.0",
"inputFiles": [
{
"fileUuid": "550e8400-e29b-41d4-a716-446655440001",
"role": "file",
"processingOrder": 1,
"dependencies": []
},
{
"fileUuid": "550e8400-e29b-41d4-a716-446655440002",
"role": "instructions",
"processingOrder": 2,
"dependencies": ["550e8400-e29b-41d4-a716-446655440001"]
},
{
"fileUuid": "550e8400-e29b-41d4-a716-446655440003",
"role": "parameters",
"processingOrder": 3,
"dependencies": []
}
]
}Response
The API returns immediately with a 202 Accepted status and provides a request ID for tracking progress.
| Field | Type | Description |
|---|---|---|
requestId | string | Unique identifier for the inference request (for example, req-20250929-abc123def456) |
statusUrl | string | URL to check the status of the inference request |
partialSuccess | boolean | Indicates if some optional files failed to stage but inference can still proceed |
stagingSummary | object | Detailed breakdown of file staging results (only present for partial success) |
Example Response (All Files Staged Successfully)
{
"requestId": "req-20250929-abc123def456",
"statusUrl": "/v1/inference/req-20250929-abc123def456"
}Example Response (Partial Staging Success)
{
"requestId": "req-20250929-abc123def456",
"statusUrl": "/v1/inference/req-20250929-abc123def456",
"partialSuccess": true,
"stagingSummary": {
"totalFiles": 3,
"successfulFiles": 2,
"failedFiles": 1,
"criticalFailures": 0,
"optionalFailures": 1,
"warnings": ["Optional file failed to stage: metadata.json (Access denied)"]
}
}Error Responses
| Status Code | Description | Example |
|---|---|---|
400 | Bad request - validation failed | Invalid AI workflow, no valid files, critical files failed to stage |
401 | Authentication failed | Invalid or missing authentication token |
403 | Access denied | No access to AI workflow, file doesn't belong to organization |
404 | Resource not found | AI workflow or files not found |
409 | Conflict | AI workflow is disabled or inactive |
422 | Unprocessable entity | Invalid file format or empty files |
500 | Internal server error | Staging, queue, or database service errors |
503 | Service unavailable | FileInfo service timeout or external dependency unavailable |
Get Inference Status
To check the status of a Batch Inference request, call the following endpoint.
Endpoint
GET /v1/inference/{inferenceId}/status
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
inferenceId | string | Yes | The inference request ID |
Query Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
includeFiles | boolean | false | Include detailed file information in the response |
Response
| Field | Type | Description |
|---|---|---|
requestId | string | Unique identifier for the inference request |
dbxRunId | number | Databricks job run ID (when processing has started) |
status | string | Status: pending, processing, completed, or failed |
aiWorkflow | string | AI workflow identifier |
organization | string | Organization slug |
user | string | User who submitted the request |
createdAt | string | ISO 8601 timestamp when request was created |
updatedAt | string | ISO 8601 timestamp when request was last updated |
completedAt | string | ISO 8601 timestamp when inference completed (only for completed/failed) |
inputFiles | array | Original input file specifications |
stagingLocation | object | S3 staging location information |
outputLocation | object | S3 output location information |
manifestPath | string | S3 path to the processing manifest |
errorMessage | string | Error message if inference failed |
partialSuccess | boolean | Whether this request had partial staging success |
stagingSummary | object | Summary of file staging results |
files | array | Detailed file information (only if includeFiles=true) |
Example Status Response (Processing)
{
"requestId": "req-20250929-abc123def456",
"status": "processing",
"aiWorkflow": "cell-analysis",
"organization": "acme-corp",
"user": "user123",
"createdAt": "2025-09-29T10:30:00Z",
"updatedAt": "2025-09-29T10:35:00Z",
"stagingLocation": {
"bucket": "inference-staging-bucket",
"prefix": "tenant=acme/org=acme-corp/2025/09/29/req-20250929-abc123def456/",
"fullPath": "s3://inference-staging-bucket/tenant=acme/org=acme-corp/2025/09/29/req-20250929-abc123def456/"
},
"outputLocation": {
"bucket": "inference-output-bucket",
"prefix": "tenants/acme/orgs/acme_corp/schemas/ai_assets/2025/09/29/req-20250929-abc123def456/",
"fullPath": "s3://inference-output-bucket/tenants/acme/orgs/acme_corp/schemas/ai_assets/2025/09/29/req-20250929-abc123def456/"
}
}Example Status Response (Completed)
{
"requestId": "req-20250929-abc123def456",
"status": "completed",
"aiWorkflow": "cell-analysis",
"organization": "acme-corp",
"user": "user123",
"createdAt": "2025-09-29T10:30:00Z",
"updatedAt": "2025-09-29T10:45:00Z",
"completedAt": "2025-09-29T10:45:00Z"
}Upload AI Workflow Assets
Some AI workflows require large files, such as model weights, configuration files, or reference data to be uploaded in advance before inference can be performed. The Assets API provides endpoints for listing existing assets and uploading new files to an AI workflow's assets directory.
Use the Assets API when you need to do any of the following:
- Upload model weights or checkpoints for inference notebooks
- Provide configuration files that customize workflow behavior
- Store reference datasets used during inference
- Upload any large files that cannot be passed inline with inference requests
Required Policies
To manage AI workflow assets, users must have one of the following platform roles:
Operation | Endpoint | Policies with Required Permissions |
|---|---|---|
List assets |
| |
Upload assets |
|
List Assets
To retrieve a hierarchical listing of files and folders in an AI workflow's assets directory, call the following endpoint.
Endpoint
GET /v1/assets
Request Headers
| Header | Required | Description |
|---|---|---|
x-org-slug | Yes | Organization slug identifier (for example, acme-corp) |
ts-auth-token | Yes | Authentication token (JWT) |
Query Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
namespace | string | Yes | — | Namespace of the AI workflow (for example, common) |
aiWorkflow | string | Yes | — | Slug of the AI workflow (for example, cell-analysis) |
path | string | No | / | Path within the assets folder to list |
Example Request
curl -X GET "https://api.tetrascience.com/v1/assets?namespace=common&aiWorkflow=cell-analysis&path=/models" \
-H "x-org-slug: acme-corp" \
-H "ts-auth-token: YOUR_JWT_TOKEN"Example Response
{
"name": "models",
"type": "folder",
"path": "/models",
"children": [
{
"name": "weights.pkl",
"type": "file",
"path": "/models/weights.pkl",
"size": 104857600,
"lastModified": "2025-09-29T10:30:00.000Z"
},
{
"name": "config.json",
"type": "file",
"path": "/models/config.json",
"size": 1024,
"lastModified": "2025-09-29T09:15:00.000Z"
}
]
}Upload Assets
Uploading files to an AI workflow's assets directory is a two-step process:
- Get a presigned URL: Call the Assets API to get a presigned Amazon Simple Storage Service (Amazon S3) URL for uploading
- Upload the file: Use the presigned URL to upload your file directly to Amazon S3.
Multipart Upload SupportFor large files (such as model weights or knowledge base files), the AI Asset Files API supports multipart uploads with pre-signed URLs. This enables reliable upload of files that exceed single-request size limits. Orphaned upload sessions are automatically recovered to prevent organizations from being blocked.
Step 1: Get Presigned Upload URL
POST /v1/assets
Request Headers
| Header | Required | Description |
|---|---|---|
x-org-slug | Yes | Organization slug identifier (for example, acme-corp) |
ts-auth-token | Yes | Authentication token (JWT) |
Query Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
namespace | string | Yes | — | Namespace of the AI workflow |
aiWorkflow | string | Yes | — | Slug of the AI workflow |
fileName | string | Yes | — | Name of the file to upload (no path separators allowed) |
path | string | No | / | Path within the assets folder where the file will be stored |
Example Request
curl -X POST "https://api.tetrascience.com/v1/assets?namespace=common&aiWorkflow=cell-analysis&path=/models&fileName=weights.pkl" \
-H "x-org-slug: acme-corp" \
-H "ts-auth-token: YOUR_JWT_TOKEN"Example Response
{
"url": "https://s3.amazonaws.com/ai-assets-bucket/path/to/weights.pkl?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Signature=...",
"expiresIn": 86400
}| Field | Type | Description |
|---|---|---|
url | string | Presigned S3 URL for uploading the file |
expiresIn | integer | URL expiration time in seconds (24 hours) |
Step 2: Upload File to Presigned URL
Use the presigned URL from Step 1 to upload your file directly to Amazon S3:
curl -X PUT "$PRESIGNED_URL" \
-H "Content-Type: application/octet-stream" \
--data-binary @./weights.pklComplete Upload Example
The following example shows the complete two-step process for uploading a model weights file:
# Step 1: Get presigned URL
PRESIGNED_URL=$(curl -s -X POST "https://api.tetrascience.com/v1/assets?namespace=common&aiWorkflow=cell-analysis&path=/models&fileName=model_weights.pkl" \
-H "x-org-slug: acme-corp" \
-H "ts-auth-token: YOUR_JWT_TOKEN" | jq -r '.url')
# Step 2: Upload file using presigned URL
curl -X PUT "$PRESIGNED_URL" \
-H "Content-Type: application/octet-stream" \
--data-binary @./model_weights.pklError Responses
| Status Code | Description | Example |
|---|---|---|
400 | Bad request - invalid fileName or path | Invalid fileName: must be a filename without path separators |
401 | Authentication failed | Invalid or missing authentication token |
404 | AI workflow not found | AI workflow 'cell-analysis' not found in namespace 'common' |
500 | Internal server error | Failed to list assets or create presigned URL |
Supported File Types
TetraScience AI Services supports the following file types:
- Images: JPEG, PNG, GIF, WebP
- Documents: PDF, CSV, JSON
- Compressed: GZ, GZIP
- Any other binary or text files your AI workflow can process
Each Scientific AI Workflow supports different file types, which are documented in each AI Workflow's README.md file.
Invoke a Training Notebook
TetraScience AI Services v1.2.0 introduces the ability to invoke any training notebook within an AI Workflow directly through the API. This enables model training, data prefetching, and other custom notebook tasks without leaving the platform.
Invoke a Training Notebook Quickstart
Follow these steps to quickly invoke a notebook (approximately 5 minutes):
Step 1: Verify a Workflow is Installed
curl -s "https://<your-gateway>/ai-platform/v1/workflows" \
-H "ts-auth-token: $TOKEN" -H "x-org-slug: $ORG_SLUG" \
| jq '.[] | select(.status=="installed") | {slug, version}'Step 2: Identify the Notebook Path
Review the AI Workflow's README or source to find the notebook path you want to invoke. For example, a training notebook might be located at training/train_model.
Step 3: Submit the Invoke Request
curl -s -X POST "https://<your-gateway>/ai-platform/v1/inference/invoke/training/train_model" \
-H "ts-auth-token: $TOKEN" -H "x-org-slug: $ORG_SLUG" \
-H "Content-Type: application/json" \
-d '{
"aiWorkflow": "molecule-property-predictor",
"version": "v2.0.0",
"parameters": {
"epochs": 100,
"learning_rate": 0.001,
"batch_size": 32
}
}'Step 4: Process the Response
The notebook execution begins and the API returns a request ID for tracking:
{
"requestId": "invoke_123e4567-e89b-12d3-a456-426614174000",
"status": "accepted",
"timestamp": "2025-09-29T10:30:00Z"
}
NOTEThe wildcard (
*) path in/v1/inference/invoke/*maps to the notebook path within the AI Workflow. You can also reference S3 input files by including aninputFilesarray in the request body.
Invoke a Training Notebook API Reference
Endpoint
POST /v1/inference/invoke/*
The wildcard (*) path maps to the notebook path within the AI Workflow. For example, POST /v1/inference/invoke/training/train_model invokes the training/train_model notebook.
Request Headers
| Header | Required | Description |
|---|---|---|
x-org-slug | Yes | Organization slug identifier (for example, acme-corp) |
ts-auth-token | Yes | Authentication token (JWT) |
Content-Type | Yes | Must be application/json |
Request Body
The invoke endpoint accepts an arbitrary JSON payload, which is passed directly to the target notebook. You can also reference S3 input files using file IDs.
Request Example
{
"aiWorkflow": "molecule-property-predictor",
"version": "v2.0.0",
"parameters": {
"epochs": 100,
"learning_rate": 0.001,
"batch_size": 32
},
"inputFiles": [
{
"fileId": "550e8400-e29b-41d4-a716-446655440000"
}
]
}Required Policies
To invoke a training notebook, users must have one of the following policy permissions:
Manage Knowledge Bases
TetraScience AI Services v1.2.0 supports the creation and management of vectorized knowledge bases, powered by Databricks Vector Search. This enables AI use cases that require retrieval-augmented generation (RAG) and semantic search over enterprise knowledge bases.
Create a Vector Store
Create a new vector store scoped to your organization:
POST /v1/vector-stores
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Name of the vector store |
description | string | No | Description of the vector store's purpose |
aiWorkflow | string | Yes | The AI workflow identifier associated with this knowledge base |
Request Example
{
"name": "research-papers-kb",
"description": "Knowledge base for published research papers",
"aiWorkflow": "research-assistant"
}Upload Knowledge Base Files
Upload content to a vector store. Files are automatically vectorized and indexed:
POST /v1/vector-stores/{vectorStoreId}/files
Use the AI Asset Files API with multipart upload support for large knowledge base files.
Query a Vector Store
Query a vector store using natural language text:
POST /v1/vector-stores/{vectorStoreId}/query
Request Body
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
query | string | Yes | — | Natural language query text |
maxResults | integer | No | 10 | Maximum number of results to return |
Request Example
{
"query": "What are the optimal conditions for cell culture growth?",
"maxResults": 5
}Required Policies
To manage knowledge bases and vector stores, users must have one of the following policy permissions:
Error Handling Best Practices
When integrating with AI Services, use the following guidance for handling errors:
| Code | Meaning | Recommended Action |
|---|---|---|
400 | Malformed payload | Check JSON syntax; verify aiWorkflow and fileUuid are present |
401 | Token invalid/expired | Generate a new token from My Account → API Tokens |
403 | Insufficient permissions | Ask Org Admin to grant ML Engineer or Developer policy |
404 | Workflow not found | Check slug spelling; confirm workflow is installed for your org |
409 | Workflow disabled | Re-activate the version in AI Services UI under Actions |
500/503 | Transient error | Retry with exponential backoff; include requestId in support ticket if persistent |
GxP NOTEEvery inference is logged in TDP System Log (Artifacts → System Log, Artifact Type: AI Workflow). Retain the
requestIdfrom every API call for audit linkage.
Automate Inference with Pipelines
Use the ts-inference-protocol task script to wire a TDP pipeline step to AI Services. Files trigger inference automatically and results write back to TDP. Example minimal configuration:
{
"aiWorkflow": "ms-peak-picking",
"version": "v2.1.0",
"inputRole": "file",
"waitForCompletion": true,
"outputTag": "ai-results"
}For more information about configuring pipelines, see Set Up and Edit Pipelines.
Python LLM Wrapper for Task Scripts
For Python-based task scripts, you can use the ts_ai_services_utils library to call Databricks-hosted LLMs:
from ts_ai_services_utils.llm import LLMClient
client = LLMClient(
ai_workflow="clinical-doc-extractor",
version="v1.0.0",
gateway_url="https://<gateway>",
org_slug="<org>",
auth_token="<token>"
)
# Buffered response
response = client.complete(messages=[
{"role": "system", "content": "Extract adverse events as JSON."},
{"role": "user", "content": "Patient presented with Grade 2 nausea..."}
])
print(response.content)
# Streaming response
for chunk in client.stream(messages=[{"role": "user", "content": "Summarise: ..."}]):
print(chunk, end="", flush=True)Documentation Feedback
Do you have questions about our documentation or suggestions for how we can improve it? Start a discussion in TetraConnect Hub. For access, see Access the TetraConnect Hub.
NOTEFeedback isn't part of the official TetraScience product documentation. TetraScience doesn't warrant or make any guarantees about the feedback provided, including its accuracy, relevance, or reliability. All feedback is subject to the terms set forth in the TetraConnect Hub Community Guidelines.
Updated about 9 hours ago
