IDS Conventions - elasticsearch.json

Suggest Edits

❗️
Location Changed
The content of this page has been moved to TetraScience Confluence and it's not public-facing anymore. We will revise this page once self-service IDS is available.

The elasticsearch.json file is a component of the IDS artifact, and is used to define:

IDS attributes that will be of type nested in the Elasticsearch document
IDS attributes that will not be included in the Elasticsearch document
Additional dynamic mappings that can be included

How to Generate the File

TetraScience has created a elasticsearch.json generating script. Click here for our public repository.
To create elasticsearch.json, run the generator on your schema.json. By default, all arrays of objects are created as type nested, and datacubes are excluded from search. To customize after generation, you can:

Remove the properties from the mapping for those properties that you do not want to be of type nested.
Add paths to the "nonSearchableFields" for those fields you want to exclude from the Elasticsearch document.

Platform Requirements

#	Rule	Checked by IDS Validator
1	`datacubes`, if defined in schema.json, should be in `nonSearchableFields`	Yes

TS Convention

#	Rule	Checked by IDS Validator (only for IDS designed by TS)
1	All object-type fields in schema.json should be defined as `nested`	Yes, warning if not followed

In-Depth Explanation

Elasticsearch.json is composed of three main fields:

IDS attributes that will be of type "nested" in the Elasticsearch document
IDS attributes that will not be included in the Elasticsearch document
Additional dynamic mappings that can be included

For an example elasticsearch.json, expand the following nested example:

Nested - Example

If you have the following expected.json:

{
  "results": [{
    "peaks": [{
      "number": 1,
      "name": "a" 
    },{
      "number": 2,
      "name": "b"
    }]
  }],
  "my_really_long_array": [{
    "foo": "hello",
    "bar": "world"
  }],
}

Then, your elasticsearch.json should look like the following:

{
  "mapping": {
    "properties": {
      "result": {
        "type": "nested",
        "properties": {
          "peaks": {
            "type": "nested"
          }
        }
      }
    },
    "dynamic_templates": []
  },
  "nonSearchableFields": ["my_really_long_array"]
}

Mapping and Defining `nested` Attributes

Click here for the Elasticsearch official documentation describing the nested type.

You should define all fields within an array of objects in schema.json as nested. You use the "mappings" property of elasticsearch.json to define nested types. To review how properties within an array of objects is assigned as nested in elasticsearch.json, review the previous example.

Do not define a field to be nested if you have any of these conditions:

When you have only a few users who are searching, and may not need to use nested types
The array only contains one item
Save storage space in Elasticsearch
ignore_malformed does not work for nested data type
Elasticsearch query string does not work for nested data type

`nonSearchableFields`

Fields defined here will not be indexed into Elasticsearch. You want to exclude fields when:

You will most likely not search this field
Save some space
Elasticsearch indexing will be faster
Your file is larger than 100 MB. Elasticsearch indexing currently has a 100 MB limit.

nonSearchableFields uses lodash omit to exclude. Use the official lodash syntax here: https://lodash.com/docs/4.17.15#omit

Dynamic Mapping

It is theoretically possible to include a dynamic type mapping based on either path pattern or data type. Currently, this feature is not used extensively. For details, please contact your Customer Success representative.

Updated over 1 year ago

❗️Location Changed

How to Generate the File

Platform Requirements

TS Convention

In-Depth Explanation

Mapping and Defining nested Attributes

nonSearchableFields

Dynamic Mapping

❗️
Location Changed

Mapping and Defining `nested` Attributes

`nonSearchableFields`