ElasticSearch field with different types in one single index - elasticsearch

We have a scenario in a service that accepts multiple types of data and we want to store in ElasticSearch so we can benefit from its search capabilities.
Data could be a String, Number, Object or an Array of objects as the following:
POST my-index/_doc/1
{
"additionalData": [
{
"values": {
"some-field": "some-value",
"some-other-field": "some-value"
}
}
]
}
POST my-index/_doc/1
{
"additionalData": [
{
"values": [12345, 9875]
}
]
}
POST my-index/_doc/1
{
"additionalData": [
{
"values": "Some text"
}
]
}
Is there a way to store that in elasticSearch? or better to store in other NoSQL Databases like Mongodb?
PS: we are using Es 7.x, and would like to keep using ES.

If you don't need to search on those values, it's possible with a disabled field (i.e. not indexed, not stored)
However, if you want to search on those value, it's not possible. Each field must have a specific type (object, numeric, text, etc) and then you can only store values of that type in the field.

Related

How to enrich data in elastic search when the data to be enriched is inside an array

I have used information from the below link to create a pipeline for enriching the data using lookup from other indices.
Enriching the Data in Elastic Search
The problem that I am facing is :
my payload has this structure:
{
field1: value1,
field2:value2,
field3[
{
field3.1.1: value1,
field3.1.2: value2
},
{
field3.2.1: value1,
field3.2.2: value2
}
]
}
I created ingest pipeline for this and I am able to enrich the data correctly at the parent level, i.e. field1, field2.
However since field3 is an array element, enrichment doesn't work straight away. So I applied foreach processor in the pipeline.
and the processor of the pipeline is enrich processor.
PUT _ingest/pipeline/test-data-lookup
{
"processors": [
{
"foreach": {
"field": "field3",
"processor": {
"enrich": {
"policy_name": "field3-policy",
"field": "_ingest._value.field3.1.1",
"target_field": "{{{_ingest._value.field3.1.1}}}"
}
}
}
}
]
}
the target field is generated correctly however if I have to set field3.1.1 with the look up value defined in target_field. I have to use set processor like this.
{
"set": {
"if": "_ingest._value.field3.1.1 != null",
"field": "field1",
"value": "{{_ingest._value.field3.1.1.codevalue}}"
}
}
The problem is if condition here doesn't like _ingest._value and so gives compilation error, and because of this I am not able to compare the value of the target field with the incoming value and so all the elements of the array end up having the same codevalue.
I am new to elastic and have read almost all the documentation that I am able to understand right now. What I am trying to do, is it even possible or not?

Elasticsearch nested objects with query_string as first class attributes

I'm trying to index a nested field as a first-class attribute in my document so that I can search them using query_string without dot syntax.
For example, if I have a document like
"data": { "name": "Bob" }
instead of searching for data.name:Bob I would like to be able to search for name:Bob
The root of my issue is that we index a jsonb column that may have varying attributes. In some instances the data property may contain a data.business attribute, etc. I would like users to be able to search on these attributes without needing to "dig" into the object.
The data field does not have to be indexed as a nested type unless necessary; I was indexing it as an object previously.
I have tried to leverage the _all field as suggested in this post.
I have also tried to use include_in_parent:true and set the datatype as nested for my data field as suggested in this post.
I have also looked into the inner_hits feature to no avail.
Here's an example of my mapping for the data attribute.
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"data": {
"type": "object"
}
}
}
}
}
Example document
PUT my_index/_doc/1
{
"data": {
name: "bob",
business: "None of yours"
}
}
And how my query currently looks:
GET my_index/_search
{
"query": {
"query_string": {
"query": "name:bob",
"fields": ["data.*"]
}
}
}
With the current setup I almost get my desired results. I can search on individual properties like data.name:bob and data.business:"None of yours" and get back the correct documents.
However I want to be able to get the exact same results with business:"None of yours" or name:bob.
Thanks in advance for any help!
I figured it out using dynamic templates. For anyone coming across this in the future, here is how I solved the issue:
I used path_match to match the data object (data.*).
Then using copy_to and {name} I dynamically created top-level fields on my parent object.
{
"dynamic_templates":[
{"template_1":
{"mapping":
{"copy_to":"{name}"},
"path_match":"data.*"
}
}
]
}

Index main-object, sub-objects, and do a search on sub-objects (that return sib-objects)

I've an object like it (simplified here), Each strain have many chromosomes, that have many locus, that have many features, that have many products, ... Here I just put 1 of each.
The structure in json is:
{
"name": "my strain",
"public": false,
"authorized_users": [1, 23, 51],
"chromosomes": [
{
"name": "C1",
"locus": [
{
"name": "locus1",
"features": [
{
"name": "feature1",
"products": [
{
"name": "product1"
//...
}
]
}
]
}
]
}
]
}
I want to add this object in Elasticsearch, for the moment I've add objects separatly: locus, features and products. It's okay to do a search (I want type a keyword, watch in name of locus, name of features, and name of products), but I need to duplicate data like public and authorized_users, in each subobject.
Can I register the whole object in elasticsearch and just do a search on each locus level, features and products ? And get it individually ? (no return the Strain object)
Yes you can search at any level (ie, with a query like "chromosomes.locus.name").
But as you have arrays at each level, you will have to use nested objects (and nested query) to get exactly what you want, which is a bit more complex:
https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html
https://www.elastic.co/guide/en/elasticsearch/reference/5.3/query-dsl-nested-query.html
For your last question, no, you cannot get subobjects individually, elastic returns the whole json source object.
If you want only data from subobjects, you will have to use nested aggregations.

Query against array of long types returns nothing

I have a cluster with documents, with one field being an array of long types. Below is an example value of this field:
"request_categories": [
150848602323501540,
150847029425938900
],
When I query the field, it does not return anything. Below is the query.
GET service_alias/service/_search
{
"query": {
"term": {
"request_categories" : 150848602323501540
}
}
}
This data field is indexed. I have no problems querying other data fields. Anything that I may have missed? Thanks!

NEST: How to query against multiple indices and handle different subclasses (document types)?

I’m playing around with ElasticSearch in combination with NEST in my C# project. My use case includes several indices with different document types which I query separately so far. Now I wanna implement a global search function which queries against all existing indices, document types and score the result properly.
So my question: How do I accomplish that by using NEST?
Currently I’m using the function SetDefaultIndex but how can I define multiple indices?
Maybe for a better understanding, this is the query I wanna realize with NEST:
{
"query": {
"indices": {
"indices": [
"INDEX_A",
"INDEX_B"
],
"query": {
"term": {
"FIELD": "VALUE"
}
},
"no_match_query": {
"term": {
"FIELD": "VALUE"
}
}
}
}
}
TIA
You can explicitly tell NEST to use multiple indices:
client.Search<MyObject>(s=>s
.Indices(new [] {"Index_A", "Index_B"})
...
)
If you want to search across all indices
client.Search<MyObject>(s=>s
.AllIndices()
...
)
Or if you want to search one index (thats not the default index)
client.Search<MyObject>(s=>s.
.Index("Index_A")
...
)
Remember since elasticsearch 19.8 you can also specify wildcards on index names
client.Search<MyObject>(s=>s
.Index("Index_*")
...
)
As for your indices_query
client.Search<MyObject>(s=>s
.AllIndices()
.Query(q=>q
.Indices(i=>i
.Indices(new [] { "INDEX_A", "INDEX_B"})
.Query(iq=>iq.Term("FIELD","VALUE"))
.NoMatchQuery(iq=>iq.Term("FIELD", "VALUE"))
)
)
);
UPDATE
These tests show off how you can make C#'s covariance work for you:
https://github.com/Mpdreamz/NEST/blob/master/src/Nest.Tests.Integration/Search/SubClassSupport/SubClassSupportTests.cs
In your case if all the types are not subclasses of a shared base you can still use 'object'
i.e:
.Search<object>(s=>s
.Types(typeof(Product),typeof(Category),typeof(Manufacturer))
.Query(...)
);
This will search on /yourdefaultindex/products,categories,manufacturers/_search and setup a default ConcreteTypeSelector that understands what type each returned document is.
Using ConcreteTypeSelector(Func<dynamic, Hit<dynamic>, Type>) you can manually return a type based on some json value (on dynamic) or on the hit metadata.

Resources