Is it possible to retrieve an object in an array that matches my query using elasticsearch? - elasticsearch

Given a document like this:
{
"id": "12345",
"elements": [
{
"type": "configure",
"time": 3000
}
]
}
Is it possible to query for documents with an object in the elements array that have a type of configure and then also retrieve that specific object in the array so that I can also get the time value associated with that element (In this case 3000)?

You can use nested inner_hits to retrieve details of the nested objects that match for a nested query. Note that elements will need to be mapped as a nested datatype field.

Related

What is data structure used for Elasticsearch flattened type

I was trying to find how flattened type in Elasticsearch works under the hood, the documentation specifies that all leaf values will be indexed into a single field as a keyword, as a result, there will be a dedicated index for all those flattened keywords.
From documentation:
By default, Elasticsearch indexes all data in every field and each indexed field has a dedicated, optimized data structure. For example, text fields are stored in inverted indices, and numeric and geo fields are stored in BKD trees.
The specific case that I am trying to understand:
If I have flattened field and index object with nested objects there is the ability to query a specific nested key in the flattened object. See how to query by labels.release:
PUT bug_reports
{
"mappings": {
"properties": {
"labels": {
"type": "flattened"
}
}
}
}
POST bug_reports/_doc/1
{
"labels": {
"priority": "urgent",
"release": ["v1.2.5", "v1.3.0"]
}
}
POST bug_reports/_search
{
"query": {
"term": {"labels.release": "v1.3.0"}
}
}
Would flattened field have the same index structure as the keyword field, and how it is able to reference the specific child key of flattened object?
The initial design and implementation of the flattened field type is described in this issue. The leaf keys are also indexed along with the leaf values, which is how they are allowing the search for a specific sub-field.
There are some ongoing improvements to the flattened field type and Elastic would also like to support numeric values, but that's not yet released.

Is there a way to define attribute type as Keyword in ElasticSearch Array data type?

I am working on indexing a large data set which has multiple name fields for a particular entity. I have defined the name field of type array and I am adding around 4 names in that. Some of the names have spaces in between and they are getting tokenized. Can I avoid that?
I know for String we have text as well as keyword type in Elastic but how do I define the type as keyword when I am having array as my data type? By default all the array fields are taken as text type. I want them to be treated as keyword type so they don't get tokenized while indexing.
Expected : If I store "Hello World" in an array, I should be able to search "Hello World".
Current behavior : It stores hello differently and world differently as it tokenizes that.
There is no data type for array in elastic search. Whenever you send an array as value of a property of type x then that property becomes an array accepting only the values of type x.
So for example you created a property as below:
{
"tagIds": {
"type": "integer"
}
}
And you index a document with values as below:
{
"tagIds": [124, 452, 234]
}
Then tagIds automatically become an array of integers.
For your case all you need to do is create a field say name with type as keyword. And make sure you always pass an array to this field even if it has to hold a single value to make sure it is always an array. Below is what you need:
Mapping:
PUT test
{
"mappings": {
"_doc": {
"properties": {
"name": {
"type": "keyword"
}
}
}
}
}
Indexing document:
PUT test/_doc/1
{
"name" : ["name one"]
}

Multiple filter by array of object in Elastic 6.*

Need help with building query through the array in ElasticSearch 6. I have documents that represent some property units with a number of attributes:
{
"Unit":{
"Attributes":{
"Attribute":[
{
"Name":"Elevator",
"Text":"No"
},
{
"Name":"Pet Friendly",
"Text":"Yes"
}
...
]
}
}
}
How can I filter my documents to find all pet friendly units or all units without elevator?
P.S. I am using NEST.
Map Attribute as a nested type, probably with Text mapped as keyword for term level matching. To query, use a bool query with filter clauses, where the clauses will be nested queries.

Elasticsearch "self-join" type operation

I have an index containing documents that look something like this (unnecessary fields omitted)
{
_id: String,
...
relatedIds: [ String ]
}
The relatedIds are recursively referring to the _id of the documents themselves.
I want to write a query that will return only the id's from the relatedIds array that are not an _id of a document.
In abstract, I want to grab all of these identifiers, perform some computation, so that in the end every id in the relatedIds refers to a document in the index.
If your id field is also contained in the source document, you can do it like this:
POST index/_search
{
"script_fields": {
"relatedIdsLessId": {
"script": {
"inline": "doc. relatedIds.values - doc.id.value"
}
}
}
}
This will compute a new field named relatedIdsLessId which will only contain the related IDs which are not the ID of the document itself.
Note: you need to make sure to enable dynamic scripting if not already done.

Should I be using a nested object or a normal field?

I've been taking baby steps into using Elasticsearch, and while researching a separate issue I ran into this question. Here, swatkins asked about querying nested objects, and a responder pointed out that nested objects weren't necessary given his model. I've copied the model here, and made some changes to reflect my particular question:
[{
id:4635,
description:"This is a test description",
author:"John",
author_id:51421,
meta: {
title:"This is a test title for a video",
description:"This is my video description",
url:"/url_of_video"
awesomeness-level: "pretty-awesome"
kung-fu: true
}
},
{
id:4636,
description:"This is a test description 2",
author:"John",
author_id:51421,
meta: {
title:"This is an example title for a video",
description:"This is my video description2",
url:"/url_of_video2"
kung-fu:false
monsters:true
monsters-present: "Dracula, The Mummy"
}
}]
Our application allows users to define custom metadata, so we're using a nested object to represent that data. At first glance, it looks similar to swatkins' model, so I thought that maybe we shouldn't be using a nested object.
The big difference is each objects meta might be different, note the second video has meta specifically about "monster movies", while the first video references an "awesomeness-level". So, should I be using a nested object, or just mapping metadata as a normal field? If we do the latter, will the first video have empty metadata fields? Does that really even matter? Thanks in advance!
Assuming that your example represents two elasticsearch documents, it doesn't look like you need to make meta a nested object. It makes sense to use nested objects when one parent object has multiple nested objects and your searches involve several fields of the nested objects. For example, if you have a record like this:
{
"name": "apple",
"attributes": [
{
"color": "yellow",
"size": "big"
},
{
"color": "red",
"size": "small"
}
]
}
and you want this record to be found when you search for color:yellow AND shape:big or color:red AND shape:small but don't want it to be returned when you search for color:yellow AND shape:small, it makes sense to make attributes a nested object. It will allow you to index and search each attribute independently and then get parent object of matching attribute.

Resources