Getting all nested data object using facets with Elastic - elasticsearch

Greatings,
I have done some searching around but I have not come across a solid answer. We are using elastic search and trying to use facets to group together nested objects. Here is what the data looks like:
{
id:1,
name:abc,
object:{
tag: 5,
name:'test1',
set: true
id
}
},
{
id:2,
name:def,
object:{
tag: 2,
name:'test2',
set: false
}
}
and I want to use facets to get a count of object nested. I can get one field from the object using something like this:
{"facets":{"tags":{"terms":{"field":"object.name"}}}}'
but that just gets me the name and a count from the parent object its in. I want all the properties within object. I want tag,name and set to come back within the facet.
Is this possible? All signs seem to point to no, or to use something like script, where i can create a composite field of all 3, but that would require post processing, which i was hoping to avoid doing.
Any help would surely be appreciated. Thank you in advance.

Related

In Elasticsearch - Is there a way to combine many completion type under same context?

I tried to create suggest index which filter by context (I based on the examples at https://www.elastic.co/guide/en/elasticsearch/reference/6.8/suggester-context.html)
I want to filter by same context for every document but I don't want to save the duplication of the context per field, My question if there is a better way to do that?
For Example:
I have the document :
{
"Name":"Lupidon",
"Father":"Zeus",
"context":{
"favorite_food":[
"pizza"
]
}
}
when name and father is completion type and then I could look for prefix of name or father which in the document contains some favorite food

How to sort by totalCount in Gatsby?

I'm displaying a list of tags in Gatsby using this query:
query={graphql`
query {
tags: allMarkdownRemark {
group(field: frontmatter___tags) {
fieldValue
totalCount
}
}
}
`}
At the moment it includes all the tags ever used. I want to limit the list to the top 10 most popular tags. In order to do this I need to sort them first by totalCount. How can I accomplish this?
I do not think that you can sort them and then limit them using just a graphql query. My process for this would be to query the entire list of tags just like you have it, and then use javascript after that to manipulate the list of objects. There is a post that covers how to sort a list just like the one you will get back from this here:
Sorting Objects by Property Values
This would then return an array, and you could loop over the array for the first ten objects.

Unable to loop through array field ES 6.1

I'm facing a problem in ElasticSearch 6.1 that I cannot solve and I don't know why. I have read the docs several times and maybe I'm missing something.
I have a scripted query that needs to do some calculation before decides if a record is available or not.
Here is the following script:
https://gist.github.com/dunice/a3a8a431140ec004fdc6969f77356fdf
What I'm doing is trying to loop though an array field with the following source:
"unavailability": [
{
"starts_at": "2018-11-27T18:00:00+00:00",
"local_ends_at": "2018-11-27T15:04:00",
"local_starts_at": "2018-11-27T13:00:00",
"ends_at": "2018-11-27T20:04:00+00:00"
},
{
"starts_at": "2018-12-04T18:00:00+00:00",
"local_ends_at": "2018-12-04T15:04:00",
"local_starts_at": "2018-12-04T13:00:00",
"ends_at": "2018-12-04T20:04:00+00:00"
},
]
When the script is executed it throws the error: No field found for [unavailability] in mapping with types [aircraft]
Is there any clue to make it work?
Thanks
UPDATE
Query:
https://gist.github.com/dunice/3ccd7d83ca6ddaa63c11013b84e659aa
UPDATE 2
Mapping:
https://gist.github.com/dunice/f8caee114bbd917115a21b8b9175a439
Data example:
https://gist.github.com/dunice/8ad0602bc282b4ca19bce8ae849117ad
You cannot access an array present in the source document via doc_values (i.e. doc). You need to directly access the source document via the _source variable instead, like this:
for(int i = 0; i < params._source['unavailability'].length; i++) {
Note that depending on your ES version, you might want to try ctx._source or just _source instead of params._source
I solve my use-case in a different approach.
Instead having a field as array of object like unavailability was I decided to create two fields as array of datetime:
unavailable_from
unavailable_to
My script walks through the first field then checks the second with the same position.
UPDATE
The direct access to _source is disabled by default:
https://github.com/elastic/elasticsearch/issues/17558

Kibana 4.1 - use JSON input to create an Hour Of Day field from #timestamp for histogram

Edit: I found the answer, see below for Logstash <= 2.0 ===>
Plugin created for Logstash 2.0
Whomever is interested in this with Logstash 2.0 or above, I created a plugin that makes this dead simple:
The GEM is here:
https://rubygems.org/gems/logstash-filter-dateparts
Here is the documentation and source code:
https://github.com/mikebski/logstash-datepart-plugin
I've got a bunch of data in Logstash with a #Timestamp for a range of a couple of weeks. I have a duration field that is a number field, and I can do a date histogram. I would like to do a histogram over hour of day, rather than a linear histogram from x -> y dates. I would like the x axis to be 0 -> 23 instead of date x -> date y.
I think I can use the JSON Input advanced text input to add a field to the result set which is the hour of day of the #timestamp. The help text says:
Any JSON formatted properties you add here will be merged with the elasticsearch aggregation definition for this section. For example shard_size on a terms aggregation which leads me to believe it can be done but does not give any examples.
Edited to add:
I have tried setting up an entry in the scripted fields based on the link below, but it will not work like the examples on their blog with 4.1. The following script gives an error when trying to add a field with format number and name test_day_of_week: Integer.parseInt("1234")
The problem looks like the scripting is not very robust. Oddly enough, I want to do exactly what they are doing in the examples (add fields for day of month, day of week, etc...). I can get the field to work if the script is doc['#timestamp'], but I cannot manipulate the timestamp.
The docs say Lucene expressions are allowed and show some trig and GCD examples for GIS type stuff, but nothing for date...
There is this update to the BLOG:
UPDATE: As a security precaution, starting with version 4.0.0-RC1,
Kibana scripted fields default to Lucene Expressions, not Groovy, as
the scripting language. Since Lucene Expressions only support
operations on numerical fields, the example below dealing with date
math does not work in Kibana 4.0.0-RC1+ versions.
There is no suggestion for how to actually do this now. I guess I could go off and enable the Groovy plugin...
Any ideas?
EDIT - THE SOLUTION:
I added a filter using Ruby to do this, and it was pretty simple:
Basically, in a ruby script you can access event['field'] and you can create new ones. I use the Ruby time bits to create new fields based on the #timestamp for the event.
ruby {
code => "ts = event['#timestamp']; event['weekday'] = ts.wday; event['hour'] = ts.hour; event['minute'] = ts.min; event['second'] = ts.sec; event['mday'] = ts.day; event['yday'] = ts.yday; event['month'] = ts.month;"
}
This no longer appears to work in Logstash 1.5.4 - the Ruby date elements appear to be unavailable, and this then throws a "rubyexception" and does not add the fields to the logstash events.
I've spent some time searching for a way to recover the functionality we had in the Groovy scripted fields, which are unavailable for scripting dynamically, to provide me with fields such as "hourofday", "dayofweek", et cetera. What I've done is to add these as groovy script files directly on the Elasticsearch nodes themselves, like so:
/etc/elasticsearch/scripts/
hourofday.groovy
dayofweek.groovy
weekofyear.groovy
... and so on.
Those script files contain a single line of Groovy, like so:
Integer.parseInt(new Date(doc["#timestamp"].value).format("d")) (dayofmonth)
Integer.parseInt(new Date(doc["#timestamp"].value).format("u")) (dayofweek)
To reference these in Kibana, firstly create a new search and save it, or choose one of your existing saved searches (Please take a copy of the existing JSON before you change it, just in case) in the "Settings -> Saved Objects -> Searches" page. You then modify the query to add "Script Fields" in, so you get something like this:
{
"query" : {
...
},
"script_fields": {
"minuteofhour": {
"script_file": "minuteofhour"
},
"hourofday": {
"script_file": "hourofday"
},
"dayofweek": {
"script_file": "dayofweek"
},
"dayofmonth": {
"script_file": "dayofmonth"
},
"dayofyear": {
"script_file": "dayofyear"
},
"weekofmonth": {
"script_file": "weekofmonth"
},
"weekofyear": {
"script_file": "weekofyear"
},
"monthofyear": {
"script_file": "monthofyear"
}
}
}
As shown, the "script_fields" line should fall outside the "query" itself, or you will get an error. Also ensure the script files are available to all your Elasticsearch nodes.

Use one field to compare to another field and filter in oData

Lets say I have data like this (lots of it)
{
"name" : "Coffee",
"quantity": 100,
"restock": 10
}
I want to use an odata $filter to show me ONLY items where the quantity is LESS than the restock number
Is it possible to do something like $filter=quantity lt restock
I know that specific example fails. Is there a way to do this? Or do I need to fetch everything and post process it?
That query should absolutely be possible in most (all?) versions of OData: see http://services.odata.org/V4/OData/OData.svc/Products?$filter=Rating lt Price for a working example.

Resources