Elasticsearch: How to calculate the yield (percentage of success)? - elasticsearch

My purpose is to calculate the yield of each benchId. Which means: For each bench, what is the percentage of team that have isPassed=True the first time they pass the test. I would like to have a visualization of each yield for each bench.
My Elasticsearch mapping is:
"test-logs" : {
"mappings" : {
"log" : {
"properties" : {
"benchGroup" : {
"type" : "keyword"
},
"benchId" : {
"type" : "keyword"
},
"date" : {
"type" : "date",
"format" : "yyyy/MM/dd HH:mm:ss"
},
"duration" : {
"type" : "float"
},
"finalStatus" : {
"type" : "keyword"
},
"isCss" : {
"type" : "boolean"
},
"isPassed" : {
"type" : "boolean"
},
"machine" : {
"type" : "keyword"
},
"sha1" : {
"type" : "keyword"
},
"uuid" : {
"type" : "keyword"
},
"team" : {
"type" : "keyword"
}
I tried to divide this issue in several sub-issues. I think I need to aggregate the documents by benchId then sub-aggregate them by team, ordering them by date then taking the first document. Then I think need to use a script to calculate isPassed=True/all first attemps.
No idea how to visualize the result on Kibana though.
I manage to create aggregations with this search:
GET _search
{
"size" : 0,
"aggs": {
"benchId": {
"terms": {
"field": "benchId"
},
"aggs": {
"teams": {
"terms": {
"script": "doc['uut'].join(' & ')",
"size": 10
}
}
}
}
}
}
I get the result I want but I have difficulties to include order by date ascending with limitation to one document by uut

Related

ElasticSearch - how to get aggregation of aggregation

I am very new to elasticsearch
I work for a dating website that has data as follows:
Single - with fields: name, signUpDate, state, and other data fields.
Encounter - with fields: state, encounterDate, singlesInvolved, and other data fields.
These are my 2 indexes
Now I have to write a query that returns as follows:
For every state, how many singles, how many encounters, the longest time a single has been part of our website, and the average time a single has been part of our website
And also return one result that is that same average for all states
Like this example:
[
{ //this one is the average of all states
"singles": 45,
"dates": 18,
"minWaitingTime": 1644677979530,
"avgWaitingTime": 15603
},
{ //these are the averages of each state
"state": "MA",
"singles": 50,
"dates": 23,
"minWaitingTime": 1644677979530,
"avgWaitingTime": 15603
},
{
"state": "NY",
"singles": 39,
"dates": 13,
"minWaitingTime": 1644850558872,
"avgWaitingTime": 6033
}
]
I've been working on the query for each state individually but i dont know how to get an average of all states
so far what i have is this:
GET /single,encounter/_search
{
"size": 0,
"aggs": {
"bystate": {
"terms": {
"field": "state",
"size": 59
},
"aggs": {
"group-by-index": {
"terms": {
"field": "_index"
}
},
"min_date": {
"min": {
"field": "signedUpAt"
}
},
"avg_date": {
"avg": {
"field": "signedUpAt"
}
}
}
}
}
}
I don't know if there is a better way to do this, likewise I don't know how to calculate the average (singles, encounters, min_date and average_date average) for all states using this result
Every result of the previous query looks like this:
{
"key" : "MA",
"doc_count" : 164,
"avg_date" : {
"value" : 1.6457900076508965E12,
"value_as_string" : "2022-02-25T11:53:27.650"
},
"min_date" : {
"value" : 1.64467797953E12,
"value_as_string" : "2022-02-12T14:59:39.530"
},
"group-by-index" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "single",
"doc_count" : 135
},
{
"key" : "encounter",
"doc_count" : 29
}
]
}
},
I would really appreciate help on this one
Addition: index mapping.
Encounter:
{
"encounter" : {
"aliases" : { },
"mappings" : {
"properties" : {
"_class" : {
"type" : "keyword",
"index" : false,
"doc_values" : false
},
"avgAge" : {
"type" : "integer",
"index" : false,
"doc_values" : false
},
"application" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"createdAt" : {
"type" : "date",
"format" : "date_hour_minute_second_millis"
},
"encounterId" : {
"type" : "keyword"
},
"locationType" : {
"type" : "keyword",
"index" : false,
"doc_values" : false
},
"singleOneId" : {
"type" : "keyword",
"index" : false,
"doc_values" : false
},
"singleTwoId" : {
"type" : "keyword",
"index" : false,
"doc_values" : false
},
"serviceLine" : {
"type" : "keyword"
},
"state" : {
"type" : "keyword"
},
"rating" : {
"type" : "keyword"
}
}
},
"settings" : {
"index" : {
"refresh_interval" : "1s",
"number_of_shards" : "1",
"provided_name" : "encounter",
"creation_date" : "1643704661932",
"number_of_replicas" : "1",
"uuid" : "MliXQL_bRBKDN7_d8G_BYw",
"version" : {
"created" : "7100299"
}
}
}
}
}
And Single:
{
"single" : {
"aliases" : { },
"mappings" : {
"properties" : {
"_class" : {
"type" : "keyword",
"index" : false,
"doc_values" : false
},
"id" : {
"type" : "keyword"
},
"singleId" : {
"type" : "keyword"
},
"state" : {
"type" : "keyword"
},
"preferedGender" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
"settings" : {
"index" : {
"refresh_interval" : "1s",
"number_of_shards" : "1",
"provided_name" : "single",
"creation_date" : "1643704662136",
"number_of_replicas" : "1",
"uuid" : "Js_tqZfRRx-IxbjVRRN4wQ",
"version" : {
"created" : "7100299"
}
}
}
}
}
You can use avg bucket aggregation, where you can provide bucket_path and based on value it will calculate avg of entire aggregation.
Below is sample query:
{
"size": 0,
"aggs": {
"bystate": {
"terms": {
"field": "state",
"size": 59
},
"aggs": {
"group-by-index": {
"terms": {
"field": "_index"
}
},
"min_date": {
"min": {
"field": "signedUpAt"
}
},
"avg_date": {
"avg": {
"field": "signedUpAt"
}
}
}
},
"avg_all_state": {
"avg_bucket": {
"buckets_path": "bystate>avg_date"
}
}
}
}

How to split object (nested) into multiple columns in Elasticsearch / Kibana data table visualization

I have a nested object indexed in elasticsearch (7.10) and I need to visualize it with a kibana table. The problem is that kibana throws in the values from the nested field which have the same name in one column.
Part of the index:
{
"index" : {
"mappings" : {
"properties" : {
"data1" : {
"type" : "keyword"
},
"Details" : {
"type" : "nested",
"properties" : {
"Amount" : {
"type" : "float"
},
"Currency" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"DetailType" : {
"type" : "keyword"
},
"Price" : {
"type" : "float"
},
"Quantity" : {
"type" : "float"
},
"TotalAmount" : {
"type" : "float"
.......
The problem in the table:
How can I get three rows named Details each with one split term (e.g DetailType: "start_fee")?
Update:
I could query the nested object in the console:
GET _search
{
"query": {
"nested": {
"path": "Details",
"query": {
"bool": {
"must": [
{ "match": { "Details.DetailType": "energybased_fee" }}
]
}
},
"inner_hits": {
}
}}}
But how can I visualize in the table only the "inner_hits" value?

Elasticsearch multi_match + nested search

I am trying to execute a multi_match + nested search in ElasticSearch 6.4. I have the following mappings:
"name" : {
"type" : "text"
},
"status" : {
"type" : "short"
},
"user" : {
"type" : "nested",
"properties" : {
"first_name" : {
"type" : "text"
},
"last_name" : {
"type" : "text"
},
"pk" : {
"type" : "integer"
},
"profile" : {
"type" : "nested",
"properties" : {
"location" : {
"type" : "nested",
"properties" : {
"name" : {
"type" : "text",
"analyzer" : "html_strip"
}
}
}
}
}
}
},
And this is the html_strip analyzer:
"html_strip" : {
"filter" : [
"lowercase",
"stop",
"snowball"
],
"char_filter" : [
"html_strip"
],
"type" : "custom",
"tokenizer" : "standard"
}
And my current query is this one:
"query": {
"bool": {
"must": {
"multi_match": {
"query": 'Paris',
"fields": ['name', 'user.profile.location.name']
},
},
"filter": {
"term": {
"status": 1
}
}
}
}
Obviously searching for "Paris" in user.profile.location.name doesn't work. I was trying to adapt my code to following this answer https://stackoverflow.com/a/48836012/12007123 but without any success.
What I am basically trying to achieve, is to be able to search for a value in multiple fields, this may or may not be nested.
I was also checking this discussion https://discuss.elastic.co/t/multi-match-query-string-with-nested-and-non-nested-fields/118652/5 but everything I tried wasn't successful.
If I just search for name, the search is working fine.
Any tips on how can I achieve this the right way, would be much appreciated.
EDIT:
While I didn't get an answer to my initial question, I was following Nikolay's (#nikolay-vasiliev) comment and changed th mappings to Object instead of Nested.
At least now I am able to search in user.profile.location.name. This is how the new mapping for user looks like:
"user" : {
"properties" : {
"first_name" : {
"type" : "text"
},
"last_name" : {
"type" : "text"
},
"pk" : {
"type" : "integer"
},
"profile" : {
"properties" : {
"location" : {
"properties" : {
"name" : {
"type" : "text",
"analyzer" : "html_strip"
}
}
}
}
}
}
},

Kibana - Pie-Chart with sum over two different fields

In an index I have two mappings.
"mappings" : {
"deliveries" : {
"properties" : {
"#timestamp": { "type" : "date", "format": "yyyy-MM-dd" },
"receiptName" : { "type" : "text" },
"amountDelivered" : { "type" : "integer" },
"amountSold" : { "type" : "integer" },
"sellingPrice" : { "type" : "float" },
"earned" : { "type" : "float" }
}
},
"expenses" : {
"properties" : {
"#timestamp": { "type" : "date", "format": "yyyy-MM-dd" },
"description": { "type" : "text" },
"amount": { "type": "float" }
}
}
}
Now I wanted to create a simple Pie Chart in Kibana for sumarize up deliveries.earned and expenses.amount.
Is this possible or do I have to switch to an client application? The number of documents (2 or 3 a month) is really to less to start some development here xD
You can create a simple scripted_field through Kibana which maps amount and earned fields to the same field, called transaction_amount.
Painless script:
if(doc['earned'].length > 0) { return doc['earned']; } else { return doc['amount']; }
Then you can create a Pie Chart with "Slice Size" configured as the sum of transaction_amount and "Split Slices" configured as a Terms Aggregation on _type.

elasticsearch: search within text

For example if name as P.Shanmukha Sharma and if user searches for Shanmukha will not be available for search result. its returning only for P.Shanmukha and Sharma, is there any way if i will search Shanmukha and it will return result?
"user" : {
"properties" : {
"city" : {
"type" : "string",
"analyzer" : "autocomplete",
"search_analyzer" : "standard"
},
"created" : {
"type" : "date",
"format" : "strict_date_optional_time||epoch_millis"
},
"id" : {
"type" : "long"
},
"latitude" : {
"type" : "double"
},
"longitude" : {
"type" : "double"
},
"profile_image" : {
"type" : "string"
},
"state" : {
"type" : "string",
"analyzer" : "autocomplete",
"search_analyzer" : "standard"
},
"super_verification" : {
"type" : "string"
},
"type" : {
"type" : "string"
},
"username" : {
"type" : "string",
"analyzer" : "autocomplete",
"search_analyzer" : "standard"
}
}
}
username is defined as a search analyzer
and search query is
def EsSearch(self, index, page, size, searchTerm):
body = {
'query': {
'match': searchTerm
},
'sort': {
'created': {
'order': 'desc'
}
},
'filter': {
'term': {
'super_verification': 'verified'
}
}
}
res = self.conn.search(index=index, body=body)
output = []
for doc in res['hits']['hits']:
output.append(doc['_source'])
return output
so doing so much of research on ES i Got this solution with wildcard. Thanks EveryOne
{
"query": {
"wildcard": {
"username": {
"value": "*Shanmukha*"
}
}
}
}
Basically, 2 way in to do so,
By GET method and URL:
http://localhost:9200/your_index/your_type/_search?q=username:*Shanmukha*&pretty=true
By Fuzzy Query
as given by #krrish this one:

Resources