Statistics with elasticsearch suggest api - elasticsearch

Im trying to use the suggest-api in elasticsearch. And i would like to get total hits, as you get when you are doing a regular query.
As it is now, if i ask this of elasticsearch
/_suggest
{
"name_suggest": {
"text": "derp",
"completion": {
"size": 10,
"field": "name.sugest"
}
}
}
I get 10 answers, but no information on how many other matches there is.
So the question is, is there a way yo get hold of this information, using the suggest feature? for example using facets? (i have tried but not got anything working)

You can get your 10 answers and build aggregation query like this:
{
"query" : {
"query_string" : {
"query" : "answer1 OR answer2 OR ...",
"fields" : [ "name.sugest" ]
}
},
"aggregations" : {
"name_sugest" : {
"terms" : {
"field" : "name.sugest"
}
}
}
}

Related

Restructuring Elasticsearch model for fast aggregations

My business domain is real estate listings, and i'm trying to build a faceted UI. So i need to do aggregations to know how many listings have 1 beds, 2 beds, how many in this price range, how many with a pool etc etc. Pretty standard stuff.
Currently my model is like this:
{
"beds": 1,
"baths": 1,
"price": 100000,
"features": ['pool','aircon'],
"inspections": [{
"startsOn": "2019-01-20"
}]
}
To build my faceted UI, i'm doing multiple aggregations, e.g:
{
"aggs" : {
"beds" : {
"terms" : { "field" : "beds" }
},
"baths" : {
"terms" : { "field" : "baths" }
},
"features" : {
"terms" : { "field" : "features" }
}
}
}
You get the idea. If i've got 10 fields, i'm doing 10 aggregations.
But after seeing this article, i'm thinking i should just re-structure my model to be like this:
{
"beds": 1,
"baths": 1,
"price": 100000,
"features": ['pool','aircon'],
"attributes": ['bed_1','bath_1','price_100000-200000','has_pool','has_aircon','has_inspection_tomorrow']
}
Then i only need the 1 agg:
{
"aggs": {
"attributes": {
"terms": {
"field": "attributes"
}
}
}
}
So i've got a couple of questions.
Is the only drawback in this approach that logic is moved to the client? If so, im happy with this - for performance, since i don't see this logic changing very often.
Can i leverage this field in my queries too? For example, what if i wanted to match all documents with 1 bedroom and price = 100000 and with a pool, etc. Terms queries work on an 'any' match, but how can i find documents where the array of values contain all the provided terms?
Alternatively, if you can think of a better structure for modelling for search speed, please let me know!
Thanks
For the second point your can use the terms set query (doc here).
This query is like a terms query, but you will have control over how many terms must match.
You can configure it through a script like that :
GET /my-index/_search
{
"query": {
"terms_set": {
"codes" : {
"terms" : ["bed_1","bath_1","price_100000-200000"],
"minimum_should_match_script": {
"source": "params.num_terms"
}
}
}
}
}
will require all params to match

Sum of all the values of a field in Kibana using elasticsearch query DSL

I have seen quite a few similar questions answered but they are all for older versions of Kibana, or do not actually help with my particular question.
I want to find the sum of all values in a specific field,the kibana docs give the following example code for creating the sum of a field.
POST /sales/_search?size=0
{
"query" : {
"constant_score" : {
"filter" : {
"match" : { "type" : "hat" }
}
}
},
"aggs" : {
"hat_prices" : { "sum" : { "field" : "price" } }
}
}
Based on this, the following should sum all the values in the field "tweetSentiment.polarity"
(POST /sales/_search?size=0 was removed because the UI gives an "unexpected 'p'" error with that line in.)
{
"query" : {
"constant_score" : {
"filter" : {
"match" : { "type" : "number" }
}
}
},
"aggs" : {
"hat_prices" : { "sum" : { "field" : "tweetSentiment.polarity" } }
}
}
Changing around the values for "type" and "field" between all the possible combinations of things they could be did not solve the issue either. My best guess is that this is not actually the code I want, especially after digging deep into how to create the query I am looking for.

Elasticsearch tags aggregation with specific keys

I have an array field with tags and fixed list of 10 most popular tags (I got it from previous terms aggregations call).
Can I determine document counts for current search exactly with this keys (tags from my array)? Like terms aggregation, but for specific keys only.
Thanks!
Take a look at filtering terms aggregations, especially the include parameter. It would be easier to show you if you provided a specific example of your problem, but here is the example from the docs that should help you figure out how to solve your problem:
{
"aggs" : {
"JapaneseCars" : {
"terms" : {
"field" : "make",
"include" : ["mazda", "honda"]
}
},
"ActiveCarManufacturers" : {
"terms" : {
"field" : "make",
"exclude" : ["rover", "jensen"]
}
}
}
}
You can use include or exclude keywords inside aggregations to filter your keys.
{
"size": 0,
"aggs": {
"my_agg": {
"terms": {
"field": "agg_field",
"include": [key1,key2,key3]
}
}
}
}

Elastic(search): How to structure nested queries correctly?

I'm currently quite confuse about the structuring of queries in elastic. Let me explain what I mean with the following template that works fine for me:
{
"template" : {
"query" : {
"filtered" : {
"query" : {
"bool" : {
"must" : [
{ "match" : {
"user" : "{{param_user}}"
} },
{ "match" : {
"session" : "{{param_session}}"
} },
{ "range" : {
"date" : {
"gte" : "{{param_from}}",
"lte" : "{{param_to}}"
}
} }
]
}
}
}
}
}
}
Ok so I want to get entries of a specific session of a user in a certain time period. Now if you take a llok at this link http://www.elastic.co/guide/en/elasticsearch/guide/current/combining-filters.html you can find the following query:
{
"query" : {
"filtered" : {
"filter" : {
"bool" : {
"should" : [
{ "term" : {"price" : 20}},
{ "term" : {"productID" : "XHDK-A-1293-#fJ3"}}
],
"must_not" : {
"term" : {"price" : 30}
}
}
}
}
}
}
In this example we have right after the "filtered" the "filter" keyword. However if I exchange my second "query" with a "filter" as in the example , my template won't work anymore. This is really counterintuitive and I payed alot of time to figure this out. A̶l̶s̶o̶ ̶I̶ ̶d̶o̶n̶'̶t̶ ̶u̶n̶d̶e̶r̶s̶t̶a̶n̶d̶ ̶w̶h̶y̶ ̶w̶e̶ ̶n̶e̶e̶d̶ ̶t̶o̶ ̶p̶u̶t̶ ̶e̶v̶e̶r̶y̶ ̶f̶i̶l̶t̶e̶r̶ ̶i̶n̶ ̶s̶e̶p̶a̶r̶a̶t̶e̶ ̶̶{̶ ̶}̶̶ ̶e̶v̶e̶n̶ ̶t̶h̶o̶u̶g̶h̶ ̶t̶h̶e̶y̶ ̶a̶r̶e̶ ̶a̶l̶r̶e̶a̶d̶y̶ ̶s̶e̶p̶a̶r̶a̶t̶e̶d̶ ̶b̶y̶ ̶t̶h̶e̶ ̶a̶r̶r̶a̶y̶ ̶s̶y̶n̶t̶a̶x̶.̶
Another issue I had was that I suggested to match several fields I can just type smth like:
{
"query" : {
"match" : {
"user" : "{{param_user}}",
"session" : "{{param_session}}"
}
}
}
but it seemed that I have to use a bool query which I didn't know of, so I searched for 'elastic multi match' but got something completely different.
My question: where can I find how to structure a query properly (smth like a PEG)? The documentation only give basic examples but doesn't state what we can actually do and how.
Best regards,
Jan
Edit: Ok I just found by accident that I cannot exchange "query" with "filter" as "match" is a query and not a filter. But then again what about "range"? It seems to be a query as well as a filter... Is there a summary of keywords specifying in which context they can be used?
Is there a summary of keywords specifying in which context they can be used?
I wouldn't consider that as keywords. It's just there are both queries and filters with the same names (but not all of them).
Here is everything you need. For example there are both range query and filter. All you need is to understand the difference between filters and queries.
For example, if you want to move range section from query to filter, you can do that like shown in the code below (not tested). Since your code already contains filtered type of query, you can just create filter section right after query section.
{
"template": {
"query": {
"filtered": {
"query": {
"bool": {
"must": [
{
"match": {
"user": "{{param_user}}"
}
},
{
"match": {
"session": "{{param_session}}"
}
}
]
}
},
"filter": {
"range": {
"date": {
"gte": "{{param_from}}",
"lte": "{{param_to}}"
}
}
}
}
}
}
}
Just remember that you can filter only not analyzed fields.

match or term query on a long property for exact match?

My document has the following mapping property:
"sid" : {"type" : "long", "store": "yes", "index": "not_analyzed"},
This property has only one long value for each record. I would like to query this property. I tried the following two queries:
{
"query" : {
"term" : {
"sid" : 10
}
}
}
{
"query" : {
"match" : {
"sid" : 10
}
}
}
Both queries work and return the target document. My question: which one is more efficient? And why?
You want to use a term query, and if you want to be even more effecient, use a filtered query so your results get cached.
GET index1/test/_search
{
"query": {
"filtered": {
"filter": {
"term": {
"sid": 10
}
}
}
}
}
Both work like the same way as you mentioned. As distinguished from match query the term query matches documents that have fields that contain a term (not analyzed!). So my opinion is that term query is more efficient in your case, because no analyzing have to be done.See:http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-term-query.html

Resources