I have a question that i want to search a result use suggest.
My type schema like this
`
{
"name": {
"input": [
"uers1"
]
},
"usertype": 1
}{
"name": {
"input": [
"uers2"
]
},
"usertype": 2
}`
I want search data by suggest, the query like these
`{
"suggest": {
"person_suggest": {
"text": "us",
"completion": {
"field": "name"
}
}
}
}`
And the result like these
`{
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"person_suggest": [
{
"text": "word",
"offset": 0,
"length": 4,
"options": [
{
"name": "user1",
"usertype": 1,
"score": 1
},
{
"text": "user2",
"usertype": 2,
"score": 1
}
]
}
]
} `
But I only want the result is usertype = 1, like add a where condition in mysql. Any body can help me ?I want a DSL query.Thx a lot.
You can'nt filter in completion suggest queries. A solution to your problem to make different completion fields for each usertype or use standard queries with nGram analyzers.
Related
We use enterprise search indexes to store items that can be tagged by multiple tenants.
e.g
[
{
"id": 1,
"name": "document 1",
"tags": [
{ "company_id": 1, "tag_id": 1, "tag_name": "bla" },
{ "company_id": 2, "tag_id": 1, "tag_name": "bla" }
]
}
]
I'm looking to find a way to retrieve all documents with only the tags of company 1
This request:
{
"query": "",
"facets": {
"tags": {
"type": "value"
}
},
"sort": {
"created": "desc"
},
"page": {
"size": 20,
"current": 1
}
}
Is coming back with
...
"facets": {
"tags": [
{
"type": "value",
"data": [
{
"value": "{\"company_id\":1,\"tag_id\":1,\"tag_name\":\"bla\"}",
"count": 1
},
{
"value": "{\"company_id\":2,\"tag_id\":1,\"tag_name\":\"bla\"}",
"count": 1
}
]
}
],
}
...
Can I modify the request in a way such that I get no tags by "company_id" = 2 ?
I have a solution that involves modifying the results to strip the extra data after they are retrieved but I'm looking for a better solution.
I'm using Elasticsearch term suggester for spell correction. my index contains huge list of ads. Each ad has subject and body fields. I've found a problematic example for which the suggester is not suggesting correct suggestions.
I have lots of ads whose subject contains word "soffa" and also 5 ads whose subject contain word "sofa". Ideally, when I send "sofa" (wrong spelling) as text to suggester, it should return "soffa" (correct spelling) as suggestions (since soffa is correct spell and most of ads contains "soffa" and only few ads contains "sofa" (wrong spell)).
Here is my suggester query body :
{
"suggest": {
"text": "sofa",
"subjectSuggester": {
"term": {
"field": "subject",
"suggest_mode": "popular",
"min_word_length": 1
}
}
}
}
When I send above query, I get below response :
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 0,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"suggest": {
"subjectSuggester": [
{
"text": "sof",
"offset": 0,
"length": 4,
"options": [
{
"text": "soff",
"score": 0.6666666,
"freq": 298
},
{
"text": "sol",
"score": 0.6666666,
"freq": 101
},
{
"text": "saf",
"score": 0.6666666,
"freq": 6
}
]
}
]
}
}
As you see in above response, it returned "soff" but not "soffa" although I have lots of docs whose subject contains "soffa".
I even played with parameters like suggest_mode and string_distance but still no luck.
I also used phrase suggester instead of term suggester but still same. Here is my phrase suggester query :
{
"suggest": {
"text": "sofa",
"subjectuggester": {
"phrase": {
"field": "subject",
"size": 10,
"gram_size": 3,
"direct_generator": [
{
"field": "subject.trigram",
"suggest_mode": "always",
"min_word_length":1
}
]
}
}
}
}
I somehow think it doesn't work when one character is missing instead of being misspelled. in the "soffa" example, one "f" is missing.
while it works fine for misspells e.g it works fine for "vovlo".
When I send "vovlo" it gives me "volvo".
Any help would be hugely appreciated.
Try changing the "string_distance".
{
"suggest": {
"text": "sof",
"subjectSuggester": {
"term": {
"field": "title",
"min_word_length":2,
"string_distance":"ngram"
}
}
}
}
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters.html#term-suggester
I've found the workaround myself.
I added ngram filter and analyzer with max_shingle_size 3 which means trigram, then added a subfield with that analyzer (trigram) and performed suggester query on that field (instead of actual field) and it worked.
Here is the mapping changes :
{
"settings": {
"analysis": {
"filter": {
"shingle": {
"type": "shingle",
"min_shingle_size": 2,
"max_shingle_size": 3
}
},
"analyzer": {
"trigram": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"shingle"
],
"char_filter": [
"diacritical_marks_filter"
]
}
}
}
},
"mappings": {
"properties": {
"subject": {
"type": "text",
"fields": {
"trigram": {
"type": "text",
"analyzer": "trigram"
}
}
}
}
}
}
And here is my corrected query :
{
"suggest": {
"text": "sofa",
"subjectSuggester": {
"term": {
"field": "subject.trigram",
"suggest_mode": "popular",
"min_word_length": 1,
"string_distance": "ngram"
}
}
}
}
Note that I'm performing suggester to subject.trigram instead of subject itself.
Here is the result :
{
"suggest": {
"subjectSuggester": [
{
"text": "sofa",
"offset": 0,
"length": 4,
"options": [
{
"text": "soffa",
"score": 0.8,
"freq": 282
},
{
"text": "soffan",
"score": 0.6666666,
"freq": 5
},
{
"text": "som",
"score": 0.625,
"freq": 102
},
{
"text": "sol",
"score": 0.625,
"freq": 82
},
{
"text": "sony",
"score": 0.625,
"freq": 50
}
]
}
]
}
}
As you can see above soffa appears as first suggestion.
There is sth weird in your result for the term suggester for the word sofa, take a look at the text that is being corrected:
"suggest": {
"subjectSuggester": [
{
"text": "sof",
"offset": 0,
"length": 4,
"options": [
{
"text": "soff",
"score": 0.6666666,
"freq": 298
},
{
"text": "sol",
"score": 0.6666666,
"freq": 101
},
{
"text": "saf",
"score": 0.6666666,
"freq": 6
}
]
}
]
}
As you can see it's sof and not sofa which means the correction is not for sofa but instead it's for sof, so I doubt that this issue is related to the analyzer you were using on this field, especially when looking at the results soff instead of soffa it's removing the last a
Well.. I am quite "newb" regarding ES so regarding aggregation... there is no words in the dictionary to describe my level regarding it :p
Today I am facing an issue where I am trying to create a query that should execute something similar to a SQL DISTINCT, but among filters. I have this document given (of course, an abstraction of the real situation):
{
"id": "1",
"createdAt": 1626783747,
"updatedAt": 1626783747,
"isAvailable": true,
"kind": "document",
"classification": {
"id": 1,
"name": "a_name_for_id_1"
},
"structure": {
"material": "cartoon",
"thickness": 5
},
"shared": true,
"objective": "stackoverflow"
}
As all the data of the above document can vary, I however have some values that can be redundant, such as classification.id, kind, structure.material.
So, in order to fullfit my requirements, I would like to "group by" these 3 fields in order to have a unique combination of each. If we go deeper, with the following data, I should get the following possibilities:
[{
"id": "1",
"createdAt": 1626783747,
"updatedAt": 1626783747,
"isAvailable": true,
"kind": "document",
"classification": {
"id": 1,
"name": "a_name_for_id_1"
},
"structure": {
"material": "cartoon",
"thickness": 5
},
"shared": true,
"objective": "stackoverflow"
},
{
"id": "2",
"createdAt": 1626783747,
"updatedAt": 1626783747,
"isAvailable": true,
"kind": "document",
"classification": {
"id": 2,
"name": "a_name_for_id_2"
},
"structure": {
"material": "iron",
"thickness": 3
},
"shared": true,
"objective": "linkedin"
},
{
"id": "3",
"createdAt": 1626783747,
"updatedAt": 1626783747,
"isAvailable": false,
"kind": "document",
"classification": {
"id": 2,
"name": "a_name_for_id_2"
},
"structure": {
"material": "paper",
"thickness": 1
},
"shared": false,
"objective": "tiktok"
},
{
"id": "4",
"createdAt": 1626783747,
"updatedAt": 1626783747,
"isAvailable": true,
"kind": "document",
"classification": {
"id": 3,
"name": "a_name_for_id_3"
},
"structure": {
"material": "cartoon",
"thickness": 5
},
"shared": false,
"objective": "snapchat"
},
{
"id": "5",
"createdAt": 1626783747,
"updatedAt": 1626783747,
"isAvailable": true,
"kind": "document",
"classification": {
"id": 3,
"name": "a_name_for_id_3"
},
"structure": {
"material": "paper",
"thickness": 1
},
"shared": true,
"objective": "twitter"
},
{
"id": "6",
"createdAt": 1626783747,
"updatedAt": 1626783747,
"isAvailable": false,
"kind": "document",
"classification": {
"id": 3,
"name": "a_name_for_id_3"
},
"structure": {
"material": "iron",
"thickness": 3
},
"shared": true,
"objective": "facebook"
}
]
based on the above, I should get the following results in the "buckets":
document 1 cartoon
document 2 iron
document 2 paper
document 3 cartoon
document 3 paper
document 3 iron
Of course, for the sake of this example (and to make it easier, I yet don't have any duplicates)
However, on top of that, I need some "pre-filters" as I only want:
Documents that are available isAvailable=true
Documents'structure's thickness should range between 2 and 4 included: 2 >= structure.thickness >= 4
Document's that are shared shared=true
I should so then get only the following combinations compared to the first set of results:
document 1 cartoon -> not a valid result, thickness > 4
document 2 iron
document 2 paper -> not a valid result, isAvailable != true
document 3 cartoon -> not a valid result, thickness > 4
document 3 cartoon -> not a valid result, thickness < 2
document 3 iron -> not a valid result, isAvailable != true
If you're still reading, well.. thanks! xD
So, as you can see, I need all the possible combination of this field regarding the static pattern kind <> classification_id <> structure_material that are matching the filters regarding isAvailable, thickness, shared.
Regarding the output, the hits doesn't matter to me as I don't need the documents but only the combination kind <> classification_id <> structure_material :)
Thanks for any help :)
Max
You can got with Cardinatily aggregations with your existing filters.Please check this url and let me know if you have any queries.
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html
Thanks to a colleague, I could finally get it working as expected!
QUERY
GET index-latest/_search
{
"size": 0,
"query": {
"bool": {
"filter": [
{
"term": {
"isAvailable": true
}
},
{
"range": {
"structure.thickness": {
"gte": 2,
"lte": 4
}
}
},
{
"term": {
"shared": true
}
}
]
}
},
"aggs": {
"my_agg_example": {
"composite": {
"size": 10,
"sources": [
{
"kind": {
"terms": {
"field": "kind.keyword",
"order": "asc"
}
}
},
{
"classification_id": {
"terms": {
"field": "classification.id",
"order": "asc"
}
}
},
{
"structure_material": {
"terms": {
"field": "structure.material.keyword",
"order": "asc"
}
}
}
]
}
}
}
}
The given result is then:
{
"took": 11,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"aggregations": {
"my_agg_example": {
"after_key": {
"kind": "document",
"classification_id": 2,
"structure_material": "iron"
},
"buckets": [
{
"key": {
"kind": "document",
"classification_id": 2,
"structure_material": "iron"
},
"doc_count": 1
}
]
}
}
}
So, as we can see, we get the following bucket:
{
"key": {
"kind": "document",
"classification_id": 2,
"structure_material": "iron"
},
"doc_count": 1
}
Note: Be careful regarding the type of your field.. putting .keyword on classification.id was resulting to no results in the buckets... .keyword should be use only on types such as string (as far as I understood, correct me if I am wrong)
As expected, we have the following result (compared to the initial question):
document 2 iron
Note: Be careful, the order of the elements within the aggs.<name>.composite.sources does play a role in the returned results.
Thanks!
I'm playing with suggesters currently and wonder why the resultset has always multiple equal objects.
Example request:
{"suggest": {
"test" : {
"text": "holz",
"term" : {
"field":"title"
}
}
}}
Result:
{"suggest": {
"test": [
{
"text": "holz",
"offset": 0,
"length": 4,
"options": [...]
},
{
"text": "holz",
"offset": 0,
"length": 4,
"options": [...]
},
{
"text": "holz",
"offset": 0,
"length": 4,
"options": [...]
},
{
"text": "holz",
"offset": 0,
"length": 4,
"options": [...]
}
]
}}
Even the objects in options are exactly the same. It's always the same, no matter what text I want suggestions for. Is there any explanation for this?
ES version is 2.3.4
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html#skip_duplicates
You have to add the skip duplicates param.
Have a nice day,
Daniel
Have you tried adding payloads to your docs ?
https://www.elastic.co/guide/en/elasticsearch/reference/2.1/search-suggesters-completion.html
curl -X PUT 'localhost:9200/music/song/1?refresh=true' -d '{
"name" : "Nevermind",
"suggest" : {
"input": [ "Nevermind", "Nirvana" ],
"output": "Nirvana - Nevermind",
**"payload" : { "artistId" : 2321 },**
"weight" : 34
}
}'
when i request the suggester with
{
"my-title-suggestions-1": {
"text": "tücher ",
"term": {
"field": "name",
}
},
"my-title-suggestions-2": {
"text": "tüchers ",
"term": {
"field": "name"
}
}
}
it returns
{
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"my-title-suggestions-1": [
{
"text": "tücher",
"offset": 0,
"length": 6,
"options": []
}
],
"my-title-suggestions-2": [
{
"text": "tüchers",
"offset": 0,
"length": 7,
"options": [
{
"text": "tücher",
"score": 0.8333333,
"freq": 6
}
]
}
]
}
i wonder why it does not return the exact match with the first suggester?
the second suggester obviously has that result.
can i add other options which will resolve this behavior?
edit:
the minimal mapping is just this ...
{
"name" : {
"analyzer" : "standard",
"type" : "string"
}
}
To add to what #ChintanShah25 said: According to https://www.elastic.co/guide/en/elasticsearch/reference/2.0/search-suggesters-term.html (see suggest_mode) the Term suggester will by default:
Only provide suggestions for suggest text terms that are not in the index.
I dont think you can do that and I am not sure why do you want exact match in suggestions, after all they are "suggestions".
Normally they are used to check misspelling. It will give you candidate suggestions that are similar and fall in edit distance of 2 for the word you entered.