How to search multiple fields using * by using Common Terms - elasticsearch

I have the following mapping:
"mappings": {
"mydoctype": {
....
"properties": {
"title": {
"properties": {
"en": {
...
},
"zh_CN": {
...
},
"zh_TW": {
...
}
...
}
},
...
}
}
}
I would like to perform Common Terms on the title.* fields, but the following query does not return any results or error message.
"common" : {
"title.*" : {
"query" : "sleep",
"cutoff_frequency" : 0.001
}
}
However, if I change the above "title.*" to "title.en", then I am able to get returned results.
How can I do the "title.*" search with Common Terms? Or can I?

If you really want to use a common terms query, just know that it only works on a single field, i.e. not several and not wildcarded ones.
Otherwise, you can use a multi_match query with the cutoff_frequency like in your other question.

is this working ? i use it to search a wildcarded (with*) value on many field
{
"query" : {
"dis_max" : {
"tie_breaker" : 0,
"boost" : 1,
"queries" : [
{"wildcard" : {"title.en" : "sic*"}},
{ "wildcard" : { "title.zh_CN" : "sic*"}},
{ "wildcard" : { "title.zh_TW" : "*sic*" }}
]
}
}
}
dis_max les you run multiple queries and concat the result

Related

ElasticSearch - Fuzzy search in list elements

I've got some documents stored in ElasticSearch like this:
{
"tag" : ["tag1", "tag2", "tag3"]
...
}
I want to search through the "tag" field. I know that It should work with a query like:
{
"query":
{
"match" : {"tag" : "tag1"}
}
}
But, I don't want to use a match, I want to use a fuzzy search through the list, for example, something like:
{
"query":
{
"fuzzy" : {"tag" : "tagg1"}
}
}
The problem is, the above query doesn't return anything. What should I use instead?
What is the type of tag field in your elasticsearch mapping ?
I have tried with following type for tag field & elastisearch version is 7.2
"tag" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
And working well for me.
Query With elastic fuzzy will be :
{
"query":
{
"fuzzy": {"tag" : "tagg1"}
}
}

Query documents with access control filter

Each document in my Elasticsearch index has two access control lists containing user ids. One is an allow list, the other is a deny list. I am trying to add a filter to a given query that considers these ACLs. I thought I could use a bool query with a must clause for the given query, a filter clause for the allow list, and a must_not clause for the deny list. What I have so far (example for user 1):
{
"bool" : {
"must" : {
[given query]
},
"filter" : [ {
"match" : {
"acl.allow" : {
"query" : "/user/1",
"type" : "boolean"
}
}
}],
"must_not" : [ {
"match" : {
"acl.deny" : {
"query" : "/user/1",
"type" : "boolean"
}
}
}]
}
}
Unfortunately, this query does not return the desired result. It returns objects that have not listed user 1 in their allow list (a behavior I don't understand). Also, it (obviously) ignores objects with empty access control lists (which should be visible to anyone). Any suggestions to fix that?
I figured it out. First of all, using match isn't really a good solution for that kind of query—due to its analyzer. Using term though left me puzzled why I did not get any results. Term queries only return results if the corresponding field is set to not_analyzed. Thus I changed my mapping:
"acl": {
"properties": {
"allow": {
"type": "string",
"index": "not_analyzed"
},
"deny": {
"type": "string",
"index": "not_analyzed"
}
}
}
My second problem—treating objects with empty ACLs as visible to anyone—was solved using exists nested in must_not nested in bool. This is recommended as substitute for the deprecated missing query. My final query looks like this and passed all ACL related tests I could think of.
{
"bool" : {
"must" : {
[given query]
},
"filter" : {
"bool" : {
"should" : [ {
"terms" : {
"acl.allow" : [ "/user/1" ]
}
}, {
"bool" : {
"must_not" : {
"exists" : {
"field" : "acl.allow"
}
}
}
} ]
}
},
"must_not" : {
"terms" : {
"acl.deny" : [ "/user/1" ]
}
}
}
}

Elastic(search): How to structure nested queries correctly?

I'm currently quite confuse about the structuring of queries in elastic. Let me explain what I mean with the following template that works fine for me:
{
"template" : {
"query" : {
"filtered" : {
"query" : {
"bool" : {
"must" : [
{ "match" : {
"user" : "{{param_user}}"
} },
{ "match" : {
"session" : "{{param_session}}"
} },
{ "range" : {
"date" : {
"gte" : "{{param_from}}",
"lte" : "{{param_to}}"
}
} }
]
}
}
}
}
}
}
Ok so I want to get entries of a specific session of a user in a certain time period. Now if you take a llok at this link http://www.elastic.co/guide/en/elasticsearch/guide/current/combining-filters.html you can find the following query:
{
"query" : {
"filtered" : {
"filter" : {
"bool" : {
"should" : [
{ "term" : {"price" : 20}},
{ "term" : {"productID" : "XHDK-A-1293-#fJ3"}}
],
"must_not" : {
"term" : {"price" : 30}
}
}
}
}
}
}
In this example we have right after the "filtered" the "filter" keyword. However if I exchange my second "query" with a "filter" as in the example , my template won't work anymore. This is really counterintuitive and I payed alot of time to figure this out. A̶l̶s̶o̶ ̶I̶ ̶d̶o̶n̶'̶t̶ ̶u̶n̶d̶e̶r̶s̶t̶a̶n̶d̶ ̶w̶h̶y̶ ̶w̶e̶ ̶n̶e̶e̶d̶ ̶t̶o̶ ̶p̶u̶t̶ ̶e̶v̶e̶r̶y̶ ̶f̶i̶l̶t̶e̶r̶ ̶i̶n̶ ̶s̶e̶p̶a̶r̶a̶t̶e̶ ̶̶{̶ ̶}̶̶ ̶e̶v̶e̶n̶ ̶t̶h̶o̶u̶g̶h̶ ̶t̶h̶e̶y̶ ̶a̶r̶e̶ ̶a̶l̶r̶e̶a̶d̶y̶ ̶s̶e̶p̶a̶r̶a̶t̶e̶d̶ ̶b̶y̶ ̶t̶h̶e̶ ̶a̶r̶r̶a̶y̶ ̶s̶y̶n̶t̶a̶x̶.̶
Another issue I had was that I suggested to match several fields I can just type smth like:
{
"query" : {
"match" : {
"user" : "{{param_user}}",
"session" : "{{param_session}}"
}
}
}
but it seemed that I have to use a bool query which I didn't know of, so I searched for 'elastic multi match' but got something completely different.
My question: where can I find how to structure a query properly (smth like a PEG)? The documentation only give basic examples but doesn't state what we can actually do and how.
Best regards,
Jan
Edit: Ok I just found by accident that I cannot exchange "query" with "filter" as "match" is a query and not a filter. But then again what about "range"? It seems to be a query as well as a filter... Is there a summary of keywords specifying in which context they can be used?
Is there a summary of keywords specifying in which context they can be used?
I wouldn't consider that as keywords. It's just there are both queries and filters with the same names (but not all of them).
Here is everything you need. For example there are both range query and filter. All you need is to understand the difference between filters and queries.
For example, if you want to move range section from query to filter, you can do that like shown in the code below (not tested). Since your code already contains filtered type of query, you can just create filter section right after query section.
{
"template": {
"query": {
"filtered": {
"query": {
"bool": {
"must": [
{
"match": {
"user": "{{param_user}}"
}
},
{
"match": {
"session": "{{param_session}}"
}
}
]
}
},
"filter": {
"range": {
"date": {
"gte": "{{param_from}}",
"lte": "{{param_to}}"
}
}
}
}
}
}
}
Just remember that you can filter only not analyzed fields.

Trouble with has_parent query containing scripted function_score

I have two document types, in a parent-child relationship:
"myParent" : {
"properties" : {
"weight" : {
"type" : "double"
}
}
}
"myChild" : {
"_parent" : {
"type" : "myParent"
},
"_routing" : {
"required" : true
}
}
The weight field is to be used for custom scoring/sorting. This query directly against the parent documents works as intended:
{
"query" : {
"function_score" : {
"script_score" : {
"script" : "_score * doc['weight'].value"
}
}
}
}
However, when trying to do similar scoring for the child documents with a has_parent query, I get an error:
{
"query" : {
"has_parent" : {
"query" : {
"function_score" : {
"script_score" : {
"script" : "_score * doc['weight'].value"
}
}
},
"parent_type" : "myParent",
"score_type" : "score"
}
}
}
The error is:
QueryPhaseExecutionException[[myIndex][3]: query[filtered(ParentQuery[myParent](filtered(function score (ConstantScore(:),function=script[_score * doc['weight'].value], params [null]))->cache(_type:myParent)))->cache(_type:myChild)],from[0],size[10]: Query Failed [failed to execute context rewrite]]; nested: ElasticSearchIllegalArgumentException[No field found for [weight] in mapping with types [myChild]];
It seems like instead of applying the scoring function to the parent and then passing its result to the child, ES is trying to apply the scoring function itself to the child, causing the error.
If I don't use score for score_type, the error doesn't occur, although the results scores are then all 1.0, as documented.
What am I missing here? How can I query these child documents with custom scoring based on a parent field?
This I would say is a bug: it is using the myChild mapping as the default context, even though you are inside a has_parent query. But I'm not sure how easy the bug would be to fix. properly.
However, you can work around it by including the type name in the full field name:
curl -XGET "http://localhost:9200/t/myChild/_search" -d'
{
"query": {
"has_parent": {
"query": {
"function_score": {
"script_score": {
"script": "_score * doc[\"myParent.weight\"].value"
}
}
},
"parent_type": "myParent",
"score_type": "score"
}
}
}'
I've opened an issue to see if we can get this fixed #4914
I think the problem is that you are trying to score child documents based on a field in the parent document and that the function score should really be the other way round.
To solve the problem my idea would be to store the parent/child relation and the score with the child documents. Then you would filter for child documents and score them according to the weight in the child document.
An example:
"myParent" : {
"properties" : {
"name" : {
"type" : "string"
}
}
}
"myChild" : {
"_parent" : {
"type" : "myParent"
},
"_routing" : {
"required" : true
},
"properties": {
"weight" : {
"type" : "double"
}
}
}
Now you could use a has_parent filter to select all child documents that have a certain parent and then score them using the function score:
{
"query": {
"filtered": {
"query": {
"function_score" : {
"script_score" : {
"script" : "_score * doc['weight'].value"
}
}
},
"filter": {
"has_parent": {
"parent_type": "myParent",
"query": {
"term": {
"name": "something"
}
}
}
}
}
}
}
So if parent documents were blog posts and child comments, then you could filter all posts and score the comments based on weight. I doubt that scoring childs based on parents is possible though I might be wrong :)
Disclaimer: 1st post to stack overflow...

elasticsearch: can I defined synonyms with boost?

Let's say A, B, C are synonyms, I want to define B is "closer" to A than C
so that when I search the keyword A, in the searching results, A comes the first, B comes the second and C comes the last.
Any help?
There is no search-time mechanism (as of yet) to differentiate between matches on synonyms and source field. This is because, when indexed, a field's synonyms are placed into the inverted index alongside the original term, leaving all words equal.
This is not to say however that you cannot do some magic at index time to glean the information you want.
Create an index with two analyzers: one with a synonym filter, and one without.
PUT /synonym_test/
{
settings : {
analysis : {
analyzer : {
"no_synonyms" : {
tokenizer : "lowercase"
},
"synonyms" : {
tokenizer : "lowercase",
filter : ["synonym"]
}
},
filter : {
synonym : {
type : "synonym",
format: "wordnet",
synonyms_path: "prolog/wn_s.pl"
}
}
}
}
}
Use a multi-field mapping so that the field of interest is indexed twice:
PUT /synonym_test/mytype/_mapping
{
"properties":{
"mood": {
"type": "multi_field",
"fields" : {
"syn" : {"type" : "string", "analyzer" : "synonyms"},
"no_syn" : {"type" : "string", "analyzer" : "no_synonyms"}
}
}
}
}
Index a test document:
POST /synonym_test/mytype/1
{
mood:"elated"
}
At search time, boost the score of hits on the field with no synonymn.
GET /synonym_test/mytype/_search
{
query: {
bool: {
should: [
{ match: { "mood.syn" : { query: "gleeful", "boost": 3 } } },
{ match: { "mood.no_syn" : "gleeful" } }
]
}
}
}
Results in _score":0.2696457
Searching for the original term returns a better score:
GET /synonym_test/mytype/_search
{
query: {
bool: {
should: [
{ match: { "mood.syn" : { query: "elated", "boost": 3 } } },
{ match: { "mood.no_syn" : "elated" } }
]
}
}
}
Results in: _score":0.6558018,"

Resources