Problem searching domain with elastic search - elasticsearch

I have registered the following document
"ownDomainValue":"catalogonuevo1.com"
When I perform the following query the document is found, value is "catalogonuevo1"
[
{
"query": {
"bool": {
"filter": [
{
"term": {
"valor_dominio_propio": "catalogonuevo1"
}
}
]
}
},
"from": 0,
"size": 1
}
]
However, when the search value is "catalogonuevo1.com"
[
{
"query": {
"bool": {
"filter": [
{
"term": {
"valor_dominio_propio": "catalogonuevo1.com"
}
}
]
}
},
"from": 0,
"size": 1
}
]
it does not return any value, using MatchQueries the opposite happens, it always finds a wrong document, such as one with the value "catalogonuevo2.com" which is not what I am looking for since I need the search to be exact

It sounds like the problem is that the "term" query in Elasticsearch is not matching the exact value "catalogonuevo1.com" when it is included in the query.
This is likely because the "term" query is tokenizing the input string at the "." character, so it is matching on the token "catalogonuevo1" rather than the entire string "catalogonuevo1.com".
You can resolve this issue by using the "match_phrase" query instead of "term" query, as "match_phrase" query matches on the exact phrase rather than individual tokens.
Additionally, you can use keyword fields to store the domain values; this way, the values are not tokenized and the match phrase will work as expected.

Related

Elasticsearch Boolean query with Constant score wrapper

When using elasticsearch-7 I'm confused by es compound queries syntax.
Though reading es documents repeatedly but i just find standard syntax of Boolean or Constant score seperately.
As it illuminate,i understand what is 'query context' and what is 'filter context'.But when combining these two query type in a single query i don't know what it mean.
Let's see a example:
GET /classes_test/_search
{
"size": "21",
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{
"match": {
"class_name": "29386556"
}
}
],
"should": [
{
"term": {
"master": "7033560"
}
},
{
"term": {
"assistant": "7033560"
}
},
{
"term": {
"students": "7033560"
}
}
],
"minimum_should_match": 1,
"must_not": [
{
"term": {
"class_id": 0
}
}
],
"filter": [
{
"term": {
"class_status": "1"
}
}
]
}
}
}
}
}
This query can be executed and response well.Each item in response content has a '_score' value with 1.0.
So,is it mean that the sub bool query as a entirety is in a filter context though it has a 'must' and 'should'?
Also i found boolean query can have a constant score sub query.
Why es allow these syntax but has no more words to explain?
If you use a constant_score query, you'll never get scores different than 1.0, unless you specify boost parameters in which case the score will match those.
If you need scoring you obviously need to ditch constant_score.
In your case, your match query on class_name cannot yield any other score than 1 or 0 since this is basically a yes/no filter, not a matching based on full-text search.
To sum up, all your query executes in a filter context (hence score 0 or 1) since you don't rely on full-text search. So you get scoring whenever you use full-text search, not because you use a match query. In your case, you can merge all must constraints into filter, it won't make any difference since you only have filters (yes/no matches) and no full-text search.

ElasticSearch must-terms does not return data

My ElasticSearch must-terms does not work, the data has clientId value "08d71bc7-c4ab-6e1d-f858-cf3448242e8b" but the result is empty. I am using elasticsearch:6.7.1. Do you know the problem here?
{
"from": 0,
"size": 20,
"query": {
"bool": {
"must": [
{ "terms": { "clientId": ["08d71bc7-c4ab-6e1d-f858-cf3448242e8b", "08d71bc7-c4ab-6e1d-f858-cf3448242e8c"] } },
{
"query_string": {
"query": "*d*",
"fields": ["name", "description", "title"]
}
},
{ "query_string": { "query": "1", "fields": ["type"] } }
]
}
}
}
I share sample data
I haven't worked enough with "query_string"... But if you don't put them and run your query, I'm sure it should at least give you some results. If so, your "query_string"s are the ones that are giving you this bad time
I first recommend you to use "filter" instead of "must".
Consider using the Regexp query your first "query_string". I found here how to query multiple fields with Regexp.
For the second, it would be enough to use "term" instead of "query_string".
Hope this is helpful! :D
The search results depends on the analysis type of clientId . If clientId is a 'keyword' your query should work as expected, but if the type of clientId is 'text' then the value might get tokenized to smaller parts (break at the dash).
You can check the clientId fields type in the index mappings, and also run the analyze API to check the tokenization: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-analyze.html

Return matched input from an elasticsearch query as they were typed

Here is an example:
"query": {
"bool": {
"should": [
{
"match_phrase_prefix": {
"keyword":{
"query": "pokemno", //approximative term
"fuzziness": 1
}
}
},
//... multiple should closures there
]
}
}
This query match a document with the "Pokémon" keyword, however i want to get back the exact input search term(s) that has matched for each document (here, "pokemno").
AFAIK the highlight feature only return the matched values as they are saved in the document, but not the ones from the search input.
Is there any way to get those values back from the search?

Must match multiple values

I have a query that works fine when I need the property of a document
to match just one value.
However I also need to be able to search with must with two values.
So if a banana has id 1 and a lemon has id 2 and I search for yellow
I will get both if I have 1 and 2 in the must clause.
But if i have just 1 I will only get the banana.
{
"from": 0,
"size": 20,
"query": {
"bool": {
"should": [
{ "match":
{ "fruit.color": "yellow" }}
],
"must" : [
{ "match": { "fruit.id" : "1" } }
]
}
}
}
I haven´t found a way to search with two values with must.
is that possible?
If the document "must" be returned only if the id is 1 or 2, that sounds like another should clause. If I'm understanding your question properly, you want documents with either id 1 OR id 2. Additionally, if the color is yellow, give it a higher score.
Here's one way you might achieve what you're looking for:
{
"query": {
"bool": {
"should": {
"match": {
"fruit.color": "yellow"
}
},
"must": {
"bool": {
"should": [
{
"match": {
"fruit.id": "1"
}
},
{
"match": {
"fruit.id": "2"
}
}
]
}
}
}
}
}
Here I put the two match queries in the should clause of a separate bool query. This achieves the OR behavior you are looking for.
Have another look at the Bool Query documentation and take note of the nuances of should. It behaves differently by default depending on whether or not there is a sibling must clause and whether or not the bool query is being executed in filter context.
Another key option that is adjustable and can help you achieve your expected results is the minimum_should_match parameter. Have a look at this documentation page.
Instead of a match query, you could simply try the terms query for ORing between multiple terms.
Match queries are generally used for analyzed fields. For exact matching, you should use term queries
{
"from": 0,
"size": 20,
"query": {
"bool": {
"should": [
{ "match": { "fruit.color": "yellow" } }
],
"must" : [
{ "terms": { "fruit.id": ["1","2"] } }
]
}
}
}
term or terms query is the perfect way to fetch the exact text or id, using match query result in search inside the id or text
Ex:
id = '4'
id = '44'
Search using match query with id = 4 return both 4 & 44 since it matches 4 in both. This is where terms query come into play.
same search using terms query will return 4 only.
So the accepted is absolutely wrong. Use the #Rahul answer. Just one more thing you need to do, Instead of text you need to analyse the field as a keyword
Example for indexing a field both as a text and keyword (mapping is for flat level for nested change it accordingly).
{
"index_patterns": [ "test" ],
"mappings": {
"kb_mapping_doc": {
"_source": {
"enabled": true
},
"properties": {
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
}
using #Rahul's answer doesn't worked because you might be analysed as a text.
id - access a text field
id.keyword - access a keyword field
it would be
{
"from": 0,
"size": 20,
"query": {
"bool": {
"should": [{
"match": {
"color": "yellow"
}
}],
"must": [{
"terms": {
"id.keyword": ["1", "2"]
}
}]
}
}
}
So I would say accepted answer will return falsy results Please use #Rahul's answer with the corresponding mapping.

Elasticsearch case-insensitive query_string query with wildcards

In my ES mapping I have an 'uri' field which is currently set to not_analysed and I'm not allowed to change the mapping.I wanted to search for uri parts with a query_string query like this (this ES query is autogenerated, that is why it is a bit complicated but let's just focus on the query_string part)
{
"sort": [{"updated": {"order": "desc"}}],
"query": {
"bool": {
"must":[{
"query_string": {
"query":"*w3\\.org\\/2014\\/01\\/a*",
"lowercase_expanded_terms": true,
"default_field": "uri"
}
}],
"minimum_number_should_match": 1
}
}, "size": 50}
Now it is usually working, but I've the following url stored (fictional url): http://w3.org/2014/01/Abc.html and this query does not bring it back because of the A-a difference. Setting the expanded terms to false also not solves this. What should I do for this query to be case insensitive?
Thanks for the help in advance.
From the docs, it seems like you need a new analyzer that first transforms to lowercase and then can run the search. Have you tried that?
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/sorting-collations.html
As I read it, your pattern, lowercase_expanded_terms, only applies to expansions, not to regular words
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html
lowercase_expanded_terms
Whether terms of wildcard, prefix, fuzzy, and range queries are to be automatically lower-cased or not (since they are not analyzed). Default it true
Try to use match query instead of query string.
{
"sort": [
{
"updated": {
"order": "desc"
}
}
],
"query": {
"bool": {
"must": [
{
"match": {
"uri": "*w3\\.org\\/2014\\/01\\/a*"
}
}
]
}
},
"size": 50
}
Query string queries are not analyzed and but match queries are analyzed.

Resources