Multiple Match Phrase Prefixes Return Zero Results In Elasticsearch - elasticsearch

I have the following Elasticsearch, version 2.3, query which produces zero results.
{
"query": {
"bool": {
"must": [
{
"match_phrase_prefix": {
"phone": "123"
}
},
{
"match_phrase_prefix": {
"firstname": "First"
}
}
]
}
}
}
Output from above query:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
Output of above query with _explain
{
"_index": "index_name",
"_type": "doc_type",
"_id": "_explain",
"_version": 4,
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"created": false
}
However, when I do either of the following I get results including the one document that matches both parts of the above query. If I include the full phone number then the document will appear in the results.
Phone numbers are stored as strings without any formatting. i.e. "1234567890".
Any reason why the two prefix query returns zero results?
{
"query": {
"bool": {
"must": [
{
"match_phrase_prefix": {
"phone": "123"
}
}
]
}
}
}
{
"query": {
"bool": {
"must": [
{
"match_phrase_prefix": {
"firstname": "First"
}
}
]
}
}
}

I was able to get the results I wanted by changing the phone number query to a regexp query instead of a match_phrase_prefix query.
{
"query": {
"bool": {
"must": [
{
"regexp": {
"phone": "123[0-9]+"
}
},
{
"match_phrase_prefix": {
"firstname": "First"
}
}
]
}
}
}

Related

Elasticsearch - How do i search on 2 fields. 1 must be null and other must match search text

I am trying to do a search on elasticsearch 6.8.
I don't have control over the elastic search instance, meaning i cannot control how the data is indexed.
I have data structured like this when i do a match. all search:
{ "took": 4,
"timed_out": false,
"_shards": {
"total": 13,
"successful": 13,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 15.703552,
"hits": [ {
"_index": "(removed index)",
"_type": "_doc",
"_id": "******** (Removed id)",
"_score": 15.703552,
"_source": {
"VCompany": {
"cvrNummer": 12345678,
"penheder": [
{
"pNummer": 1234567898,
"periode": {
"gyldigFra": "2013-04-10",
"gyldigTil": "2014-09-30"
}
}
],
"vMetadata": {
"nyesteNavn": {
"navn": "company1",
"periode": {
"gyldigFra": "2013-04-10",
"gyldigTil": "2014-09-30"
}
},
}
}
}
}
}]
The json might not be fully complete because i removed some unneeded data. So what I am trying to do is search where: "vCompany.vMetaData.nyesteNavn.gyldigTil" is null and where "vCompany.vMetaData.nyesteNavn.navn" will match a text string.
I tried something like this:
{
"query": {
"bool": {
"must": [
{"match": {"Vrvirksomhed.virksomhedMetadata.nyesteNavn.navn": "company1"}}
],
"should": {
"terms": {
"Vrvirksomhed.penheder.periode.gyldigTil": null
}
}
}
}
You need to use must_not with exists query like below to check if field is null or not. Below query will give result where company1 is matching and Vrvirksomhed.penheder.periode.gyldigTil field is null.
{
"query": {
"bool": {
"must": [
{
"match": {
"Vrvirksomhed.virksomhedMetadata.nyesteNavn.navn": "company1"
}
}
],
"must_not": [
{
"exists": {
"field": "Vrvirksomhed.penheder.periode.gyldigTil"
}
}
]
}
}
}

Is it possible to use a query result into another query in ElasticSearch?

I have two queries that I want to combine, the first one returns a document with some fields.
Now I want to use one of these fields into the new query without creating two separates ones.
Is there a way to combine them in order to accomplish my task?
This is the first query
{
"_source": {
"includes": [
"data.session"
]
},
"query": {
"bool": {
"must": [
{
"match": {
"field1": "9419"
}
},
{
"match": {
"field2": "5387"
}
}
],
"filter": [
{
"range": {
"timestamp": {
"time_zone": "+00:00",
"gte": "2020-10-24 10:16",
"lte": "2020-10-24 11:16"
}
}
}
]
}
},
"size" : 1
}
And this is the response returned:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 109,
"relation": "eq"
},
"max_score": 3.4183793,
"hits": [
{
"_index": "file",
"_type": "_doc",
"_id": "UBYCkgsEzLKoXh",
"_score": 3.4183793,
"_source": {
"data": {
"session": "123456789"
}
}
}
]
}
}
I want to use that "data.session" into another query, instead of rewriting the value of the field by passing the result of the first query.
{
"_source": {
"includes": [
"data.session"
]
},
"query": {
"bool": {
"must": [
{
"match": {
"data.session": "123456789"
}
}
]
}
},
"sort": [
{
"timestamp": {
"order": "asc"
}
}
]
}
If you mean to use the result of the first query as an input to the second query, then it's not possible in Elasticsearch. But if you share your query and use-case, we might suggest you better way.
ElasticSearch does not allow sub queries or inner queries.

ElasticSearch - search for any nested field that is in range

I have the next filed as part of an elastic element:
"PayPlan" : {
"ActivePlans" : {
"plan1" : {
"startsOn" : "1",
"endsOn" : "999999"
}
},
"someOtherData" : [
NumberLong(0), 0]
},
plan names are completely without logic (can be 'plan2323a' or 'plan_hh_jj' and so on).
How can I search for ALL the elements that have ANY plan that the startsOn is smaller then X and endsOn is bigger than X?
Thank you all
I am unable to do this with query_string or by using range on query and using the next format "PayPlan.ActivePlans.*.startsOn" (the asterisk did not work as a wildcard in range
Thank you all
This is the elasticsearch query I have working now but I want to change 'plan1' into '*' so it will search for any sub plan:
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"match_all": {}
},
{
"or": {
"filters": [
{
"bool": {
"must": [
{
"range": {
"PayPlan.ActivePlans.plan1.startsOn": {
"lte": "1234"
}
}
},
{
"range": {
"PayPlan.ActivePlans.plan1.endsOn": {
"gte": "1236"
}
}
}
]
}
}
]
}
}
]
}
}
}
}
}
You could start with a query string like:
GET test1/_search
{
"query": {
"query_string": {
"default_field": "PayPlan.ActivePlans.plan*.startsOn",
"query": ">0"
}
}
}
The output (with a quick test run):
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "test1",
"_type": "plan",
"_id": "AVq000G1mKJs7uLU8liY",
"_score": 1,
"_source": {
"PayPlan": {
"ActivePlans": {
"plan2": {
"startsOn": "2",
"endsOn": "999998"
}
}
}
}
},
{
"_index": "test1",
"_type": "plan",
"_id": "AVq00p0pmKJs7uLU8liW",
"_score": 1,
"_source": {
"PayPlan": {
"ActivePlans": {
"plan1": {
"startsOn": "1",
"endsOn": "999999"
}
}
}
}
}
]
}
}

Elasticsearch Cardinality Aggregation giving completely wrong results

I am saving each page view of a website in an ES index, where each page is recognized by an entity_id.
I need to get the total count of unique page views since a given point in time.
I have the following mapping:
{
"my_index": {
"mappings": {
"page_views": {
"_all": {
"enabled": true
},
"properties": {
"created": {
"type": "long"
},
"entity_id": {
"type": "integer"
}
}
}
}
}
}
According to the Elasticsearch docs, the way to do that is using a cardinality aggregation.
Here is my search request:
GET my_index/page_views/_search
{
"filter": {
"bool": {
"must": [
[
{
"range": {
"created": {
"gte": 9999999999
}
}
}
]
]
}
},
"aggs": {
"distinct_entities": {
"cardinality": {
"field": "entity_id",
"precision_threshold": 100
}
}
}
}
Note, that I have used a timestamp in the future, so no results are returned.
And the result I'm getting is:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
},
"aggregations": {
"distinct_entities": {
"value": 116
}
}
}
I don't understand how the unique page visits could be 116, giving that there are no page visits at all for the search query. What am I doing wrong?
Your aggregation is returning the global value for the cardinality. If you want it to return only the cardinality of the filtered set, one way you could do that is to use a filter aggregation, then nest your cardinality aggregation inside that. Leaving out the filtered query for clarity (you can add it back in easily enough), the query I tried looks like:
curl -XPOST "http://localhost:9200/my_index/page_views/_search " -d'
{
"size": 0,
"aggs": {
"filtered_entities": {
"filter": {
"bool": {
"must": [
[
{
"range": {
"created": {
"gte": 9999999999
}
}
}
]
]
}
},
"aggs": {
"distinct_entities": {
"cardinality": {
"field": "entity_id",
"precision_threshold": 100
}
}
}
}
}
}'
which returns:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 4,
"max_score": 0,
"hits": []
},
"aggregations": {
"filtered_entities": {
"doc_count": 0,
"distinct_entities": {
"value": 0
}
}
}
}
Here is some code you can play with:
http://sense.qbox.io/gist/bd90a74839ca56329e8de28c457190872d19fc1b
I used Elasticsearch 1.3.4, by the way.

Is there any method in Elastic Search to get result in case of misspelling?

I want to know if it's possible to search among the data in case of misspelling like we search in google.
Currently this query returns thousands of results:
{
"query": {
"query_string": {
"query": "obama"
}
}
}
but when I change it to:
{
"query": {
"query_string": {
"query": "omama"
}
}
}
"obama" replaced with "omama" there is no result. is it possible to get results in case of wrong spelling?
I think what you are looking for is Fuzzy Query .
{
"query": {
"fuzzy": {
"field_name" : "omama"
}
}
}
If you are run this on single field the you can use fuzzy query like this field
{
"fuzzy_like_this_field" : {
"name.first" : {
"like_text" : "omama",
"max_query_terms" : 12
}
}
}
You can also check Phonetic Matching
https://github.com/elasticsearch/elasticsearch-analysis-phonetic
Simply use a fuzzy query, (documentation) :
{
"query": {
"fuzzy": {
"name": "omama"
}
}
}
You should get your result :
{
"took": 12,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 2.7917595,
"hits": [
{
"_index": "test",
"_type": "obama",
"_id": "D_ovfcHkQwODdftWM4_z1Q",
"_score": 2.7917595,
"_source": {
"name": "obama"
}
}
]
}
}

Resources