Must match query is not working in elastic search - elasticsearch

I am tying to find all Videos with the name "The Shining"
but also with a parent_id = 189, and parent_type = "folder"
My query seams to connect all of the match statements with "OR" instead of "AND"
What am I doing wrong?
{
"fields": ["name","parent_id","parent_type"],
"query": {
"and": {
"must":[
{
"match": {
"name": "The Shining"
}
},
{
"match": {
"parent_id": 189
}
},
{
"match": {
"parent_type": "folder"
}
}
]
}
}
}
Mapping:
{"video" : {
"mappings" : {
"video" : {
"properties" : {
"homepage_tags" : {
"type" : "nested",
"properties" : {
"id" : {
"type" : "integer"
},
"metaType" : {
"type" : "string"
},
"tag_category_name" : {
"type" : "string"
},
"tag_category_order" : {
"type" : "integer"
},
"tag_name" : {
"type" : "string"
}
}
},
"id" : {
"type" : "integer"
},
"name" : {
"type" : "string"
},
"parent_id" : {
"type" : "integer"
},
"parent_type" : {
"type" : "string"
},
"provider" : {
"type" : "string"
},
"publish" : {
"type" : "string"
},
"query" : {
"properties" : {
"bool" : {
"properties" : {
"must" : {
"properties" : {
"match" : {
"properties" : {
"name" : {
"type" : "string"
},
"parent_id" : {
"type" : "long"
}
}
}
}
}
}
}
}
},
"source_id" : {
"type" : "string"
},
"subtitles" : {
"type" : "nested",
"include_in_root" : true,
"properties" : {
"content" : {
"type" : "string",
"store" : true,
"analyzer" : "no_stopwords"
},
"end_time" : {
"type" : "float"
},
"id" : {
"type" : "integer"
},
"parent_type" : {
"type" : "string"
},
"start_time" : {
"type" : "float"
},
"uid" : {
"type" : "integer"
},
"video_id" : {
"type" : "string"
},
"video_parent_id" : {
"type" : "integer"
},
"video_parent_type" : {
"type" : "string"
}
}
},
"tags" : {
"type" : "nested",
"properties" : {
"content" : {
"type" : "string"
},
"end_time" : {
"type" : "string"
},
"id" : {
"type" : "integer"
},
"metaType" : {
"type" : "string"
},
"parent_type" : {
"type" : "string"
},
"start_time" : {
"type" : "string"
}
}
},
"uid" : {
"type" : "integer"
},
"vid_url" : {
"type" : "string"
}
}
}
}}}

I was able to solve the problem by reading Volodymryrs answer above.
Aditional matches were being found because they were matching "the". I tried to add the operator argument, but that did not work unfortunately. What I did instead was to use "match_phrase" and also switched my two other match fields to "term" - see my answer below –
[
'query' => [
'bool' => [
'must' => [
[
'match_phrase' => [
'name' => $searchTerm
]
],
[
'term' => [
'parent_id' => intVal($parent_id)
]
],
[
'term' => [
'parent_type' => strtolower($parent_type)
]
]
]
]
]
];

You will get matches on the or shining because of match query, and depending on matches you will get score. One of the easiest fixes would be to add operator and:
{
"match": {
"name": "The Shining",
"operator": "and"
}
}
But it's not what you need since this will also match names "shining The" or "The sun is shining".
Other option is that if you need to do exact matches on name, then you would need to make field name as non-analyzed. In ES 5 you can set field type as a keyword
In addition I would recommend you to use bool query with term queries since they will do exact match.
{
"fields": ["name","parent_id","parent_type"],
"query": {
"bool": {
"must":[{
"term": {
"name": "The Shining"
}
},
{
"term": {
"parent_id": 189
}
},
{
"term": {
"parent_type": "folder"
}
}
]
}
}
}

Related

How to query in parent-child relation in elasticsearch

I have parent-child relation for Customer(Parent) and PromotionCustomer(Child).
Customer data:
{
"id": "b7818d4d-566e-24f5-89d2-3995bb97fd5e",
**"externalId": "9200191",**
"name": "LOBLAW NFLD",
"fullName": "LOBLAW NFLD",
"businessSystem": "EBJ/001"
}
PromotionCustomer data:
{
"id" : "31f2e065-a046-9c3a-808b-83545ddb07d1",
"externalId" : "T-000195542",
"businessSystem" : "EBJ/001",
"promotionDescription" : "PM-RT-LOBLAW NFLD-BB",
"promotionType" : "Bill Back",
"promotionStatus" : "Approved",
**"promotionCustomer" : "9200191",**
"validFrom" : "02/28/2019",
"validTo" : "03/20/2019",
"promotionDateTypeCode" : "1"
}
This the mapping details(schema)
{
"promotionsearch" : {
"mappings" : {
"properties" : {
"_class" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"customer" : {
"type" : "nested",
"include_in_parent" : true,
"properties" : {
"businessSystem" : {
"type" : "keyword"
},
"externalId" : {
"type" : "keyword"
},
"fullName" : {
"type" : "text"
},
"id" : {
"type" : "text"
},
"name" : {
"type" : "text"
}
}
},
"id" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"promotioncustomer" : {
"type" : "nested",
"include_in_parent" : true,
"properties" : {
"businessSystem" : {
"type" : "keyword"
},
"externalId" : {
"type" : "keyword"
},
"id" : {
"type" : "text"
},
"promotionCustomer" : {
"type" : "text"
},
"promotionDateTypeCode" : {
"type" : "keyword"
},
"promotionDescription" : {
"type" : "text"
},
"promotionProducts" : {
"type" : "text"
},
"promotionStatus" : {
"type" : "keyword"
},
"promotionType" : {
"type" : "keyword"
},
"validFrom" : {
"type" : "date",
"format" : "MM/dd/yyyy"
},
"validTo" : {
"type" : "date",
"format" : "MM/dd/yyyy"
}
}
},
"promotionjoin" : {
"type" : "join",
"eager_global_ordinals" : true,
"relations" : {
"customer" : "promotioncustomer"
}
},
"routing" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
I need to get the get the promotionCustomer data based on the externalId which is the customerId which maps to promotionCustomer along with the date range.
I've written a query as below.
GET /promotionsearch/_search
{
"query": {
"has_parent": {
"parent_type": "customer",
"inner_hits": {},
"query": {
"has_child": {
"type": "promotioncustomer",
"query": {
"bool": {
"must": [
{
"match": {
"customer.externalId": "9200191"
}
},
{
"range": {
"promotioncustomer.validFrom": {
"gte": "02/28/2019"
}
}
},
{
"range": {
"promotioncustomer.validTo": {
"lte": "03/20/2019"
}
}
}
]
}
}
}
}
}
}
}
But it's not yielding the result. I know the reason as well. In the has_child clause i'm making use of parent field i.e "customer.externalId". Is there a way to include/add this condition in the parent_type query and then based on the result apply range condition inside has_child
I was able to get the solution. Hope it helps others as well. No need make multiple query call to ES inorder get the desired result.
GET /promotionsearch/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"promotioncustomer.promotionDescription": "PM-RT"
}
},
{
"range": {
"promotioncustomer.validFrom": {
"gte": "02/28/2019"
}
}
},
{
"range": {
"promotioncustomer.validTo": {
"lte": "03/28/2019"
}
}
},
{
"has_parent": {
"parent_type": "customer",
"query": {
"match": {
"customer.fullName": "sdjhfb"
}
}
}
}
]
}
}
}

Elasticsearch multi_match + nested search

I am trying to execute a multi_match + nested search in ElasticSearch 6.4. I have the following mappings:
"name" : {
"type" : "text"
},
"status" : {
"type" : "short"
},
"user" : {
"type" : "nested",
"properties" : {
"first_name" : {
"type" : "text"
},
"last_name" : {
"type" : "text"
},
"pk" : {
"type" : "integer"
},
"profile" : {
"type" : "nested",
"properties" : {
"location" : {
"type" : "nested",
"properties" : {
"name" : {
"type" : "text",
"analyzer" : "html_strip"
}
}
}
}
}
}
},
And this is the html_strip analyzer:
"html_strip" : {
"filter" : [
"lowercase",
"stop",
"snowball"
],
"char_filter" : [
"html_strip"
],
"type" : "custom",
"tokenizer" : "standard"
}
And my current query is this one:
"query": {
"bool": {
"must": {
"multi_match": {
"query": 'Paris',
"fields": ['name', 'user.profile.location.name']
},
},
"filter": {
"term": {
"status": 1
}
}
}
}
Obviously searching for "Paris" in user.profile.location.name doesn't work. I was trying to adapt my code to following this answer https://stackoverflow.com/a/48836012/12007123 but without any success.
What I am basically trying to achieve, is to be able to search for a value in multiple fields, this may or may not be nested.
I was also checking this discussion https://discuss.elastic.co/t/multi-match-query-string-with-nested-and-non-nested-fields/118652/5 but everything I tried wasn't successful.
If I just search for name, the search is working fine.
Any tips on how can I achieve this the right way, would be much appreciated.
EDIT:
While I didn't get an answer to my initial question, I was following Nikolay's (#nikolay-vasiliev) comment and changed th mappings to Object instead of Nested.
At least now I am able to search in user.profile.location.name. This is how the new mapping for user looks like:
"user" : {
"properties" : {
"first_name" : {
"type" : "text"
},
"last_name" : {
"type" : "text"
},
"pk" : {
"type" : "integer"
},
"profile" : {
"properties" : {
"location" : {
"properties" : {
"name" : {
"type" : "text",
"analyzer" : "html_strip"
}
}
}
}
}
}
},

elasticsearch: search within text

For example if name as P.Shanmukha Sharma and if user searches for Shanmukha will not be available for search result. its returning only for P.Shanmukha and Sharma, is there any way if i will search Shanmukha and it will return result?
"user" : {
"properties" : {
"city" : {
"type" : "string",
"analyzer" : "autocomplete",
"search_analyzer" : "standard"
},
"created" : {
"type" : "date",
"format" : "strict_date_optional_time||epoch_millis"
},
"id" : {
"type" : "long"
},
"latitude" : {
"type" : "double"
},
"longitude" : {
"type" : "double"
},
"profile_image" : {
"type" : "string"
},
"state" : {
"type" : "string",
"analyzer" : "autocomplete",
"search_analyzer" : "standard"
},
"super_verification" : {
"type" : "string"
},
"type" : {
"type" : "string"
},
"username" : {
"type" : "string",
"analyzer" : "autocomplete",
"search_analyzer" : "standard"
}
}
}
username is defined as a search analyzer
and search query is
def EsSearch(self, index, page, size, searchTerm):
body = {
'query': {
'match': searchTerm
},
'sort': {
'created': {
'order': 'desc'
}
},
'filter': {
'term': {
'super_verification': 'verified'
}
}
}
res = self.conn.search(index=index, body=body)
output = []
for doc in res['hits']['hits']:
output.append(doc['_source'])
return output
so doing so much of research on ES i Got this solution with wildcard. Thanks EveryOne
{
"query": {
"wildcard": {
"username": {
"value": "*Shanmukha*"
}
}
}
}
Basically, 2 way in to do so,
By GET method and URL:
http://localhost:9200/your_index/your_type/_search?q=username:*Shanmukha*&pretty=true
By Fuzzy Query
as given by #krrish this one:

Mysteriously wrong values of numerical fields in ElasticSearch

I've spent the last 2 days investigating this mind-bending issue:I have an index with custom mappings on which I perform some aggregations. The problem is that in the results of the aggregation on numerical fields,it returns values that do not appear in the database from which the data was imported, even though the number of results is the same.
I found a similar issue here where the problem was inconsistent mapping of a field across an index, but in my case it is mapped as the same type. The problem happens with the fields: award.value.amount, award.value.x_amountEur, tender.value.x_amountEur as far as I have checked.This is my current mapping as stated by curl -XGET 'http://localhost:9200/documents/_mappings?pretty&human'
(the part that contains the target fields):
{
"documents" : {
"mappings" : {
"document" : {
"properties" : {
"additionalIdentifiers" : {
"type" : "string",
"index" : "not_analyzed"
},
"award" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"contract_number" : {
"type" : "string",
"index" : "not_analyzed"
},
"date" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"x_day" : {
"type" : "integer"
},
"x_month" : {
"type" : "integer"
},
"x_year" : {
"type" : "integer"
}
}
},
"description" : {
"type" : "string"
},
"initialValue" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"amount" : {
"type" : "float"
},
"currency" : {
"type" : "string"
},
"x_vat" : {
"type" : "float"
}
}
},
"minValue" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"amount" : {
"type" : "float"
},
"x_amountEur" : {
"type" : "float"
}
}
},
"title" : {
"type" : "string"
},
"value" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"amount" : {
"type" : "float"
},
"currency" : {
"type" : "string"
},
"x_amountEur" : {
"type" : "float"
},
"x_vat" : {
"type" : "float"
},
"x_vatbool" : {
"type" : "boolean"
}
}
},
"x_initialValue" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"x_amountEur" : {
"type" : "float"
},
"x_vatbool" : {
"type" : "boolean"
}
}
}
}
},
"awardCriteria" : {
"type" : "string"
},
"contract_number" : {
"type" : "string"
},
"document_id" : {
"type" : "string",
"index" : "not_analyzed"
},
"numberOfTenderers" : {
"type" : "string"
},
"procurementMethod" : {
"type" : "string"
},
"procuring_entity" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"address" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"country" : {
"type" : "string"
},
"countryName" : {
"type" : "string",
"index" : "not_analyzed"
},
"email" : {
"type" : "string"
},
"locality" : {
"type" : "string"
},
"postalCode" : {
"type" : "string"
},
"streetAddress" : {
"type" : "string"
},
"telephone" : {
"type" : "string"
},
"x_url" : {
"type" : "string"
}
}
},
"name" : {
"type" : "string"
},
"x_slug" : {
"type" : "string",
"index" : "not_analyzed"
}
}
},
"suppliers" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"address" : {
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"email" : {
"type" : "string"
},
"locality" : {
"type" : "string"
},
"postalCode" : {
"type" : "string"
},
"streetAddress" : {
"type" : "string"
},
"telephone" : {
"type" : "string"
},
"x_url" : {
"type" : "string"
}
}
},
"name" : {
"type" : "string"
},
"x_slug" : {
"type" : "string",
"index" : "not_analyzed"
}
}
},
"tender" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"value" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"amount" : {
"type" : "float"
},
"currency" : {
"type" : "string"
},
"x_amountEur" : {
"type" : "float"
},
"x_vat" : {
"type" : "float"
},
"x_vatbool" : {
"type" : "boolean"
}
}
}
}
}
This is the aggregation I am using in order to get the values of contracts between each pair of supplier - procuring_entity:
Document.es.search({
"search_type": "count" ,
"body":{
"aggregations": {
"entities":{
"nested": {
"path": "procuring_entity"
},
"aggs": {
"procuring_entity_names": {
"terms": {
"field": "procuring_entity.x_slug",
"size": 0
},
"aggs": {
"suppliers": {
"nested": {
"path": "suppliers"
},
"aggs": {
"suppliers_names": {
"terms":{
"field": "suppliers.x_slug",
"size": 0
},
"aggs": {
"awards": {
"nested": {
"path": "award.value"
},
"aggs": {
"award_amounts": {
"terms":{
"field": "award.value.x_amountEur",
"size": 0
}
}
}
}
}
}
}
}
}
}
}
}
}
}})
The result with type float is :
{"entities"=>
{"doc_count"=>24300,
"procuring_entity_names"=>
{"doc_count_error_upper_bound"=>0,
"sum_other_doc_count"=>0,
"buckets"=>
[{"key"=>"vsia-bernu-kliniska-universitates-slimnica",
"doc_count"=>1360,
"suppliers"=>
{"doc_count"=>1360,
"suppliers_names"=>
{"doc_count_error_upper_bound"=>0,
"sum_other_doc_count"=>0,
"buckets"=>
[{"key"=>"recipe-plus-as",
"doc_count"=>388,
"awards"=>
{"doc_count"=>388,
"awards"=>
{"doc_count_error_upper_bound"=>0,
"sum_other_doc_count"=>0,
"buckets"=>
[{"key"=>3679.086669921875, "doc_count"=>373},
{"key"=>0.0, "doc_count"=>12},
{"key"=>73610.3203125, "doc_count"=>1},
{"key"=>244000.0, "doc_count"=>1},
{"key"=>342348.9375, "doc_count"=>1}]}}}
The problem is that in MongoDB the same query returns 388 documents that all have award.value.x_amountEur = 3679.08661250056 , as presented by Mongoid query:
Document.where(:"procuring_entity.x_slug" => "vsia-bernu-kliniska-universitates-slimnica")
.keep_if{|doc| doc.suppliers.first.x_slug == "recipe-plus-as"}
.map{|doc| doc.award.value.x_amountEur}.uniq
=>[3679.08661250056]
A query directly into MongoDB returns the same.
I have also tried to map the targeted fields as double, which gave the same result and as long which returned the following (even more incorrect result):
{"entities"=>
{"doc_count"=>24300,
"procuring_entity_names"=>
{"doc_count_error_upper_bound"=>0,
"sum_other_doc_count"=>0,
"buckets"=>
[{"key"=>"vsia-bernu-kliniska-universitates-slimnica",
"doc_count"=>1360,
"suppliers"=>
{"doc_count"=>1360,
"suppliers_names"=>
{"doc_count_error_upper_bound"=>0,
"sum_other_doc_count"=>0,
"buckets"=>
[{"key"=>"recipe-plus-as",
"doc_count"=>388,
"awards"=>
{"doc_count"=>388,
"awards"=>
{"doc_count_error_upper_bound"=>0,
"sum_other_doc_count"=>0,
"buckets"=>
[{"key"=>3679, "doc_count"=>371},
{"key"=>0, "doc_count"=>12},
{"key"=>44300, "doc_count"=>1},
{"key"=>80472, "doc_count"=>1},
{"key"=>331636, "doc_count"=>1},
{"key"=>342348, "doc_count"=>1},
{"key"=>1658805, "doc_count"=>1}]}}}
I'm using Elasticsearch 2.0, mongoid 5.0.1 and mongoid-elasticsearch for indexing. I can't think of anything else to do so any suggestion is welcomed and appreciated.
I tried to test your scenario with ES 2.0 and there is something that I'm missing. I cannot make it create buckets for the award.value.x_amountEur unless I use a reverse_nested aggregation to "get out" from one nested path and into another.
So, instead of the awards aggregation that you have I'm using the same aggregation but "wrapped" in a reverse_nested aggregation:
"aggs": {
"getting_back": {
"reverse_nested": {},
"aggs": {
"awards": {
"nested": {
"path": "award.value"
},
"aggs": {
"award_amounts": {
"terms": {
"field": "award.value.x_amountEur"
}
}
}
}
}
}
}
And for this one I am seeing something ok.
Later edit: following mine and more general #Val's suggestion, the complete solution was to use reverse_nested on both awards and suppliers aggregations.

Is it possible to define default mapping for an inner object in ElasticSearch?

Say I have a document like this:
{
"events" : [
{
"event_id" : 123,
"props" : {
"version": "33"
},
{
"event_id" : 124,
"props" : {
"version": "44a"
}
]
}
Is it possible to specify that the events.props.version be mapped to some type?
I've tried:
{
"template" : "logstash-*",
...
"mappings" : {
"_default_" : {
"properties" : {
"events.props.version" : { "type" : "string" }
}
}
}
}
But that doesn't seem to work.
Please have a look at mapping API in elasticsearch Mapping API.
To set any analyzer in the inner element we need to consider each and every inner field as a separate properties set. try the following
{
"mappings": {
"properties": {
"events": {
"properties": {
"event_id": {
"type": "string",
"analyzer": "keyword"
},
"props": {
"properties": {
"version": {
"type": "string"
}
}
}
}
}
}
}
}
if this not works please provide me you mapping.
Sure, but you need to use the "object" type:
From the doc ( https://www.elastic.co/guide/en/elasticsearch/reference/1.5/mapping-object-type.html ) if you want to map
{
"tweet" : {
"person" : {
"name" : {
"first_name" : "Shay",
"last_name" : "Banon"
},
"sid" : "12345"
},
"message" : "This is a tweet!"
}
}
you can write:
{
"tweet" : {
"properties" : {
"person" : {
"type" : "object",
"properties" : {
"name" : {
"type" : "object",
"properties" : {
"first_name" : {"type" : "string"},
"last_name" : {"type" : "string"}
}
},
"sid" : {"type" : "string", "index" : "not_analyzed"}
}
},
"message" : {"type" : "string"}
}
}
}

Resources