Query issue with parent-children relation in Elasticsearch - elasticsearch

Having the following children-father mapping:
curl -XPUT 'localhost:9200/my_index' -d '{
"mappings": {
"my_parent": {
"dynamic": "strict",
"properties" : {
"title" : { "type": "string" },
"body" : { "type": "string" },
"source_id" : { "type": "integer" },
}
},
"my_child": {
"_parent": {"type": "my_parent" },
"properties" : {
"user_id" : { "type": "string" },
}}}}'
... this two parents with ids 10 and 11:
curl -X PUT 'localhost:9200/my_index/my_parent/10' -d '{
"title" : "Microsiervos - Discos duros de 10TB",
"body" : "Empiezan a sacar DD de 30GB en el mercado",
"source_id" : "27",
}'
curl -X PUT 'localhost:9200/my_index/my_parent/11' -d '{
"title" : "Microsiervos - En el 69 llegamos a la luna",
"body" : "Se cumplen 3123 anos de la llegada a la luna",
"source_id" : "27",
}'
... and this two childrens:
curl -XPUT 'localhost:9200/my_index/my_child/1234_10?parent=10' -d '{
"user_id": "1234",
}'
curl -XPUT 'localhost:9200/my_index/my_child/1234_11?parent=11' -d '{
"user_id": "1234",
}'
With the following query, I want to get the _id of the father with user_id = 1234.
curl -XGET 'localhost:9200/my_index/my_parent/_search?pretty=true' -d '{
"_source" : "_id",
"query": {
"has_child": {
"type": "my_child",
"query" : {
"query_string" : {
"default_field" : "user_id",
"query" : "1234"
}}}}}'
This outputs the two ids, 10 and 11.
Now I want to search on parent on those specific ids only, something like this:
curl -XGET 'localhost:9200/my_index/my_parent/_search?pretty=true' -d '{
"query": {
"bool": {
"must": [
{
"terms": {
"_id": ["10", "11"]
}},
{
"query_string": {
"default_field": "body",
"query": "mercado"
}}]}}}'
As you can notice, the "_id": ["10", "11"] part is written by hand. I would like to know if there's a way to combine this two queries in one single query putting the ids returned in the first query automatically on the second query.
So the output to this should be:
},
"hits" : {
"total" : 1,
"max_score" : 0.69177496,
"hits" : [ {
"_index" : "my_index",
"_type" : "my_parent",
"_id" : "10",
"_score" : 0.69177496,
"_source":{
"title" : "Microsiervos - Discos duros de 10TB",
"body" : "Empiezan a sacar DD de 30GB en el mercado",
"source_id" : "27"
}}]}}

Use bool Query and put both conditions in must:
curl -XGET "http://localhost:9200/my_index/my_parent/_search" -d'
{
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field": "body",
"query": "mercado"
}
},
{
"has_child": {
"type": "my_child",
"query": {
"query_string": {
"default_field": "user_id",
"query": "1234"
}
}
}
}
]
}
}
}'

Related

Elasticsearch children-father issue with 'has_parent'

Having the following mapping...
curl -XPUT 'localhost:9200/myindex' -d '{
"mappings": {
"my_parent": {},
"my_child": {
"_parent": {
"type": "my_parent"
}}}}'
... the following parent:
curl -X PUT localhost:9200/myindex/my_parent/1?pretty=true' -d '{
"title" : "Microsiervos - Discos duros de 10TB",
"body" : "Empiezan a sacar DD de 30GB en el mercado"
}'
and the following children:
curl -XPUT 'localhost:9200/myindex/my_child/2?parent=1' -d '{
"user": "Pepe"
}'
If I do the following has_child query:
curl -XGET 'localhost:9200/myindex/my_parent/_search?pretty=true' -d '{
"query": {
"has_child": {
"type": "my_child",
"query" : {
"query_string" : {
"default_field" : "user",
"query" : "Pepe"
}}}}}'
I get the desired output. Pepe is found, and his father is shown:
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [ {
"_index" : "myindex",
"_type" : "my_parent",
"_id" : "2",
"_score" : 1.0,
"_source":{
"title" : "Microsiervos - En el 69 llegamos a la luna",
"body" : "Se cumplen 3123 anos de la llegada a la luna"
}
But if I try to do it in reverse trying to get the children using has_parent:
curl -XGET 'localhost:9200/myindex/my_parent/_search?pretty=true' -d '{
"query": {
"has_parent": {
"parent_type": "my_parent",
"query" : {
"query_string" : {
"default_field" : "body",
"query" : "mercado"
}}}}}'
I dont get any hits. I was supposing to get the Pepe children as output. What I'm missing or doing wrong?
PS: I'm using Elasticsearch 2.1.1
You have made a mistake above.You are searching in my_parent type. If you want to fetch children using parent, you should fetch it from child_type.
Change query to:
curl -XGET 'localhost:9200/myindex/my_child/_search?pretty=true' -d
'{
"query": {
"has_parent": {
"parent_type": "my_parent",
"query" : {
"query_string" : {
"default_field" : "body",
"query" : "mercado"
}
}
}
}
}'
Please note I have used
curl -XGET 'localhost:9200/myindex/my_child/_search?pretty=true'
instead of
curl -XGET 'localhost:9200/myindex/my_parent/_search?pretty=true'

How to see which of the queries in boolean is matched?

I have given multiple queries using the bool query. Now it can happen that some of them might have matches and some queries might not have matches in the database. How can I know which of the queries had a match?
For example, here I have a bool query with two should conditions against the field landMark.
{
"query": {
"bool": {
"should": [
{
"match": {
"landMark": "wendys"
}
},
{
"match": {
"landMark": "starbucks"
}
}
]
}
}
}
How can I know which one of them matched in the above query if only one of them matches the documents?
You can use named queries for this purpose. Try this
{
"query": {
"bool": {
"should": [
{
"match": {
"landMark": {
"query": "wendys",
"_name": "wendy match"
}
}
},
{
"match": {
"landMark": {
"query": "starbucks",
"_name": "starbucks match"
}
}
}
]
}
}
}
you can use any _name . In response you will get something like this
"matched_queries": ["wendy match"]
so you will be able to tell which query matched that specific document.
Named query is certainly the way to go.
LINK - https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-named-queries-and-filters.html
Idea of named query is simple , you tag a name to each of your query and in the result , it shows which all tags matched per document.
curl -XPOST 'http://localhost:9200/data/data' -d ' { "landMark" : "wendys near starbucks" }'
curl -XPOST 'http://localhost:9200/data/data' -d ' { "landMark" : "wendys" }'
curl -XPOST 'http://localhost:9200/data/data' -d ' { "landMark" : "starbucks" }'
Hence create you query in this fashion -
curl -XPOST 'http://localhost:9200/data/_search?pretty' -d '{
"query": {
"bool": {
"should": [
{
"match": {
"landMark": {
"query": "wendys",
"_name": "wendy_is_a_match"
}
}
},
{
"match": {
"landMark": {
"query": "starbucks",
"_name": "starbuck_is_a_match"
}
}
}
]
}
}
}'
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 0.581694,
"hits" : [ {
"_index" : "data",
"_type" : "data",
"_id" : "AVMCNNCY3OZJfBZCJ_tO",
"_score" : 0.581694,
"_source": { "landMark" : "wendys near starbucks" },
"matched_queries" : [ "starbuck_is_a_match", "wendy_is_a_match" ] ---> "Matched tags
}, {
"_index" : "data",
"_type" : "data",
"_id" : "AVMCNS0z3OZJfBZCJ_tQ",
"_score" : 0.1519148,
"_source": { "landMark" : "starbucks" },
"matched_queries" : [ "starbuck_is_a_match" ]
}, {
"_index" : "data",
"_type" : "data",
"_id" : "AVMCNRsF3OZJfBZCJ_tP",
"_score" : 0.04500804,
"_source": { "landMark" : "wendys" },
"matched_queries" : [ "wendy_is_a_match" ]
} ]
}
}

ElasticSearch: Searching fields in nested arrays

I'm fairly new to ES and am using it for a new project of mine. Starting off, I have a simple mapping for a customer, which has a first and last name, and a list of payment information objects. If I were doing this in SQL, it would be something like a customer table, and a payment info table with a 1:many relationship.
Here's a simple example of what I'm trying to do: https://gist.github.com/anonymous/6109593
I'm hoping to find any customer based on any match in the nested array of paymentInfos, i.e. finding any users who've had a paymentInfo with billingZip 10101. This query returns no results, and I'm not sure why. Can anyone point me in the right direction as to why this query doesn't work, and if there are any changes I can make to either my query or mapping to have it return the user properly?
Thanks!
Nested fields should be searched using nested query:
echo "Deleting old ElasticSearch index..."
curl -XDELETE 'localhost:9200/arrtest'
echo
echo "Creating new ElasticSearch index..."
curl -XPUT 'localhost:9200/arrtest/?pretty=1' -d '{
"mappings" : {
"cust2" : {
"properties" : {
"firstName" : {
"type" : "string",
"analyzer" : "string_lowercase"
},
"lastName" : {
"type" : "string",
"analyzer" : "string_lowercase"
},
"paymentInfos": {
"properties": {
"billingZip": {
"type": "string",
"analyzer": "string_lowercase"
},
"paypalEmail": {
"type": "string",
"analyzer": "string_lowercase"
}
},
"type": "nested"
}
}
}
},
"settings" : {
"analysis" : {
"analyzer" : {
"uax_url_email" : {
"filter" : [ "standard", "lowercase" ],
"tokenizer" : "uax_url_email"
},
"string_lowercase": {
"tokenizer" : "keyword",
"filter" : "lowercase"
}
}
}
}
}
'
echo
echo "Index recreation finished"
echo "Inserting one record..."
curl -XPUT 'localhost:9200/arrtest/cust2/1' -d '{
"firstName": "john",
"lastName": "smith",
"paymentInfos": [{
"billingZip": "10101",
"paypalEmail": "foo#bar.com"
}, {
"billingZip": "20202",
"paypalEmail": "foo2#bar2.com"
}]
}
'
echo
echo "Refreshing index to make new records searchable"
curl -XPOST 'localhost:9200/arrtest/_refresh'
echo
echo "Searching for record..."
curl -XGET 'localhost:9200/arrtest/cust2/_search?pretty=1' -d '{
"sort": [],
"query": {
"bool": {
"should": [],
"must_not": [],
"must": [{
"nested": {
"query": {
"query_string": {
"fields": ["paymentInfos.billingZip"],
"query": "10101"
}
},
"path": "paymentInfos"
}
}]
}
},
"facets": {},
"from": 0,
"size": 25
}'
echo

Searching in array by elasticsearch

I want to search in elasticsearch but getting hit even though the condition does not match.
For eg:-
{
tweet: [
{
firstname: Lav
lastname: byebye
}
{
firstname: pointto
lastname: ihadcre
}
{
firstname: letssearch
lastname: sarabhai
}
]
}
}
Now there are following condition:-
1)
must:- firstname: Lav
must:- lastname: byebye
required:there should be hit
getting: Hit
2)
must:- firstname: Lav
must:- lastname: ihadcre
required:there should not be hit
getting: Hit
I should not be getting hit in 2nd condition which is problem
thanks for help
To achieve the behavior that you are describing, tweets should be indexed as nested objects and searched using nested query or filter. For example:
curl -XDELETE localhost:9200/test-idx
curl -XPUT localhost:9200/test-idx -d '{
"settings": {
"index.number_of_shards": 1,
"index.number_of_replicas": 0
},
"mappings": {
"doc": {
"properties": {
"tweet": {"type": "nested"}
}
}
}
}'
curl -XPUT "localhost:9200/test-idx/doc/1" -d '{
"tweet": [{
"firstname": "Lav",
"lastname": "byebye"
}, {
"firstname": "pointto",
"lastname": "ihadcre"
}, {
"firstname": "letssearch",
"lastname": "sarabhai"
}]
}
'
echo
curl -XPOST "localhost:9200/test-idx/_refresh"
echo
curl "localhost:9200/test-idx/doc/_search?pretty=true" -d '{
"query": {
"nested" : {
"path" : "tweet",
"score_mode" : "avg",
"query" : {
"bool" : {
"must" : [
{
"match" : {"tweet.firstname" : "Lav"}
},
{
"match" : {"tweet.lastname" : "byebye"}
}
]
}
}
}
}
}'
echo
curl "localhost:9200/test-idx/doc/_search?pretty=true" -d '{
"query": {
"nested" : {
"path" : "tweet",
"score_mode" : "avg",
"query" : {
"bool" : {
"must" : [
{
"match" : {"tweet.firstname" : "Lav"}
},
{
"match" : {"tweet.lastname" : "ihadcre"}
}
]
}
}
}
}
}'

Multiple properties in facet (elasticsearch)

I have following index:
curl -XPUT "http://localhost:9200/test/" -d '
{
"mappings": {
"files": {
"properties": {
"name": {
"type": "string",
"index": "not_analyzed"
},
"owners": {
"type": "nested",
"properties": {
"name": {
"type":"string",
"index":"not_analyzed"
},
"mail": {
"type":"string",
"index":"not_analyzed"
}
}
}
}
}
}
}
'
With sample documents:
curl -XPUT "http://localhost:9200/test/files/1" -d '
{
"name": "first.jpg",
"owners": [
{
"name": "John Smith",
"mail": "js#example.com"
},
{
"name": "Joe Smith",
"mail": "joes#example.com"
}
]
}
'
curl -XPUT "http://localhost:9200/test/files/2" -d '
{
"name": "second.jpg",
"owners": [
{
"name": "John Smith",
"mail": "js#example.com"
},
{
"name": "Ann Smith",
"mail": "as#example.com"
}
]
}
'
curl -XPUT "http://localhost:9200/test/files/3" -d '
{
"name": "third.jpg",
"owners": [
{
"name": "Kate Foo",
"mail": "kf#example.com"
}
]
}
'
And I need to find all owners that match some query, let's say "mit":
curl -XGET "http://localhost:9200/test/files/_search" -d '
{
"facets": {
"owners": {
"terms": {
"field": "owners.name"
},
"facet_filter": {
"query": {
"query_string": {
"query": "*mit*",
"default_field": "owners.name"
}
}
},
"nested": "owners"
}
}
}
'
This gives me following result:
{
"facets" : {
"owners" : {
"missing" : 0,
"_type" : "terms",
"other" : 0,
"total" : 4,
"terms" : [
{
"count" : 2,
"term" : "John Smith"
},
{
"count" : 1,
"term" : "Joe Smith"
},
{
"count" : 1,
"term" : "Ann Smith"
}
]
}
},
"timed_out" : false,
"hits" : {...}
}
And it's ok.
But what I exaclty need is to get owners with their email addresses (for each entry in facet I need additional field in results).
Is it achievable?
Not possible i think? Depending on your needs I would have
Create a composite field with both name & email and do the facet on that field, or
Run the query in addition to the facet and extract it from the query-result, but this is obviously not scalable
Two step-operation, get the facet, build the needed queries and merge results.

Resources