How to perform nested queries on Elasticsearch? - elasticsearch

I was trying to perform nested query on elastic-search that is, I have 2 queries in which the output of the first query must be used as an input in the second query, was going through the documentation of elastic-search but couldn't find any alternative.
The first query is:
GET index1/_search
{
"query": {
"query_string": {
"query": "(imageName: xyz.jpg)"
}
}
}
The output of this query would be of JSON format,
For example:
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 2.2682955,
"hits" : [
{
"_index" : "index1",
"_type" : "_doc",
"_id" : "1",
"_score" : 2.2682955,
"_source" : {
"assetId" : "0",
"descriptor" : "randomString",
"bucketId" : [randomArray],
"imageName" : "xyz.jpg"
}
}
]
}
}
The second query is:
GET index2/_search
{
"query": {
"function_score": {
"boost_mode": "replace",
"query": {
"constant_score": {
"filter": {
"terms": {
"bucketId": [randomArray that came as an output of the first query]
}
}
}
},
"pqcode_score": {
"descriptors": [
{
"descriptor": "randomString that came as an output of the first query"
}
]
}
}
}
}
How can we use the output of the first query inside the second query?
Can anyone help me in this regard?

It is not possible in Elasticsearch. You need to implement this at your application side.
You can call first query and get result then you can call the second query by passing the output of first query that is the only option.

Related

Query for value in object

I have multiple documents like:
{
labels: {
label1Key: "label1Value",
label2Key: "label2Value",
...
},
...
}
The keys of the labels object are arbitrary. I would like to query for the existence of specific values in the labels object without knowing the key, e.g. I want all data that contain label2Value as a value in the labels object.
I've tried to solve this via an exists query, but this way I can only access the key of an object. Is there a way to query for values?
With a Multimatch query you can use wildcards on the field names
Ingest data
POST test_bene/_doc
{
"labels": {
"label1Key": "label1Value",
"label2Key": "label2Value"
}
}
Query
POST test_bene/_search
{
"query": {
"multi_match": {
"query": "label1Value",
"fields": ["labels.*"]
}
}
}
Response
{
"took" : 24,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "test_bene",
"_type" : "_doc",
"_id" : "RtBd_ncB46EpgstaHy3Y",
"_score" : 0.2876821,
"_source" : {
"labels" : {
"label1Key" : "label1Value",
"label2Key" : "label2Value"
}
}
}
]
}
}

document_missing_exception while performing ElasticSearch update

I went through several questions with the same "document_missing_exception" problem but looks like they aren't the same problem in my case. I can query the document, but failed when I tried to updated it.
My query:
# search AuthEvent by sessionID
GET events-*/_search
{
"size": "100",
"query": {
"bool": {
"must": [{
"term": {
"type": {
"value": "AuthEvent"
}
}
},
{
"term": {
"client.sessionID.raw": {
"value": "067d660a1504Y67FOuiiRIEkVNG8uYIlnK87liuZGLBcSmEW0aHoDXAHfu"
}
}
}
]
}
}
}
Query result:
{
"took" : 18,
"timed_out" : false,
"_shards" : {
"total" : 76,
"successful" : 76,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 6.705622,
"hits" : [
{
"_index" : "events-2020.10.06",
"_type" : "doc",
"_id" : "2c675295b27a225ce243d2f13701b14222074eaf",
"_score" : 6.705622,
"_routing" : "067d660a1504Y67FOuiiRIEkVNG8uYIlnK87liuZGLBcSmEW0aHoDXAHfu",
"_source" : {
# some data
}
}
]
}
}
Update request:
POST events-2020.10.06/_doc/2c675295b27a225ce243d2f13701b14222074eaf/_update
{
"doc" : {
"custom" : {
"testField" : "testData"
}
}
}
And update result:
{
"error" : {
"root_cause" : [
{
"type" : "document_missing_exception",
"reason" : "[_doc][2c675295b27a225ce243d2f13701b14222074eaf]: document missing",
"index_uuid" : "5zhQy6W6RnWscDz7Av4_bA",
"shard" : "1",
"index" : "events-2020.10.06"
}
],
"type" : "document_missing_exception",
"reason" : "[_doc][2c675295b27a225ce243d2f13701b14222074eaf]: document missing",
"index_uuid" : "5zhQy6W6RnWscDz7Av4_bA",
"shard" : "1",
"index" : "events-2020.10.06"
},
"status" : 404
}
I'm quite new to ElasticSearch and couldn't find any reason for such behaviour. I use ElasticSearch 6.7.1 oss version + Kibana for operating with data. I also tried with bulk update but ended with same error.
As you can see in the query results, your document has been indexed with a routing value and you're missing it in your update request.
Try this instead:
POST events-2020.10.06/_doc/2c675295b27a225ce243d2f13701b14222074eaf/_update?routing=067d660a1504Y67FOuiiRIEkVNG8uYIlnK87liuZGLBcSmEW0aHoDXAHfu
{
"doc" : {
"custom" : {
"testField" : "testData"
}
}
}
If a document is indexed with a routing value, all subsequent get, update and delete operations need to happen with that routing value as well.

Elastic Search 7 - how to query against multiple options of a field order email has 1. customer email, 2.shipping email, 3.billing email

I am indexing orders from a database. When searching for a customer I want to be able to provide something like the following in the query string:
email:tom#test.com
The problem I have is that my model has multiple emails against different related models. For example:
order.customer.email
order.shipping.email
order.billing.email
I want to be able to combine all these into a single email field.
I have tried creating a new key on the order document root called email and filled it with a list, such as:
order
|- email: "tom#test.com,tom#test2.com,tomsCompany#test.com"
This does work, but it causes issues when returning the data in areas such as highlighting the results.
Is there a way I can do something like:
order
|- email
|-tom#test.com
|-tom#test2.com
|-tomsCompany#test.com
Where I can still search for email:tom#test.com but instead of being returned with a hit of
value of "tom#test.com,tom#test2.com,tomsCompany#test.com" I just get the one value "tom#test.com"
EDIT
An alternative would be to pre-process my query string before its submitted to ES so "email:tom#test.com" is changed to (customer.email:tom#test.com OR billing.email:tom#test.com OR shipping.email:tom#test.com) but that feels quite messy too and requires an extra processing step.
If you are collating all the email ids into a single field and only want to return the exact hit, the only way to do that would be to make use of nested datatype.
Basically every value in a nested datatype is a document in itself. ES doesn't return only a part of the field's value as the response for search queries.
The way they've designed nested datatype is that every value in the list would be treated as a document. Now when a user searches for that value, it has a way to mention in the response which inner document has had a hit.
Please see below sample mapping, the document, query and the response.
Mapping:
PUT my_nested_index
{
"mappings": {
"properties": {
"order":{
"type": "nested", <---- Note this
"properties": {
"email":{
"type": "keyword"
}
}
}
}
}
}
Sample Document:
POST my_nested_index/_doc/1
{
"order":[
{
"email": "tom#test.com"
},
{
"email": "tom#test2.com"
},
{
"email": "tomsCompany#test.com"
}
]
}
Nested Query:
POST my_nested_index/_search
{
"query": {
"bool": {
"must": [
{
"nested": { <---- Note this. Nested Query
"path": "order",
"query": {
"term": {
"order.email": "tom#test.com"
}
},
"inner_hits": { <---- Inner hits field
"_source": "order.email"
}
}
}
]
}
}
}
Note that I've made use of nested query in order to find the document you are looking for. Notice in the response that there is a separate section inner_hits and it is this section, that would help you know that which document among the nested documents has been a hit.
Response:
{
"took" : 24,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.9808291,
"hits" : [
{
"_index" : "my_nested_index",
"_type" : "_doc",
"_id" : "1", <---- The original document
"_score" : 0.9808291,
"_source" : {
"order" : [
{
"email" : "tom#test.com"
},
{
"email" : "tom#test2.com"
},
{
"email" : "tomsCompany#test.com"
}
]
},
"inner_hits" : { <---- Inner hits
"order" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.9808291,
"hits" : [
{
"_index" : "my_nested_index",
"_type" : "_doc",
"_id" : "1",
"_nested" : {
"field" : "order",
"offset" : 0
},
"_score" : 0.9808291,
"_source" : { <---- The Exact Hit.
"email" : "tom#test.com"
}
}
]
}
}
}
}
]
}
}
Notice the section where I've mentioned inner_hits. This is where it would tell you which document it has hit.
Hope this helps!

How to achieve searching on unstructured data in Spring boot Elastic Search integration with MongoDB

I am a newbie in elasticsearch and wanna know if the following case works for me
I wanna achieve search functionality on unstructured data, What I mean by that is I dont know what kind of fields does a model have, as you can see the image below I have a data property inside a model in which any kind of data can be
I know how to connect mongodb and elasticsearch using mongo-connect but I dont know that requirement can be achieved or not?
This answer is based on you last comment.
Let's say for example that your data field mappings look like:
PUT my_index
{
"mappings": {
"properties": {
"data": {
"type": "nested"
}
}
}
}
As you can see we didn't insert field to our schema, elastic will do that for us when we will index the first document.
Insert a new document:
POST my_index/_doc/1
{
"data" : {
"adType" : "SELL",
"price" : "2000",
"numberOfRooms" : 20,
"isNegotiable" : "true",
"area" : 200
}
}
If we want to search for the word SELL but we don't know which field is assigned to it then we could use the following query:
GET my_index/_search
{
"query": {
"nested": {
"path": "data",
"query": {
"multi_match": {
"query": "2000",
"fields": [],
"type": "best_fields"
}
}
}
}
}
We set fields=[] meaning:
If no fields are provided, the multi_match query defaults to the index.query.default_field index settings, which in turn defaults to *. * extracts all fields in the mapping that are eligible to term queries and filters the metadata fields. All extracted fields are then combined to build a query.
We used multi_match query
The results we get:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.2876821,
"_source" : {
"data" : {
"adType" : "SELL",
"price" : "2000",
"numberOfRooms" : 20,
"isNegotiable" : "true",
"area" : 200
}
}
}
]
}
}
UPDATE
Insert a document
POST my_index/_doc/1
{
"data" : "SELL 2000 20 true 200"
}
Then your query:
GET my_index/_search
{
"query": {
"match":
{
"data":"SELL 2000"
}
}
}
In spring using QueryBuilder
QueryBuilder qb = QueryBuilders.matchQuery("data", "SELL 2000");
I hope this is what you were looking for.

range query not working as intended [elasticsearch]

I am executing a simple range query. But I see that an empty result being returned. But I know that they are many records/documents that satisfy the query.
Below are the 3 types of queries I have tried.
(the third one is intended query)
1)
"query": {
"range" : {
"endTime" : {
"gte" : 1559076400.0
}
}
}
2)
"query": {
"bool": {
"must": [
{"range" : {
"endTime" : {
"gte" : 1559076401.0
}
}
}
]
}
}
3)
"query": {
"bool": {
"filter": [
{"range" : {
"startTime" : {
"gt" : 1356873300.0
}
}
},
{"range" : {
"endTime" : {
"gte" : 1559076401.0
}
}
}
]
}
All 3 queries return an empty response.
Hope you people can help. Thank you.
In elastic index, before inserting data, you you need define the fields mappings as date or numbers so that range searches can be applied.
Or keep dynamic mappings ON so that elastic can identify the field types automatically based on inserted data.
In case of latter, do check the auto generated mappings on your index.
Also check the date/timestamp format.
Steps to check mappings
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-get-mapping.html
Since you are using epoch time, you need to mention that in the mapping. This is what I did. Basically the mapping and the way you stored the data mattered here. I am not sure if we can save any format as we want and query using any format we want. I will do some more research and update the answer if that can be done
1) created the mapping -- to show how the endTime mapping is done
2) inserting a few sample documents
3) queried the document using epoch time -- the way you wanted
Mapping
PUT so_test24
{
"mappings" : {
"_doc" : {
"properties" : {
"id" : {
"type" : "long"
},
"endTime" : {
"type" : "date",
"format": "epoch_millis"
}
}
}
}
}
Inserting the documents
POST /so_test24/_doc
{
"id": 1,
"endTime": "1546300800"
}
POST /so_test24/_doc
{
"id": 2,
"endTime": "1514764800"
}
POST /so_test24/_doc
{
"id": 3,
"endTime": "1527811200"
}
POST /so_test24/_doc
{
"id": 4,
"endTime": "1535760000"
}
The search Query
GET /so_test24/_search
{
"query": {
"range": {
"endTime": {"gte": "1532883892"}
}
}
}
The result
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 1.0,
"hits" : [
{
"_index" : "so_test24",
"_type" : "_doc",
"_id" : "uFIq42sB4TH56W1h-jGu",
"_score" : 1.0,
"_source" : {
"id" : 1,
"endTime" : "1546300800"
}
},
{
"_index" : "so_test24",
"_type" : "_doc",
"_id" : "u1Iq42sB4TH56W1h-zEK",
"_score" : 1.0,
"_source" : {
"id" : 4,
"endTime" : "1535760000"
}
}
]
}
}

Resources