ElasticSearch: update by query from a different index - elasticsearch

I have the following problem with ElasticSearch. Let's say I have one index called "products". In general, its documents have the following fields:
productId
productPackId
productName
price
And then (for reason that I cannot explain here, but let's say weren't my decision) I have another index called "productPacks" with:
productPackId
name
imageUrl
Now, I need to get the imageUrl field of the index "productPacks" in the "products" index according to which *productPackId" each document on the "products" index has. To clarify: let's say that in "productPacks" the document with
"productPackId" = 1
has as
imageUrl: "https://mywebsite.com/image1.jpg",
what I need is that all documents on the "products" index that have "productPackId" === 1 get then
imageUrl: "https://mywebsite.com/image1.jpg"
I can't find a way of doing it.
Thanks in advance!
(This, of course, would be super easy on a SQL database.)

What you basically want to do is join the two indices, on the "productPackId".
This is not possible in elasticsearch over two different indices.
There is a simple solution:
Iterate over each and every document in the index with the image url's(Index 2) and update by query into index 1, use the productPackId to make the query. That way you will be able to add the image_urls into index1.
Elasticsearch does not have any concept of Join's across indexes.
HTH.

The result you expect, you can only do it with a SQL request
https://www.elastic.co/guide/en/elasticsearch/reference/master/xpack-sql.html

Related

How to retrieve and list the first element of a field use Elasticsearch query (two compare and find end deleted duplicated documents in same index)?

in my elasticsearch index all logs have a field called RES and the structure look like this :
Number:"12131", amount:8, referenceNumber:"140102129728883", expire:"1365", securityControl:0
I want to compare number in all indexed documents and delete duplicated documents.
can anybody help me?

Elasticsearch get sibling documents for a document matching a query

I have parent and child documents in my Elasticsearch index related through a join: https://www.elastic.co/guide/en/elasticsearch/reference/6.3/parent-join.html.
I would like to be able to submit a query which matches on child documents and returns the siblings of the matching child documents.
My situation is i have students divided into groups, each student in my index is a separate child document and all students in the same group have the same parentId. The parent document contains no meaningful fields other than a groupId. My query is I want to get the list of all of the students who are in the group with student X with a single query.
For example my query would look similar to:
{
"query": {
"match": {
"studentName": "Bob"
}
}
}
And my response would list all the students who are in the same group as "Bob"
NOTE: I realize this problem could easily be solved by nesting the children who are in a group together into a single document, however, for my use case i cannot do this as i need to support a second query which is to be able to search for a student by name and return the results in sorted order based on relevancy. If i nest the student documents inside the same document, to my understanding, i can no longer achieve this second query.
Does anyone know if the search for siblings query is possible?
Or more broadly does anyone know of any ES construct that would allow me to achieve both searching for students in the group with student X with a single query AND searching for student by name in a single query?
Looks like that can be achieved by nesting has_child inside has_parent. Still, you cant sort by child doc's properties + this query is going to be slow depending on your index size.

How to project a new field in response in ElasticSearch?

I am using Elasticsearch 6.2.
I have an index products with index_type productA having data with following structure:
{
"id": 1,
"parts": ["part1", "part2",...]
.....
.....
}
Now during the query time, I want to add or project a field parts_count to the response which simply represents the number of parts i.e the length of parts array. Also, if possible, I would also like to sort the documents of productA based on the generated field parts_count.
I have gone through most of the docs but haven't found a way to achieve this.
Note:
I don't want to update the mapping and add dynamic fields. I am not sure if Elasticsearch allows it. I just wanted to mention it.
Did you read about Script Fields and on Script Based Sorting?
I think you should be able to achieve both things and this not require any mapping updates.

Messages aggregation in elasticsearch

For example I have next documents.
{sourceIP:1.1.1.1, destIP:2.2.2.2}
{sourceIP:1.1.1.1, destIP:3.3.3.3}
{sourceIP:1.1.1.1, destIP:4.4.4.4}
Is there anyway to automatically aggregate them into one document which will contain next data?
{sourceIP:1.1.1.1, destIP:{2.2.2.2,3.3.3.3,4.4.4.4}}
So it looks like group by in SQL, but generate new documents in elasticsearch instead of old one.
I dont think there is anyway to do indexing time auto-merging of documents.
However , it should be possible to acheive whatever result you are planning to query should be possible by using one of querying options offered by Elasticsearch - while indexing one document for ,
Like ..
You can index seperate documents, query by sourceIP and use aggregations to give dest_ip
Take count of documents if its just to find dest_ips for a source_ip
Also if you want to avoid duplicate source_id + dest_id combinations , you can concat and use it as _id of document
Hope this helps.

How to retrieve all document ids matching a search, in elastic search?

I'm working on a simple side project, and have a tech stack that involves both a SQL database and ElasticSearch. I only have ElasticSearch because I assumed that as my project grows, my full text searching would be most efficiently performed by ES. My ES schema is very simple - documents that I insert into ES have 2 fields, one being the id and the other being the field with the body of text to search. The id being inserted into ES corresponds to that document's primary key id from the SQL database.
insert record into SQL -> insert record into ES using PK from SQL
Searching would be the reverse of that. Query ES and grab all the matching ids, and then turn around and use those ids to get records from SQL.
search ES can get all PK ids -> use those ids to get documents from SQL
The problem that I am facing though, is that ES can only return documents in a paginated manner. This is a problem because I also have a WHERE clause on my SQL query, beyond just the ids. My SQL query might look like this ...
SELECT * FROM foo WHERE id IN (1,2,3,4,5) AND bar != 'baz'
Well, with ES paginating the results, my WHERE clause will always only be querying a subset of the full results from ES. Even if I utilize ES' skip and take, I'm still only querying SQL using a subset of document ids.
Is there a way to get Elastic Search to only return the entire list of matching document ids? I realize this is here to not allow me to shoot myself in the foot, because doing this across all shards and many many documents is not efficient. Is there no way, though?
After putting in some hours on this project, I've only now realized that I've poorly engineered this, unless I can get all of these ids from ES. Some alternative implementations that I've thought of would be to store the things that I'm filtering on, in SQL, in ES as well. A problem there is that I'd have to update the ES document every time I update the document in SQL. This would require a pretty big rewrite to some of my data access code. I could scrap ElasticSearch all together and just perform searching in Postgres, for now, until I can think of a better way to structure this.
The elasticsearch not support return each and every doc match to you queries. Because it Ll overload the system. Instead of this.. Use scroll concept in elasticsearch.. It's lik cursor concept in db's..
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/scan-scroll.html
For more examples refer the Github repo. https://github.com/sidharthancr/elasticsearch-java-client
Hope it helps..
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-fields.html
please have a look into the elastic search document where you can specify only particular fields that return from the match documents
hope this resolves your problem
{
"fields" : ["user", "postDate"],
"query" : {
"term" : { "user" : "kimchy" }
}
}

Resources