Merge Documents based on field value? - elasticsearch

I have multiple Documents within an Index, each have the following fields:
id serviceName Type
Now, stupidly, id is not unique and I want to change that. I want to use Kibana/Elasticsearch to query the data so that I have id unique and the behaviour I want is that if I have the following Docs:
id serviceName Type
1 A T1
1 B T2
1 D T2
I use a query so that I get this result
1 A,B,C T1,T2,T3
Is there a way for this?

You cannot do this with just Elasticsearch/Kibana, you have to write some code. You can use the scroll api to iterate through all the documents in the index, and then use an upsert query to index them into a new index. I think your upsert request will look something like this:
POST test/type1/1/_update
{
"script" : {
"inline": "ctx._source.serviceName.add(params.serviceName); ctx._source.Type.add(params.Type)",
"lang": "painless",
"params" : {
"serviceName" : "A",
"Type": "T1"
}
},
"upsert" : {
"serviceName": ["A"],
"Type": ["T1"]
}
}
This means in case id 1 doesn't exist yet, add it with the "upsert" value for the document, otherwise do the script (which appends the serviceName and Type values to the existing doc).
This would be pretty straightforward to do with very little code using elasticsearch-py, check out the scan helper and bulk helper

Related

Python OpenSearch retrieve records based on element in a list

So I need to retrieve records based on a field called "cash_transfer_ids" which is a python list.
I want to retrieve all records whose cash_transfer_ids contain a specific id value (a string).
What should the query be look like? Should I use match or term query?
Example: I want to retrieve any record whose cash_transfer_ids field contains 'abc'
Then I may get record such as
record 1: cash_transfer_ids:['abc']
record 2: cash_transfer_ids:['dfdfd', 'abc']
etc...
Thanks very much for any help!
if cash_transfer_ids is type keyword I try filter with Term.
term = "abc"
query = {
"query": {
"term": {
"cash_transfer_ids": {
"value": term
}
}
}
}
response = get_client_es().search(index="idx_test", body=query)

How to combine 2 fiels with different names but same value in Kibana

I set up a ELK environment with 2 different indices(index) streams. both streams have a field with the same value but the filed name is different.
is there a possibility to merge them or something like that so when i use the Kibana filter it shows me the value from both filed.
so i can set up visualizations, but when i filter on stream 1, the visualization of stream 2 is empty.
i also tried to name the indedx the same, but did not help.
Example:
index1 fieldname information.ID = 123
index2 fieldname ID = 123
i want to use the filter on both streams
You can create an alias which contains both indexes. Write a query against the alias using script fields.
In script fields you can define the new field and its source logic from underlying documents.
"script_fields": {
"my_script_field": {
"script": {
"lang": "painless",
"inline": "doc['some_field'].value + doc['another_field'].value"
}
In this way the resultset will have single field “my_script_field”

Select and Update all matching Documents

We are trying to do the following and any help would be appreciated.
Say you make a search and 100,000 documents match.
We would like to increment a counter in each document that matched. Then at the same time select the first page say the first 50.
Can this be done in one operation or may be a parallel scenario.
You could try a multi search query for such kind of maneuver:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-multi-search.html
Basically you add several queries in a single multi search which is parallelized on the ES side, and returns a list of responses per each query.
You Can use Update by query using NEST.Let me know if you still facing any issues.
You can use update by query to do this :
{
"script": {
// fieldName is field you want to increment in document
"source": "ctx._source.fieldName=params.val", // counter increment by when query match
"lang": "painless",
"params": {
"i": 0,
"val": i+1,
}
},
"query": {
// your match condition
}
}

elasticsearch - query between document types

I have a production_order document_type
i.e.
{
part_number: "abc123",
start_date: "2018-01-20"
},
{
part_number: "1234",
start_date: "2018-04-16"
}
I want to create a commodity document type
i.e.
{
part_number: "abc123",
commodity: "1 meter machining"
},
{
part_number: "1234",
commodity: "small flat & form"
}
Production orders are datawarehoused every week and are immutable.
Commodities on the other hand could change over time. i.e abc123 could change from 1 meter machining to 5 meter machining, so I don't want to store this data with the production_order records.
If a user searches for "small flat & form" in the commodity document type, I want to pull all matching records from the production_order document type, the match being between part number.
Obviously I can do this in a relational database with a join. Is it possible to do the same in elasticsearch?
If it helps, we have about 500k part numbers that will be commoditized and our production order data warehouse currently holds 20 million records.
I have found that you can indeed now query between indexs in elasticsearch, however you have to ensure your data stored correctly. Here is an example from the 6.3 elasticsearch docs
Terms lookup twitter example At first we index the information for
user with id 2, specifically, its followers, then index a tweet from
user with id 1. Finally we search on all the tweets that match the
followers of user 2.
PUT /users/user/2
{
"followers" : ["1", "3"]
}
PUT /tweets/tweet/1
{
"user" : "1"
}
GET /tweets/_search
{
"query" : {
"terms" : {
"user" : {
"index" : "users",
"type" : "user",
"id" : "2",
"path" : "followers"
}
}
}
}
Here is the link to the original page
https://www.elastic.co/guide/en/elasticsearch/reference/6.1/query-dsl-terms-query.html
In my case above I need to setup my storage so that commodity is a field and it's values are an array of part numbers.
i.e.
{
"1 meter machining": ["abc1234", "1234"]
}
I can then look up the 1 meter machining part numbers against my production_order documents
I have tested and it works.
There is no joins supported in elasticsearch.
You can query twice first by getting all the partnumbers using "small flat & form" and then using all the partnumbers to query the other index.
Else try to find a way to merge these into a single index. That would be better. Updating the Commodities would not cause you any problem by combining the both.

Group by field in found document

The best way to explain what I want to accomplish is by example.
Let us say that I have an object with fields name and color and transaction_id. I want to search for documents where name and color match the specified value and that I can accomplish easily with boolean queries.
But, I do not want only documents which were found with search query. I also want transaction to which those documents belong, and that is specified with transaction_id. For example, if a document has been found with transaction_idequal to 123, I want my query to return all documents with transaction_idequal to 123.
Of course, I can do that with two queries, first one to fetch all documents that match criteria, and the second one that will return all documents that have one of transaction_idvalues found in first query.
But is there any way to do it in a single query?
You can use parent-child relation ship between transaction and your object. Or nest the denormalize your data to include the objects in the transactions. Otherwise you'll have to do an application side join, meaning 2 queries.
Try an index mapping similar to the following, and include a parent_id in the objects.
{
"mappings": {
"transaction": {},
"object": {
"_parent": {
"type": "transaction"
}
}
}
}
Further reading:
https://www.elastic.co/guide/en/elasticsearch/guide/current/parent-child-mapping.html

Resources