Elasticsearch mapping boolean with value "0" and "1" - elasticsearch

ElasticSearch version 7.13
Index already exist and I want to reindex with mapping, the field is a boolean. But when I'm trying to reindex, the field has "1" and "0" (string).
How can I evaluate if field = "1" set true (same for 0, but false)?
I have read about runtime, but can't figure out how does it work.
my mapping
{
mappings:{
"OPTIONS": {
"type": "nested",
"properties":{
"COMBINABLE": {
"type": "boolean"
}
}
}
}
}
and document
{
"options": [
{
"COMBINABLE": "0"
}
]
}

You might consider using pipeline ingestion to convert your number to a boolean value, you can do something like this:
POST _ingest/pipeline/_simulate
{
"pipeline": {
"description": "convert to boolean",
"processors": [
{
"script": {
"source": "def options = ctx.options;def pairs = new ArrayList();for (def pair : options) {def k = false;if (pair[\"COMBINABLE\"] == \"1\" || pair[\"COMBINABLE\"] == 1) {k = true;}pair[\"COMBINABLE\"] = k;}ctx.options = options;"
}
}
]
},
"docs": [
{
"_source": {
"options": [
{
"COMBINABLE": 1
}
]
}
}
]
}
The painless script above is pretty simple:
def options = ctx.options;
def pairs = new ArrayList();
for (def pair : options) {
def k = false;
if (pair["COMBINABLE"] == "1" || pair["COMBINABLE"] == 1) {
k = true;
}
pair["COMBINABLE"] = k;
}
ctx.options = options;
It simply loop through all your option under options, then if the COMBINABLE is 1 or "1", it will convert to true, otherwise, it will be false. You can set the pipeline as your default ingestion, see here

Related

GraphQL query that excludes results where an array relationship is empty

I have this query (in Hasura in case that matters):
query MyQuery {
records(distinct_on:[recordId], where: { modelId: {_eq: "2f1f70b8-cb7b-487c-9e4c-ca03624ce926"}}) {
recordId
inboundEdges(where: {fromModelId: {_eq: "f0e19461-6d38-4148-8041-54eba6451293"}}) {
fromRecord {
property_path_values(where:{stringValue:{_eq:"2021-08-26"}}) {
stringValue
}
}
}
}
}
I get this result back:
{
"data": {
"records": [
{
"recordId": "2fbe37b1-78db-4b22-b713-2388cfb52597",
"inboundEdges": [
{
"fromRecord": {
"property_path_values": [
{
"stringValue": "2021-08-26"
}
]
}
},
{
"fromRecord": {
"property_path_values": [
{
"stringValue": "2021-08-26"
},
{
"stringValue": "2021-08-26"
}
]
}
}
]
},
{
"recordId": "7b34e85d-f4e1-4099-89d9-02483128a6cd",
"inboundEdges": [
{
"fromRecord": {
"property_path_values": [
{
"stringValue": "2021-08-26"
}
]
}
}
]
},
{
"recordId": "840f52e2-0f2e-4591-810d-19f9e8840a49",
"inboundEdges": []
}
]
}
}
I do not want the third result in the response, because it's inboundEdges array is empty.
What I am trying to say is: find me all records that have at least one inboundEdge with a fromRecord that has at least one property_path_value with a stringValue equal to 2021-08-26. I do not want to have to parse the response needing to exclude results with inboundEdges === []
Seems I was confusing the selection set with the place to state the query. The right way to do what I wanted is:
query MyQuery {
records(where: {inboundEdges: {fromModelId: {_eq: "f0e19461-6d38-4148-8041-54eba6451293"}, fromRecord: {propertyPathValues: {stringValue: {_eq: "2021-08-26"}}}}, modelId: {_eq: "2f1f70b8-cb7b-487c-9e4c-ca03624ce926"}}) {
recordId
}
}
i.e. put the query in the where clause, like a normal person, not the selection set

Selecting items from a hash based on sub-hash values

I have the following JSON output from an API:
{
"Objects": [
{
"FieldValues": [
{
"Field": {
"Name": "Nuix Field"
},
"Value": "Primary Date"
},
{
"Field": {
"Name": "Field Type"
},
"Value": {
"Name": "Nuix"
}
},
{
"Field": {
"Name": "Field Category"
},
"Value": {
"Name": "Generic"
}
}
]
}
]
}
I want to be able to select all Objects where "Field" has a "Name" of "Field Type" and it's "Value" has a "Name" of "Nuix".
This is my attempt, but I feel like there is a better way to do it?
json = JSON.parse(response)
results = []
json["Objects"].each do |obj|
obj["FieldValues"].each do |fv|
if fv["Field"]["Name"] == "Field Type" && fv["Value"]["Name"] == "Nuix"
results << obj
end
end
end
One of the options is not to loop all FieldValues but only until expected one is found with the any? method.
Then you can simplify code with select method, which will create new array with only "satisfied" objects.
objects_with_required_fields = json.fetch("Objects", []).select do |obj|
obj.fetch("FieldValues", []).any? do |fv|
name = fv.dig("Field", "Name")
value = fv["Value"]
name == "Field Type" && value.is_a?(Hash) && value["Name"] == "Nuix"
end
end
Here's a more minimal Ruby solution:
json = JSON.parse(response, symbolize_names: true)
target = [ 'Field Type', 'Value' ]
# For each of the entries in Objects...
results = json[:Objects].flat_map do |obj|
# ...filter out those that...
obj[:FieldValues].select do |fv|
# ...match the target criteria.
[ fv.dig(:Field, :Name), fv[:Value] ] == target
end
end
Where that uses symbolized keys and just filters through an array of arrays looking for matching entries, then returns those in one (flat) array.

Elasticsearch: Update/upsert an array field inside a document but ignore certain existing fields

GET _doc/1
"_source": {
"documents": [
{
"docid": "ID001",
"added_vals": [
{
"code": "123",
"label": "Abc"
},
{
"code": "113",
"label": "Xyz"
}
]
},
{
"docid": "ID002",
"added_vals": [
{
"code": "123",
"label": "Abc"
}
]
}
],
"id": "1"
}
POST /_bulk
{ "update": { "_id": "1"}}
{ "doc": { "documents": [ { "docid": "ID001", "status" : "cancelled" } ], "id": "1" }, "doc_as_upsert": true }
The problem above is when I run my bulk update script it replaces that document field, removing the added_vals list. Would I be able to achieve this using painless script? Thank you.
Using elasticsearch painless scripting
POST /_bulk
{ "update": { "_id": "1"} }
{ "scripted_upsert":true, "script" :{ "source": "if(ctx._version == null) { ctx._source = params; } else { def param = params; def src = ctx._source; for(s in src.documents) { boolean found = false; for(p in param.documents) { if (p.docid == s.docid) { found = true; if(s.added_vals != null) { p.added_vals = s.added_vals; } } } if(!found) param.documents.add(s); } ctx._source = param; }", "lang": "painless", "params" : { "documents": [ { "docid": "ID001", "status" : "cancelled" } ], "id": "1" } }, "upsert" : { } }
well, this one worked for me. I need to tweak a few more things that I require, but I will just leave it here for someone who may need it. Didnt know it was this simple. If there is any other answer that might be easier, please do submit so. Thanks.
"script" :
if(ctx._version == null)
{
ctx._source = params;
}
else
{
def param = params;
def src = ctx._source;
for(s in src.documents)
{
boolean found = false;
for(p in param.documents)
{
if (p.docid == s.docid)
{
found = true;
if(s.added_vals != null)
{
p.added_vals = s.added_vals;
}
}
}
if(!found) param.documents.add(s);
}
ctx._source = param;
}
I am not sure if I should modify the params directly so I used and pass the params to the param variable. I also used scripted_upsert: true with a ctx._version not null check.

Elasticsearch partial update of Object(multi=True)

How to update document with field mapping Object(multi=True),
when a document can have both single (dictionary) and multiple values (list of dictionaries).
Example of documents in the same index:
A single value in items:
{
"title": "Some title",
"items": {
"id": 123,
"key": "foo"
}
}
Multiple values in items:
{
"title": "Some title",
"items": [{
"id": 456,
"key": "foo"
}, {
"id": 789,
"key": "bar"
}]
}
You can try to use the following script.
I intentionally formatted inline attribute to show what's inside.
POST index_name/_update_by_query
{
"search": {
"term": {
"items.key": "foo"
}
},
"script": {
"inline": "
if (ctx._source.items instanceof List) {
for (item in ctx.source.items) {
if (item.key == params.old_value) {
item.key = params.new_value;
break;
}
}
} else {
ctx._source.items.key = params.new_value;
}
",
"params": {"old_value": "foo", "new_value": "bar"},
"lang": "painless'
}
}
And to make it work, replace inline attribute with a single line value.
"inline": "if (ctx._source.items instanceof List) {for (item in ctx.source.items) {if (item.key == params.old_value) {item.key = params.new_value;break;}}} else {ctx._source.items.key = params.new_value;}"

elasticsearch script to check if field exists and create it

I've created a script which records the history of the tags that are applied to my documents in elastic. The names of the tags are dynamic, so when I try to move the current tag to the history field, it fails for tags that do not already have a history field.
This is my script to copy the current tags, to the tag history field:
script:"ctx._source.tags[params.tagName.toString()].history.add(ctx._source.tags[params.tagName.toString()].current)"
This is what the documents look like:
"tags": {
"relevant": {
"current": {
"tagDate": 1501848372292,
"taggedByUser": "dev",
"tagActive": true
},
"history": [
{
"tagDate": 1501841137822,
"taggedByUser": "admin",
"tagActive": true
},
{
"tagDate": 1501841334127,
"taggedByUser": "admin",
"tagActive": true
},
}}}}
The users can add new tags dynamically, so what I want to do is create the history object if it does not exist and then I can populate it.
There is very little documentation available for the elasticsearch scripting, so I'm hoping someone wise will know the answer as I'm sure that checking for a field and creating it are fundamental things to the elastic scripting languages.
Update
So, having rethought the structure of this index, what I want to achieve is the following:
tags:[
{hot:
{current:{tagDate:1231231233, taggedbyUser: user1, tagStatus: true},
history:[ {tagDate:123444433, taggedbyUser: user1, tagStatus: true},
{tagDate:1234412433, taggedbyUser: user1, tagStatus: true}
]
}
{interesting:
{current:{tagDate:1231231233, taggedbyUser: user1, tagStatus: true},
history:[ {tagDate:123444433, taggedbyUser: user1, tagStatus: true},
{tagDate:1234412433, taggedbyUser: user1, tagStatus: true}
]
}
]
The tag names in this example are "hot" and "interesting", however the user will be able enter any tag name they want, so these are in no way predefined. When a user tags a document in elastic and the tag that is applied already exists in elastic, it should more the "current" tag to the "history" array and then overwrite the "current" tag with the new values.
Thank you for the responses to date, however the example code does not work for me.
The problem I think I'm having is that, first the code will need to loop through all of the tags and get the name. I then want to compare each of these to the name that I am supplying in the params. I think that this is where the first issue is arising.
I then need to move the "current" object to the "history" array. There also appears to be an issue here. I'm trying to use the "ctx._source.tags[i].history.add(params.param1), however nothing is added.
Any thoughts?
Thanks!
It's a bit more complicated because you need to do three things in the script:
if history does not already exist, initialize the array
move current tag to history
replace old current tag with the new one
Assuming that your initial document looks like this (note no history yet):
{
"_id": "AV2uvqCUfGXyNt1PjTbb",
"tags": {
"relevant": {
"current": {
"tagDate": 1501848372292,
"taggedByUser": "dev",
"tagActive": true
}
}
}
}
to be able to execute these three steps, you need to run following script:
curl -X POST \
http://127.0.0.1:9200/script/test/AV2uvqCUfGXyNt1PjTbb/_update \
-d '{
"script": {
"inline": "if (ctx._source.tags.get(param2).history == null) ctx._source.tags.get(param2).history = new ArrayList(); ctx._source.tags.get(param2).history.add(ctx._source.tags.get(param2).current); ctx._source.tags.get(param2).current = param1;",
"params" : {
"param1" : {
"tagDate": 1501848372292,
"taggedByUser": "my_user",
"tagActive": true
},
"param2": "relevant"
}
}
}'
And I get as a result:
{
"_id": "AV2uvqCUfGXyNt1PjTbb",
"_source": {
"tags": {
"relevant": {
"current": {
"tagActive": true,
"tagDate": 1501848372292,
"taggedByUser": "my_user"
},
"history": [
{
"tagDate": 1501848372292,
"taggedByUser": "dev",
"tagActive": true
}
]
}
}
}
}
Running the same script with a new content of parm1 (new tag) gives:
{
"_id": "AV2uvqCUfGXyNt1PjTbb",
"_source": {
"tags": {
"relevant": {
"current": {
"tagActive": true,
"tagDate": 1501841334127,
"taggedByUser": "admin"
},
"history": [
{
"tagDate": 1501848372292,
"taggedByUser": "dev",
"tagActive": true
},
{
"tagActive": true,
"tagDate": 1501848372292,
"taggedByUser": "my_user"
}
]
}
}
}
}
Update - if `tags` is a list
If tags is a list of "inner json objects", for example:
{
"tags": [
{
"relevant": {
"current": {
"tagDate": 1501841334127,
"taggedByUser": "dev",
"tagActive": true
}
}
},
{
"new_tag": {
"current": {
"tagDate": 1501848372292,
"taggedByUser": "admin",
"tagActive": true
}
}
}
]
}
you have to iterate over the list to find the index of the right element. Let's say you want to update element new_tag. First, you need to check if this tag exists - if so, get its index, if not, return from the script. Having the index, just get the right element and you can go almost the same as before. The script looks like this:
int num = -1;
for (int i = 0; i < ctx._source.tags.size(); i++) {
if (ctx._source.tags.get(i).get(param2) != null) {
num = i;
break;
};
};
if (num == -1) {
return;
};
if (ctx._source.tags.get(num).get(param2).history == null)
ctx._source.tags.get(num).get(param2).history = new ArrayList();
ctx._source.tags.get(num).get(param2).history.add(ctx._source.tags.get(num).get(param2).current);
ctx._source.tags.get(num).get(param2).current = param1;
And the wole query:
curl -X POST \
http://127.0.0.1:9200/script/test/AV29gAnpqbJMKVv3ij7U/_update \
-d '{
"script": {
"inline": "int num = -1; for (int i = 0; i < ctx._source.tags.size(); i++) {if (ctx._source.tags.get(i).get(param2) != null) {num = i; break;};}; if (num == -1) {return;}; if (ctx._source.tags.get(num).get(param2).history == null) ctx._source.tags.get(num).get(param2).history = new ArrayList(); ctx._source.tags.get(num).get(param2).history.add(ctx._source.tags.get(num).get(param2).current); ctx._source.tags.get(num).get(param2).current = param1;",
"params" : {
"param1" : {
"tagDate": 1501848372292,
"taggedByUser": "my_user",
"tagActive": true
},
"param2": "new_tag"
}
}
}
'
Result:
{
"tags": [
{
"relevant": {
"current": {
"tagDate": 1501841334127,
"taggedByUser": "dev",
"tagActive": true
}
}
},
{
"new_tag": {
"current": {
"tagActive": true,
"tagDate": 1501848372292,
"taggedByUser": "my_user"
},
"history": [
{
"tagDate": 1501848372292,
"taggedByUser": "admin",
"tagActive": true
}
]
}
}
]
}
I think you can do something like this in groovy scripting
{
"script": "if( ctx._source.containsKey(\"field_name\") ){ ctx.op = \"none\"} else{ctx._source.field_name= field_value;}"
}

Resources