Best way to get field level exclusion/security on Elastic Search - elasticsearch

I have data which looks like this :
[{
"id":"1",
"name": "Champ",
"lastName": "Camp",
"address": "1922 Breaking Change",
"postalCode": "34802",
"businessLine": "line1"
}, {
"id":"2",
"name": "Kamp",
"lastName": "Mamp",
"address": "1922 Breaking Change 1",
"postalCode": "138283",
"businessLine": "line2"
}, {
"id":"3",
"name": "Jamp",
"lastName": "Tamp",
"address": "1922 Breaking Change 2",
"postalCode": "18941",
"businessLine": "line1"
}, {
"id":"4",
"name": "Kamp",
"lastName": "Mamp",
"address": "1922 Breaking Change 1",
"postalCode": "138283",
"businessLine": "line2"
}, {
"id":"5",
"name": "Gamp",
"lastName": "Damp",
"address": "1922 Breaking Change 12",
"postalCode": "121333",
"businessLine": "line1"
}]
The problem is that I have to ensure security on the searchable fields related to businessLine. There are certain fields which are protected by businessLine level security. It means fields like name, postalCode, address are only searchable if you belong to a businessLine.
A scenario :
line1 has access to field "name" and line2 has access to field "postalCode".
So for user with access to businessLines, line1 and line2, the field "name" will be searchable only in the documents with ids 1,2 and 5.
But the postal code will not be searchable on these documents.
The field postal code will only be searchable in documents with id 2 and 4.
I tried something and that works :
{
"query": {
"query_string" : {
"default_field" : "content",
"query" : "("name" :"amp" AND (businessline : line1 OR line3)) ("postalCode" :"amp" AND (businessline : line2 OR line3))"
}
}
}
This gives the desired results, but is there a better way of handling this?

Related

Turn a JSON array of key/value pairs into object properties

I'm trying to use JSONata to convert arrays of "key/value" objects into properties of the parent object. My input looks like this:
[
{
"city": "Ottawa",
"properties": [
{
"name": "population",
"value": 37
},
{
"name": "postalCode",
"value": 10001
},
{
"name": "founded",
"value": 1826
}
]
},
{
"city": "Toronto",
"properties": [
{
"name": "population",
"value": 54
},
{
"name": "postalCode",
"value": 10002
}
]
}
]
I'm struggling to generate the output I need, I've seen examples that reference explicit elements, like in this answer, but I need the properties to be converted "dynamically" since I don't know them in advance. I think I need something like this, but I'm missing some particular function:
$[].{
"city": city,
properties.name: properties.value
}
This is the output I need to generate:
[
{
"city": "Ottawa",
"population": 37,
"postalCode": 10001,
"founded": 1826
},
{
"city": "Toronto",
"population": 54,
"postalCode": 10002
}
]
The properties arrays don't always contain the same keys, but the city attributes are always present.
You can use the reduce operator, as described in the Grouping docs here:
$[].(
$city := city;
properties{ "city": $city, name: value }
)
You can play with it live: https://stedi.link/uUANwtE
Please try this expression.
$[].{
"city": $.city,
$.properties[0].name: $.properties[0].value,
$.properties[1].name: $.properties[1].value,
$.properties[2].name: $.properties[2].value,
$.properties[3].name: $.properties[3].value
}
https://try.jsonata.org/s1Ea4kUvo

How to concatenate a Object instead whole them replace using Update API (painless) on write ElasticSearch

Scenario: have a document A on ElasticSearch and I want concatenate on this document some fields existing on document B.
Document A:
{
"id": 121423,
"name": "Sample Name",
"timestamp": "2020-10-01T00:12:00",
"age": 24
}
Document B:
{
"city": "New York",
"job": "programmer"
}
To write on this document I can use the write ES path and a body like it:
{"update":{"_index":"test-index","_id":"121423"}}
{"script": {"source":"if( ctx._source.containsKey('id') ){ ctx._source = params.param1; }","lang":"painless","params":{"param1":{"city": "New York", "job": "programmer"}}},"upsert":{"id":121423,"name": "Sample Name","timestamp": "2020-10-01T00:12:00","age": 24}}
But, if document A exists on ES, like check do (if( ctx._source.containsKey('id') )), the document is completely overwrite. I can replace the attributed params to pick each element by step like:
{ ctx._source.city = params.param1.city; ctx._source.job = params.param1.job }
That would solve my problem, BUT this get me a problem, the logic can't be static, because on the real world, the document have many (Very many) fields, and the support of application gonna be hard. The desired document on the last update must be something like:
{
"id": 121423,
"name": "Sample Name",
"timestamp": "2020-10-01T00:12:00",
"age": 24,
"city": "New York",
"job": "programmer"
}
So, the question is, How I can update the document appending the new fields with little steps or using only one operator?
You should probably just add , "doc_as_upsert" : true
I never tried it with a script but it should work.
{ "update" : {"_id" : "2", "_index" : "index1", "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"}, "doc_as_upsert" : true }

Parameterize POST body in JMeter HTTP POST

I am using Apache JMeter to run a few performance tests against RESTFUL API for an application that we have developed. I have an end point "api/create/empListJob", which basically adds one or more employee records in the MongoDB. The payload for the POST call looks like this:
{
"employeeList": [
{
"first_name": "josh",
"last_name": "don",
"age": "25",
"address": {
"street1": "xyz",
"street2": "apt-10",
"city" : "def",
"state" : "CA",
"zip" : "95055"
},
"deptType": {
"deptID": "1",
"deptName": "R&D"
}
},
{
"first_name": "mary",
"last_name": "jane",
"age": "22",
"address": {
"street1": "zzz",
"street2": "apt-15",
"city" : "yyy",
"state" : "CA",
"zip" : "95054"
},
"deptType": {
"deptID": "2",
"deptName": "HR"
}
}
]
}
As you can see, the payload takes a list of employee data, and it should have atleast one employee record. I have a requirement in which i want JMeter thread group to have 10 threads and each of these threads should make a concurrent POST to "api/create/empListJob" such that the body has 10 unique employee records, thus creating a total of 100 records. What is the best way that i could parameterize the payload?
Take a look at JMeter Functions, like:
__threadNum() - returns the number of current thread (virtual user)
__Random() - returns a random number in a given range
__RandomString() - returns a random string from specified input characters
__UUID() - returns random GUID structure
So for example if you change your JSON payload to look like:
"employeeList": [
{
"first_name": "josh-${__threadNum}",
"last_name": "don-${__threadNum}",
"age": "25",
"address": {
"street1": "xyz",
"street2": "apt-10",
"city" : "def",
"state" : "CA",
"zip" : "95055"
},
"deptType": {
"deptID": "1",
"deptName": "R&D"
}
},
{
"first_name": "mary-${__threadNum}",
"last_name": "jane-${__threadNum}",
"age": "22",
"address": {
"street1": "zzz",
"street2": "apt-15",
"city" : "yyy",
"state" : "CA",
"zip" : "95054"
},
"deptType": {
"deptID": "2",
"deptName": "HR"
}
}
]
}
JMeter will create:
- `josh-1` for 1st virtual user
- `josh-2` for 2nd virtual usre
- etc.
See Apache JMeter Functions - An Introduction to get familiarized with JMeter Functions concept.

Elastic Search. Search by sub-collection value

Need help with specific ES query.
I have objects at Elastic Search index. Example of one of them (Participant):
{
"_id": null,
"ObjectID": 6008,
"EventID": null,
"IndexName": "crmws",
"version_id": 66244,
"ObjectData": {
"PARTICIPANTTYPE": "2",
"STATE": "ACTIVE",
"EXTERNALID": "01010111",
"CREATORID": 1006,
"partAttributeList":
[
{
"SYSNAME": "A",
"VALUE": "V1"
},
{
"SYSNAME": "B",
"VALUE": "V2"
},
{
"SYSNAME": "C",
"VALUE": "V2"
}
],
....
I need to find the only entity(s) by partAttributeList entities. For example whole Participant entity with SYSNAME=A, VALUE=V1 at the same entity of partAttributeList.
If i use usul matches:
{"match": {"ObjectData.partAttributeList.SYSNAME": "A"}},
{"match": {"ObjectData.partAttributeList.VALUE": "V1"}}
Of course I will find more objects than I really need. Example of redundant object that can be found:
...
{
"SYSNAME": "A",
"VALUE": "X"
},
{
"SYSNAME": "B",
"VALUE": "V1"
}..
What I get you are trying to do is to search multiple fields of the same object for exact matches of a piece of text so please try this out:
https://www.elastic.co/guide/en/elasticsearch/guide/current/multi-query-strings.html

Openrefine working with Templating to export JSON as records

I have been working with Openrefine for the last days trying to figure out how to export a Google Data sheet into a JSON file.
I have the following data that I want to export to a JSON file.
id first name last name friends first name friends last name family first name family last name
1 James Brown Judy Garland Mary Brown
John Neverland Marlene Brown
Paul Garland Judy Brown
2 John Buller Amy Garland Francis Buller
Peter Flake John Buller
Jules Peter Judy Buller
The JSON that I'm expecting is:
{
"results": [
{
"id": 1,
"firstName": "James",
"lastName": "Brown",
"has": {
"friends": [
{
"firstName": "Judy",
"lastName": "Garland"
},
{
"firstName": "John",
"lastName": "Neverland"
},
{
"firstName": "Paul",
"lastName": "Garland"
}
],
"family": [
{
"firstName": "Mary",
"lastName": "Brown"
},
{
"firstName": "Marlene",
"lastName": "Brown"
},
{
"firstName": "Judy",
"lastName": "Brown"
}
]
}
},
{
"id": 2,
"firstName": "John",
"lastName": "Buller",
"has": {
"friends": [
{
"firstName": "Amy",
"lastName": "Garland"
},
{
"firstName": "Peter",
"lastName": "Flake"
},
{
"firstName": "Jules",
"lastName": "Peter"
}
],
"family": [
{
"firstName": "Francis",
"lastName": "Buller"
},
{
"firstName": "John",
"lastName": "Buller"
},
{
"firstName": "Judy",
"lastName": "Buller"
}
]
}
}
]
}
So far I have tried several approaches:
1) using excel-to-json but it's limited to single nesting and it has some limitations as to column names
2) using Openrefine and the Templating tool but I have encountered several issues:
- Although they are detected as records in openrefine, you export rows and not records so it will export 6 rows to JSON, 4 of them containing empty data
- If i try filling down columns it will aso export 6 rows to JSON, 4 of them with duplicates thus loosing the relations between the person and his family members and friends
Any help would be much appreciated as I'm trying to export about 150,000 records of this type that have to be in this JSON format.
OpenRefine only support one level of nesting. You might need to go with a programming language or an ETL solution to have nested element.

Resources