Nested keys in redis using Spring Boot - spring-boot

I want to run a job in spring boot using quartz where multiple threads will execute the method.
What i want is to save the result in redis for every processing, so i can get idea how good job is working.
I want to save data in redis in this form.
{
"2020-04-20": [
{
"item_1": {
"success": "true",
"message": ""
}
},
{
"item_2": {
"success": "true",
"message": ""
}
}
]
}
I want to insert all the items in key date.
Since multiple threads are working , every thread is working on some item. So all item should be inserted into only key (date).
Is it possible?
one solution is to over-write the data of (date) key again and again , first getting data from redis, appending item on it and again saving the key in redis.
Is there another way , or using some annotation like #cacheable, #cacheput etc. so that i can create nested key. automatically item is appended in the (date) key.

Have you considered RedisJSON?
Soemthing like this (I haven't tested it, I don't have RedisJSON handy)
JSON.SET "2020-04-20" . [] // create the object once
JSON.ARRAPPEND "2020-04-20". '{ // every thread issues a command like this.
"item": {
"success": "true",
"message": "thread 123"
} }'
JSON.ARRAPPEND "2020-04-20". '{ // every thread issues a command like this.
"item": {
"success": "true",
"message": "thread 456"
} }'
JSON.ARRAPPEND are supposed to be atomic.

I solved it using redis set functionality.
I am using jedis client in my project.
It has very useful funtions like:-
1) sadd => insertion of element.O(1)
2) srem => deletion of element in set.O(1)
3) smembers => getting all results.O(N)
This is what i needed.
In my case date is the key, and other details (one object of json) is the member of the set. So, i convert my json to data to string when adding memeber in set, and when getting data i convert it back from string to json.
This solved my problem.
Note:- There is also list functionality that can be used. But time complexities for list are not O(1). In my case i am sure i will not have duplicates so set works for me.

Related

Azure Data Factory REST API paging with Elasticsearch

During developing pipeline which will use Elasticsearch as a source I faced with issue related paging. I am using SQL Elasticsearch API. Basically, I've started to do request in postman and it works well. The body of request looks following:
{
"query":"SELECT Id,name,ownership,modifiedDate FROM \"core\" ORDER BY Id",
"fetch_size": 20,
"cursor" : ""
}
After first run in response body it contains cursor string which is pointer to next page. If in postman I send the request and provide cursor value from previous request it return data for second page and so on. I am trying to archive the same result in Azure Data Factory. For this I using copy activity, which store response to Azure blob. Setup for source is following.
copy activity source configuration
This is expression for body
{
"query": "SELECT Id,name,ownership,modifiedDate FROM \"#{variables('TableName')}\" WHERE ORDER BY Id","fetch_size": #{variables('Rows')}, "cursor": ""
}
I have no idea how to correctly setup pagination rule. The pipeline works properly but only for the first request. I've tried to setup Headers.cursor and expression $.cursor but this setup leads to an infinite loop and pipeline fails with the Elasticsearch restriction.
I've also tried to read document at https://learn.microsoft.com/en-us/azure/data-factory/connector-rest#pagination-support but it seems pretty limited in terms of usage examples and difficult for understanding.
Could somebody help me understand how to build the pipeline with paging abilities utilization?
Responce with the cursor looks like:
{
"columns": [
{
"name": "companyId",
"type": "integer"
},
{
"name": "name",
"type": "text"
},
{
"name": "ownership",
"type": "keyword"
},
{
"name": "modifiedDate",
"type": "datetime"
}
],
"rows": [
[
2,
"mic Inc.",
"manufacture",
"2021-03-31T12:57:51.000Z"
]
],
"cursor": "g/WuAwFaAXNoRG5GMVpYSjVWR2hsYmtabGRHTm9BZ0FBQUFBRUp6VGxGbUpIZWxWaVMzcGhVWEJITUhkbmJsRlhlUzFtWjNjQUFBQUFCQ2MwNWhaaVIzcFZZa3Q2WVZGd1J6QjNaMjVSVjNrdFptZDP/////DwQBZgljb21wYW55SWQBCWNvbXBhbnlJZAEHaW50ZWdlcgAAAAFmBG5hbWUBBG5hbWUBBHRleHQAAAABZglvd25lcnNoaXABCW93bmVyc2hpcAEHa2V5d29yZAEAAAFmDG1vZGlmaWVkRGF0ZQEMbW9kaWZpZWREYXRlAQhkYXRldGltZQEAAAEP"
}
I finally find the solution, hopefully, it will be useful for the community.
Basically, what needs to be done it is split the solution into four steps.
Step 1 Make the first request as in the question description and stage file to blob.
Step 2 Read blob file and get the cursor value, set it to variable
Step 3 Keep requesting data with a changed body
{"cursor" : "#{variables('cursor')}" }
Pipeline looks like this:
pipeline
Configuration of pagination looks following
pagination . It is a workaround as the server ignores this header, but we need to have something which allows sending a request in loop.

Add an object value to a field to Elastic Search during ingest and drop empty valued fields all during ingest

I am ingesting csv data into elasticsearch using the append processor. I already have two fields that are objects (object1 and object2) and I want to append them both into an array of a different field (mainlist). So it would come out as mainlist:[ {object1}, {object}] I have tried the set processor with the copy_from parameter and I am getting an error that I am missing the required property name "value" even though the ElasticSearch documentation clearly doesn't use the "value" property when it uses the "copy_from". {"set": {"field": "mainlist","copy_from": ["object1", "object"]}}. My syntax is even copied exactly from the documentation. Please help.
Furthermore I need to drop empty fields at the ingest level so they are not returned. I don't wish to have "fieldname: "", returned to the user. What is the best way to do that. I am new to ElasticSearch and it has not been going well.
As to dropping the empty fields at ingest level -- set up a pipeline:
PUT _ingest/pipeline/no_empty_fields
{
"description": "Removes empty-ish fields from a doc",
"processors": [
{
"script": {
"source": """
def keys_to_remove = ctx.keySet()
.stream()
.filter(field -> ctx[field] == null ||
ctx[field] == "")
.collect(Collectors.toList());
for (key in keys_to_remove) {
ctx.remove(key);
}
"""
}
}
]
}
and apply it upon indexing
POST myindex/_doc?pipeline=no_empty_fields
{
"fieldname23": 123,
"fieldname": null,
"fieldname123": ""
}
You can of course extend the conditions to ditch other fields such as "undefined", "Infinity" and others.

How to correlate drop down list

I have a response like below-
"distributionChannelList":[
{
"id":1,
"description":"Agency1"
},
{
"id":5,
"description":"Agency2"
},
{
"id":4,
"description":"Agency3"
},
{
"id":3,
"description":"Agency4"
}
],
"marketingTypeList":[
{
"id":1,
"description":"Type1".......
There are so many 'id' and 'description' values in my response. Agency1, Agency2.. are drop downs in my application.
So I want Jmeter to pick a different agency every time and pass in subsequent requests.
How to achieve this?
Use json extractor or reg Ex to fetch all the description with Match No. as 0 for random. Pass the Json created variable to the next request like ${varDescription}. On every run, random value will be fetched and provided to the next request.
Below snapshot is an example for regex but prefer json in your case. For fetching with json use $..description as json path expression. Repeat the same for others if required.
Hope this helps.
Update:-
Please check the below config. It will extractor 2 values in sync. But, ${cnt} should be same value. I have used counter just for demo. You can use random function to generate value between 1 to 4 and pass that variable ${rnd};${rnd}.

What is query_hash in instagram?

I was working for the first time on graphql, and I saw that Instagram hash their queries.
I searched something, but I don't know if it is correct. The hash is like a persistedquery stored in a cache memory?
Or am I wrong?
Example: this is my request payload
{
"operationName":"user",
"variables":{},
"query":"query user {\n users {\n username\n createdAt\n _id\n }\n}\n"
}
this is instagram:
query_hash: 60b755363b5c230111347a7a4e242001
variables: %7B%22only_stories%22%3Atrue%7D
(it is in urlencode mode).
Now, how could I hash my query? I'm using NodeJS as backend and react js as frontend.
I would like to understand how it works x)! Thank you guys!
The persisted query is used to improve GraphQL network performance by reducing the request size.
Instead of sending a full query which could be very long, you send a hash to the GraphQL server which will retrieve the full query from the key-value store using the hash as the key.
The key value store can be memcached, redis, etc
The Apollo Server comes with automated persisted queries out of the box. I recommended gives it a try. They have publish a blog about it. https://blog.apollographql.com/automatic-persisted-queries-and-cdn-caching-with-apollo-server-2-0-bf42b3a313de
If you want to build your own solution, you can use this package to do the hashing yourself https://www.npmjs.com/package/hash.js
query_hash (or query_id) does not hash the variables or the parameters, it hashes the payload.
Lets say your actual path is /graphql and your payload is
{
"user": {
"profile": [
"username",
"user_id",
"profile_picture"
],
"feed": {
"posts": {
"data": [
"image_url"
],
"page_size": "{{variables.max_count}}"
}
}
}
}
Then this graphql payload will be hashed and it becomes d4d88dc1500312af6f937f7b804c68c3. Now instead of doing that on /graphql you do /graphql/query/?query_hash=d4d88dc1500312af6f937f7b804c68c3. This way you hashed the payload, as in you hashed the "keys" that are required from the graphql. So when you pass variables as a param then the payload does not actually change, because the variables are constant as well, and you are changing them on the backend, and not in the payload.

ServiceNow REST API: return single column

Is there a way to perform a REST call to ServiceNow REST API that returns a single column of a table? I would like to query the server table for only the names of the servers and not have the entire record containing some 50 plus fields returned.
The latest REST Table API (as of Eureka, I think) supports the parameter sysparm_fields, which allows you to specify a comma-delimited list of fields to include in the response:
This URL template:
https://YOURINSTANCENAME.service-now.com/api/now/v1/table/incident?sysparm_fields=number,short_description,caller_id.name
Would give you a result with something like:
{
"result": [
{
"caller_id.name": "",
"short_description": "Unable to get to network file shares",
"number": "INC0000002"
}
]
}

Resources