why json.parse inserts a backslash before # - ruby

I have a JSON file as following :
[
{"testid" : 1, "desc" : "with valid email", "rescode" : 200, "request" : {"my_user" : {"email" : "#{$user.email}", "password" : "abcde"}, "source" : "android"}},
{"testid" : 2, "desc" : "with valid phone", "rescode" : 200, "request" : {"my_user" : {"phone" : "#{$user.phone}", "password" : "abcde"}, "source" : "android"}}
]
Now, when I try to parse this JSON to get a Hash using JSON.parse(), the content I get has a backslash before the # as follows :
{"housing_user"=>{"email"=>"\\#{$user.email}", "password"=>"abcde"}, "source"=>"android"}
It is preventing me from using the value of $user.email and instead \\#{$user.email} is being used.
How can I prevent this ?

Related

Number of records processed in logstash

We're using logstash to sync Elastic search and we've around 3 million documents. It takes 3 to 4 hours to sync. Currently all we get is, it is started and stopped. Is there any way to see how many records processed in logstash ?
If you're using Logstash 5 and higher, the Logstash Monitoring API can help you. You can see and monitor what's happening inside Logstash as it processes events. If you hit the Pipeline stats API you'll get the total number of processed events per stage and plugin (input/filter/output):
curl -XGET 'localhost:9600/_node/stats/pipelines?pretty'
You'll get this type of response in which you can clearly see at any time how many events have been processed:
{
"pipelines" : {
"test" : {
"events" : {
"duration_in_millis" : 365495,
"in" : 216485,
"filtered" : 216485,
"out" : 216485,
"queue_push_duration_in_millis" : 342466
},
"plugins" : {
"inputs" : [ {
"id" : "35131f351e2dc5ed13ee04265a8a5a1f95292165-1",
"events" : {
"out" : 216485,
"queue_push_duration_in_millis" : 342466
},
"name" : "beats"
} ],
"filters" : [ {
"id" : "35131f351e2dc5ed13ee04265a8a5a1f95292165-2",
"events" : {
"duration_in_millis" : 55969,
"in" : 216485,
"out" : 216485
},
"failures" : 216485,
"patterns_per_field" : {
"message" : 1
},
"name" : "grok"
}, {
"id" : "35131f351e2dc5ed13ee04265a8a5a1f95292165-3",
"events" : {
"duration_in_millis" : 3326,
"in" : 216485,
"out" : 216485
},
"name" : "geoip"
} ],
"outputs" : [ {
"id" : "35131f351e2dc5ed13ee04265a8a5a1f95292165-4",
"events" : {
"duration_in_millis" : 278557,
"in" : 216485,
"out" : 216485
},
"name" : "elasticsearch"
} ]
},
"reloads" : {
"last_error" : null,
"successes" : 0,
"last_success_timestamp" : null,
"last_failure_timestamp" : null,
"failures" : 0
},
"queue" : {
"type" : "memory"
}
}
}

Sample outputs of Rumen or Sample input to Gridmix

I am quite new to the use of big data tools like Hadoop. I want to execute a publicly available cluster trace (https://github.com/google/cluster-data) on Yarn/or Yarn Simulator.
One way to do is to feed input into Yarn via Gridmix.
The format in which Gridmix (https://hadoop.apache.org/docs/r2.8.3/hadoop-gridmix/GridMix.html) takes input is basically the output from Rumen.
And Rumen (https://hadoop.apache.org/docs/r2.8.3/hadoop-rumen/Rumen.html) takes JobHistory log generated from a map-reduce cluster as input.
The google trace is not a map-reduce trace. However, I was wondering if I can transform it to the format same as what Grdimix takes as input, then I can use the Grdmix.
Can anyone here point me input format of Gridmix (Or output of Rumen)?
Or suggest me another way to do what I want to do?
Thanks.
The output of Rumen contains two files:
1. job-trace file,
2. cluster-topology file;
those two files are all json format, job-trace file as following format:
{
"jobID" : "job_1546949851050_53464",
"user" : "mammut",
"computonsPerMapInputByte" : -1,
"computonsPerMapOutputByte" : -1,
"computonsPerReduceInputByte" : -1,
"computonsPerReduceOutputByte" : -1,
"submitTime" : 1551801585141,
"launchTime" : 1551801594958,
"finishTime" : 1551801630228,
"heapMegabytes" : 200,
"totalMaps" : 2,
"totalReduces" : 1,
"outcome" : "SUCCESS",
"jobtype" : "JAVA",
"priority" : "NORMAL",
"directDependantJobs" : [ ],
"mapTasks" : [ {
"inputBytes" : 25599927,
...}]
...
}
And, the cluster-topology like:
{
"name" : "<root>",
"children" : [ {
"name" : "rack-01",
"children" : [ {
"name" : "",
"children" : null
}, {
"name" : "",
"children" : null
}, {
"name" : "",
"children" : null
} ]
}, {
"name" : "default-rack",
"children" : [ {
"name" : "x",
"children" : null
} ]
} ]
}

How to get data from a json response as a variable

I tried to get the value of a variable named version (first one) using JSONPath but apparently my solution didn't work at all.
I tried to use an expression like $..version or $.container..version .
My response below:
{
"container" : {
"version" : 8,
"updatedBy" : "user111",
"updatedOn" : "2017-08-17T16:00:24Z",
"id" : 16,
"dataEnt" : {
"dataEntid" : "dataEntid-000032",
"dataEnttype" : "21"
},
"impact" : [ ],
"operationalFocus" : false,
"periodicity" : {
"version" : 0,
"updatedBy" : "unknown",
"updatedOn" : "2017-03-31T16:44:08Z",
"step" : 1,
"period" : 31084132,
"_VALIDATION" : {
"valid" : true,
"saveAll" : true,
"reasons" : [ ],
"details" : {
"period" : {
"valid" : true,
"saveAll" : true,
"risks" : [ ],
"rmiCode" : null,
"rmiMessage" : null
},
"version" : {
"valid" : true,
"saveAll" : true,
"risks" : [ ],
"rmiCode" : null,
"rmiMessage" : null
},
"step" : {
"valid" : true,
"saveAll" : true,
"risks" : [ ],
"rmiCode" : null,
"rmiMessage" : null
}
},
"rmiCode" : null,
"rmiMessage" : null
},
"_META" : { }
}
First of all the JSON you pasted is invalid: it's missing 2 curly brackets at the end (root object and container objects are not closed). If this is not a copy/paste error on SO, but actual data problem, you may need to correct that first.
If I understood correctly, you want the value from this field in the variable:
"version" : 8
If so, JSON path should be:
$.container.version
or
container.version
if you prefer relative path to absolute.
Path like $..version or $.container..version will select multiple version fields ("version" : 0 in periodicity property, and the one that is an object inside _VALIDATION)
The following expression will get you the desired result.
Variable: ContainerVersion
JSON Expression: $..container.version
Now the stored version value can be called using: ${ContainerVersion}
If there are multiple "version" tags are there, then you can load all values of "version" by having following expression,
$..container.version[*]
You can call the variable as ${Var_1}, ${Var_2} etc..
Add debug sampler to see the loaded variable names and its corresponding values.
Hope the above helps...

How i can parse multiline json file in logstash to hash?

I have a valid multiline JSON file. I want to parse it, and to assign it keys as field names, and values as field values.
Is it possible to do automatically?
input {
file {
path => "/home/logstash/xunit.json"
codec => json
}
}
output {
stdout {}
elasticsearch {
protocol => "http"
codec => "json"
host => "kibana.dev"
port => "9200"
}
}
After using this config, i see that something was added.. but i can't see that fields from my json appeared. Is it possible to grab name, severity, status, start & stop dates?
My json example:
[
{
"uid" : "441d1d1dd296fe60",
"name" : "test_buylinks",
"title" : "Test buylinks",
"time" : {
"start" : 1419621623182,
"stop" : 1419621640491,
"duration" : 17309
},
"severity" : "NORMAL",
"status" : "FAILED"
},
{
"uid" : "a88c89b377aca0c9",
"name" : "test_buylinks",
"title" : "Test buylinks",
"time" : {
"start" : 1419621623182,
"stop" : 1419621640634,
"duration" : 17452
},
"severity" : "NORMAL",
"status" : "FAILED"
},
{
"uid" : "32c3f8b52386c85c",
"name" : "test_buylinks",
"title" : "Test buylinks",
"time" : {
"start" : 1419621623185,
"stop" : 1419621640826,
"duration" : 17641
},
"severity" : "NORMAL",
"status" : "FAILED"
}
]

How to create map from JSON response in Ruby on Rails 3?

I need to create a map/array for auto complete from a JSON response and I am looking for the best, most efficient way to do it in Ruby and Rails 3. A portion of the response is below and the working code I have is before it. What is the one line of code I need to create locations for me?
# Need help making this more efficient
response_fields = JSON.parse(response.body)
predictions = response_fields['predictions']
predictions.each do |prediction|
locations << prediction['description']
end
Sample response from API:
{
"predictions" : [
{
"description" : "Napa, CA, United States",
"id" : "cf268f9fb9a1b46aed72d59ab85ed40f982763c6",
"matched_substrings" : [
{
"length" : 4,
"offset" : 0
}
],
"reference" : "CjQvAAAAqZWNGzqtJf3awNuQNQdnZpl4dBVVXFPrPdz29r1jo1GMWYFuz3KRlK9HgdgszOThEhDeYz_vYgcOPJTaYehF11bUGhR8yH9zqMGV9kenZIo9OTBrSwftgg",
"terms" : [
{
"offset" : 0,
"value" : "Napa"
},
{
"offset" : 6,
"value" : "CA"
},
{
"offset" : 10,
"value" : "United States"
}
],
"types" : [ "locality", "political", "geocode" ]
},
You can shorten your code like this:
locations = JSON.parse(response.body)['predictions'].map { |p| p['description'] }

Resources