sh file to construct mongo document - bash

My document structure looks like this:
{"name":"John", "age":32, ...other fields}
All other fields need not be initialized, only name and age. I want to make a script that takes in name and number
./client.sh John 32
and in the script, it will do something like
db.client.insert({"name":$1,"age":$2});
how to achieve this?

Here is a simple example script that works
#!/bin/bash
if [ $# -lt 2 ]
then
echo "USAGE: $0 name age"
exit 1
fi
mongo <<EOF
use test
db.client.insert({"name":"$1", "age":$2})
db.client.find()
EOF
It assumes that mongo is installed and in your path and that you have the client collection in the test database.
A sample run below
hduser#localhost:~/temp$ ./mongo_ins.sh "overflow" 20
MongoDB shell version: 2.6.10
connecting to: test
switched to db test
WriteResult({ "nInserted" : 1 })
{ "_id" : ObjectId("56b35d134c24bf7c1190cfb3"), "name" : "Stack", "age" : 1 }
{ "_id" : ObjectId("56b35dabe23223802ea3fa61"), "name" : { }, "age" : 10 }
{ "_id" : ObjectId("56b35ebbd69a0abeeb817fe3"), "name" : "overflow", "age" : 20 }
bye
Hope this helps

Related

Loop through JSON file in bash script

I have a bash script that user enter name of a database and then my script needs to loop through this database config file. my JSON looks like this:
{
"_id":1,
"db_name":"postgres",
"username":"postgres_user",
"password":"postgres_pass",
"daily" : {
"days_to_backup" : [1,2],
"delete_after" : 14,
"compressed" : true,
"path" : "localhost"
},
"monthly" : {
"days_to_backup" : [2,5,30],
"delete_after" : 7,
"compressed" : false,
"path" :" localhost"
},
"yearly" : {
"days_to_backup" : [100],
"delete_after" : 14,
"compressed" : true,
"path" : "localhost"
}
}
{
"_id":2,
"db_name":"test",
"username":"test_user",
"password":"test_pass",
"daily" : {
"days_to_backup" : [1,7],
"delete_after" : 14,
"compressed" : true,
"path" : "localhost"
},
"monthly" : {
"days_to_backup" : [2,5,25],
"delete_after" : 7,
"compressed" : false,
"path" : "localhost"
},
"yearly" : {
"days_to_backup" : [50],
"delete_after" : 14,
"compressed" : true,
"path" : "localhost"
}
}
Now once user run the script with particular database name, i want to loop through the "days_to_backup" numbers and for each day look it it's equal for today date. The problem is that i don't know how to fetch the daily/monthly/yealy.days_to_backup and then loop through the days.
I tried with jq :
#! /bin/sh
#monday is first day of the week
export Today=$(date +%d)
while getopts d: flag
do
case "${flag}" in
d) database=${OPTARG};;
esac
done
if jq .db_name file.json | grep $database
then
jq 'select(.db_name=="'$database'")' file.json
##this one returns the whole document of the specified database
##loop here for daily/monthly_yearly.days_to_backup???
else
echo "database not found"
fi
Edit: my input is the name of the database ( " postgres " ) and then i will iterate through the daily/monthly/yearly array and if the number is like current date ( today is 8 of the month) so the output will be - echo "backup today".
Edit2: sorry my bad. for daily it means that day of the week so (1,7) are sunday and saturday and if current date is sunday or saturday i want and echo "backup today". for monthly it means the day of the month, so if i have monthly (01,08) it will echo "backup today" because today is the 8th of the month. and for yearly i want days since the first day of the year.
Ok so I succeeded to loop through the array but the problem now is that it run over the array of the 2 big objects (id 1 + 2)
for i in $(jq -r ".daily.days_to_backup | .[]" file.json)
do
echo $i "im number"
done
how do I select that it will run only over _id:1 / db_name: postgres?
An efficient approach would be based on using any, e.g. along the lines of the following:
jq --arg db $database --argjson Today $Today '
select(.db_name==$db and
any(.daily[], .monthly[], .yearly[]; . == $Today) )
' file.json
Note that both parameters ($database and $Today) should be passed in as command-line arguments rather than provided using shell-based string interpolation. Using --argjson will ensure that any leading 0s in the shell variable $Today do not cause problems.
Note also that the above query will return a stream of 0 or more values. You might want to guarantee the names are distinct, e.g. by using the jq built-in filter unique (defined for arrays), or the stream-oriented filter uniques defined in the jq Cookbook.

Executing a perl script in bash, need to parse it's JSON output and put it into a variable [duplicate]

This question already has answers here:
Parsing JSON with Unix tools
(45 answers)
Closed 3 years ago.
I have an automation script that I'm writing in Bash. I need to execute a perl script from within this and capture/parse one specific part of it's output and use it as a variable in order to complete the bash script's tasks.
For example:
echo "Creating database for "$custname
perl /home/dib/testing/addCustomer.cgi name=$custname prefix=$customerno
The perl script "addCustomer.cgi" will return JSON output of:
Content-Type: application/json; charset=UTF-8
{
"recordcount" : 1,
"status" : "OK",
"statusText" : "Add Customer: 40:Testing",
"customer" : [
{
"address" : "",
"city" : "",
"country" : "US",
"description" : "",
"email" : "",
"endDate" : "0000-00-00 00:00:00",
"frontendurl" : "",
"phone" : "",
"prefix" : "testing",
"startDate" : "0000-00-00 00:00:00",
"state" : "",
"customerID" : "40",
"subscriberName" : "Testing",
"url" : "",
"zip" : ""
}
],
"timestamp" : 1559163419
}
What I need to capture is the customerID number, stick it into a variable, and use it to finish the bash script. Is something like this possible? I always see parsing or passing from bash to perl but not the other way around.
Append this to your Perl command:
| sed 1d | jq -r '.customer[].customerID'
Output:
40
I assume there's one header line.

How to setup Sublime Text 3 to build script with MSYS bash?

Hi i would like to know how to setup sublime text 3 build system to execute bash script threw MSYS bash.
I have try the following build system :
{
"cmd" : ["bash", "$file"],
"selector" : "source.shell",
"path" : "C:/MinGW/msys/1.0/bin"
}
It does look to work, but it can't find gcc, i have try to add multiple path this way :
"path" : "C:/MinGW/msys/1.0/bin:C:/MinGW/bin"
But the C: looks to break the thing.
I have also try this :
"path" : ["C:/MinGW/msys/1.0/bin", "C:/MinGW/bin"]
But it fail also as path is waiting for a string only
"path" :"C\:/MinGW/msys/1.0/bin:C\:/MinGW/bin"
"path" :"C:/MinGW/msys/1.0/bin\:C:/MinGW/bin"
"path" :"${C:/MinGW/msys/1.0/bin}:${C:/MinGW/bin}"
Also fail ...
"path" : "/c/MinGW/msys/1.0/bin:/c/MinGW/bin"
"path" : "/MinGW/msys/1.0/bin:/MinGW/bin"
Same for those ...
Any working suggestion would be greatly appreciated.
To install msys bash with sublime text 3 you just have to copy this in
a new build system:
{
"cmd" : ["bash", "$file"],
"selector" : "source.shell",
"windows": {
"path" : "$path;c:\\mingw\\bin;c:\\mingw\\msys\\1.0\\bin",
"cmd" : ["bash", "--login", "-i", "-c", "cd \"$file_path\"; \"$file\""]
},
"osx": {
"path" : "$PATH:/usr/local/bin:/usr/X11R6/bin:/opt/X11/bin:/usr/texbin"
}
}
thanks to macite :
https://github.com/macite/sublimebashbuildsystem/blob/master/Bash.sublime-build

Check existence of a field in HDFS avro format using Pig/Python

I have a set of files in HDFS stored in Avro format. Some of them have a column named id:int as follows
{
"type" : "record",
"name" : "metric",
"fields" : [ {
"name" : "timestamp",
"type" : "long"
}, {
"name" : "id",
"type" : "long"
}, {
"name" : "metric",
"type" : "string"
}, {
"name" : "value",
"type" : "double"
} ]
}
I need to flag the files (output the file names) having "id" column. Is there a way to get it done using Pig/Python UDF / Pig streaming or embedded Pig in Python. I have used Python UDF with Pig but not sure about how to check the existence of a field. I will appreciate if anybody can post a small sample.
Thanks in advance.
If Hadoop streaming will work, you can use the AvroAsTextInputFormat which will send one Avro datum (record) in JSON format to the map tasks. (http://avro.apache.org/docs/1.7.4/api/java/org/apache/avro/mapred/AvroAsTextInputFormat.html).
$ hadoop fs -ls avro-test
Found 1 items
-rw-r--r-- 3 brandon.bell hadoop 548 2015-06-17 12:13 avro-test/twitter.avro
The following:
$ hadoop jar hadoop-streaming-2.0.0-mr1-cdh4.3.0.jar -D mapred.reduce.tasks=0 -files avro-1.7.7.jar,avro-mapred-1.7.7-hadoop2.jar -libjars avro-1.7.7.jar,avro-mapred-1.7.7-hadoop2.jar -input avro-test -output avro-test-output -mapper org.apache.hadoop.mapred.lib.IdentityMapper -inputformat org.apache.avro.mapred.AvroAsTextInputFormat
outputs the JSON :
$ hadoop fs -cat avro-test-output/part-*
{"username": "miguno", "tweet": "Rock: Nerf paper, scissors is fine.", "timestamp": 1366150681}
{"username": "BlizzardCS", "tweet": "Works as intended. Terran is IMBA.", "timestamp": 1366154481}
The input files I tested on were found here.
A simple Python script as the mapper will be able to test the key/values for what you are looking for. To output the filename, you can use the environment variable that's set up with Streaming jobs. This should work unless it's changed with more recent versions. file_name = os.getenv('map_input_file')

ElasticSearch Unassigned Shard

I have 2 nodes in elastic search cluster with 8 CPU and 16 GB RAM. I have set ES_HEAP_SIZE to 10 GB.
In my yml configuration file on both machines i have set
index.number_of_shards: 5
index.number_of_replicas: 1
And both machines are allowed as master/data true.Now problem is my 0th shard of node 1 is unassigned after restart.I tried
for shard in $(curl -XGET http://localhost:9201/_cat/shards | grep UNASSIGNED | awk '{print $2}'); do
echo "processing $shard"
curl -XPOST 'localhost:9201/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "inxn",
"shard" : '$shard',
"node" : "node1",
"allow_primary" : true
}
}
]
}'
done
it does not give any error and says acknowledged true and show status of shard to initialize but when i view shard its still uninitialized.
Am I doing anything wrong in setting? Should I make both node as master/data true and on both machines set shard:5 and replica:1
Any help or suggestion would be greatly appreciated.
Thanks
I did a trick to solve the same , I renamed 0 folder under indices on node1 and did a force full assign 0th shard on node1 and it worked for me.
curl -XPOST 'localhost:9201/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "inxc",
"shard" : 0,
"node" : "node1",
"allow_primary" : true
}
}
]
}'

Resources