Bash loop is not working when word contains space - bash

I am using JQ module the parse some of the data and then running the final loop over it to parse few more data.
cluster_list=`databricks --profile hq_dev clusters list --output JSON | jq 'select(.clusters != null) | .clusters[] | [.cluster_name,.autotermination_minutes,.state,.cluster_id] | #csv' | grep -v "job-"`
for cluster in ${cluster_list[#]}
do
cluster_id=`echo $cluster| cut -d "," -f 4 | sed 's/\"//g' | sed 's/\\\//g'`
cluster_name=`echo "${cluster}"| cut -d "," -f 1| sed 's/\"//g' | sed 's/\\\//g'`
echo $cluster_name
done
cluster_list contains following value.
"\"Test Space Cluster\",15,\"TERMINATED\",\"ddd-dese23-can858\""
"\"GatewayCluster\",15,\"TERMINATED\",\"ddd-ddsd-ddsds\""
"\"delete_later\",15,\"TERMINATED\",\"1120-195800-93839\""
"\"GatewayCluster_old\",15,\"TERMINATED\",\"0108-2y7272-393893\""
it prints following.
Test
Space
Cluster
GatewayCluster
delete_later
GatewayCluster_old
Desired output
it shouldn't break to newline if there is a space, I am doing few more action by the name I am getting here.
Test Space Cluster
GatewayCluster
delete_later
GatewayCluster_old

Your script seems a bit overly complex to achieve your goal. Better use read to store each value in a separate variable, and set a comma for the input field separator IFS:
databricks --profile hq_dev clusters list --output JSON |
jq 'select(.clusters != null) | .clusters[] |
[.cluster_name,.autotermination_minutes,.state,.cluster_id] | #csv' |
grep -v "job-" |
sed 's/\\\?"//g' |
while IFS=, read name autotermination_minutes state id ; do
echo $name
done
Note: I didn't touch your jq command. The sed line I put aims to remove quotes, protected or not. You can tune jq to remove these quotes with -r, as said in the man page:
INVOKING JQ
[...]
--raw-output / -r::
With this option, if the filterĀ“s result is a string then it will be written directly to standard output rather than being formatted as a JSON string with quotes. This can be useful for making jq filters talk to non-JSON-based systems.

Related

Escaping Bash $ with JQ

I have a number of ec2 instances running in AWS, and I've extracted this information into a file.
aws ec2 describe-instances > instances.json
I also have another file ipAddressList
cat ipAddressList
10.100.39.4
10.100.56.20
10.100.78.11
10.100.78.12
I would like to extract the ImageId for these 4 instances.
I'm able to get the ImageId for individual ip addresses using this command
cat instances.json | jq '.Reservations[] | .Instances[] | select(.PrivateIpAddress == "10.100.39.41") | .ImageId'
But I would like to put this into a bash loop to extract the ImageId's for all 4 instances at once.
I've tried
for i in `cat ipAddressList` ; do jq '.Reservations[] | .Instances[] | select(.PrivateIpAddress == \$i) | .ImageId' instances.json ; done
But it throws an error.
What am I doing wrong please?
Don't inject data into code, use the --arg and --argjson options provided by jq to do that safely:
for i in `cat ipAddressList`
do jq --arg i "$i" '
.Reservations[] | .Instances[]
| select(.PrivateIpAddress == $i) | .ImageId
' instances.json
done
On top of that, jq provides options to read in files as raw text, so you could shift the entire loop into jq logic, resulting in just one invocation of jq:
jq --rawfile ips ipAddressList '
.Reservations[].Instances[]
| select(IN(.PrivateIpAddress; ($ips / "\n")[])).ImageId
' instances.json
You're really close with your solution.
What you need is
for i in `cat ipAddressList` ; do jq '.Reservations[] | .Instances[] | select(.PrivateIpAddress == "'$i'") | .ImageId' instances.json ; done
And you should be fine.

Parsing CSV records when a value is multiline

Source file looks like this:
"google.com", "vuln_example1
vuln_example2
vuln_example3"
"facebook.com", "vuln_example2"
"reddit.com", "stupidly_long_vuln_name1"
"stackoverflow.com", ""
I've been trying to get the output to be something like this but the line breaks seem to cause me no end of problems. I'm using a "while read line" job to do this because I do some processing on the columns (e.g Vulnerability count and url in this example). This is output into a jenkins job (yuk).
The basic summary of the problem is getting the linebreaks in the csv to be output into the third column while retaining the table structure. I've got a sort of weird example of the desired output below.
||hostname ||Vulnerability count|| Vulnerability list || URL ||
|google.com |3 |vuln_example1 |http://cve.com/vuln_example1|
| | |vuln_example2 |http://cve.com/vuln_example2|
| | |vuln_example3 |http://cve.com/vuln_example3|
|facebook.com |1 |vuln_example2 |http://cve.com/vuln_example2|
|reddit.com |1 |stupidly_long_vuln_name1 |http://cve.com/stupidly_long_vuln_name1|
|stackoverflow.com |0 | ||
Looking at this... I've got a feeling it might be easier by showing some code and example output.
Parsing your input with the command line below makes the problem easier (I'm assuming the inputs are correct):
perl -0777 -pe 's/([^"])\s*\n/\1 /g ; s/[",]//g' < sample.txt
This line invokes Perl to perform two regex substitutions:
s/([^"])\s*\n/\1 /g: This substitution removes an end of line if it doesn't terminate by a quote " (i.e. if a host entry, with all vulnerabilities isn't yet complete).
s/[",]//g removes all quotes and commas remaining.
For each host entry like this one:
"google.com", "vuln_example1
vuln_example2
vuln_example3"
You'll get:
google.com vuln_example1 vuln_example2 vuln_example3
Then you can assume for each line, you have an host and a set of vulnerabilities.
The given example below stores vulnerabilities in an array and loop through it, formatting and printing each line:
# Replace this by your custom function
# to get an URL for a given vulnerability
function get_vuln_url () {
# This just displays a random url for an non-empty arg
[[ -z "$1" ]] || echo "http://host/$1.htm"
}
# Format your line (see printf help)
function print_row () {
printf "%-20s|%5s|%-30s|%s\n" "$#"
}
# The perl line reformat
perl -0777 -pe 's/([^"])\s*\n/\1 /g ; s/[",]//g' < sample.txt |
while read -r line ; do
arr=(${line})
print_row "${arr[0]}" "$((${#arr[#]} - 1))" "${arr[1]}" "$(get_vuln_url ${arr[1]})"
#echo -e "${arr[0]}\t|$vul_count\t|${arr[1]}\t|$(get_vuln_url ${arr[1]})"
for v in "${arr[#]:2}" ; do
print_row " " " " "$v" "$(get_vuln_url ${arr[1]})"
done
done
Output:
google.com | 3|vuln_example1 |http://host/vuln_example1.htm
| |vuln_example2 |http://host/vuln_example1.htm
| |vuln_example3 |http://host/vuln_example1.htm
facebook.com | 1|vuln_example2 |http://host/vuln_example2.htm
reddit.com | 1|stupidly_long_vuln_name1 |http://host/stupidly_long_vuln_name1.htm
stackoverflow.com | 0| |
Update.
If you don't have Perl, and if your file doesn't have tabulations, you can use this command as a workaround instead:
tr '\n' '\t' < sample.txt | sed -r -e 's/([^"])\s*\t/\1 /g' -e 's/[",]//g' -e 's/\t/\n/g'
tr '\n' '\t' replaces all ends of line by tabulations
sed part acts like Perl line, except it deals with tabulations instead of ends of line and restores tabulations back to ends of line.

Command execution in sed while preserving unmatched part of the line

It is simple - I have a data stream with IPv4 addresses encoded into hexadecimal representation like for example 0c22384e which stands for 12.34.56.78.
I figured out sed command with substitution of captured octets into decimal numbers separated by dot.
echo "0c22384e" | sed -E 's/([0-9a-f]{2})([0-9a-f]{2})([0-9a-f]{2})([0-9a-f]{2})/printf "%d.%d.%d.%d" 0x\1 0x\2 0x\3 0x\4/eg'
This works with a single number BUT as soon I add some text that is not supposed to be matched, it is also passed for the execution - via printf in this case.
How can I preserve the unmatched part of the line without being passed for the execution?
With only one address in a line you could use
echo "Something 0c22384e more" |
sed -r 's/(.*)([0-9a-f]{2})([0-9a-f]{2})([0-9a-f]{2})([0-9a-f]{2})(.*)/"\1" 0x\2 0x\3 0x\4 0x\5 "\6"/' |
xargs -n6 printf '%s%d.%d.%d.%d%s\n'
EDIT:
Replaced solution for one line and more addresses
with solution for more lines (assuming no '\r' in the stream):
echo "Something 0c22384e more 0c22385e
Second line: 0c22386e and 0c223870
Third line: 0c22388e and 0c223890
4th line: 0c2238ae and 0c2238b0" |
sed 's/$/\r/' |
sed -r 's/[0-9a-f]{8}/\n&\n/g' |
sed -r 's/([0-9a-f]{2})([0-9a-f]{2})([0-9a-f]{2})([0-9a-f]{2})/printf '%d.%d.%d.%d' 0x\1 0x\2 0x\3 0x\4/e' |
tr -d '\n' |
tr '\r' '\n'

How to append lots of variables to one variable with a simple command

I want to stick all the variables into one variable
A=('blah')
AA=('blah2')
AAA=('blah3')
AAB=('blah4')
AAC=('blah5')
#^^lets pretend theres 100 more of these ^^
#Variable composition
#after AAA, is AAB then AAC then AAD etc etc, does that 100 times
I want them all placed into this MASTER variable
#MASTER=${A}${AA}${AAA} (<-- insert AAB, AAC and 100 more variables here)
I obviously don't want to type 100 variables in this expression because there's probably an easier way to do this. Plus I'm gonna be doing more of these therefore I need it automated.
I'm relatively new to sed, awk, is there a way to append those 100 variables into the master variable?
For this specific purpose I DO NOT want an array.
You can use a simple one-liner, quite straightforward, though more expensive:
master=$(set | grep -E '^(A|AA|A[A-D][A-D])=' | sort | cut -f2- -d= | tr -d '\n')
set lists all the variables in var=name format
grep filters out the variables we need
sort puts them in the right order (probably optional since set gives a sorted output)
cut extracts the values, removing the variable names
tr removes the newlines
Let's test it.
A=1
AA=2
AAA=3
AAB=4
AAC=5
AAD=6
AAAA=99 # just to make sure we don't pick this one up
master=$(set | grep -E '^(A|AA|A[A-D][A-D])=' | sort | cut -f2- -d= | tr -d '\n')
echo "$master"
Output:
123456
With my best guess, how about:
#!/bin/bash
A=('blah')
AA=('blah2')
AAA=('blah3')
AAB=('blah4')
AAC=('blah5')
# to be continued ..
for varname in A AA A{A..D}{A..Z}; do
value=${!varname}
if [ -n "$value" ]; then
MASTER+=$value
fi
done
echo $MASTER
which yields:
blahblah2blah3blah4blah5...
Although I'm not sure whether this is what the OP wants.
echo {a..z}{a..z}{a..z} | tr ' ' '\n' | head -n 100 | tail -n 3
adt
adu
adv
tells us, that it would go from AAA to ADV to reach 100, or for ADY for 103.
echo A{A..D}{A..Z} | sed 's/ /}${/g'
AAA}${AAB}${AAC}${AAD}${AAE}${AAF}${AAG}${AAH}${AAI}${AAJ}${AAK}${AAL}${AAM}${AAN}${AAO}${AAP}${AAQ}${AAR}${AAS}${AAT}${AAU}${AAV}${AAW}${AAX}${AAY}${AAZ}${ABA}${ABB}${ABC}${ABD}${ABE}${ABF}${ABG}${ABH}${ABI}${ABJ}${ABK}${ABL}${ABM}${ABN}${ABO}${ABP}${ABQ}${ABR}${ABS}${ABT}${ABU}${ABV}${ABW}${ABX}${ABY}${ABZ}${ACA}${ACB}${ACC}${ACD}${ACE}${ACF}${ACG}${ACH}${ACI}${ACJ}${ACK}${ACL}${ACM}${ACN}${ACO}${ACP}${ACQ}${ACR}${ACS}${ACT}${ACU}${ACV}${ACW}${ACX}${ACY}${ACZ}${ADA}${ADB}${ADC}${ADD}${ADE}${ADF}${ADG}${ADH}${ADI}${ADJ}${ADK}${ADL}${ADM}${ADN}${ADO}${ADP}${ADQ}${ADR}${ADS}${ADT}${ADU}${ADV}${ADW}${ADX}${ADY}${ADZ
The final cosmetics is easily made by hand.
One-liner using a for loop:
for n in A AA A{A..D}{A..Z}; do str+="${!n}"; done; echo ${str}
Output:
blahblah2blah3blah4blah5
Say you have the input file inputfile.txt with arbitrary variable names and values:
name="Joe"
last="Doe"
A="blah"
AA="blah2
then do:
master=$(eval echo $(grep -o "^[^=]\+" inputfile.txt | sed 's/^/\$/;:a;N;$!ba;s/\n/$/g'))
This will concatenate the values of all variables in inputfile.txt into master variable. So you will have:
>echo $master
JoeDoeblahblah2

jq and bash: object construction with --arg is not working

Given the following input:
J='{"a":1,"b":10,"c":100}
{"a":2,"b":20,"c":200}
{"a":3,"b":30,"c":300}'
The command
SELECT='a,b'; echo $J | jq -c -s --arg P1 $SELECT '.[]|{a,b}'
produces
{"a":1,"b":10}
{"a":2,"b":20}
{"a":3,"b":30}
but this command produces unexpected results:
SELECT='a,b'; echo $J | jq -c -s --arg P1 $SELECT '.[]|{$P1}'
{"P1":"a,b"}
{"P1":"a,b"}
{"P1":"a,b"}
How does one get jq to treat an arg string literally?
Using tostring gives an error
SELECT='a,b'; echo $J | jq -c -s --arg P1 $SELECT '.[]|{$P1|tostring}'
jq: error: syntax error, unexpected '|', expecting '}' (Unix shell quoting
issues?) at <top-level>, line 1:
.[]|{$SELECT|tostring}
jq: 1 compile error
SELECT needs to be a variable and not hardcoded in the script.
SELECT needs to be a variable and not hardcoded in the script.
Assuming you want to avoid the risks of "code injection" and that you want the shell variable SELECT to be a simple string such as "a,b", then consider this reduce-free solution along the lines you were attempting:
J='{"a":1,"b":10,"c":100}'
SELECT='a,b'
echo "$J" |
jq -c --arg P1 "$SELECT" '
. as $in | $P1 | split(",") | map( {(.): $in[.]} ) | add'
Output:
{"a":1,"b":10}
If you really want your data to be parsed as syntax...
This is not an appropriate use case for --arg. Instead, substitute into the code:
select='a,b'; jq -c -s '.[]|{'"$select"'}' <<<"$j"
Note that this has all the usual caveats of code injection: If the input is uncontrolled, the output (or other behavior of the script, particularly if jq gains more capable I/O features in the future) should be considered likewise.
If you want to split the literal string into a list of keys...
Here, we take your select_str (of the form a,b), and generate a map: {'a': 'a', 'b': 'b'}; then, we can break each data item into entries, select only the items in the map, and there's our output.
jq --arg select_str "$select" '
($select_str
| split(",")
| reduce .[] as $item ({}; .[$item]=$item)) as $select_map
| with_entries(select($select_map[.key]))' <<<"$j"

Resources