How to merge lines with pattern including "[,],{,}" chars with sed/awk - bash

I have a text file following a certain formatting which has lines like :
{
"297723": [
[
1,
2
],
[
5,
10
],
[
1,
157
]
],
"369258": [
[
3,
4
],
[
6,
11
],
[
30,
200
]
]
}
How can I make it look like this ?
{"297723": [[1, 2], [5, 10], [1,157]],
{"369258": [[3, 4], [6, 11], [30,200]]}
Of course, there are several blocks, I just append the first (which starts with "{" and the last which closes with "}" - in all the rest, there is a number (like "2927723" in my example) which notifies the new block.

Your input is a valid JSON, so you may apply jq tool for this case:
jq -c '.' yourfile | sed 's/,"/,\n"/'
The output:
{"297723":[[1,2],[5,10],[1,157]],
"369258":[[3,4],[6,11],[30,200]]}
-c - print the input in compact-output form

Related

print elements from multiple lists by index with jq

I have a json with 3 equal lists:
{
"value": {
"list1": [
"el11",
"el12",
"el13",
"el14"
],
"list2": [
"el21",
"el22",
"el23",
"el24"
],
"list3": [
"el31",
"el32",
"el33",
"el34"
]
}
}
I'm trying to extract from each list the elements with the same index:
el11 el21 el31
...
el13 el23 el33
Ideally those need to be exported as ENV values. But I just want to know if it's possible to do that with jq. And how.
Thank you
You can use the transpose function for this. For example,
$ jq -r '[.value[]] | transpose[] | join (" ")' tmp.json
el11 el21 el31
el12 el22 el32
el13 el23 el33
el14 el24 el34

Split line into multiple lines and append a substring in the beginning [duplicate]

This question already has answers here:
how to iterate though delimited values in bash
(3 answers)
Closed 4 years ago.
I'm very new to bash, and I have a log like this:
10, "a#gmail.com"
2, "b#gmail.com"
3333, "c#hotmail.com", "d#gmail.com"
4, "e#hotmail.com", "f#hotmail.com", "g#gmail.com"
55, "h#gmail.com"
I would like it to be:
10, "a#gmail.com"
2, "b#gmail.com"
3333, "c#hotmail.com"
3333, "d#gmail.com"
4, "e#hotmail.com"
4, "f#hotmail.com"
4, "g#gmail.com"
55, "h#gmail.com"
How do I do it in bash?
The standard UNIX tool for manipulating text is awk:
$ awk 'BEGIN{FS=OFS=", "} {for (i=2;i<=NF;i++) print $1, $i}' file
10, "a#gmail.com"
2, "b#gmail.com"
3333, "c#hotmail.com"
3333, "d#gmail.com"
4, "e#hotmail.com"
4, "f#hotmail.com"
4, "g#gmail.com"
55, "h#gmail.com"
As the first argument, I'm passing the path to the file.
If no argument will be passed I will say about the error and will exit with (-1).
I'm iterating thru the file, with commas changed to space.
Each iteration I will be tacking separated by space word. If this word is the number I will store it and go to the next word. If it is non-numeric I will print the previous number and the current word, separated by the comma. Before the for loop, I'm initializing number with 0, just in case ;)
#!/bin/bash
if [ -z "${1}" ]; then
echo "No file specified"
exit -1
else
file=$1
echo "Parsing file \"$file\":"
fi
number="0"
for word in $(sed "s#,# #g" $file); do
if [[ $line =~ ^[0-9]+ ]] ; then
number=${word};
continue;
else
echo "$number, ${word}"
fi
done
exit 0
Run:
test#LAPTOP-EQKIVD8A:~$ cat new.txt
10, "a#gmail.com"
2, "b#gmail.com"
3333, "c#hotmail.com", "d#gmail.com"
4, "e#hotmail.com", "f#hotmail.com", "g#gmail.com"
55, "h#gmail.com"
test#LAPTOP-EQKIVD8A:~$ ./script.sh new.txt
Parsing file "new.txt":
10, "a#gmail.com"
2, "b#gmail.com"
3333, "c#hotmail.com"
3333, "d#gmail.com"
4, "e#hotmail.com"
4, "f#hotmail.com"
4, "g#gmail.com"
55, "h#gmail.com"

Output the results of select operation in an array - jq

I have the following input:
[
{"id": "first", "val": 1},
{"id": "second", "val": 2},
{"id": "second", "val": 3}
]
Using the jq filter : .[] | select(.id == "second")
I get following as output :
{
"id": "second",
"val": 2
}
{
"id": "second",
"val": 3
}
I want to get the result in the form of an array. Is it possible to get the multiple result values of select operation in an array?
Yes; wrap the filter in an array :)
$ jq '[.[] | select(.id == "second")]' tmp.json
[
{
"id": "second",
"val": 2
},
{
"id": "second",
"val": 3
}
]
Or, use map/1, which is predefined as [.[] | ...].
$ jq 'map(select(.id == "second"))' tmp.json
[same result]
To wrap the results in a bash array, use the -c option to output each result on a single line, and read the result with readarray.
$ readarray -t arr < <(jq -c '.[] | select(.id == "second")' tmp.json)
$ for r in "${arr[#]}"; do echo "Result: $r"; done
Result: {"id":"second","val":2}
Result: {"id":"second","val":3}

User's shell is (apparently) adding lines to a shell script during execution

My bash shell script has the following lines:
(Line numbers added)
1] printf " Username:";
2] read USERNAME;
3]
4] WGET=`which wget`;
5] $WGET -V >& /dev/null;
6] STATUS=$?;
This is run by hundreds of users and thoroughly tested. When a particular user runs the script, the output of "sh -x" shows some lines being executed which are not in the original script:
(line numbers added)
1] + printf ' Username:'
2] Username:
3] + read SSO_USERNAME
4] + tput cuu1
5] + 2> /dev/null
6] .sh.subscript=$'\E[A'
7]
8] + which wget
9] + WGET=/usr/bin/wget
10]+ /usr/bin/wget -V
11]+ update/wget.sh[31]: /dev/null: bad file unit number
12]+ STATUS=1
User is executing using putty with kerberos authentication and "ksu".
Where are lines 4 and 6 in the user output coming from and why?

Finding gaps in sequential numbers

I don’t do this stuff for a living so forgive me if it’s a simple question (or more complicated than I think). I‘ve been digging through the archives and found a lot of tips that are close but being a novice I’m not sure how to tweak for my needs or they are way beyond my understanding.
I have some large data files that I can parse out to generate a list of coordinate that are mostly sequential
5
6
7
8
15
16
17
25
26
27
What I want is a list of the gaps
1-4
9-14
18-24
I don’t know perl, SQL or anything fancy but thought I might be able to do something that would subtract one number from the next. I could then at least grep the output where the difference was not 1 or -1 and work with that to get the gaps.
With awk :
awk '$1!=p+1{print p+1"-"$1-1}{p=$1}' file.txt
explanations
$1 is the first column from current input line
p is the previous value of the last line
so ($1!=p+1) is a condition : if $1 is different than previous value +1, then :
this part is executed : {print p+1 "-" $1-1} : print previous value +1, the - character and fist columns + 1
{p=$1} is executed for each lines : p is assigned to the current 1st column
interesting question.
sputnick's awk one-liner is nice. I cannot write a simpler one than his. I just add another way using diff:
seq $(tail -1 file)|diff - file|grep -Po '.*(?=d)'
the output with your example would be:
1,4
9,14
18,24
I knew that there is comma in it, instead of -. you could replace the grep with sed to get -, grep cannot change the input text... but the idea is same.
hope it helps.
A Ruby Answer
Perhaps someone else can give you the Bash or Awk solution you asked for. However, I think any shell-based answer is likely to be extremely localized for your data set, and not very extendable. Solving the problem in Ruby is fairly simple, and provides you with flexible formatting and more options for manipulating the data set in other ways down the road. YMMV.
#!/usr/bin/env ruby
# You could read from a file if you prefer,
# but this is your provided corpus.
nums = [5, 6, 7, 8, 15, 16, 17, 25, 26, 27]
# Find gaps between zero and first digit.
nums.unshift 0
# Create array of arrays containing missing digits.
missing_nums = nums.each_cons(2).map do |array|
(array.first.succ...array.last).to_a unless
array.first.succ == array.last
end.compact
# => [[1, 2, 3, 4], [9, 10, 11, 12, 13, 14], [18, 19, 20, 21, 22, 23, 24]]
# Format the results any way you want.
puts missing_nums.map { |ary| "#{ary.first}-#{ary.last}" }
Given your current corpus, this yields the following on standard output:
1-4
9-14
18-24
Just remember the previous number and verify that the current one is the previous plus one:
#! /bin/bash
previous=0
while read n ; do
if (( n != previous + 1 )) ; then
echo $(( previous + 1 ))-$(( n - 1 ))
fi
previous=$n
done
You might need to add some checking to prevent lines like 28-28 for single number gaps.
Perl solution similar to awk solution from StardustOne:
perl -ane 'if ($F[0] != $p+1) {printf "%d-%d\n",$p+1,$F[0]-1}; $p=$F[0]' file.txt
These command-line options are used:
-n loop around every line of the input file, do not automatically print every line
-a autosplit mode – split input lines into the #F array. Defaults to splitting on whitespace. Fields are indexed starting with 0.
-e execute the perl code
Given input file, use the numinterval util and paste its output beside file, then munge it with tr, xargs, sed and printf:
gaps() { paste <(echo; numinterval "$1" | tr 1 '-' | tr -d '[02-9]') "$1" |
tr -d '[:blank:]' | xargs echo |
sed 's/ -/-/g;s/-[^ ]*-/-/g' | xargs printf "%s\n" ; }
Output of gaps file:
5-8
15-17
25-27
How it works. The output of paste <(echo; numinterval file) file looks like:
5
1 6
1 7
1 8
7 15
1 16
1 17
8 25
1 26
1 27
From there we mainly replace things in column #1, and tweak the spacing. The 1s are replaced with -s, and the higher numbers are blanked. Remove some blanks with tr. Replace runs of hyphens like "5-6-7-8" with a single hyphen "5-8", and that's the output.
This one list the ones who breaks the sequence from a list.
Idea taken from #choroba but done with a for.
#! /bin/bash
previous=0
n=$( cat listaNums.txt )
for number in $n
do
numListed=$(($number - 1))
if [ $numListed != $previous ] && [ $number != 2147483647 ]; then
echo $numListed
fi
previous=$number
done

Resources