Easy way to make a loop in shell - shell

I would like to print an array as
001 002 003 .. 010 021 022 .. 030 041 042 .. 050
I had written the following script to do that. This is working well, but it is printing like
001 021 041 002 022 042 ....
#!/bin/sh
for i in {1..10}; do
while [ $i -le 50 ]; do
if [[ $i -le 9 ]];then n=00$i;else n=0$i;fi
echo $n
i=$(( i + 20 ))
done
done
I am looking for a easy way so that it will print like
001 002 003 .. 010 021 022 .. 030 041 042 .. 050

I assume that you are using Linux (since seq is often not installed in something like FreeBSD).
You can use seq with -f option.
first seq prints 001 .. 010
second seq prints 021 .. 030
and the last seq prints 041 .. 050
for i in {0..2}
do seq -f '%03g' $((i*20+1)) $((i*20+10))
done

A for-loop in a shell needs to operate on a static set, but you change $i in your loop.
Instead, use a while-loop:
i=1
while [ $i -le 50 ]; do
printf "%03d " $i
if [ $( expr $i % 10 ) -eq 0 ]; then
i=$(( i + 11 ))
else
i=$(( i + 1 ))
fi
done
echo
Or, with bash or ksh:
i=1
while (( i <= 50 )); do
printf "%03d " $i
if (( (i % 10) == 0 )); then
(( i += 11 ))
else
(( ++i ))
fi
done
echo

bash:
echo {001..010} {021..030} {041..050}
Output:
001 002 003 004 005 006 007 008 009 010 021 022 023 024 025 026 027 028 029 030 041 042 043 044 045 046 047 048 049 050

Using numrange:
numrange /001..010,021..030,041..050/
Output (space delimited):
001 002 003 004 005 006 007 008 009 010 021 022 023 024 025 026 027
028 029 030 041 042 043 044 045 046 047 048 049 050
For linefeed delimiters add the -N option (30 line output not shown):
numrange -N /001..010,021..030,041..050/

Related

JQ explode function is returning incorrect chars

I am trying to decode base64 encoded binary content in JQ using explode function.
When I run explode and then through implode, I am expecting it to return the same string. But it is not. Try it here: https://jqplay.org/s/Rt8H1qv8VRP
Base64 encoded string: "AQEAAAABAQAyGWRkZBXNWwcAAAAAAQIDBAUGBwgJClIGnj9SBp4/"
JQ: '#base64d | explode | implode | #base64'
Output: "AQEAAAABAQAyGWRkZBXvv71bBwAAAAABAgMEBQYHCAkKUgbvv70/Ugbvv70/"
Debugging further,
#base64d | explode | .[14]
returns
65533
Running the following on Ubuntu, you can see the [14] char is 315 (octal) == 215(decimal)
$ echo "AQEAAAABAQAyGWRkZBXNWwcAAAAAAQIDBAUGBwgJClIGnj9SBp4/" | base64 -d | od -bc
0000000 001 001 000 000 000 001 001 000 062 031 144 144 144 025 315 133
001 001 \0 \0 \0 001 001 \0 2 031 d d d 025 315 [
0000020 007 000 000 000 000 001 002 003 004 005 006 007 010 011 012 122
\a \0 \0 \0 \0 001 002 003 004 005 006 \a \b \t \n R
0000040 006 236 077 122 006 236 077
006 236 ? R 006 236 ?
0000047
Why is JQ returning this weird 65533 (0xFFFD) character? What am I missing?
First of all, the issue has nothing to do with explode or implode. Using just #base64d | #base64 produces the same result.
jq expects the string encoded with base64 to be text encoded with UTF-8.
If the decoded string is not UTF-8, the results are undefined.
Your input is not UTF-8.
U+FFFD REPLACEMENT CHARACTER is a character used to mark input errors.

output to a variable file name in for loop in bash

I am doing some tasks in side the for loop and trying to stdout to a variable file name during every iteration. But it is giving me the only one file with part of file assigned.
This is my script:
#!/bin/sh
me1_dir="/Users/njayavel/Downloads/Silencer_project/roadmap_analysis/data/h3k4me1_data"
me3_dir="/Users/njayavel/Downloads/Silencer_project/roadmap_analysis/data/h3k4me3_data"
dnase_dir="/Users/njayavel/Downloads/Silencer_project/roadmap_analysis/data/dnase_data"
index=(003 004)
#index=(003 004 005 006 007 008 017 021 022 028 029 032 033 034 046 050 051 055 056 057 059 080 081 082 083 084 085 086 088 089 090 091 092 093 094 097 098 100 109)
#index=(006 007 008 017 021 022 028 029 032 033 034 046 050 051 055 056 057 059 080 081 082 083 084 085 086 088 089 090 091 092 093 094 097 098 100 109)
for i in "${index[#]}"; do
dnase_file="$dnase_dir/E$i-DNase.hotspot.fdr0.01.broad.bed"
me1_fil="$me1_dir/E$i-H3K4me1.broadPeak"
me3_fil="$me3_dir/E$i-H3K4me3.broadPeak"
awk 'BEGIN { OFS="\t"}; {print $1,$2,$3}' $me1_fil > me1_file.bed
awk 'BEGIN { OFS="\t"}; {print $1,$2,$3}' $me3_fil > me3_file.bed
ctcf_file="CTCFsites_hg19_sorted_bedmerged.bed"
tss_file="TSS_gene_2kbupstrm_0.5kbdownstrm.bed"
cat me1_file.bed me3_file.bed $ctcf_file $tss_file | sort -k1,1 -k2,2n > file2.bed
awk 'BEGIN { OFS="\t"}; {print $1,$2,$3}' $dnase_file | sort -k1,1 -k2,2n > file1.bed
bedtools intersect -v -a file1.bed -b file2.bed > E$i_file.txt;
done
It is giving only the output file "E.txt" from the last line in for loop. I am expecting E003_file.txt and E004_file.txt.
I am newbie please help me out.
Thank you
When you write
E$i_file.txt
the shell is looking for a variable named i_file, because _ is a valid character in a variable name, not a delimiter. You need to use braces to delimit the variable name:
bedtools intersect -v -a file1.bed -b file2.bed > "E${i}_file.txt"

unexpected result: grep from a changing line

I wrote a bash command to test grep from a changing line:
for i in $(seq 0 9); do echo -e -n "\r"$i; sleep 0.1; done | grep 5
The result shows:
9
Update
The real problem is as follows:
mplayer shows and refreshes a single-line playing progress when playing a media file. A sample result is:
A: 17.2 (17.2) of 213.0 (03:33.0) 0.5%
And I'm trying to grep this playing progress and ingore other lines. I used this command:
mplayer xxx.mp3 | grep ^A:
The result does not contain the line expected.
Update 2
mplayer xxx.mp3 | od -xda
shows:
0002140 4a5b 410d 203a 2020 2e31 2033 3028 2e31
[ J \r A : 1 . 3 ( 0 1 .
133 112 015 101 072 040 040 040 061 056 063 040 050 060 061 056
0002160 2932 6f20 2066 3132 2e33 2030 3028 3a33
2 ) o f 2 1 3 . 0 ( 0 3 :
062 051 040 157 146 040 062 061 063 056 060 040 050 060 063 072
0002200 3333 302e 2029 3020 342e 2025 5b1b 0d4a
3 3 . 0 ) 0 . 4 % 033 [ J \r
063 063 056 060 051 040 040 060 056 064 045 040 033 133 112 015
0002220 3a41 2020 3120 352e 2820 3130 342e 2029
A : 1 . 5 ( 0 1 . 4 )
101 072 040 040 040 061 056 065 040 050 060 061 056 064 051 040
0002240 666f 3220 3331 302e 2820 3330 333a 2e33
o f 2 1 3 . 0 ( 0 3 : 3 3 .
157 146 040 062 061 063 056 060 040 050 060 063 072 063 063 056
And
mplayer xxx.mp3 | tr '\r' '\n'
shows
A: 0.2 (00.1) of 213.0 (03:33.0) 0.3%
A: 0.3 (00.3) of 213.0 (03:33.0) 0.3%
A: 0.5 (00.5) of 213.0 (03:33.0) 0.4%
A: 0.6 (00.6) of 213.0 (03:33.0) 0.4%
A: 0.8 (00.8) of 213.0 (03:33.0) 0.4%
A: 1.0 (01.0) of 213.0 (03:33.0) 0.4%
While,
mplayer xxx.mp3 | tr '\r' '\n' | grep ^A
shows empty result.
Any tip will be appreciated.
It's your definition of "line" that's causing the problem here. The -n means that all the numbers are output on a single line, according the the definition used by grep (a series of characters, terminated by the \n character):
\r1\r2\r3\r4\r5\r6\r7\r8\r9
If you pipe the output through something like a hex dump, you can see what's happening:
$ for i in $(seq 0 9); do echo -e -n "\r"$i; sleep 0.1; done | grep 5 | od -xcb
0000000 300d 310d 320d 330d 340d 350d 360d 370d
\r 0 \r 1 \r 2 \r 3 \r 4 \r 5 \r 6 \r 7
015 060 015 061 015 062 015 063 015 064 015 065 015 066 015 067
0000020 380d 390d 000a
\r 8 \r 9 \n
015 070 015 071 012
0000025
That single line containing all the carriage returns (and not newlines) will, when output, appear to be a single line with just the 9 on it. Removing the -n will result instead in:
$ for i in $(seq 0 9); do echo -e "\r"$i; sleep 0.1; done | grep 5 | od -xcb
0000000 350d 000a
\r 5 \n
015 065 012
0000003
which would look like just the 5 was being output.
If you have a process that outputs "lines" separated by carriage returns rather than newlines, there's nothing to stop you changing them on the fly so as to be able to handle them as real lines:
$ echo -e "junk\rA: good 1\rjunk\rA: good 2\rjunk" | tr '\r' '\n' | grep '^A'
A: good 1
A: good 2
Applying that back to your original question, it would be (with the sleep removed since it's irrelevant):
$ for i in $(seq 0 9); do echo -e -n "\r"$i; done | tr '\r' '\n' | grep 5
5
$ for i in $(seq 0 9); do echo -e -n "\r"$i; done | tr '\r' '\n' | grep 5 | od -xcb
0000000 0a35
5 \n
065 012
0000002

Generate combinations of elements with echo

I need to prepare a simple script to generate all the permutations possible of a set of elements stored in a variable in groups of n elements (being n parameterizable), the easiest solution which came to mind was using several loops depending on the selected length of the group. But I thought that it would be more elegant taking advantage of the ability of echo command to generate combinations, that is
echo {1,2}{1,2}
11 12 21 22
So using this method, I'm trying to achieve a general way to do it, using as input parameters the list of elements (for example {1,2}) and the number of elements. It would be something like it:
set={1,2,3,4}
group=3
for ((i=0; i<$group; i++));
do
repetition=$set$repetition
done
So in this particular case, at the end of the loop the repetition variable has the value {1,2,3,4}{1,2,3,4}{1,2,3,4}. But I'm not able to find the way to use this variable to produce the combinations using the echo command. I've tried, several things like:
echo $repetition
echo $(echo $repetition)
I'm stucked on it, I'd appreciate any tip or help on that.
You can use:
bash -c "echo "$repetition""
111 112 113 114 121 122 123 124 131 132 133 134 141 142 143 144 211 212 213 214 221 222 223 224 231 232 233 234 241 242 243 244 311 312 313 314 321 322 323 324 331 332 333 334 341 342 343 344 411 412 413 414 421 422 423 424 431 432 433 434 441 442 443 444
Or else use eval instead of bash -c
If you need k-combinations for all k, this combination script can help:
#!/bin/bash
POWER=$((2**$#))
BITS=`seq -f '0' -s '' 1 $#`
while [ $POWER -gt 1 ];do
POWER=$(($POWER-1))
BIN=`bc <<< "obase=2; $POWER"`
MASK=`echo $BITS | sed -e "s/0\{${#BIN}\}$/$BIN/" | grep -o .`
POS=1; AWK=`for M in $MASK;do
[ $M -eq 1 ] && echo -n "print \\$\${POS};"
POS=$(($POS+1))
done;echo`
awk -v ORS=" " "{$AWK}" <<< "$#" | sed 's/ $//'
done
Example:
./combination ⚪ ⛔ ⚫
⚪ ⛔ ⚫
⚪ ⛔
⚪ ⚫
⚪
⛔ ⚫
⛔
⚫
The empty set is there too, trust me.

bash sequence 00 01 ... 10

in bash, with
$ echo {1..10}
1 2 3 4 5 6 7 8 9 10
I can get a numbers sequence, but in some case I need
01 02 03 ... 10
how I can get this ?
and how I can get ?
001 002 ... 010 011 .. 100
This will work in any shell on a machine that has coreutils installed (thanks commenters for correcting me):
seq -w 1 10
and
seq -w 1 100
Explanation:
the option -w will:
Equalize the widths of all numbers by padding with zeros as necessary.
seq [-w] [-f format] [-s string] [-t string] [first [incr]] last
prints a sequence of numbers, one per line (default), from
first (default 1), to near last as possible, in increments of incr (default
1). When first is larger than last the default incr is -1
use seq command with -f parameter, try:
seq -f "%02g" 0 10
results:
00
01
02
03
04
05
06
07
08
09
10
seq -f "%03g" 0 10
results:
000
001
002
003
004
005
006
007
008
009
010
printf "%02d " {1..10} ; echo
Output:
01 02 03 04 05 06 07 08 09 10
Similarly:
printf "%03d " {1..100} ; echo
In more recent versions of bash, simply:
echo {01..10}
And:
echo {001..100}
for i in {01..99}; do
echo $i
done
will return :
01
02
03
04
05
06
07
08
09
10
...
Replacing 01 with 001 and 99 with 999 or 100 will do what you expect also.
$ printf "%02d " {0..10}; echo
00 01 02 03 04 05 06 07 08 09 10
$ printf "%03d " {0..100}; echo
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 063 064 065 066 067 068 069 070 071 072 073 074 075 076 077 078 079 080 081 082 083 084 085 086 087 088 089 090 091 092 093 094 095 096 097 098 099 100
Just vary the field width in the format string (2 and 3 in this case) and of course the brace expansion range. The echo is there just for cosmetic purposes, since the format string does not contain a newline itself.
printf is a shell builtin, but you likely also have a version from coreutils installed, which can be used in-place.
awk only:
awk 'BEGIN { for (i=0; i<10; i++) printf("%02d ", i) }'
The following will work in bash
echo {01..10}
**EDIT seeing the answers around me I just wanted to add this, in the case we're talking about commands that will work under any terminal
yes | head -n 100 | awk '{printf( "%03d ", NR )}' ##for 001...100
or
yes | head -n 10 | awk '{printf( "%03d ", NR )}' ##for 01..10
echo 0{0..9}
You can get: 00 01 02 03 04 05 06 07 08 09
echo 0{0..9} 1{0..9}
You can get: 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19
echo 00{0..9} 0{10..99}
You can get 001 .. 099
There are so many ways to do this! My personal favorite is:
yes | grep y | sed 100q | awk '{printf( "%03d ", NR )}'; echo
Clearly, neither the sed nor the grep are necessary (the grep being far more trivial, since if you omit the sed you need to change the awk), but they contribute to the overall satisfaction of the solution! The final echo is not really necessary either, but it's always nice to have a trailing newline.
Another nice option is:
yes | nl -ba | tr ' ' 0 | sed 100q | cut -b 4-6
Or (less absurdly):
yes '' | sed ${top-100}q | nl -ba -w ${width-3} -n rz
as commented by favoretti, seq is your friend.
But there is a caveat:
seq -w uses the second argument to set the format it will use.
Thus, the command seq -w 1 9 will print the sequence 1 2 3 4 5 6 7 8 9
To print the sequence 01 .. 09 you need to do the following:
seq -w 1 09
Or for clarities sake use the same format on both ends, for instance:
seq -w 000 010 for the series 001 002 003 ... 010
And you can also use a step argument that also works in reverse:
seq -w 10 -1 01' for 10,09,08...01 orseq -w 01 2 10` for 01,03,05,07,09

Resources