unexpected result: grep from a changing line - shell

I wrote a bash command to test grep from a changing line:
for i in $(seq 0 9); do echo -e -n "\r"$i; sleep 0.1; done | grep 5
The result shows:
9
Update
The real problem is as follows:
mplayer shows and refreshes a single-line playing progress when playing a media file. A sample result is:
A: 17.2 (17.2) of 213.0 (03:33.0) 0.5%
And I'm trying to grep this playing progress and ingore other lines. I used this command:
mplayer xxx.mp3 | grep ^A:
The result does not contain the line expected.
Update 2
mplayer xxx.mp3 | od -xda
shows:
0002140 4a5b 410d 203a 2020 2e31 2033 3028 2e31
[ J \r A : 1 . 3 ( 0 1 .
133 112 015 101 072 040 040 040 061 056 063 040 050 060 061 056
0002160 2932 6f20 2066 3132 2e33 2030 3028 3a33
2 ) o f 2 1 3 . 0 ( 0 3 :
062 051 040 157 146 040 062 061 063 056 060 040 050 060 063 072
0002200 3333 302e 2029 3020 342e 2025 5b1b 0d4a
3 3 . 0 ) 0 . 4 % 033 [ J \r
063 063 056 060 051 040 040 060 056 064 045 040 033 133 112 015
0002220 3a41 2020 3120 352e 2820 3130 342e 2029
A : 1 . 5 ( 0 1 . 4 )
101 072 040 040 040 061 056 065 040 050 060 061 056 064 051 040
0002240 666f 3220 3331 302e 2820 3330 333a 2e33
o f 2 1 3 . 0 ( 0 3 : 3 3 .
157 146 040 062 061 063 056 060 040 050 060 063 072 063 063 056
And
mplayer xxx.mp3 | tr '\r' '\n'
shows
A: 0.2 (00.1) of 213.0 (03:33.0) 0.3%
A: 0.3 (00.3) of 213.0 (03:33.0) 0.3%
A: 0.5 (00.5) of 213.0 (03:33.0) 0.4%
A: 0.6 (00.6) of 213.0 (03:33.0) 0.4%
A: 0.8 (00.8) of 213.0 (03:33.0) 0.4%
A: 1.0 (01.0) of 213.0 (03:33.0) 0.4%
While,
mplayer xxx.mp3 | tr '\r' '\n' | grep ^A
shows empty result.
Any tip will be appreciated.

It's your definition of "line" that's causing the problem here. The -n means that all the numbers are output on a single line, according the the definition used by grep (a series of characters, terminated by the \n character):
\r1\r2\r3\r4\r5\r6\r7\r8\r9
If you pipe the output through something like a hex dump, you can see what's happening:
$ for i in $(seq 0 9); do echo -e -n "\r"$i; sleep 0.1; done | grep 5 | od -xcb
0000000 300d 310d 320d 330d 340d 350d 360d 370d
\r 0 \r 1 \r 2 \r 3 \r 4 \r 5 \r 6 \r 7
015 060 015 061 015 062 015 063 015 064 015 065 015 066 015 067
0000020 380d 390d 000a
\r 8 \r 9 \n
015 070 015 071 012
0000025
That single line containing all the carriage returns (and not newlines) will, when output, appear to be a single line with just the 9 on it. Removing the -n will result instead in:
$ for i in $(seq 0 9); do echo -e "\r"$i; sleep 0.1; done | grep 5 | od -xcb
0000000 350d 000a
\r 5 \n
015 065 012
0000003
which would look like just the 5 was being output.
If you have a process that outputs "lines" separated by carriage returns rather than newlines, there's nothing to stop you changing them on the fly so as to be able to handle them as real lines:
$ echo -e "junk\rA: good 1\rjunk\rA: good 2\rjunk" | tr '\r' '\n' | grep '^A'
A: good 1
A: good 2
Applying that back to your original question, it would be (with the sleep removed since it's irrelevant):
$ for i in $(seq 0 9); do echo -e -n "\r"$i; done | tr '\r' '\n' | grep 5
5
$ for i in $(seq 0 9); do echo -e -n "\r"$i; done | tr '\r' '\n' | grep 5 | od -xcb
0000000 0a35
5 \n
065 012
0000002

Related

zsh: no such file or directory error but file exist

I'm trying to run a compiler but I'm getting an error saying it can not be found, but it looks to exist and the path is good. I even tried a different shell incase zsh was mis-configured, but got the same error. Lost at what to do, any suggestions?
6909077c228a% ls -l toolchain/bin/armv7l-timesys-linux-gnueabi-gcc
-rwxr-xr-x 2 root root 2287465 Sep 11 13:19 toolchain/bin/armv7l-timesys-linux-gnueabi-gcc
6909077c228a% ./toolchain/bin/armv7l-timesys-linux-gnueabi-gcc
zsh: no such file or directory: ./toolchain/bin/armv7l-timesys-linux-gnueabi-gcc
#switch to bash
6909077c228a:~$ ./toolchain/bin/armv7l-timesys-linux-gnueabi-gcc
bash: ./toolchain/bin/armv7l-timesys-linux-gnueabi-gcc: No such file or directory
Edit:
Update showing suggestion, don't see any odd character inserted.
6909077c228a% ls -l toolchain/bin/armv7l-timesys-linux-gnueabi-gcc | od -xcb
0000000 722d 7877 2d72 7278 782d 3220 7220 6f6f
- r w x r - x r - x 2 r o o
055 162 167 170 162 055 170 162 055 170 040 062 040 162 157 157
0000020 2074 6f72 746f 3220 3832 3437 3536 5320
t r o o t 2 2 8 7 4 6 5 S
164 040 162 157 157 164 040 062 062 070 067 064 066 065 040 123
0000040 7065 3120 2031 3331 313a 2039 6f74 6c6f
e p 1 1 1 3 : 1 9 t o o l
145 160 040 061 061 040 061 063 072 061 071 040 164 157 157 154
0000060 6863 6961 2f6e 6962 2f6e 7261 766d 6c37
c h a i n / b i n / a r m v 7 l
143 150 141 151 156 057 142 151 156 057 141 162 155 166 067 154
0000100 742d 6d69 7365 7379 6c2d 6e69 7875 672d
- t i m e s y s - l i n u x - g
055 164 151 155 145 163 171 163 055 154 151 156 165 170 055 147
0000120 756e 6165 6962 672d 6363 000a
n u e a b i - g c c \n
156 165 145 141 142 151 055 147 143 143 012
Depending on how you typed in your initial ls -l line, there may be funny characters in the file name. If you use auto completion, it may have put those funny characters in for you so, if you subsequently attempt to type in the file name without auto completion, that could result in a file not found situation.
The first thing you should do is to check the filename completely, with something like:
ls -l toolchain/bin/armv7l-timesys-linux-gnueabi-gcc | od -xcb
and check the output to ensure there's no funny characters in the name.
If the file does exist in that for (no funny characters), one other possibility is that you're trying to run a 32-bit ELF program on a system that's not correctly set up to run them (i.e., a 64-bit system without the libraries and support infrastructure for 32-bit).
That results in an unhelpful error message since it really should be complaining about not being able to find the loader for your 32-bit executable, rather than the executable itself.
If this is the case, you will need to identify those missing items and install them.

output to a variable file name in for loop in bash

I am doing some tasks in side the for loop and trying to stdout to a variable file name during every iteration. But it is giving me the only one file with part of file assigned.
This is my script:
#!/bin/sh
me1_dir="/Users/njayavel/Downloads/Silencer_project/roadmap_analysis/data/h3k4me1_data"
me3_dir="/Users/njayavel/Downloads/Silencer_project/roadmap_analysis/data/h3k4me3_data"
dnase_dir="/Users/njayavel/Downloads/Silencer_project/roadmap_analysis/data/dnase_data"
index=(003 004)
#index=(003 004 005 006 007 008 017 021 022 028 029 032 033 034 046 050 051 055 056 057 059 080 081 082 083 084 085 086 088 089 090 091 092 093 094 097 098 100 109)
#index=(006 007 008 017 021 022 028 029 032 033 034 046 050 051 055 056 057 059 080 081 082 083 084 085 086 088 089 090 091 092 093 094 097 098 100 109)
for i in "${index[#]}"; do
dnase_file="$dnase_dir/E$i-DNase.hotspot.fdr0.01.broad.bed"
me1_fil="$me1_dir/E$i-H3K4me1.broadPeak"
me3_fil="$me3_dir/E$i-H3K4me3.broadPeak"
awk 'BEGIN { OFS="\t"}; {print $1,$2,$3}' $me1_fil > me1_file.bed
awk 'BEGIN { OFS="\t"}; {print $1,$2,$3}' $me3_fil > me3_file.bed
ctcf_file="CTCFsites_hg19_sorted_bedmerged.bed"
tss_file="TSS_gene_2kbupstrm_0.5kbdownstrm.bed"
cat me1_file.bed me3_file.bed $ctcf_file $tss_file | sort -k1,1 -k2,2n > file2.bed
awk 'BEGIN { OFS="\t"}; {print $1,$2,$3}' $dnase_file | sort -k1,1 -k2,2n > file1.bed
bedtools intersect -v -a file1.bed -b file2.bed > E$i_file.txt;
done
It is giving only the output file "E.txt" from the last line in for loop. I am expecting E003_file.txt and E004_file.txt.
I am newbie please help me out.
Thank you
When you write
E$i_file.txt
the shell is looking for a variable named i_file, because _ is a valid character in a variable name, not a delimiter. You need to use braces to delimit the variable name:
bedtools intersect -v -a file1.bed -b file2.bed > "E${i}_file.txt"

Easy way to make a loop in shell

I would like to print an array as
001 002 003 .. 010 021 022 .. 030 041 042 .. 050
I had written the following script to do that. This is working well, but it is printing like
001 021 041 002 022 042 ....
#!/bin/sh
for i in {1..10}; do
while [ $i -le 50 ]; do
if [[ $i -le 9 ]];then n=00$i;else n=0$i;fi
echo $n
i=$(( i + 20 ))
done
done
I am looking for a easy way so that it will print like
001 002 003 .. 010 021 022 .. 030 041 042 .. 050
I assume that you are using Linux (since seq is often not installed in something like FreeBSD).
You can use seq with -f option.
first seq prints 001 .. 010
second seq prints 021 .. 030
and the last seq prints 041 .. 050
for i in {0..2}
do seq -f '%03g' $((i*20+1)) $((i*20+10))
done
A for-loop in a shell needs to operate on a static set, but you change $i in your loop.
Instead, use a while-loop:
i=1
while [ $i -le 50 ]; do
printf "%03d " $i
if [ $( expr $i % 10 ) -eq 0 ]; then
i=$(( i + 11 ))
else
i=$(( i + 1 ))
fi
done
echo
Or, with bash or ksh:
i=1
while (( i <= 50 )); do
printf "%03d " $i
if (( (i % 10) == 0 )); then
(( i += 11 ))
else
(( ++i ))
fi
done
echo
bash:
echo {001..010} {021..030} {041..050}
Output:
001 002 003 004 005 006 007 008 009 010 021 022 023 024 025 026 027 028 029 030 041 042 043 044 045 046 047 048 049 050
Using numrange:
numrange /001..010,021..030,041..050/
Output (space delimited):
001 002 003 004 005 006 007 008 009 010 021 022 023 024 025 026 027
028 029 030 041 042 043 044 045 046 047 048 049 050
For linefeed delimiters add the -N option (30 line output not shown):
numrange -N /001..010,021..030,041..050/

Bash: ls says file not found

I tried the following command
for i in `ls`; do ls $i; done
and got the following output:
ls: a.out: No such file or directory
ls: c: No such file or directory
ls: contest: No such file or directory
ls: cpp: No such file or directory
ls: java: No such file or directory
ls: : No such file or directory
It is confusing since the list of files was also obtained using ls. When I tried to do an od on echo, i see the following:
0000000 033 133 060 155 033 133 060 061 073 063 062 155 141 056 157 165
033 [ 0 m 033 [ 0 1 ; 3 2 m a . o u
0000020 164 033 133 060 155 012
t 033 [ 0 m \n
0000026
0000000 033 133 060 061 073 063 064 155 143 033 133 060 155 012
033 [ 0 1 ; 3 4 m c 033 [ 0 m \n
0000016
0000000 033 133 060 061 073 063 064 155 143 157 156 164 145 163 164 033
033 [ 0 1 ; 3 4 m c o n t e s t 033
0000020 133 060 155 012
[ 0 m \n
0000024
0000000 033 133 060 061 073 063 064 155 143 160 160 033 133 060 155 012
033 [ 0 1 ; 3 4 m c p p 033 [ 0 m \n
0000020
0000000 033 133 060 155 146 151 154 145 056 164 170 164 033 133 060 155
033 [ 0 m f i l e . t x t 033 [ 0 m
0000020 012
\n
0000021
0000000 033 133 060 061 073 063 064 155 152 141 166 141 033 133 060 155
033 [ 0 1 ; 3 4 m j a v a 033 [ 0 m
0000020 012
\n
0000021
0000000 033 133 155 012
033 [ m \n
0000004
What does these "033 [ 0 m" characters stand for? How do I avoid them? Are they the cause of this problem?
Please help.
Thanks,
Karthick S.
You don't need `ls` or $(ls). You can use * instead. This way you avoid fancy colored outputs while leaving your code both portable, readable and compact.
This is #1 in Bash Pitfalls
NEVER use ls as input for another command...
The "033 [ 0 m" characters are escape codes for colouring terminal output. Try using this instead:
for file in $(unset LS_COLORS \ls);
do
ls "$file";
done
Or
for file in $(ls -1 -Q --quoting-style=shell --color=never);
do
ls "$file";
done

unexpected result from gnu sort

when I try to sort the following text file 'input':
test1 3
test3 2
test 4
with the command
sort input
the output is exactly the input. Here is the output of
od -bc input
:
0000000 164 145 163 164 061 011 063 012 164 145 163 164 063 011 062 012
t e s t 1 \t 3 \n t e s t 3 \t 2 \n
0000020 164 145 163 164 011 064 012
t e s t \t 4 \n
0000027
It's just a tab separated file with two columns. When I do
sort -k 2
The output changes to
test3 2
test1 3
test 4
which is what I would expect. But if I do
sort -k 1
nothing changes with respect to the input, whereas I would expect 'test' to sort before 'test1'. Finally, if I do
cat input | cut -f 1 | sort
I get
test
test1
test3
as expected. Is there a logical explanation for this? What exactly is sort supposed to do by default, something like:
sort -k 1
?
My version of sort:
sort (GNU coreutils) 7.4
From the man pages:
* WARNING * The locale specified by the environment affects
sort
order. Set LC_ALL=C to get the traditional sort order that uses
native
byte values.
So it seems export LC_ALL=C must help

Resources