BASH split variable in characters and print out space in for loop - bash

I am trying to create a fancy welcome message to be displayed when the terminal is opened. I have done it in Java with this code:
public void slowPrint() {
String message = "Welcome, " + System.getProperty("user.name");
try {
for(int i = 0; i < message.length(); i++) {
System.out.print(message.charAt(i));
Thread.sleep(50);
}
} catch(InterruptedException ex) {
ex.printStackTrace();
}
}
Now, I am fairly new to bash but I've managed to make this code:
for i in W e l c o m e , $USER; do
echo -ne $i
sleep 0.05
done
echo !
There are two problems with this code:
I have no idea how to print a plain space after the comma, the output is just Welcome,simon! How can I make it output a space instead?
It, of course, prints $USER as a whole word. I would want it to be character by character, how can I do this?

You could use a standard for loop to achieve the same effect:
MESSAGE="Welcome, $USER"
for (( i=0; i<${#MESSAGE}; i++ )); do
echo -ne "${MESSAGE:$i:1}"
sleep 0.05
done
echo !
The ${MESSAGE:$i:1} syntax takes a substring of 1 from position i in the string. Enclosing that part in quotes ensures that things like spaces and tabs are also printed.

You can specify the space easily enough; enclose it in quotes:
for i in W e l c o m e , ' ' ...
Splitting $USER into separate characters can be done many ways. The way I'd do it is old-fashioned but reliable:
for i in W e l c o m e , ' ' $(sed 's/./& /g' <<< "$USER")
Note that the <<< operation saves a process and a pipe; it redirects standard input of sed so it reads the given string as a line of input.
Or, if you think the user name might contain any spaces or other special characters:
for i in W e l c o m e , ' ' $(sed "s/./'&' /g" <<< "$USER")
(This isn't bullet proof; the value USER="Tam O'Shanter" will cause some grief, and the simple fix for that runs into trouble with USER='Will "O'\''Wisp" Light' instead. ...mutter, mutter, arcane incantations, ...
for i in W e l c o m e , ' ' $(sed "s/./'&' /g; s/'''/\\\\'/g" <<< "$USER")
except that echoes the name with single quotes around everything; grumble, grumble, ... I think I've just worked out why I wouldn't ever both to do this, ... spaces get in the way too ... I'd use the simple first version and tell people not to use blanks or special characters in the value of $USER.)
There might be are ways to do it without invoking a separate process such as sed; see the answer by heuristicus. So, do it that way. But note that it is firmly tied to bash in a way that this answer isn't wholly tied to bash — you can easily replace the <<< notation (which is bash-only) with echo "$USER" | ... instead.

Completely POSIXly portable, i.e. without those pesky bashisms, and no fork to sed:
#!/bin/sh
m="Welcome, $USER!"
while test ${#m} -gt 0; do
r=${m#?}
printf '%s ' "${m%%$r}"
m=$r
sleep 0.05
done
printf '\n'
Note that sleeping for fractions of a second is non-portable.

Related

Bash what is the return value of grep and cut

when I write something like this:
x = `grep "#include $1 | cut -f2"`
or any use with grep, cut like:
x = `grep string file.c`
I don't understand if x is an array or one long string? because when I write
echo ${#x[*]}
it prints 1, but I can write:
for d in `grep....`
as it was an array, please explain.
It is giving you a single long string. This is not called the "return value" of grep or cut, but rather the "standard output" (the text they print which you capture with backticks or perhaps clearer with $(...)).
What's happening here is that you get one single string, perhaps even with newlines inside, and then you iterate over it with your for d in .... In Bash, iterating over a string splits on spaces, so you get one d value for each word. Try this to see it in action, plus a way to avoid it:
x="foo bar baz"
for d in $x; do echo $d; done
for d in "$x"; do echo $d; done
If you quote in the loop, splitting on spaces will not occur.
x is a string.
In your example for loops through words.

Why is this command within my code giving different result than the same command in terminal?

**Edit: Okay, so I've tried implementing everyone's advice so far.
-I've added quotes around each variable "$1" and "$codon" to avoid whitespace.
-I've added the -ioc flag to grep to avoid caps.
-I tried using tr -d' ', however that leads to a runtime error because it says -d' ' is an invalid option.
Unfortunately I am still seeing the same problem. Or a different problem, which is that it tells me that every codon appears exactly once. Which is a different kind of wrong.
Thanks for everything so far - I'm still open to new ideas. I've updated my code below.**
I have this bash script that is supposed to count all permutations of (A C G T) in a given file.
One line of the script is not giving me the desired result and I don't know why - especially because I can enter the exact same line of code in the command prompt and get the desired result.
The line, executed in the command prompt, is:
cat dnafile | grep -o GCT | wc -l
This line tells me how many times the regular expression "GCT" appears in the file dnafile. When I run this command the result I get is 10 (which is accurate).
In the code itself, I run a modified version of the same command:
cat $1 | grep -o $codon | wc -l
Where $1 is the file name, and $codon is the 3-letter combination. When I run this from within the program, the answer I get is ALWAYS 0 (which is decidedly not accurate).
I was hoping one of you fine gents could enlighten this lost soul as to why this is not working as expected.
Thank you very, very much!
My code:
#!/bin/bash
#countcodons <dnafile> counts occurances of each codon in sequence contained within <dnafile>
if [[ $# != 1 ]]
then echo "Format is: countcodons <dnafile>"
exit
fi
nucleos=(a c g t)
allCods=()
#mix and match nucleotides to create all codons
for x in {0..3}
do
for y in {0..3}
do
for z in {0..3}
do
perm=${nucleos[$x]}${nucleos[$y]}${nucleos[$z]}
allCods=("${allCods[#]}" "$perm")
done
done
done
#for each codon, use grep to count # of occurances in file
len=${#allCods[*]}
for (( n=0; n<len; n++ ))
do
codon=${allCods[$n]}
occs=`cat "$1" | grep -ioc "$codon" | wc -l`
echo "$codon appears: $occs"
# if (( $occs > 0 ))
# then
# echo "$codon : $occs"
# fi
done
exit
You're generating your sequences in lowercase. Your code greps for gct, not GCT. You want to add the -i switch to grep. Try:
occs=`grep -ioc $codon $1`
You've got your logic backwards - you shouldn't have to read your input file once for every codon, you should only have to read it once and check each line for every codon.
You didn't supply any sample input or expected output so it's untested but something like this is the right approach:
awk '
BEGIN {
nucleosStr="a c g t"
split(nucleosStr,nucleos)
#mix and match nucleotides to create all codons
for (x in nucleos) {
for (y in nucleos) {
for (z in nucleos) {
perm = nucleos[x] nucleos[y] nucleos[z]
allCodsStr = allCodsStr (allCodsStr?" ":"") perm
}
}
}
split(allCodsStr,allCods)
}
{
#for each codon, count # of occurances in file
for (n in allCods) {
codon = allCods[n]
if ( tolower($0) ~ codon ) {
occs[n]++
}
}
}
END {
for (n in allCods) {
printf "%s appears: %d\n", allCods[n], occs[n]
}
}
' "$1"
I expect you'll see a huge performance improvement with that approach if your file is moderately large.
Try:
occs=`cat $1 | grep -o $codon | wc -l | tr -d ' '`
The problem is that wc indents the output, so $occs has a bunch of spaces at the beginning.

Bash - extracting a string between two points

For example:
((
extract everything here, ignore the rest
))
I know how to ignore everything within, but I don't know how to do the opposite. Basically, it'll be a file and it needs to extract the data between the two points and then output it to another file. I've tried countless approaches, and all seem to tell me the indentation I'm stating doesn't exist in the file, when it does.
If somebody could point me in the right direction, I'd be grateful.
If your data are "line oriented", so the marker is alone (as in the example), you can try some of the following:
function getdata() {
cat - <<EOF
before
((
extract everything here, ignore the rest
someother text
))
after
EOF
}
echo "sed - with two seds"
getdata | sed -n '/((/,/))/p' | sed '1d;$d'
echo "Another sed solution"
getdata | sed -n '1,/((/d; /))/,$d;p'
echo "With GNU sed"
getdata | gsed -n '/((/{:a;n;/))/b;p;ba}'
echo "With perl"
getdata | perl -0777 -pe "s/.*\(\(\s*\\n(.*)?\)\).*/\$1/s"
Ps: yes, its looks like a dance of crazy toothpicks
Assuming you want to extract the string inside (( and )):
VAR="abc((def))ghi"
echo "$VAR"
VAR=${VAR##*((}
VAR=${VAR%%))*}
echo "$VAR"
## cuts away the longest string from the beginning; # cuts away the shortest string from the beginning; %% cuts away the longest string at the end; % cuts away the shortes string at the end
The file :
$ cat /tmp/l
((
extract everything here, ignore the rest
someother text
))
The script
$ awk '$1=="((" {p=1;next} $1=="))" {p=o;next} p' /tmp/l
extract everything here, ignore the rest
someother text
sed -n '/^((/,/^))/ { /^((/b; /^))/b; p }'
Brief explanation:
/^((/,/^))/: range addressing (inclusive)
{ /^((/b; /^))/b; p }: sequence of 3 commands
1. skip line with ^((
2. skip line with ^))
3. print
The line skipping is required to make the range selection exclusive.

How to line wrap output in bash?

I have a command which outputs in this format:
A
B
C
D
E
F
G
I
J
etc
I want the output to be in this format
A B C D E F G I J
I tried using ./script | tr "\n" " " but all it does is remove n from the output
How do I get all the output in one line. (Line wrapped)
Edit: I accidentally put in grep while asking the question. I removed
it. My original question still stands.
The grep is superfluous.
This should work:
./script | tr '\n' ' '
It did for me with a command al that lists its arguments one per line:
$ al A B C D E F G H I J
A
B
C
D
E
F
G
H
I
J
$ al A B C D E F G H I J | tr '\n' ' '
A B C D E F G H I J $
As Jonathan Leffler points out, you don't want the grep. The command you're using:
./script | grep tr "\n" " "
doesn't even invoke the tr command; it should search for the pattern "tr" in files named "\n" and " ". Since that's not the output you reported, I suspect you've mistyped the command you're using.
You can do this:
./script | tr '\n' ' '
but (a) it joins all its input into a single line, and (b) it doesn't append a newline to the end of the line. Typically that means your shell prompt will be printed at the end of the line of output.
If you want everything on one line, you can do this:
./script | tr '\n' ' ' ; echo ''
Or, if you want the output wrapped to a reasonable width:
./script | fmt
The fmt command has a number of options to control things like the maximum line length; read its documentation (man fmt or info fmt) for details.
No need to use other programs, why not use Bash to do the job? (-- added in edit)
line=$(./script.sh)
set -- $line
echo "$*"
The set sets command-line options, and one of the (by default) seperators is a "\n". EDIT: This will overwrite any existing command-line arguments, but good coding practice would suggest that you reassigned these to named variables early in the script.
When we use "$*" (note the quotes) it joins them alll together again using the first character of IFS as the glue. By default that is a space.
tr is an unnecessary child process.
By the way, there is a command called script, so be careful of using that name.
If I'm not mistaken, the echo command will automatically remove the newline chars when its argument is given unquoted:
tmp=$(./script.sh)
echo $tmp
results in
A B C D E F G H I J
whereas
tmp=$(./script.sh)
echo "$tmp"
results in
A
B
C
D
E
F
G
H
I
J
If needed, you can re-assign the output of the echo command to another variable:
tmp=$(./script.sh)
tmp2=$(echo $tmp)
The $tmp2 variable will then contain no newlines.

gnuplot for cycle and spaces in filename

I have small script in bash, which is generating graphs via gnuplot.
Everything works fine until names of input files contain space(s).
Here's what i've got:
INPUTFILES=("data1.txt" "data2 with spaces.txt" "data3.txt")
...
#MAXROWS is set earlier, not relevant.
for LINE in $( seq 0 $(( MAXROWS - 1 )) );do
gnuplot << EOF
reset
set terminal png
set output "out/graf_${LINE}.png"
filenames="${INPUTFILES[#]}"
set multiplot
plot for [file in filenames] file every ::0::${LINE} using 1:2 with line title "graf_${LINE}"
unset multiplot
EOF
done
This code works, but only without spaces in names of input files.
In the example gnuplot evaluate this:
1 iteration: file=data1.txt - CORRECT
2 iteration: file=data2 - INCORRECT
3 iteration: file=with - INCORRECT
4 iteration: file=spaces.txt - INCORRECT
The quick answer is that you can't do exactly what you want to do. Gnuplot splits the string in an iteration on spaces and there's no way around that (AFIK). Depending on what you want, there may be a "Work-around". You can write a (recursive) function in gnuplot to replace a character string with another --
#S,C & R stand for STRING, CHARS and REPLACEMENT to help this be a little more legible.
replace(S,C,R)=(strstrt(S,C)) ? \
replace( S[:strstrt(S,C)-1].R.S[strstrt(S,C)+strlen(C):] ,C,R) : S
Bonus points to anyone who can figure out how to do this without recursion...
Then your (bash) loop looks something like:
INPUTFILES_BEFORE=("data1.txt" "data2 with spaces.txt" "data3.txt")
INPUTFILES=()
#C style loop to avoid changing IFS -- Sorry SO doesn't like the #...
#This loop pre-processes files and changes spaces to '#_#'
for (( i=0; i < ${#INPUTFILES_BEFORE[#]}; i++)); do
FILE=${INPUTFILES_BEFORE[${i}]}
INPUTFILES+=( "`echo ${FILE} | sed -e 's/ /#_#/g'`" ) #replace ' ' with '#_#'
done
which preprocesses your input files to add '#_#' to the filenames which have spaces in them... Finally, the "complete" script:
...
INPUTFILES_BEFORE=("data1.txt" "data2 with spaces.txt" "data3.txt")
INPUTFILES=()
for (( i=0; i < ${#INPUTFILES_BEFORE[#]}; i++)); do
FILE=${INPUTFILES_BEFORE[${i}]}
INPUTFILES+=( "`echo ${FILE} | sed -e 's/ /#_#/g'`" ) #replace ' ' with '#_#'
done
for LINE in $( seq 0 $(( MAXROWS - 1 )) );do
gnuplot <<EOF
filenames="${INPUTFILES[#]}"
replace(S,C,R)=(strstrt(S,C)) ? \
replace( S[:strstrt(S,C)-1].R.S[strstrt(S,C)+strlen(C):] , C ,R) : S
#replace '#_#' with ' ' in filenames.
plot for [file in filenames] replace(file,'#_#',' ') every ::0::${LINE} using 1:2 with line title "graf_${LINE}"
EOF
done
However, I think the take-away here is that you shouldn't use spaces in filenames ;)
Escape the spaces:
"data2\ with\ spaces.txt"
EDIT
It seems that even with escape sequences, as you have mentioned, the bash for will always parse the input on the spaces.
Can you convert your script to work in a while loop fashion:
http://ubuntuforums.org/showthread.php?t=83424
This also may be a solution, but it's new to me and I'm still playing with it to understand exactly what it's doing:
http://www.cyberciti.biz/tips/handling-filenames-with-spaces-in-bash.html

Resources