Bash loop on files in folder without specific pattern - bash

I have to cycle over the files present in a folder but I dont want to cycle over files with a specific pattern ("Reverse"). Here is the code
Thanks
DIRECTORY=/Users/Qi.Wang/projects/CH12F3/data/CH12F3.LAM-HTGTS_mMYC.220512
outputDir=/Users/Qi.Wang/projects/CH12F3/
pw=$(pwd)
cat blank.txt > ./config.txt
c="$pw/config.txt"
o=0
for i in $DIRECTORY/*.fna; do
((o=o+1))
s=${i##*/}
b=${s%.fna}
b="${o}_$(echo $b | awk '{ gsub(/_PairEnd+/, " " ); print $1 }')"
outputDirs="$outputDir$b"
printf "%s\t" $b >> ./config.txt
printf "%s\t" $s >> ./config.txt
cat end.txt >> ./config.txt
printf "perl /Users/andy/projects/HTGTS/pipeline/align_tools/TLPpipeline.pl %s %s which=%s assembly=mm9 blatopt=mask=lower outdir=/%s -skipred -skipredadd -skipblu -skipbluadd \n" $c $DIRECTORY $o $outputDirs >> ./command.sh
done
I Also have another minor problem. When i printf outdirs=%s the variable that is printed is $outputDir that starts with a "/" but after it got printed by printf, looks like the / is not there anymore.

Your awk command puts spaces $b, so $outputDirs will contain spaces. Therefore, you need to quote it to make it a single argument to printf. You should also quote all the other variable arguments.
Also, since you're creating a perl command line, you'll want outdir=%s to be a single argument, so you should put single quotes around that as well.
printf "perl /Users/andy/projects/HTGTS/pipeline/align_tools/TLPpipeline.pl '%s' '%s' 'which=%s' assembly=mm9 blatopt=mask=lower 'outdir=/%s' -skipred -skipredadd -skipblu -skipbluadd \n" "$c" "$DIRECTORY" "$o" "$outputDirs" >> ./command.sh
To skip files with Reverse in the name, enable extended globbing and use a non-matching pattern.
shopt -s extglob
for i in "$DIRECTORY"/!(*Reverse*).fna; do

Related

awk, if else conditional when record contains a value

I'm having trouble getting an awk if/else conditional to properly trigger when the record contains a value. Running this in zsh on Mac OS Catalina.
This script (issue is on second to last line)...
echo "abcdefgh" > ./temp
echo "abc\"\(\"h" >> ./temp
echo "abcdefgh" >> ./temp
echo "abcde\(h" >> ./temp
val='"\("'
key="NEW_NEW"
file="./temp"
echo $val
echo $key
echo $file
echo ""
echo "###############"
echo ""
awk '
BEGIN { old=ARGV[1]; new=ARGV[2]; ARGV[1]=ARGV[2]=""; len=length(old) }
($0 ~ /old/){ s=index($0,old); print substr($0,1,s-1) new substr($0,s+len) }{ print $0 }
' $val $key $file
outputs:
"\("
NEW_NEW
./temp
###############
abcdefgh
abc"\("h
abcdefgh
abcde\(h
I want to fix the script so that it changes the "\(" to NEW_NEW but skips the parenthesis without the quotes...
"\("
NEW_NEW
./temp
###############
abcdefgh
abcNEW_NEWh
abcdefgh
abcde\(h
EDIT
This is an abbreviated version of the real script that I'm working on. The answer will need to include the variable expansions that the sample above has, in order for me to use the command in the larger script. The ARGV format in use is preserving special characters, so the main question I have is why the conditional isn’t triggered as expected.
($0 ~ /old/) means "do a regexp comparison between the current record ($0) and the literal regexp old" so it matches when $0 contains the 3 characters o, l, d in that order. You probably were trying to do a regexp comparison against the contents of the variable named old which would be $0 ~ old (see How do I use shell variables in an awk script?) but you don't actually want that, you want a string comparison which would be index($0,old) as shown in your previous question (https://stackoverflow.com/a/62096075/1745001) but which you have now for some reason moved out of the condition part of your condition { action } awk statement and put it as the first part of the action instead. So don't do that.
The other major problem with your script is you're removing the quotes from around your shell variables so they're being interpreted by the shell and undergoing globbing, file name expansion, etc. before awk even gets to see them (see https://mywiki.wooledge.org/Quotes). So don't do that either.
Fixing just the parts I mentioned:
$ cat tst.sh
echo "abcdefgh" > ./temp
echo "abc\"\(\"h" >> ./temp
echo "abcdefgh" >> ./temp
echo "abcde\(h" >> ./temp
val='"\("'
key="NEW_NEW"
file="./temp"
echo "$val"
echo "$key"
echo "$file"
echo ""
echo "###############"
echo ""
awk '
BEGIN { old=ARGV[1]; new=ARGV[2]; ARGV[1]=ARGV[2]=""; len=length(old) }
s=index($0,old) { $0 = substr($0,1,s-1) new substr($0,s+len) }
{ print }
' "$val" "$key" "$file"
.
$ ./tst.sh
"\("
NEW_NEW
./temp
###############
abcdefgh
abcNEW_NEWh
abcdefgh
abcde\(h
This code uses the variables val, key and file but assumes you can alter the content of val in order to compensate for shell expansion when passing to awk
$ file="./temp"; key="NEW_NEW"; val='"\\\\\\("'; \
awk --posix -v val="$val" -v key="$key" '{gsub(val, key)}1' "$file"
abcdefgh
abcNEW_NEWh
abcdefgh
abcde\(h

Bash: Keeping indentation during interpolation

I have a variable containing a multiline string.
I am going to interpolate this variable into another multiline echoed string, this echoed string has indentation.
Here's an example:
ip_status=`ip -o addr | awk 'BEGIN { printf "%-12s %-12s %-12s\n", "INTERFACE", "PROTOCOL", "ADDRESS"
printf "%-12s %-12s %-12s\n", "---------", "--------", "-------" }
{ printf "%-12s %-12s %-12s\n", $2, $3, $4 }'`
echo -e "
->
$ip_status
->
"
When running that, the first line of $ip_status is left justified against the ->, however the subsequent lines are not justified against the ->.
It's easier to see if you run that in your bash. This is the output:
->
INTERFACE PROTOCOL ADDRESS
--------- -------- -------
lo inet 127.0.0.1/8
lo inet6 ::1/128
eth0 inet 10.0.2.15/24
eth0 inet6 fe80::a00:27ff:fed3:76c/64
->
I want all the lines in the $ip_status to be aligned with the ->, not just the first line.
You need to insert the indentation yourself. Bash comes with no feature for making text pretty, although there are some possibly useful utilities (column -t is frequently useful in this sort of application, for example).
Still, inserting indentation isn't too difficult. Here's one solution:
echo "
->
${ip_status//$'\n'/$'\n '}
->
"
Note: I removed the non-standard -e flag because it really isn't necessary.
Another alternative would be to apply the replacement on the entire output, using a tool like sed:
echo "
->
$ip_status
->
" | sed 's/^ */ /'
This second one has the possible advantage that it will tidy up the indentation, even if it were ragged as in the example. If you didn't want that effect, use 's/^/ /' instead.
Or a little shell function whose first argument is the desired indent and whose remaining arguments are indented and concatenated with a newline after each one:
indent() {
local s=$(printf '%*s' $1 "")
shift
printf "$s%s\n" "${#//$'\n'/$'\n'$s}"
}
indent 4 '->' "$ip_status" '->'
That might require some explanation:
printf accepts * as a length specifier, just like the C version. It means "use the corresponding argument as the numeric value". So local s=$(printf '%*s' $1 "") creates a string of spaces of length $1.
Also, printf repeats its format as often as necessary to consume all arguments. So the second printf applies an indent at the beginning and a newline at the end to each argument.
"${#/pattern/subst}" is a substitution applied to each argument in turn. Using two slashes at the beginning ("${#//pattern/subst}") makes it a repeated substitution.
$'\n' is a common syntax for interpreting C-style backslash escapes, implemented by bash and a variety of other shells. (But it's not available in a minimal posix standard shell.)
So "${#//$'\n'/$'\n'$s}" inserts $s -- that is, the desired indentation -- after every newline in each argument.
echo " ->"
while IFS= read -r line
do
echo " $line"
done <<< "$ip_status"
echo " ->"
You can read the variable line by line and echo it with the number of spaces you need before it. I have used the accepted answer of this question.
To make it a function:
myfunction() {
echo " ->"
while IFS= read -r line
do
echo " $line"
done <<< "$1"
echo " ->"
}
myfunction "$ip_status"
A simple form is to use readarray, process substitution and printf:
readarray -t ip_status < <(exec ip -o addr | awk 'BEGIN { printf "%-12s %-12s %-12s\n", "INTERFACE", "PROTOCOL", "ADDRESS"
printf "%-12s %-12s %-12s\n", "---------", "--------", "-------" }
{ printf "%-12s %-12s %-12s\n", $2, $3, $4 }')
printf ' %s\n' '->' "${ip_status[#]}" '->'
Reference: http://www.gnu.org/software/bash/manual/bashref.html

how to prevent for loop from using space as deliminator, bash script

I am trying to right a bash script to do multiple checks and searches for a CMS my company uses. I trying to implement a function for a user to be able to search for a certain macro call and the function return all the files that contain the call, the line the macro is called on, and the actual code in the macro call. What I have seems to be getting screwed up by the fact I am using a for loop to format the output. Here's the snippet of the script I am working on:
elif [ "$choice" = "2" ]
then
echo -e "\n What macro call are we looking for $name?"
read macrocall
for i in $(grep -inR "$macrocall" $sitepath/templates/macros/); do
file=$(echo $i | cut -d\: -f1 | awk -F\/ '{ print $NF }')
line=$(echo $i | cut -d\: -f2)
calltext=$(echo $i | cut -d\: -f3-)
echo -e "\nFile: $file"
echo -e "\nLine: $line"
echo -e "\nMacro Call from file: $calltext"
done
fi
the current script runs the first few fields until it gets a a space and then everything gets all screwy. Anybody have any idea how I can have the for loops deliminator to be each result of the grep? any suggestions would be helpful. Let me know if any of you need more info. Thanks!
The right way to do this would be more like:
printf "\n What macro call are we looking for %s?" "$name"
read macrocall
# ensure globbing is off and set IFS to a newline after saving original values
oSET="$-"; set -f; oIFS="$IFS"; IFS=$'\n'
awk -v macrocall="$macrocall" '
BEGIN { lc_macrocall = "\\<" tolower(macrocall) "\\>" }
tolower($0) ~ lc_macrocall {
file=FILENAME
sub(/.*\//,"",file)
printf "\n%s\n", file
printf "\n%d\n", FNR
printf "\nMacro Call from file: %s\n", $0
}
' $(find "$sitepath/templates/macros" -type f -print)
# restore original IFS and globbing values
IFS="$oIFS"; set +f -"$oSET"
This solves the problem of having spaces in your file names as originally requested, but also handles globbing characters in your file names, and the various typical echo issues.
You can set the internal field separator $IFS (which is normally set to space, tab and newline) to just newline to get around this problem:
IFS="\n"

How to get output of grep in single line in shell script?

Here is a script which reads words from the file replaced.txt and displays the output each word in each line, But I want to display all the outputs in a single line.
#!/bin/sh
echo
echo "Enter the word to be translated"
read a
IFS=" " # Set the field separator
set $a # Breaks the string into $1, $2, ...
for a # a for loop by default loop through $1, $2, ...
do
{
b= grep "$a" replaced.txt | cut -f 2 -d" "
}
done
Content of "replaced.txt" file is given below:
hllo HELLO
m AM
rshbh RISHABH
jn JAIN
hw HOW
ws WAS
ur YOUR
dy DAY
This question can't be appropriate to what I asked, I just need the help to put output of the script in a single line.
Your entire script can be replaced by:
#!/bin/bash
echo
read -r -p "Enter the words to be translated: " a
echo $(printf "%s\n" $a | grep -Ff - replaced.txt | cut -f 2 -d ' ')
No need for a loop.
The echo with an unquoted argument removes embedded newlines and replaces each sequence of multiple spaces and/or tabs with one space.
One hackish-but-simple way to remove trailing newlines from the output of a command is to wrap it in printf %s "$(...) ". That is, you can change this:
b= grep "$a" replaced.txt | cut -f 2 -d" "
to this:
printf %s "$(grep "$a" replaced.txt | cut -f 2 -d" ") "
and add an echo command after the loop completes.
The $(...) notation sets up a "command substitution": the command grep "$a" replaced.txt | cut -f 2 -d" " is run in a subshell, and its output, minus any trailing newlines, is substituted into the argument-list. So, for example, if the command outputs DAY, then the above is equivalent to this:
printf %s "DAY "
(The printf %s ... notation is equivalent to echo -n ... — it outputs a string without adding a trailing newline — except that its behavior is more portably consistent, and it won't misbehave if the string you want to print happens to start with -n or -e or whatnot.)
You can also use
awk 'BEGIN { OFS=": "; ORS=" "; } NF >= 2 { print $2; }'
in a pipe after the cut.

Printf example in bash does not create a newline

Working with printf in a bash script, adding no spaces after "\n" does not create a newline, whereas adding a space creates a newline, e. g.:
No space after "\n"
NewLine=`printf "\n"`
echo -e "Firstline${NewLine}Lastline"
Result:
FirstlineLastline
Space after "\n "
NewLine=`printf "\n "`
echo -e "Firstline${NewLine}Lastline"
Result:
Firstline
Lastline
Question: Why doesn't 1. create the following result:
Firstline
Lastline
I know that this specific issue could have been worked around using other techniques, but I want to focus on why 1. does not work.
Edited:
When using echo instead of printf, I get the expected result, but why does printf work differently?
NewLine=`echo "\n"`
echo -e "Firstline${NewLine}Lastline"
Result:
Firstline
Lastline
The backtick operator removes trailing new lines. See 3.4.5. Command substitution at http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_03_04.html
Note on edited question
Compare:
[alvaro#localhost ~]$ printf "\n"
[alvaro#localhost ~]$ echo "\n"
\n
[alvaro#localhost ~]$ echo -e "\n"
[alvaro#localhost ~]$
The echo command doesn't treat \n as a newline unless you tell him to do so:
NAME
echo - display a line of text
[...]
-e enable interpretation of backslash escapes
POSIX 7 specifies this behaviour here:
[...] with the standard output of the command, removing sequences of one or more characters at the end of the substitution
Maybe people will come here with the same problem I had:
echoing \n inside a code wrapped in backsticks. A little tip:
printf "astring\n"
# and
printf "%s\n" "astring"
# both have the same effect.
# So... I prefer the less typing one
The short answer is:
# Escape \n correctly !
# Using just: printf "$myvar\n" causes this effect inside the backsticks:
printf "banana
"
# So... you must try \\n that will give you the desired
printf "banana\n"
# Or even \\\\n if this string is being send to another place
# before echoing,
buffer="${buffer}\\\\n printf \"$othervar\\\\n\""
One common problem is that if you do inside the code:
echo 'Tomato is nice'
when surrounded with backsticks will produce the error
command Tomato not found.
The workaround is to add another echo -e or printf
printed=0
function mecho(){
#First time you need an "echo" in order bash relaxes.
if [[ $printed == 0 ]]; then
printf "echo -e $1\\\\n"
printed=1
else
echo -e "\r\n\r$1\\\\n"
fi
}
Now you can debug your code doing in prompt just:
(prompt)$ `mySuperFunction "arg1" "etc"`
The output will be nicely
mydebug: a value
otherdebug: whathever appended using myecho
a third string
and debuging internally with
mecho "a string to be hacktyped"
$ printf -v NewLine "\n"
$ echo -e "Firstline${NewLine}Lastline"
Firstline
Lastline
$ echo "Firstline${NewLine}Lastline"
Firstline
Lastline
It looks like BASH is removing trailing newlines.
e.g.
NewLine=`printf " \n\n\n"`
echo -e "Firstline${NewLine}Lastline"
Firstline Lastline
NewLine=`printf " \n\n\n "`
echo -e "Firstline${NewLine}Lastline"
Firstline
Lastline
Your edited echo version is putting a literal backslash-n into the variable $NewLine which then gets interpreted by your echo -e. If you did this instead:
NewLine=$(echo -e "\n")
echo -e "Firstline${NewLine}Lastline"
your result would be the same as in case #1. To make that one work that way, you'd have to escape the backslash and put the whole thing in single quotes:
NewLine=$(printf '\\n')
echo -e "Firstline${NewLine}Lastline"
or double escape it:
NewLine=$(printf "\\\n")
Of course, you could just use printf directly or you can set your NewLine value like this:
printf "Firstline\nLastline\n"
or
NewLine=$'\n'
echo "Firstline${NewLine}Lastline" # no need for -e
For people coming here wondering how to use newlines in arguments to printf, use %b instead of %s:
$> printf "a%sa" "\n"
a\na
$> printf "a%ba" "\n"
a
a
From the manual:
%b expand backslash escape sequences in the corresponding argument
We do not need "echo" or "printf" for creating the NewLine variable:
NewLine="
"
printf "%q\n" "${NewLine}"
echo "Firstline${NewLine}Lastline"
Bash delete all trailing newlines in commands substitution.
To save trailing newlines, assign printf output to the variable with printf -v VAR
instead of
NewLine=`printf "\n"`
echo -e "Firstline${NewLine}Lastline"
#FirstlineLastline
use
printf -v NewLine '\n'
echo -e "Firstline${NewLine}Lastline"
#Firstline
#Lastline
Explanation
According to bash man
3.5.4 Command Substitution
$(command)
or
`command`
Bash performs the expansion by executing command and replacing the command substitution with the standard output of the command, with any trailing newlines deleted. Embedded newlines are not deleted, but they may be removed during word splitting.
So, after adding any trailing newlines, bash will delete them.
var=$(printf '%s\n%s\n\n\n' 'foo' 'bar')
echo "$var"
output:
foo
bar
According to help printf
printf [-v var] format [arguments]
If the -v option is supplied, the output is placed into the value of the shell variable VAR rather than being sent to the standard output.
In this case, for safe copying of formatted text to the variable, use the [-v var] option:
printf -v var '%s\n%s\n\n\n' 'foo' 'bar'
echo "$var"
output:
foo
bar
Works ok if you add "\r"
$ nl=`printf "\n\r"` && echo "1${nl}2"
1
2

Resources