preserve double quotes in string while using echo in unix loop - bash

I have these file names that literally have double quotes in the path to deal with special characters issues, I want to loop through and echo the file paths while preserving the quotes, this seems to remove them:
for value in temp/sample."sample.id1".genotypes.txt temp/sample."sample.id2".genotypes.txt; do echo $value; done
I tried this but no luck:
for value in temp/sample."sample.id1".genotypes.txt temp/sample."sample.id2".genotypes.txt; do echo '${value}'; done
How do I do this?

You need to quote the strings to preserve the double quotes:
for value in 'temp/sample."sample.id1".genotypes.txt' 'temp/sample."sample.id2".genotypes.txt'; do
echo "$value"
done
Otherwise, writing some."thing" is identical to some.thing because the shell interprets the quotes.

You can also escape it :
for value in temp/sample.\"sample.id1\".genotypes.txt temp/sample.\"sample.id2\".genotypes.txt; do echo $value; done

for things like this, I like to use a slightly different approach that looks like a better design to me:
# make an array with the data
mapfile -t ary <<"EOF"
temp/sample."sample.id1".genotypes.txt
temp/sample."sample.id2".genotypes.txt
EOF
# use the data from the array
for f in "${ary[#]}"; do
printf '%s\n' "$f"
done
It will make your life a bit easier if your data grows, and you can then very easily transfer it to another file. Of course, if it's only for a one-time use (e.g., you made an error when naming your files and you only want to rename them), then learn how to have Bash properly parse the quotes as shown in the other answers (escaping them or using single quotes).

Related

Multiple elements instead of one in bash script for loop

I have been following the answers given in these questions
Shellscript Looping Through All Files in a Folder
How to iterate over files in a directory with Bash?
to write a bash script which goes over files inside a folder and processes them. So, here is the code I have:
#!/bin/bash
YEAR="2002/"
INFOLDER="/local/data/datasets/Convergence/"
for f in "$INFOLDER$YEAR*.mdb";
do
echo $f
absname=$INFOLDER$YEAR$(basename $f)
# ... the rest of the script ...
done
I am receiving this error: basename: extra operand.
I added echo $f and I realized that f contains all the filenames separated by space. But I expected to get one at a time. What could be the problem here?
You're running into problems with quoting. In the shell, double-quotes prevent word splitting and wildcard expansion; generally, you don't want these things to happen to variable's values, so you should double-quote variable references. But when you have something that should be word-split or wildcard-expanded, it cannot be double-quoted. In your for statement, you have the entire file pattern in double-quotes:
for f in "$INFOLDER$YEAR*.mdb";
...which prevents word-splitting and wildcard expansion on the variables' values (good) but also prevents it on the * which you need expanded (that's the point of the loop). So you need to quote selectively, with the variables inside quotes and the wildcard outside them:
for f in "$INFOLDER$YEAR"*.mdb;
And then inside the loop, you should double-quote the references to $f in case any filenames contain whitespace or wildcards (which are completely legal in filenames):
echo "$f"
absname="$INFOLDER$YEAR$(basename "$f")"
(Note: the double-quotes around the assignment to absname aren't actually needed -- the right side of an assignment is one of the few places in the shell where it's safe to skip them -- but IMO it's easier and safer to just double-quote all variable references and $( ) expressions than to try to keep track of where it's safe and where it's not.)
Just quote your shell variables if they are supposed to contain strings with spaces in between.
basename "$f"
Not doing so will lead to splitting of the string into separate characters (see WordSplitting in bash), thereby messing up the basename command which expects one string argument rather than multiple.
Also it would be a wise to include the * outside the double-quotes as shell globbing wouldn't work inside them (single or double-quote).
#!/bin/bash
# good practice to lower-case variable names to distinguish them from
# shell environment variables
year="2002/"
in_folder="/local/data/datasets/Convergence/"
for file in "${in_folder}${year}"*.mdb; do
# break the loop gracefully if no files are found
[ -e "$file" ] || continue
echo "$file"
# Worth noting here, the $file returns the name of the file
# with absolute path just as below. You don't need to
# construct in manually
absname=${in_folder}${year}$(basename "$file")
done
just remove "" from this line
for f in "$INFOLDER$YEAR*.mdb";
so it looks like this
#!/bin/bash
YEAR="2002/"
INFOLDER="/local/data/datasets/Convergence/"
for f in $INFOLDER$YEAR*.mdb;
do
echo $f
absname=$INFOLDER$YEAR$(basename $f)
# ... the rest of the script ...
done

bash grep variable as pattern

I don't usually work in bash but grep could be a really fast solution in this case. I have read a lot of questions on grep and variable assignment in bash yet I do not see the error. I have tried several flavours of double quotes around $pattern, used `...`` or $(...) but nothing worked.
So here's what I try to do:
I have two files. The first contains several names. Each of them I want to use as a pattern for grep in order to search them in another file. Therefore I loop through the lines of the first file and assign the name to the variable pattern.
This step works as the variable is printed out properly.
But somehow grep does not recognize/interpret the variable. When I substitute "$pattern" with an actual name everything is fine as well. Therefore I don't think the variable assignment has a problem but the interpretation of "$pattern" as the string it should represent.
Any help is greatly appreciated!
#!/bin/bash
while IFS='' read -r line || [[ -n $line ]]; do
a=( $line )
pattern="${a[2]}"
echo "Text read from file: $pattern"
var=$(grep "$pattern" 9606.protein.aliases.v10.txt)
echo "Matched Line in Alias is: $var"
done < "$1"
> bash match_Uniprot_StringDB.sh ~/Chromatin_Computation/.../KDM.protein.tb
output:
Text read from file: "UBE2B"
Matched Line in Alias is:
Text read from file: "UTY"
Matched Line in Alias is:
EDIT
The solution drvtiny suggested works. It is necessary to get rid of the double quotes to match the string. Adding the following lines makes the script work.
pattern="${pattern#\"}"
pattern="${pattern%\"}"
Please, look at "-f FILE" option in man grep.
I advise that this option do exactly what you need without any bash loops or such other "hacks" :)
And yes, according to the output of your code, you read pattern including double quotes literally. In other words, you read from file ~/Chromatin_Computation/.../KDM.protein.tb this string:
"UBE2B"
But not
UBE2B
as you probably expect.
Maybe you need to remove double quotes on the boundaries of your $pattern?
Try to do this after reading pattern:
pattern=${pattern#\"}
pattern=${pattern%\"}

Word splitting using multiple IFS

Being new to shell scripting, I am not clear about the Quoting and splitting concepts in bash. In the below snippet:
array1=("france-country":"Italy-country":"singapore-country")
echo ${#array1[#]}
IFS=":-"
for i in ${array1[#]}
do
echo "$i"
done
unset IFS
with IFS being :-, I thought the result would be:
france-country
Italy-country
belgium-country
as I had quoted them("france-country"). I think it should not get split on "-". But the results were:
france
country
Italy
country
belgium
country
It would be great if someone can point me out my mistake in understanding.
For your problem you can simply change the field separator as : i.e. IFS=: because each country name is separated by : not :- in your example
array1=("france-country":"Italy-country":"singapore-country")
echo ${#array1[#]}
IFS=":"
for i in ${array1[#]}
do
echo "$i"
done
unset IFS
FYI, array elements in bash are separated by space so the whole string "france-country":"Italy-country":"singapore-country" is a single element of the array therefore echo ${#array1[#]} will always be 1 . So I do not see any use of an array in this example. Simple variable would have been suffice.
This script shows how to split a colon-separated string into a Bash array.
#!/usr/bin/env bash
words="france-country:Italy-country:singapore-country:Australia-country"
IFS=':' array1=($words)
numwords="${#array1[#]}"
for((i=0; i<numwords; i++));do
echo "$i: ${array1[i]}"
done
output
0: france-country
1: Italy-country
2: singapore-country
3: Australia-country
Note that in
array1=($words)
we don't put quotes around $words, as that would prevent word splitting.
We don't actually need the quotes in
words="france-country:Italy-country:singapore-country:Australia-country"
but quotes (either single or double) would be needed if there were any spaces in that string. Similarly, the quotes are superfluous in numwords="${#array1[#]}", and many Bash programmers would omit them because we know the result of ${#array1[#]} will never have spaces in it.
It's a Good Idea to always use quotes in Bash unless you're sure you don't want them, eg in the case of array1=($words) above, or when performing tests with the [[ ... ]] syntax.
I suggest you bookmark BashGuide and browse through their FAQ. But the only way to really learn the arcane ways of quoting and splitting in Bash is to to write lots of scripts. :)
You may also find ShellCheck helpful.

Bash - Convert comma separated list

I'm trying to write a script that works on a list of files. If I want to work on all files in the same directory then
FILES=*.ext
for i in $FILES; do
something on "$i"
done
works OK, the problem comes when I want to work on just a selection of files and not everything. How do I convert a comma separated list of files, which may or may not contain spaces into the same format, so that I can store it in $FILES and use the same code?
Many thanks, in advance
David Shaw
The correct thing to do is not use a delimited list of filenames but use an array (and avoid uppercase variable names), this will avoid the problem of filenames containing your separator (e.g. ,) and is the idiomatic approach:
files=( *foo*.ext1 *bar*.ext2 file1 "file 2 with a space" )
for file in "${files[#]}"; do
[ -e "${file}" ] || continue
do_something_with "${file}"
done
Unless you have no control over how $files is populated this is what you want, if your script gets fed a comma-separated list and you absolutely cannot avoid it then you can set IFS accordingly as in #BroSlow's answer.
Since globbing does the right thing when expanding filenames with spaces are not a problem (not even in your example).
You might also want to check extended globbing (extglob) to be able to match more specifically.
If I am interpreting your question correctly you can just the internal field separator (IFS) in bash to comma and then have word-splitting take care of the rest, e.g.
#!/bin/bash
FILES="file1,file2 with space,file3,file4 with space"
IFS=','
for i in $FILES; do
echo "File = [$i]"
done
Which would output
File = [file1]
File = [file2 with space]
File = [file3]
File = [file4 with space]
Note, as Adrian Frühwirth pointed out in comments, this will fail if the filenames can contain commas.

strings in bash

Hi I need to go over characters in string in bash including spaces. How can I do it?
Bash does support substrings directly (If that's what the OP wants):
$ A='Hello World!'
$ echo "${A:3:5}"
lo Wo
$ echo "${A:5:3}"
Wo
$ echo "${A:7:3}"
orl
The expansion used is generalized as:
${PARAMETER:OFFSET:LENGTH}
PARAMETER is your variable name. OFFSET and LENGTH are numeric expressions as used by `let'. See the bash info page on shell parameter expansion for more information, since there are a few important details on this.
Therefore, if you want to e.g. print all the characters in the contents of a variable each on its own line you could do something like this:
$ for ((i=0; i<${#A}; i++)); do echo ${A:i:1}; done
The advantage of this method is that you don't have to store the string elsewhere, mangle its contents or use external utilities with process substitution.
Not sure what you really mean, but in almost all cases, problems with strings including spaces can be solved by quoting them.
So, if you've got a nice day, try "a nice day" or 'a nice day'.
You use some external tool for it. The bash shell is really meant to be used to glue other programs together in usually simple combinations.
Depending on what you need, you might use cut, awk, sed or even perl.
Try this
#/bin/bash
str="so long and thanks for all the fish"
while [ -n "$str" ]
do
printf "%c\n" "$str"
str=${str#?}
done

Resources