Read variables from file and loop through them in bash - bash

I have a text file containing variables like this:
lane1_pair1="file1"
lane1_pair2="file2"
lane2_pair1="file3"
lane2_pair2="file4"
...
I'd like to loop through the variables and concatenate all of them in a single file. I am applying the loop as:
. variables
for (( n=1; n<=no_lanes; n++ )) {
cat $"lane"${n}_pair1 >> "$sampleID"_cat1.fq
cat $"lane"${n}_pair2 >> "$sampleID"_cat2.fq
}
"$sampleID"_cat1.fq > fq_align_1
"$sampleID"_cat2.fq > fq_align_2
The problem here is that, cat command does not work because instead of replacing the "laneX_pairY" with its value, treat it as a string. I was wondering if anyone here has any idea about this.

As I see it, the problem is you're looping through variables that are named like an array but aren't one. Then you're in the position of trying to evaluate the contents of the variable name created from your other indexing variable, which is a recipe for messy code that breaks easily. I recommend making actual arrays of file names from your variables file:
p1_files=($(grep 'pair1' variables | cut -d '=' -f 2))
p2_files=($(grep 'pair2' variables | cut -d '=' -f 2))
for f in "${p1_files[#]}"; do
cat "${f//\"/}" >> "$sampleID"_cat1.fq
done
for f in "${p2_files[#]}"; do
cat "${f//\"/}" >> "$sampleID"_cat2.fq
done
I'm not sure how you want your fq_align_n variables to be, but you can read the file contents to variables using:
fq_align_1=$(cat "$sampleID"_cat1.fq)
fq_align_2=$(cat "$sampleID"_cat2.fq)
Or just skip the file creation altogether and build the variables incrementally in the loops.

Related

how to assign each of multiple lines in a file as different variable?

this is probably a very simple question. I looked at other answers but couldn't come up with a solution. I have a 365 line date file. file as below,
01-01-2000
02-01-2000
I need to read this file line by line and assign each day to a separate variable. like this,
d001=01-01-2000
d002=02-01-2000
I tried while read commands but couldn't get them to work.It takes a lot of time to shoot one by one. How can I do it quickly?
Trying to create named variable out of an associative array, is time waste and not supported de-facto. Better use this, using an associative array:
#!/bin/bash
declare -A array
while read -r line; do
printf -v key 'd%03d' $((++c))
array[$key]=$line
done < file
Output
for i in "${!array[#]}"; do echo "key=$i value=${array[$i]}"; done
key=d001 value=01-01-2000
key=d002 value=02-01-2000
Assumptions:
an array is acceptable
array index should start with 1
Sample input:
$ cat sample.dat
01-01-2000
02-01-2000
03-01-2000
04-01-2000
05-01-2000
One bash/mapfile option:
unset d # make sure variable is not currently in use
mapfile -t -O1 d < sample.dat # load each line from file into separate array location
This generates:
$ typeset -p d
declare -a d=([1]="01-01-2000" [2]="02-01-2000" [3]="03-01-2000" [4]="04-01-2000" [5]="05-01-2000")
$ for i in "${!d[#]}"; do echo "d[$i] = ${d[i]}"; done
d[1] = 01-01-2000
d[2] = 02-01-2000
d[3] = 03-01-2000
d[4] = 04-01-2000
d[5] = 05-01-2000
In OP's code, references to $d001 now become ${d[1]}.
A quick one-liner would be:
eval $(awk 'BEGIN{cnt=0}{printf "d%3.3d=\"%s\"\n",cnt,$0; cnt++}' your_file)
eval makes the shell variables known inside your script or shell. Use echo $d000 to show the first one of the newly defined variables. There should be no shell special characters (like * and $) inside your_file. Remove eval $() to see the result of the awk command. The \" quoted %s is to allow spaces in the variable values. If you don't have any spaces in your_file you can remove the \" before and after %s.

Reading filenames from a structured file to a bash script

I have a file with a structured list of filenames (file1.sh, file2.sh, ...) and would like to read loop the file names inside a bash script.
cat /home/flora/logs/9681-T13:17:07.091363777.org
%rec: dynamic
Ptrn: Gnu
File: /home/flora/comint.rc
+ /home/flora/engine.rc
+ /home/flora/playa.rc
+ /home/flora/edva.rc
+ /home/flora/dyna.rc
+ /home/flora/lin.rc
Have started with
while read -r fl; do
echo "$fl" | grep -oE '[/].+'
done < "$logfl"
But I want to be more specific by matching the File: , then continue reading the rest using + as a continuation character.
bash doesn't have impose a limit on variables (other than memory). That said, I would start by processing the list of lines one by one:
#!/bin/bash
while read _ f
do
process "$f"
done
where process is whatever function you need to implement.
If you want a variables use an array like this:
#!/bin/bash
while read _ f
do
files+=("$f")
done
In either case pass the input file to script with:
your_script < /home/flora/logs/27043-T13:09:44.893003954.log

Bash string substitution with %

I have a list of files named with this format:
S2_7-CHX-2-5_Chr5.bed
S2_7-CHX-2-13_Chr27.bed
S2_7-CHX-2-0_Chr1.bed
I need to loop through each file to perform a task. Previously, I had named them without the step 2 indicator ("S2"), and this format had worked perfectly:
for FASTQ in *_clean.bam; do
SAMPLE=${FASTQ%_clean.bam}
echo $SAMPLE
echo $(samtools view -c ${SAMPLE}_clean.bam)
done
But now that I have the S2 preceding what I would like to set as the variable, this returns a list of empty "SAMPLE" variables. How can I rewrite the following code to specify only S2_*.bed?
for FASTQ in S2_*.bed; do
SAMPLE=${S2_FASTQ%.bed}
echo $SAMPLE
done
Edit: I'm trying to isolate the unique name from each file, for example "7-CHX-2-13_Chr27" so that I can refer to it later. I can't use the "S2" as part of this because I want to rename the file with "S3" for the next step, and so on.
Example of what I'm trying to use it for:
for FASTQ in S2_*.bed; do
SAMPLE=${S2_FASTQ%.bed}
echo $SAMPLE
#rename each mapping position with UCSC chromosome name using sed
while IFS=, read -r f1 f2; do
#rename each file
echo " sed "s/${f1}.1/chr${f2}/g" S2_${SAMPLE}_Chr${f2}.bed > S3_${SAMPLE}_Chr${f2}.bed" >> $SCRIPT
done < $INPUT
done
The name of the variable is still $FASTQ, the S2_ is not part of the variable name, but its value.
sample=${FASTQ%.bed}
# ~~~~~|~~~~
# | | |
# Variable | What to remove
# name |
# Remove
# from the right
If you want to remove the S2_ from the $sample, use left hand side removal:
sample=${sample#S2_}
The removals can't be combined, you have to proceed in two steps.
Note that I use lower case variable names. Upper case should be reserved for environment and internal shell variables.

How to process text in shell script and assign to varibles?

sample.text file .
var1=https://www.process.com
var2=https://www.hp.com
var3=http://www.google.com
:
:
varz=https://www.sample.com
i am sending this sample txt as input to one script.
that script should split the lines and assign the variables to diff parameters
like
$varn= $var1,....$varn
$value=https://www.sample.com ( all the variables value)
i am trying with below script not working .
#!/bin/bash
for $1 in ( cat sample.txt );
do
echo $1 #var1=https://www.process.com
sed 's/=/\n/g' $1 | awk 'NR%2==0'
done
main aim is to assign all urls to one variable and vars to one variable and process the file
If sample.text already contains your variable assignments for you, e.g.
var1=https://www.process.com
var2=https://www.hp.com
var3=http://www.google.com
and you want access to var1, var2, ... varn, then you are making things difficult on yourself by trying to read and parse sample.text instead of simply sourcing it with '.' or source.
For example, given sample.text containing:
$ cat sample.text
var1=https://www.process.com
var2=https://www.hp.com
var3=http://www.google.com
varz=https://www.sample.com
You need only source the file to access the variable, e.g.
#!/bin/bash
. sample.text || {
printf "error sourcing sample.text\n"
exit 1
}
printf "%s\n" $var{1..3} $varz
Example Use/Output
$ bash source_sample.sh
https://www.process.com
https://www.hp.com
http://www.google.com
https://www.sample.com
Look things over and let me know if you have further questions.

Is there a way to loop variables from another file into my bash script?

Sorry to be a pain, but I'm not sure how I can loop values from an outside file, into my bash script as variables. I have three variable names in my bash script:
$TAGBEGIN
$TAGEND
$MYCODE
In a separate varSrc.txt file, I have several variables:
# a - Some marker
tagBegin_a='/<!-- Begin A -->/'
tagEnd_a='/<!-- End A -->/'
code_a=' [ some code to replace in between tags ] '
# b - Some marker
tagBegin_b='/<!-- Begin B -->/'
tagEnd_b='/<!-- End B -->/'
code_b=' [ some code to replace in between tags ] '
# c - Some marker
...
I need my bash script to be able to loop through each "# marker"* section and perform a function:
source varSrc.txt
$TAGBEGIN
$TAGEND
$MYCODE
...
sed '
'"$TAGEND"' R '"$MYCODE"'
'"$TAGBEGIN"','"$TAGEND"' d
' -i $TARGETDIR
Note: sed code logic (not quoting mess) courtesy of Glenn J.
I need some kind of looping logic like:
for (var i = 0; i <= markers in varSrc.txt ; i++) {
// set bash vars equal to varSrc values
$TAGBEGIN= $tagBegin_i
$TAGEND= $tagEnd_i
$MYCODE= $code_i
// run the 'sed' replace command
sed '
'"$TAGEND"' R '"$MYCODE"'
'"$TAGBEGIN"','"$TAGEND"' d
' -i $TARGETDIR
}
Is this something that can be feasibly done in a bash script and is this a good approach? Any suggestions, pointers or guidance is very, very appreciated!
*(which I don't think is a real marker I can use)
[Answering the question as amended]
There's no need use use, iterate over, or think about markers at all. Leave them out.
source varSrc.txt
for beginVar in "${!tagBegin_#}"; do # Iterate over defined begin variable names
endVar=tagEnd_${var#tagBegin_} # Generate the name of the end variable
codeVar=code_${var#tagBegin_} # Generate the name of the code variable
begin=${!beginVar} # Look up the contents of the begin variable
end=${!endVar} # Look up the contents of the end variable
code=${!codeVar} # Look up the contents of the code variable
sed -e "$end R $code" -e "$begin,$end d" -i "$file"
done
[Answers original, pre-amended question]
source only works if your input file is valid bash syntax; it isn't. Thus, you'll need to parse it yourself, something like the following:
begin= end= code=
while IFS= read -r; do
case $REPLY in
#*)
# we saw a marker; process all vars seen so far
[[ $begin && $end && $code ]] || continue # do nothing if we have no vars seen
sed -e "$end R $code" -e "$begin,$end d" -i "$file"
;;
'$TAGBEGIN='*) begin=${REPLY#'$TAGBEGIN='} ;;
'$TAGEND='*) end=${REPLY#'$TAGEND='} ;;
'$MYCODE='*) code=${REPLY#'$MYCODE='} ;;
esac
done <varSrc.txt
What you can do is export your variables in your second file an the execute the script within your current environment (with a dot before the script) to get the variable names/markers you can parse the file and search for an $ or #

Resources