After looking at this post, it looks like I can just use cat to merge files.
However, I am a bit confused on how to do this with my array of filename prefixes.
For example:
prefixes=( pre1 pre2 pre3 pre4 pre5 )
If I have an array of prefixes like that, how can I make a command to look like this or do something similar to this:
cat pre1.file pre2.file pre3.file pre4.file pre5.file > merged.file
You can use a loop to iterate over the file names in the array:
prefixes=( pre1 pre2 pre3 pre4 pre5 )
for p in "${prefixes[#]}"; do cat "$p.file"; done > merged.file
If all your prefixes follow a pattern then you can do it using globbing:
cat pre*.file > merged.file
or
cat pre?.file > merged.file
Also, you can use brace expansion for a list of prefixes:
cat {pre1,pre2,pre3,pre4,pre5}.file > merged.file
eval "cat {${prefixes[*]}}.file > merged.file"
Related
this is probably a very simple question. I looked at other answers but couldn't come up with a solution. I have a 365 line date file. file as below,
01-01-2000
02-01-2000
I need to read this file line by line and assign each day to a separate variable. like this,
d001=01-01-2000
d002=02-01-2000
I tried while read commands but couldn't get them to work.It takes a lot of time to shoot one by one. How can I do it quickly?
Trying to create named variable out of an associative array, is time waste and not supported de-facto. Better use this, using an associative array:
#!/bin/bash
declare -A array
while read -r line; do
printf -v key 'd%03d' $((++c))
array[$key]=$line
done < file
Output
for i in "${!array[#]}"; do echo "key=$i value=${array[$i]}"; done
key=d001 value=01-01-2000
key=d002 value=02-01-2000
Assumptions:
an array is acceptable
array index should start with 1
Sample input:
$ cat sample.dat
01-01-2000
02-01-2000
03-01-2000
04-01-2000
05-01-2000
One bash/mapfile option:
unset d # make sure variable is not currently in use
mapfile -t -O1 d < sample.dat # load each line from file into separate array location
This generates:
$ typeset -p d
declare -a d=([1]="01-01-2000" [2]="02-01-2000" [3]="03-01-2000" [4]="04-01-2000" [5]="05-01-2000")
$ for i in "${!d[#]}"; do echo "d[$i] = ${d[i]}"; done
d[1] = 01-01-2000
d[2] = 02-01-2000
d[3] = 03-01-2000
d[4] = 04-01-2000
d[5] = 05-01-2000
In OP's code, references to $d001 now become ${d[1]}.
A quick one-liner would be:
eval $(awk 'BEGIN{cnt=0}{printf "d%3.3d=\"%s\"\n",cnt,$0; cnt++}' your_file)
eval makes the shell variables known inside your script or shell. Use echo $d000 to show the first one of the newly defined variables. There should be no shell special characters (like * and $) inside your_file. Remove eval $() to see the result of the awk command. The \" quoted %s is to allow spaces in the variable values. If you don't have any spaces in your_file you can remove the \" before and after %s.
Say I have a string in bash -
NAMES="file1 file2 file3"
How do I map it to the following string which I will then use as part of a command?
MAPPED="-i file1.txt -i file2.txt -i file3.txt"
For an example of exactly what I mean, here's the equivalent python code -
names = "file1 file2 file3"
mapped = ' '.join("-i " + x + ".txt" for x in names.split())
You should use arrays instead of strings:
names=(file1 file2 file3)
# Add suffix
names=("${names[#]/%/.txt}")
# Build new array with "-i" elements
for name in "${names[#]}"; do
mapped+=(-i "$name")
done
# Show result
declare -p mapped
resulting in this output:
declare -a mapped=([0]="-i" [1]="file1.txt" [2]="-i" [3]="file2.txt" [4]="-i" [5]="file3.txt")
This can now be used in commands like this:
cmd "${mapped[#]}"
See BashFAQ/050 regarding the rationale behind putting commands into strings vs. arrays.
The use case is, in my case, CSS file concatenation, before it gets minimized. To concat two CSS files:
cat 1.css 2.css > out.css
To add some text at one single position, I can do
cat 1.css <<SOMESTUFF 2.css > out.css
This will end in the middle.
SOMESTUFF
To add STDOUT from one other program:
sed 's/foo/bar/g' 3.css | cat 1.css - 2.css > out.css
So far so good. But I regularly come in situations, where I need to mix several strings, files and even program output together, like copyright headers, files preprocessed by sed(1) and so on. I'd like to concatenate them together in as little steps and temporary files as possible, while having the freedom of choosing the order.
In short, I'm looking for a way to do this in as little steps as possible in Bash:
command [string|file|output]+ > concatenated
# note the plus ;-) --------^
(Basically, having a cat to handle multiple STDINs would be sufficient, I guess, like
<(echo "FOO") <(sed ...) <(echo "BAR") cat 1.css -echo1- -sed- 2.css -echo2-
But I fail to see, how I can access those.)
This works:
cat 1.css <(echo "FOO") <(sed ...) 2.css <(echo "BAR")
You can do:
echo "$(command 1)" "$(command 2)" ... "$(command n)" > outputFile
You can add all the commands in a subshell, which is redirected to a file:
(
cat 1.css
echo "FOO"
sed ...
echo BAR
cat 2.css
) > output
You can also append to a file with >>. For example:
cat 1.css > output
echo "FOO" >> output
sed ... >> output
echo "BAR" >> output
cat 2.css >> output
(This potentially opens and closes the file repeatedly)
I have a text file containing variables like this:
lane1_pair1="file1"
lane1_pair2="file2"
lane2_pair1="file3"
lane2_pair2="file4"
...
I'd like to loop through the variables and concatenate all of them in a single file. I am applying the loop as:
. variables
for (( n=1; n<=no_lanes; n++ )) {
cat $"lane"${n}_pair1 >> "$sampleID"_cat1.fq
cat $"lane"${n}_pair2 >> "$sampleID"_cat2.fq
}
"$sampleID"_cat1.fq > fq_align_1
"$sampleID"_cat2.fq > fq_align_2
The problem here is that, cat command does not work because instead of replacing the "laneX_pairY" with its value, treat it as a string. I was wondering if anyone here has any idea about this.
As I see it, the problem is you're looping through variables that are named like an array but aren't one. Then you're in the position of trying to evaluate the contents of the variable name created from your other indexing variable, which is a recipe for messy code that breaks easily. I recommend making actual arrays of file names from your variables file:
p1_files=($(grep 'pair1' variables | cut -d '=' -f 2))
p2_files=($(grep 'pair2' variables | cut -d '=' -f 2))
for f in "${p1_files[#]}"; do
cat "${f//\"/}" >> "$sampleID"_cat1.fq
done
for f in "${p2_files[#]}"; do
cat "${f//\"/}" >> "$sampleID"_cat2.fq
done
I'm not sure how you want your fq_align_n variables to be, but you can read the file contents to variables using:
fq_align_1=$(cat "$sampleID"_cat1.fq)
fq_align_2=$(cat "$sampleID"_cat2.fq)
Or just skip the file creation altogether and build the variables incrementally in the loops.
The use case is, in my case, CSS file concatenation, before it gets minimized. To concat two CSS files:
cat 1.css 2.css > out.css
To add some text at one single position, I can do
cat 1.css <<SOMESTUFF 2.css > out.css
This will end in the middle.
SOMESTUFF
To add STDOUT from one other program:
sed 's/foo/bar/g' 3.css | cat 1.css - 2.css > out.css
So far so good. But I regularly come in situations, where I need to mix several strings, files and even program output together, like copyright headers, files preprocessed by sed(1) and so on. I'd like to concatenate them together in as little steps and temporary files as possible, while having the freedom of choosing the order.
In short, I'm looking for a way to do this in as little steps as possible in Bash:
command [string|file|output]+ > concatenated
# note the plus ;-) --------^
(Basically, having a cat to handle multiple STDINs would be sufficient, I guess, like
<(echo "FOO") <(sed ...) <(echo "BAR") cat 1.css -echo1- -sed- 2.css -echo2-
But I fail to see, how I can access those.)
This works:
cat 1.css <(echo "FOO") <(sed ...) 2.css <(echo "BAR")
You can do:
echo "$(command 1)" "$(command 2)" ... "$(command n)" > outputFile
You can add all the commands in a subshell, which is redirected to a file:
(
cat 1.css
echo "FOO"
sed ...
echo BAR
cat 2.css
) > output
You can also append to a file with >>. For example:
cat 1.css > output
echo "FOO" >> output
sed ... >> output
echo "BAR" >> output
cat 2.css >> output
(This potentially opens and closes the file repeatedly)