I have a tool to compare alignment. Command is
./mscore -cftit <inputFile> <inputFile2>
And the result is four lines in the console.
What I try to do?
I have hundreds files:
file1.afa
resFile1.fasta
file2.afa
resFile2.fasta
file3.afa
resFile3.fasta
...
I need to put into arguments in first-row "X.afa" file and in second same name but "X.fasta" file. The result should be saved as Xfinal.txt file or something like that.
So I should have hundreds of *final.txt files. (after I will cat it in python, and count).
How can i do that in bash?
I have tried something like this:
#!/bin/bash
ls *.afa | parallel "mscore -cftit {} {}.fasta >{}final.txt"
but of course, it didn't work. Bash it's not familiar to me, I need to just have fast some biological results and I want to automatize my job. I will learn it but now I need it fast.
Can someone help me?
So I think you want a sequence of commands like this:
mscore -cftit file1.afa resFile1.fasta > File1final.txt
mscore -cftit file2.afa resFile2.fasta > File2final.txt
mscore -cftit file3.afa resFile3.fasta > File3final.txt
...
Try this:
ls *.afa | while read A; do B=`echo $A | sed 's/^file/resFile/;s/.afa$//'`; ./mscore -cftit $A $B.fasta > ${B}final.txt; done
Better version (thanks #tripleee):
for afafile in *.afa; do number="${afafile#file}"; number="${number%.afa}"; ./mscore -cftit "$afafile" "resFile${number}.fasta" > "file${number}final.txt"; done
Related
I am trying to to provide a file for my shell as an input which in return should test if the file contains a specific word and decide what command to execute. I am not figuring out yet where the mistake might lie. Please find the shell script that i wrote:
#!/bin/(shell)
input_file="$1"
output_file="$2"
grep "val1" | awk -f ./path/to/script.awk $input_file > $output_file
grep "val2" | sh ./path/to/script.sh $input_file > $output_file
when I input the the file that uses awk everything get executed as expected, but for the second command I don't even get an output file. Any help is much appreciated
Cheers,
You haven't specified this in your question, but I'm guessing you have a file with the keyword, e.g. file cmdfile that contains x-g301. And then you run your script like:
./script "input_file" "output_file" < cmdfile
If so, the first grep command will consume the whole cmdfile on stdin while searching for the first pattern, and nothing will be left for the second grep. That's why the second grep, and then your second script, produces no output.
There are many ways to fix this, but choosing the right one depends on what exactly you are trying to do, and how does that cmdfile look like. Assuming that's a larger file with other things than just the command pattern, you could pass that file as a third argument to your script, like this:
./script "input_file" "output_file" "cmdfile"
And have your script handle it like this:
#!/bin/bash
input_file="$1"
output_file="$2"
cmdfile="$3"
if grep -q "X-G303" "$cmdfile"; then
awk -f ./mno/script.awk "$input_file" > t1.json
fi
if grep -q "x-g301" "$cmdfile"; then
sh ./mno/tm.sh "$input_file" > t2.json
fi
Here I'm also assuming that your awk and sh scripts don't really need the output from grep, since you're giving them the name of the input file.
Note the proper way to use grep for existence search is via its exit code (and the muted output with -q). Instead of the if we could have used shortcircuiting (grep ... && awk ...), but this way is probably more readable.
I have a file with lots of Rubbish inside. All of it is in one line.
But there are often things like:
"var":"value"
Before and after are different characters ...
What I want to do is, to extract only the above mentioned format and put them into a single line. Any ideas how I could realize it in Shell scripting?
Best regards,
Alex
I believe
grep -o '"[^"]*":"[^"]*"' yourFile.txt > yourOutput.txt
would do the trick:
> echo 'xxx "a":"b" yyy"x":"y"' | grep -o '"[^"]*":"[^"]*"'
"a":"b"
"x":"y"
In bash I would like to extract part of many filenames and save that output to another file.
The files are formatted as coffee_{SOME NUMBERS I WANT}.freqdist.
#!/bin/sh
for f in $(find . -name 'coffee*.freqdist)
That code will find all the coffee_{SOME NUMBERS I WANT}.freqdist file. Now, how do I make an array containing just {SOME NUMBERS I WANT} and write that to file?
I know that to write to file one would end the line with the following.
> log.txt
I'm missing the middle part though of how to filter the list of filenames.
You can do it natively in bash as follows:
filename=coffee_1234.freqdist
tmp=${filename#*_}
num=${tmp%.*}
echo "$num"
This is a pure bash solution. No external commands (like sed) are involved, so this is faster.
Append these numbers to a file using:
echo "$num" >> file
(You will need to delete/clear the file before you start your loop.)
If the intention is just to write the numbers to a file, you do not need find command:
ls coffee*.freqdist
coffee112.freqdist coffee12.freqdist coffee234.freqdist
The below should do it which can then be re-directed to a file:
$ ls coffee*.freqdist | sed 's/coffee\(.*\)\.freqdist/\1/'
112
12
234
Guru.
The previous answers have indicated some necessary techniques. This answer organizes the pipeline in a simple way that might apply to other jobs as well. (If your sed doesn't support ‘;’ as a separator, replace ‘;’ with ‘|sed’.)
$ ls */c*; ls c*
fee/coffee_2343.freqdist
coffee_18z8.x.freqdist coffee_512.freqdist coffee_707.freqdist
$ find . -name 'coffee*.freqdist' | sed 's/.*coffee_//; s/[.].*//' > outfile
$ cat outfile
512
18z8
2343
707
In order to simplify my work I usually do this:
for FILE in ./*.txt;
do ID=`echo ${FILE} | sed 's/^.*\///'`;
bin/Tool ${FILE} > ${ID}_output.txt;
done
Hence process loops over all *.txt files.
Now I have two file groups - my Tool uses two inputs (-a & -b). Is there any command to run Tool for every FILE_A over every FILE_B and name the output file as a combination of both them?
I imagine it should look like something like this:
for FILE_A in ./filesA/*.txt;
do for FILE_B in ./filesB/*.txt;
bin/Tool -a ${FILE_A} -b ${FILE_B} > output.txt;
done
So the process would run number of *.txt in filesA over number of *.txt in filesB.
And also the naming issue which I even don't know where to put in...
Hope it is clear what I am asking. Never had to do such task before and a command line would be really helpful.
Looking forward!
NEWNAME="${FILE_A##*/}_${FILE_B##*/}_output.txt"
How do I get make to generate a list of prerequisites in numerical order? I want to do something like this:
cat file1 file2 file10 file11 file21 | script
When I try this in a bash shell and in the Makefile:
cat file* | script
the files are used in the order file10 file11 file1 file21 file2
which is not what I want. In bash I can force the issue like this:
cat file{[0-9],[0-9][0-9]}
and similar tricks. But I do not know how to get make to recognize these wildcard options and I get many errors when I try to put variations of this in my Makefile:
target: file{[0-9],[0-9][0-9]}
cat $^ | script
Obviously I do not want to list each file individually, I have shown only a few as an example, there are hundreds, and yes, there are gaps. So how do I get a nice make recipe that will use my files nicely in numerical order? (I could rename all the files from [digit] to 0[digit], but that seems like wimping out and avoiding the issue!)
This might work for you:
# mkdir test
# cd test
# touch test{a,b}{9..11}.txt
# echo *
testa10.txt testa11.txt testa9.txt testb10.txt testb11.txt testb9.txt
# echo $(ls -v *)
testa9.txt testa10.txt testa11.txt testb9.txt testb10.txt testb11.txt
# echo testa*
testa10.txt testa11.txt testa9.txt
# echo $(ls -v testa*)
testa9.txt testa10.txt testa11.txt
One option is to use
SHELL=/bin/bash
inside your makefile. Other options seem a little too complicated for me (($sort ...) doesn't handle numeric comparisons).
Not the most elegant solution, but using:
cat file? file?? file??? | script # add more file?'s as necessary
will achieve the effect you're looking for.
(And as a bonus it makes it look like your command is getting increasingly desperate for input.)