loop through lines in each file in a folder - nested loop - shell

I don't know why this is not working:
for g in *.txt; do for f in $(cat $g); do grep $f annotations.csv; done > ../$f_annot; done
I want to loop through each file in a folder, for each file I want to loop through each line and apply the grep command. When I do
for f in $(cat file1.txt); do grep $f annotations.csv; done > ../$f_annot
It works, it is the nested loop that doesn't output anything, it seems like it is running but it lasts forever and does nothing.

When you hava an empty txt file, grep $f annotations.csv will be translated into a grep command reading from stdin.
You might want to use something like
for g in *.txt; do
grep -f $g annotations.csv > ../$g_annot
done

SOLVED:
for file in *list.txt; do
while read -r line; do
grep "$line" annotations.csv
done < "$file"
> ${file}_annot.txt
done
:)

I get
Cannot write to a directory.
ksh: ../: 0403-005 Cannot create the specified file.
But this is because $f_annot evaluates to what we expect. It should be better with ${f}_annot:
for g in *.txt; do for f in $(cat $g); do grep $f annotations.csv; done > ../${f}_annot ; done
But there is an issue in your script because it erase the result of some loops: $f always has the last value of the loop. Maybe this suits your need better:
for g in *.txt; do for f in $(cat $g); do grep $f annotations.csv; done ; done

Related

How to iterate two variables in bash script?

I have these kind of files:
file6543_015.bam
subreadset_15.xml
file6543_024.bam
subreadset_24.xml
file6543_027.bam
subreadset_27.xml
I would like to run something like this:
for i in *bam && l in *xml
do
my_script $i $l > output_file
done
Because in my command the first bam file goes with the first xml file. For each combination bam/xml, that command will give a specific output file.
Like this, using bash arrays:
bam=( *.bam )
xml=( *.xml )
for ((i=0; i<${#bam[#]}; i++)); do
my_script "${bam[i]}" "${xml[i]}"
done
Assuming you have way to uniquely name your output_file for each specific output,
here is one way:
#!/bin/bash
ls file*.bam | while read i
do
CMD=`echo -n "my_script $i "`
CMD="$CMD `echo $i | sed -e 's/file.*_0/subreadset_/' -e 's/.bam/.xml/'`"
$CMD >> output_file
done

Extract a line from a text file using grep?

I have a textfile called log.txt, and it logs the file name and the path it was gotten from. so something like this
2.txt
/home/test/etc/2.txt
basically the file name and its previous location. I want to use grep to grab the file directory save it as a variable and move the file back to its original location.
for var in "$#"
do
if grep "$var" log.txt
then
# code if found
else
# code if not found
fi
this just prints out to the console the 2.txt and its directory since the directory has 2.txt in it.
thanks.
Maybe flip the logic to make it more efficient?
f=''
while read prev
do case "$prev" in
*/*) f="${prev##*/}"; continue;; # remember the name
*) [[ -e "$f" ]] && mv "$f" "$prev";;
done < log.txt
That walks through all the files in the log and if they exist locally, move them back. Should be functionally the same without a grep per file.
If the name is always the same then why save it in the log at all?
If it is, then
while read prev
do f="${prev##*/}" # strip the path info
[[ -e "$f" ]] && mv "$f" "$prev"
done < <( grep / log.txt )
Having the file names on the same line would significantly simplify your script. But maybe try something like
# Convert from command-line arguments to lines
printf '%s\n' "$#" |
# Pair up with entries in file
awk 'NR==FNR { f[$0]; next }
FNR%2 { if ($0 in f) p=$0; else p=""; next }
p { print "mv \"" p "\" \"" $0 "\"" }' - log.txt |
sh
Test it by replacing sh with cat and see what you get. If it looks correct, switch back.
Briefly, something similar could perhaps be pulled off with printf '%s\n' "$#" | grep -A 1 -Fxf - log.txt but you end up having to parse the output to pair up the output lines anyway.
Another solution:
for f in `grep -v "/" log.txt`; do
grep "/$f" log.txt | xargs -I{} cp $f {}
done
grep -q (for "quiet") stops the output

cat multiple files based on ID in filename

I would like to combine files with similar ID before first underscore into one file using cat. How do I do this for multiple files like below?
Thought of something like this:
for f in *.R1.fastq.gz; do cat "$f" > "${f%}.fastq.gz"; done
in
9989_L004_R1.fastq.gz
9989_L005_R1.fastq.gz
9989_L009_R1.fastq.gz
9873_L008_R1.fastq.gz
9873_L005_R1.fastq.gz
9873_L001_R1.fastq.gz
out
9989.fastq.gz
9873.fastq.gz
for f in *_R1.fastq.gz; do cat "$f" >> "${f%%_*}.fastq.gz"; done
>> for appending,
${f%%_*} removes the longest suffix in $f matching _*.
Here is another way:
for f in *_R1.fast1.gz; do
[[ -f "${f%%_*}.fastq.gz" ]] || cat ${f%%_*}_*_R1.fast1.gz > "${f%%_*}.fastq.gz"
done
or if you want to have it a bit more readable:
for f in *_R1.fast1.gz; do
key="${f%%_*}"
[[ -f "${key}.fastq.gz" ]] || cat ${key}_*_R1.fast1.gz > "${key}.fastq.gz"
done

How to continue in loop on first GREP result

I have this following .sh file which searchs for tons of search items in tons of files. But I want to continue the while loop at first result if one of the search items is present in $file. Currently, it is so that a query is matched in all files. The first hit is enough.
How can I do this?
while read file
do
echo $file
grep -o -f searchItems.txt "$file" >> results.txt
done < filelist.txt
Thanks.
You can use break after a successful grep return:
while read -r key; do
while read -r actualFile; do
echo "searching for $key in $actualFile"
grep -o "$key" "$actualFile" >> messageKeysInUse.txt && break
done < filelist.txt
done < allMessageKeysFromDB.txt
It will break out of inner while loop as soon as a grep succeeds.

Nested for loop comparing files

I am trying to write a bash script that looks at two files with the same name, each in a different directory.
I know this can be done with diff -r, however, I would like to take everything that is in the second file that is not in the first file and output it into an new file (also with the same file name)
I have written a (nested) loop with a grep command but it's not good and gives back a syntax error:
#!/bin/bash
FILES=/Path/dir1/*
FILES2=/Path/dir2/*
for f in $FILES
do
for i in $FILES2
do
if $f = $i
grep -vf $i $f > /Path/dir3/$i
done
done
Any help much appreciated.
try this
#!/bin/bash
cd /Path/dir1/
for f in *; do
comm -13 <(sort $f) <(sort /Path/dir2/$f) > /Path/dir3/$f
done
if syntax in shell is
if test_command;then commands;fi
commands are executed if test_command exit code is 0
if [ $f = $i ] ; then grep ... ; fi
but in your case it will be more efficient to get the file name
for i in $FILES; do
f=/Path/dir2/`basename $i`
grep
done
finally, maybe this will be more efficient than grep -v
comm -13 <(sort $f) <(sort $i)
comm -13 will get everything which is in the second and not in first ; comm without arguments generates 3 columns of output : first is only in first, second only in second and third what is common.
-13 or -1 -3 removes first and third column
#!/bin/bash
DIR1=/Path/dir1
DIR2=/Path/dir2
DIR3=/Path/dir3
for f in $DIR1/*
do
for i in $DIR2/*
do
if [ "$(basename $f)" = "$(basename $i)" ]
then
grep -vf "$i" "$f" > "$DIR3/$(basename $i)"
fi
done
done
This assumes no special characters in filenames. (eg, whitespace. Use double quotes if that is unacceptable.):
a=/path/dir1
b=/path/dir2
for i in $a/*; do test -e $b/${i##*/} &&
diff $i $b/${i##*/} | sed -n '/^< /s///p'; done

Resources