loop through all files in directory bash - bash

I know this question has been asked before. However, none of the examples seems to work for me. Hence:
I've got files all of the type
XXXX1.txt
XXXX2.txt
XXXX3.txt
...
XXXX9.txt in the same directory.
Now I want to grep a certain value from each file using bash:
for file in *$i.txt
do
for ((i = 1; i < 10; i++))
do
grep "energy" *.txt >> energies.txt
done
done
The code and grep works fine without any errors, but I always get the grep from the first iteration, the first energy printed 9 times. It never looks at the others. I know these kind of loops are complicated in bash. However, please don't suggest to move all files into separate directories and loop through them.

You don't need such complicated (and wrong) nested loops.
Try this only command:
grep -h "energy" *{1..9}.txt >> energies.txt

if you just need content of file(s):
grep -h "energy" *[1-9].txt >> energies.txt
if you need file name with content:
grep -H "energy" *[1-9].txt >> energies.txt

Related

Changing file paths outputs within a loop, in a shell script

I want to make a loop to run over multiple input files and produce one output file per input file.
I can use this command to make 1 output bam files, from 1 input sam file:
samtools view -S -b -h $input_file > $output_file
where:
input_file="/scratch/RNAseq/hisat2_alignment/456.sam"
output_file="/scratch/RNAseq/BAM_files/raw_BAM/456.bam"
When making this command into a loop I am unsure of what to do with the $output_file equivalent. Because I don't know how to make the file path and file extension change required for the $many_output_file variable:
many_input_files="/scratch/RNAseq/hisat2_alignment/*.sam"
for i in $many_input_files
do
samtools view -S -b -h $i > $many_output_file
done
Can anyone help please? I am new to Bash, I usually use R. I have tried using sed and tr but they seem to come up with errors when I try to make the file list of many_output_file from many_input_files
This is how I made the loop work, thanks to the help in the comments:
for i in $input_files
do
tmp=${i/hisat2_alignment/BAM_files/raw_BAM}
samtools view -S -b -h $i > ${tmp/.sam/.bam}
done

Output into new column .CSV Shell

I am still new to Shell. In javascript it is super easy to parse all output into a new column. Allyou need is ,. But I am still struggling to do the same in Shell. I've traversed most of the anwsers on Stackoverflow, and still couldn't get it to work. Most of the anwsers are around cutting from an existing file and pasting into a new one etc. Pretty sure, somewhere I am making a simple syntax error.
At the moment I have this:
echo "Mq1:" >> ~/Desktop/howmanySKUs.csv
cd /Volumes/Hams\ Hall\ Workspace/Mannequin_1_WIP && ls |grep \_01.tif$ | wc -l | sed "s/,//" >> ~/Desktop/howmanySKUs.csv
It counts the amount of files in specified directory.
I get this:
But now I am trying to Output Mq1: in one column, and then the sum of found files in the 2nd column.
Desired Output:
Any help would be much appreciated.
You can directly append both the lines
cd /Volumes/Hams\ Hall\ Workspace/Mannequin_1_WIP && echo "Mq1:,"`ls |grep \_01.tif$ | wc -l` > ~/Desktop/howmanySKUs.csv

How do I write a bash script to copy files into a new folder based on name?

I have a folder filled with ~300 files. They are named in this form username#mail.com.pdf. I need about 40 of them, and I have a list of usernames (saved in a file called names.txt). Each username is one line in the file. I need about 40 of these files, and would like to copy over the files I need into a new folder that has only the ones I need.
Where the file names.txt has as its first line the username only:
(eg, eternalmothra), the PDF file I want to copy over is named eternalmothra#mail.com.pdf.
while read p; do
ls | grep $p > file_names.txt
done <names.txt
This seems like it should read from the list, and for each line turns username into username#mail.com.pdf. Unfortunately, it seems like only the last one is saved to file_names.txt.
The second part of this is to copy all the files over:
while read p; do
mv $p foldername
done <file_names.txt
(I haven't tried that second part yet because the first part isn't working).
I'm doing all this with Cygwin, by the way.
1) What is wrong with the first script that it won't copy everything over?
2) If I get that to work, will the second script correctly copy them over? (Actually, I think it's preferable if they just get copied, not moved over).
Edit:
I would like to add that I figured out how to read lines from a txt file from here: Looping through content of a file in bash
Solution from comment: Your problem is just, that echo a > b is overwriting file, while echo a >> b is appending to file, so replace
ls | grep $p > file_names.txt
with
ls | grep $p >> file_names.txt
There might be more efficient solutions if the task runs everyday, but for a one-shot of 300 files your script is good.
Assuming you don't have file names with newlines in them (in which case your original approach would not have a chance of working anyway), try this.
printf '%s\n' * | grep -f names.txt | xargs cp -t foldername
The printf is necessary to work around the various issues with ls; passing the list of all the file names to grep in one go produces a list of all the matches, one per line; and passing that to xargs cp performs the copying. (To move instead of copy, use mv instead of cp, obviously; both support the -t option so as to make it convenient to run them under xargs.) The function of xargs is to convert standard input into arguments to the program you run as the argument to xargs.

Why is sort -k not working all the time?

I have now a script that puts a list of files in two separate arrays:
First, I get a file list from a ZIP file and fill FIRST_Array() with it. Second, I get a file list from a control file within a ZIP file and fill SECOND_Array() with it
while read length date time filename
do
FIRST_Array+=( "$filename" )
echo "$filename" >> FIRST.report.out
done < <(/usr/bin/unzip -qql AAA.ZIP |sort -g -k12 -t~)
Third, I compare both array like so:
diff -q <(printf "%s\n" "${FIRST_Array[#]}") <(printf "%s\n" "${SECOND_Array[#]}") |wc -l
I can tell that Diff fails because I output each array to files: FIRST.report.out and SECOND.report.out are simply not sorted properly.
1) FIRST.report.out (what's inside the ZIP file)
JGS-Memphis~AT1~Pre-Test~X-BanhT~JGMDTV387~6~P~1100~HR24-500~033072053326~20120808~240914.XML
JGS-Memphis~PRE~DTV_PREP~X-GuinE~JGMDTV069~6~P~1100~H24-700~033081107519~20120808~240914.XML
JGS-Memphis~PRE~DTV_PREP~X-MooreBe~JGM98745~40~P~1100~H21-200~029264526103~20120808~240914.XML
JGS-Memphis~FUN~Pre-Test~X-RossA~jgmdtv168~2~P~1100~H21-200~029415655926~20120808~240914.XML
2) SECOND.report.out (what's inside the ZIP's control file)
JGS-Memphis~AT1~Pre-Test~X-BanhT~JGMDTV387~6~P~1100~HR24-500~033072053326~20120808~240914.XML
JGS-Memphis~FUN~Pre-Test~X-RossA~jgmdtv168~2~P~1100~H21-200~029415655926~20120808~240914.XML
JGS-Memphis~PRE~DTV_PREP~X-GuinE~JGMDTV069~6~P~1100~H24-700~033081107519~20120808~240914.XML
JGS-Memphis~PRE~DTV_PREP~X-MooreBe~JGM98745~40~P~1100~H21-200~029264526103~20120808~240914.XML
Using sort -k12 -t~ made sense since ~ is the delimiter for the file's date field (12th position). But it is not working consistently. Added -g made no difference.
The sort is worse when my script processes bigger ZIP files. Why is sort -k not working all the time? How can I sort both arrays?
you don't really have a k12 in your data, your separator is '~' in your spec, but you have ~ and sometimes - in your data.
you can check by
head -n 1 your.data.file | sed -e "s/~/\n/g"
Business requirements are going to be changed. Sort is no longer required in this case. Thread can closed. Thank you.

Bash Script - Copy latest version of a file in a directory recursively

Below, I am trying to find the latest version of a file that could be in multiple directories.
Example Directory:
~inventory/emails/2012/06/InventoryFeed-Activev2.csv 2012/06/05
~inventory/emails/2012/06/InventoryFeed-Activev1.csv 2012/06/03
~inventory/emails/2012/06/InventoryFeed-Activev.csv 2012/06/01
Heres the bash script:
#!/bin/bash
FILE = $(find ~/inventory/emails/ -name INVENTORYFEED-Active\*.csv | sort -n | tail -1)
#echo $FILE #For Testing
cp $FILE ~/inventory/Feed-active.csv;
The error I am getting is:
./inventory.sh: line 5: FILE: command not found
The script should copy the newest file as attempted above.
Two questions:
First, is this the best method to achive what I want?
Secondly, Whats wrong above?
It looks good, but you have spaces around the = sign. This won't work. Try:
#!/bin/bash
FILE=$(find ~/inventory/emails/ -name INVENTORYFEED-Active\*.csv | sort -n | tail -1)
#echo $FILE #For Testing
cp $FILE ~/inventory/Feed-active.csv;
... Whats wrong above?
Variable assignment. You are not supposed to put extra spaces around = sign. The following should work:
FILE=$(find ~/inventory/emails/ -name INVENTORYFEED-Active\*.csv | sort -n | tail -1)
... is this the best method to achive what I want?
Probably not. But the best way depends on many factors. Perhaps whoever writes those files, can put them in a right location in the first place. You can also check file modification time, but that could fail, too... So as long as it works for you, I'd say go for it :)

Resources