Separate sorting of all files in a directory - sorting

I am trying to sort sequentially the files in input_directory to output_directory.
I know how to do this for a single file as follows:
sort filename > newfilename
But, what I'd like the program to do is for every file in input_directory the following:
sort file_in_directrory > output_file_in_directory
How can I do that?
I am using cygwin sort.

Use a for loop:
for file in $(ls /PATH/TO/INPUT/DIR);
do
sort /PATH/TO/INPUT/DIR/$file > /PATH/TO/OUTPUT/DIR/$file;
done
Note that if any of the files in the input directory is a directory then sort command will fail as sort expects a file as input.

Related

for loop concatenating files that share part of their common basename (paired end sequencing reads)

I'm trying to concatenate a bunch of paired files into one file (for those who work with sequencing data, you'll be familiar with the paired-end read format).
For example, I have
SLG1.R1.fastq.gz
SLG1.R2.fastq.gz
SLG2.R1.fastq.gz
SLG2.R2.fastq.gz
SLG3.R1.fastq.gz
SLG3.R2.fastq.gz
etc.
I need to concatenate the two SLG1 files, the two SLG2 files, and the two SLG3 files.
So far I have this:
cd /workidr/slg/diet_manip/filtered_concatenated_reads/nonhost
for i in *1.fastq.gz
do
base=(basename $i "1.fastq.gz")
cat ${base}1.fastq.gz ${base}2.fastq.gz > /workdir/slg/diet_manip/filtered_concatenated_reads/cat/${base}.fastq.gz
done
The original files are all in the /filtered_concatenated_reads/nonhost directory, and I want the concatenated versions to be in /filtered_concatenated_reads/cat
The above code gives this error:
-bash: /workdir/slg/diet_manip/filtered_concatenated_reads/cat/basename.fastq.gz: No such file or directory
Any ideas?
Thank you!!

Rename files by matching key value from a text file bash

I have files in directories like:
./PBMCs/SRR1_1.fastq
./PBMCs/SRR1_2.fastq
./Monos/SRR2.fastq
./Monos/SRR3.fastq
I want to change the SRR# to a more informative name based on a file of key-value pairs:
SRR1 pbmc-1
SRR2 mono-1
SRR3 mono-2
And rename the files as:
./PBMCs/pbmc-1_1.fastq
./PBMCs/pbmc-1_2.fastq
./Monos/mono-1.fastq
./Monos/mono-2.fastq
All that I can think to do is loop through the list of original files and then loop through the lines of the name-change.txt file and replace the strings. However, I'm not sure how to implement this or if it's a good way to approach this.
Assuming all *.fastq are one subdirectory deep, this should work fine:
while read old new; do
for fastq in ./*/"$old"*.fastq; do
new_name=$new${fastq##*/"$old"}
echo mv "$fastq" "${fastq%/*}/$new_name"
done
done <name-change.txt
Remove echo if the output looks good.

how do i combine txt file from a list of file emplacement

i have a problem, i used "everything" to extract every txt file from a specific directory so that i can merge them. But on emeditor i don't find a way to merge file from a list of localisation.
Here what the everything file look like:
E:\Main directory\subdirectory 1\file.txt
E:\Main directory\subdirectory 2\file.txt
E:\Main directory\subdirectory 3\file.txt
E:\Main directory\subdirectory 4\file.txt
The list goes over 40k location. is there a way to use a program to read all the location in the text file and combine them ?
Also, the subdirectory has other txt file that i don't want to so i can't just merge all txt file from the main. Another thing is that there are variation of the "file.txt" like "Files.txt" for example.

Combine CSV files with condition

I need to combine all the csv files in some directory (.csv), provided that there are other files with the same name in this directory, but with different expansion (.csv.done).
If a csv file doesn't have .done in this extension then I don't need it for combine process.
What is the best way to do it using Bash ?
This approach is a solution to your problem. I see you've commented that it "didn't work", but whatever the reason is for it not working, it's likely simple to fix e.g. if you forgot to include key details, or failed to adapt it appropriately to suit your specific situation. If you need further help troubleshooting, add more info to your question.
The approach:
for f in *.csv.done
do
cat "${f%.*}" >> combined_file.csv
done
How it works:
In your example, you have 3 files named 1.csv 2.csv 3.csv and two 'done' files named 1.csv.done 2.csv.done.
This script begins by making a list of all files that end in .csv.done (two files: 1.csv.done 2.csv.done).
It then uses a parameter expansion, specifically ${parameter%word}, to 'shorten' the name of the two files in the list to .csv (instead of .csv.done).
Then it 'prints' the content of the two 'shortened' filenames (1.csv and 2.csv) into a 'combined' file.
It doesn't 'print' the content of 1.csv.done or 2.csv.done, or 3.csv, because these files weren't in the original 'list'.
If you run this script multiple times, it will keep adding the contents of files 1.csv and 2.csv to the 'combined' file (only run it once, or delete the 'combined' file before running it again)

How do i search for the file names that match the pattern we set and write the output to a text file in bash

Example: /apps/mft/local/tmp/folder1/folder2
Let say I have many files at random in folder 1 and folder 2. My requirement is to find the files that match the pattern $DAY-log.xml and write the result file names to a text file. Which command should i use in the bash script to perform this task. Please help.
find -name "$DAY-log.xml" > listOfFiles.txt
It will recursively find the pattern that you want, starting from the current directory, and output all the result to the listOfFiles.txt file.
If you want, you can even give a starting path like this:
find /path/where/to/start -name "$DAY-log.xml" > listOfFiles.txt

Resources