Massive rename of files but keep the same sorting - bash

I have a lot of files in a folder with the same extension (e.g .vtk) and I am using a bash script to massive rename them with sequencial numbers.
Here is the script i use:
n=0;
for file in *.vtk; do
${file} 100_${n}.vtk;
n=$((n+1));
done
After the script's execution, all the files are rename like:
100_1.vtk
100_2.vtk
.
.
.
My problem is that I want to keep the sorting of files exactly the same as it was before. For example, if i had two sequential files named something.vtk and something_else.vtk, I want them after the renaming process, to correspond to 100_1.vtk and 100_2.vtk respectively.

Can you change your for loop from this:
for file in *.vtk; do
to this:
for file in $(ls -1 *.vtk | sort); do
If your filename don't contain spaces, this should work.

You can use sort -kX.Y! X refers to the column and Y to the character.
So, something like following should be fine:
$ ls | sort -k1.5

Related

How to sort files by modified timestamp in unix for a shellscript to pick them one at a time

I am writing a shell script that picks one file at a time and processes them.
I want the script to pick files in the ascending order of their modified time.
I used the code below to pick .csv files with a particular filename pattern.
for file in /filepath/file*.csv
do
#mystuff
done
But I expect the script to pick .csv files according to the ascending order of their modified time. Please suggest.
Thanks in advance
If you are sure the file names don't contain any "strange" characters, e.g. newline, you could use the sorting capability of ls and read the output with a while read... loop. This will also work for file names that contain spaces.
ls -tr1 /filepath/file*.csv | while read -r file
do
mystuff "$file"
done
Note this solution should be preferred over something like
for file in $(ls -tr /filepath/file*.csv) ...
because this will fail if you have a file name that contains a space due to the word-splitting involved here.
You can return the results of ls -t as an array. (-t sorts by modified time)
csvs=($(ls -t /filepath/file*.csv))
Then apply your for loop.
for file in $csvs
do
#mystuff
done
with your loop "for"
for file in "`ls -tr /filepath/file*.csv`"
do
mystuff $file
done

Bash shell script: recursively cat TXT files in folders

I have a directory of files with a structure like below:
./DIR01/2019-01-01/Log.txt
./DIR01/2019-01-01/Log.txt.1
./DIR01/2019-01-02/Log.txt
./DIR01/2019-01-03/Log.txt
./DIR01/2019-01-03/Log.txt.1
...
./DIR02/2019-01-01/Log.txt
./DIR02/2019-01-01/Log.txt.1
...
./DIR03/2019-01-01/Log.txt
...and so on.
Each DIRxx directory has a number of subdirectories named by date, which themselves have a number of log files that need to be concatenated. The number of text files to concatenate varies, but could theoretically could be as many as 5. I would like to see the following command performed for each set of files within the dated directories (note that the files must be concatenated in reverse order):
cd ./DIR01/2019-01-01/
cat Log.txt.4 Log.txt.3 Log.txt.2 Log.txt.1 Log.txt > ../../Log.txt_2019-01-01_DIR01.txt
(I understand the above command will give an error that certain files do not exist, but the cat will do what I need of it anyways)
Aside from cding into each directory and running the above cat command, how can I script this into a Bash shell script?
If you just want to concatenate all files in all subdirectories whose name starts with Log.txt, you could do something like this:
for dir in DIR*/*; do
date=${dir##*/};
dirname=${dir%%/*};
cat $dir/Log.txt* > Log.txt_"${date}"_"${dirname}".txt;
done
If you need the files in reverse numerical order, from 5 to 1 and then Log.txt, you can do this:
for dir in DIR*/*; do
date=${dir##*/};
dirname=${dir%%/*};
cat $dir/Log.txt.{5..1} $dir/Log.txt > Log.txt_"${date}"_"${dirname}".txt;
done
That will, as you mention in your question, complain for files that don't exist, but that's just a warning. If you don't want to see that, you can redirect error output (although that might cause you to miss legitimate error messages as well):
for dir in DIR*/*; do
date=${dir##*/};
dirname=${dir%%/*};
cat $dir/Log.txt.{5..1} $dir/Log.txt > Log.txt_"${date}"_"${dirname}".txt;
done 2>/dev/null
Not as comprehensive as the other, but quick and easy. Use find and sort your output however you like (-zrn is --zero-terminated --reverse --numeric-sort) then iterate over it with read.
find . -type f -print0 |
sort -zrn |
while read -rd ''; do
cat "$REPLY";
done >> log.txt

Sort files in directory then execute command on each one of them

I have a directory containing files numbered like this
1>chr1:2111-1111_mask.txt
1>chr1:2111-1111_mask2.txt
1>chr1:2111-1111_mask3.txt
2>chr2:345-678_mask.txt
2>chr2:345-678_mask2.txt
2>chr2:345-678_mask3.txt
100>chr19:444-555_mask.txt
100>chr19:444-555_mask2.txt
100>chr19:444-555_mask3.txt
each file contains a name like >chr1:2111-1111 in the first line and a series of characters in the second line.
I need to sort files in this directory numerically using the number before the > as guide, the execute the command for each one of the files with _mask3 and using.
I have this code
ls ./"$INPUT"_temp/*_mask3.txt | sort -n | for f in ./"$INPUT"_temp/*_mask3.txt
do
read FILE
Do something with each file and list the results in output file including the name of the string
done
It works, but when I check the list of the strings inside the output file they are like this
>chr19:444-555
>chr1:2111-1111
>chr2:345-678
why?
So... I'm not sure what "Works" here like your question stated.
It seems like you have two problems.
Your files are not in sorted order
The file names have the leading digits removed
Addressing 1, your command ls ./"$INPUT"_temp/*_mask3.txt | sort -n | for f in ./"$INPUT"_temp/*_mask3.txt here doesn't make a whole lot of sense. You are getting a list of files from ls, and then piping that to sort. That probably gives you the output you are looking for, but then you pipe that to for, which doesn't make any sense.
In fact you can rewrite your entire script to
for f in ./"$INPUT"_temp/*_mask3.txt
do
read FILE
Do something with each file and list the results in output file including the name of the string
done
And you'll have the exact same output. To get this sorted you could do something like:
for f in `ls ./"$INPUT"_temp/*_mask3.txt | sort -n`
do
read FILE
Do something with each file and list the results in output file including the name of the string
done
As for the unexpected truncation, that > character in your file name is important in your bash shell since it directs the stdout of the preceding command to a specified file. You'll need to insure that when you use variable $f from your loop that you stick quotes around that thing to keep bash from misinterpreting the file name a command > file type of thing.

How do I write a bash script to copy files into a new folder based on name?

I have a folder filled with ~300 files. They are named in this form username#mail.com.pdf. I need about 40 of them, and I have a list of usernames (saved in a file called names.txt). Each username is one line in the file. I need about 40 of these files, and would like to copy over the files I need into a new folder that has only the ones I need.
Where the file names.txt has as its first line the username only:
(eg, eternalmothra), the PDF file I want to copy over is named eternalmothra#mail.com.pdf.
while read p; do
ls | grep $p > file_names.txt
done <names.txt
This seems like it should read from the list, and for each line turns username into username#mail.com.pdf. Unfortunately, it seems like only the last one is saved to file_names.txt.
The second part of this is to copy all the files over:
while read p; do
mv $p foldername
done <file_names.txt
(I haven't tried that second part yet because the first part isn't working).
I'm doing all this with Cygwin, by the way.
1) What is wrong with the first script that it won't copy everything over?
2) If I get that to work, will the second script correctly copy them over? (Actually, I think it's preferable if they just get copied, not moved over).
Edit:
I would like to add that I figured out how to read lines from a txt file from here: Looping through content of a file in bash
Solution from comment: Your problem is just, that echo a > b is overwriting file, while echo a >> b is appending to file, so replace
ls | grep $p > file_names.txt
with
ls | grep $p >> file_names.txt
There might be more efficient solutions if the task runs everyday, but for a one-shot of 300 files your script is good.
Assuming you don't have file names with newlines in them (in which case your original approach would not have a chance of working anyway), try this.
printf '%s\n' * | grep -f names.txt | xargs cp -t foldername
The printf is necessary to work around the various issues with ls; passing the list of all the file names to grep in one go produces a list of all the matches, one per line; and passing that to xargs cp performs the copying. (To move instead of copy, use mv instead of cp, obviously; both support the -t option so as to make it convenient to run them under xargs.) The function of xargs is to convert standard input into arguments to the program you run as the argument to xargs.

Read file names from directory in Bash

I need to write a script that reads all the file names from a directory and then depending on the file name, for example if it contains R1 or R2, it will concatenates all the file names that contain, for example R1 in the name.
Can anyone give me some tip how to do this?
The only thing I was able to do is:
#!/bin/bash
FILES="path to the files"
for f in $FILES
do
cat $f
done
and this only shows me that the variable FILE is a directory not the files it has.
To make the smallest change that fixes the problem:
dir="path to the files"
for f in "$dir"/*; do
cat "$f"
done
To accomplish what you describe as your desired end goal:
shopt -s nullglob
dir="path to the files"
substrings=( R1 R2 )
for substring in "${substrings[#]}"; do
cat /dev/null "$dir"/*"$substring"* >"${substring}.out"
done
Note that cat can take multiple files in one invocation -- in fact, if you aren't doing that, you usually don't need to use cat at all.
Simple hack:
ls -al R1 | awk '{print $9}' >outputfilenameR1
ls -al R2 | awk '{print $9}' >outputfilenameR2
Your expectation that
for f in $FILES
will loop over all the file names which are stored in the directory defined by the variable FILES was disappointed by the fact that you had observed that the value of FILES was the only item processed in the for loop.
In order to create a list of files out of the value pointing to a directory it is necessary to provide a pattern for file names which if applied to the file system will give a list of found directory and file names upon evaluation of the pattern by using $FILES.
This can be done by appending of /* to the directory pattern string stored in the variable FILES which is then used to be evaluated to a list of file names using the $-character as directive for the shell to evaluate the value stored in FILES and replace $FILES with a list of found files. The pure * after /* guarantees that all entries in the directory are returned, so the list will contain not only files but also sub-directories if there are any.
In other words if you change the assignment to:
FILES="path to the files/*"
the script will then behave like you have expected it.

Resources