Rename files depending on age - bash

I have a script
for d in $(find /home/users/*/personal/*/docs/MY -type d); do
find $d -maxdepth 1 -type f -amin -10
done
It will list all files in MY directory created in period of 10 minutes in Linux server. I am looking for a way how to do that all these searched files matching criterias will be renamed by adding prefix old_. I mean if script finds files aaaa and bbbb, then renames them to old_aaa and old_bbbb.
Could someone help me?

Let find find all the files for you:
find /home/users/*/personal/*/docs/MY -type f -amin +10 -print0 |
while IFS= read -r -d $'\0' path; do
dir=$(dirname "$path")
name=$(basename "$path")
if [[ $name != old_* ]]; then
echo mv "$path" "$dir/old_$name"
fi
done
Remove the "echo" when you're satisfied it's working for you.

If you don't want old_old_old_old_... files you should try the following:
find "$d" -maxdepth 1 -type f -amin -10 -exec grep -q -v "old_" {} \; -exec mv "{}" "old_{}" \;
PS: I woul'd have commented the previous post but I am not allowed to.

Do like follows:
for d in $(find /home/users/*/personal/*/docs/MY -type d); do
find $d -maxdepth 1 -type f -amin -10 | { read bn; basename "$bn"; } | rename "s/()/old_\$1/"
done
Also take in mind that -amin -10 refers to files being accesed in the last 10 minutes, that includes files created in last 10 minutes but could contain other files as well.
Update:
An alternative way, which is probably faster than above one, because it uses mv command to rename and so it doesn't involve with regex, is the following:
for d in $(find /home/users/*/personal/*/docs/MY -type d); do
find $d -maxdepth 1 -type f -amin -10 | { read bn; basename "$bn"; } | { read fn; mv "$fn" "old_$fn"; }
done
Update 2:
I've updated my two commands above to fix the errors that OP noted in comments.

Use the option -exec:
find "$d" -maxdepth 1 -type f -amin -10 -exec mv "{}" "old_{}" \;
Comments:
Use quotes to make sure spaces don't matter. For the loop above, you should use
find "/home/users/"*"/personal/"*"/docs/MY" -type d
This isn't strictly necessary but it's good to turn this into a habit so you don't forget it when it matters.
Also note that the * must be outside the quotes.
The quotes around {} protect the special characters so the find command sees them (otherwise, the shell would try to interpret them). For the same reason, there is a \ before the ;
[EDIT] You may want to add a \! -name "old_*" to that which means "don't rename files that already start with _old" to the find.

Related

Keep latest pair of files and move older files to another (Unix)

For example I have following files in a directory
FILE_1_2021-01-01.csum
FILE_1_2021-01-01.csv
FILE_1_2021-01-02.csum
FILE_1_2021-01-02.csv
FILE_1_2021-01-03.csum
FILE_1_2021-01-03.csv
I want to keep FILE_1_2021-01-03.csum and FILE_1_2021-01-03.csv in current directory but zip and move rest of the older files to another directory.
So far I have tried like this but stuck how to correctly identify the pairs
file_count=0
PATH=/path/to/dir
ARCH=/path/to/dir
for file in ${PATH}/*
do
if [[ ! -d $file ]]
then
file_count=$(($file_count+1))
fi
done
echo "file count $file_count"
if [ $file_count -gt 2 ]
then
echo "moving old files to $ARCH"
// How to do it
fi
Since the timestamps are in a format that naturally sorts out with earliest first, newest last, an easy approach is to just use filename expansion to store the .csv and .csum filenames in a pair of arrays, and then do something with all but the last element of both:
declare -a csv=( FILE_*.csv ) csum=( FILE_*.csum )
mv "${csv[#]:0:${#csv[#]}-1}" "${csum[#]:0:${#csum[#]}-1}" new_directory/
(Or tar them up first, or whatever.)
First off ...
it's bad practice to use all uppercase variables as these can clash with OS-level variables (also all uppercase); case in point ...
PATH is a OS-level variable for keeping track of where to locate binaries but in this case ...
OP has just wiped out the OS-level variable with the assignment PATH=/path/to/dir
As for the question, some assumptions:
each *.csv file has a matching *.csum file
the 2 files to 'keep' can be determined from the first 2 lines of output resulting from a reverse sort of the filenames
not sure what OP means by 'zip and move' (eg, zip? gzip? tar all old files into a single .tar and then (g)zip?) so for the sake of this answer I'm going to just gzip each file and move to a new directory (OP can adjust the code to fit the actual requirement)
Setup:
srcdir='/tmp/myfiles'
arcdir='/tmp/archive'
rm -rf "${srcdir}" "${arcdir}"
mkdir -p "${srcdir}" "${arcdir}"
cd "${srcdir}"
touch FILE_1_2021-01-0{1..3}.{csum,csv} abc XYZ
ls -1
FILE_1_2021-01-01.csum
FILE_1_2021-01-01.csv
FILE_1_2021-01-02.csum
FILE_1_2021-01-02.csv
FILE_1_2021-01-03.csum
FILE_1_2021-01-03.csv
XYZ
abc
Get list of *.csum/*.csv files and sort in reverse order:
$ find . -maxdepth 1 -type f \( -name '*.csum' -o -name '*.csv' \) | sort -r
/tmp/myfiles/FILE_1_2021-01-03.csv
/tmp/myfiles/FILE_1_2021-01-03.csum
/tmp/myfiles/FILE_1_2021-01-02.csv
/tmp/myfiles/FILE_1_2021-01-02.csum
/tmp/myfiles/FILE_1_2021-01-01.csv
/tmp/myfiles/FILE_1_2021-01-01.csum
Eliminate first 2 files (ie, generate list of files to zip/move):
$ find "${srcdir}" -maxdepth 1 -type f \( -name '*.csum' -o -name '*.csv' \) | sort -r | tail +3
/tmp/myfiles/FILE_1_2021-01-02.csv
/tmp/myfiles/FILE_1_2021-01-02.csum
/tmp/myfiles/FILE_1_2021-01-01.csv
/tmp/myfiles/FILE_1_2021-01-01.csum
Process our list of files:
while read -r fname
do
gzip "${fname}"
mv "${fname}".gz "${arcdir}"
done < <(find "${srcdir}" -maxdepth 1 -type f \( -name '*.csum' -o -name '*.csv' \) | sort -r | tail +3)
NOTE: the find|sort|tail results could be piped to xargs (or parallel) to perform the gzip/mv operations but without more details on what OP means by 'zip and move' I've opted for a simpler, albeit less performant, while loop
Results:
$ ls -1 "${srcdir}"
FILE_1_2021-01-03.csum
FILE_1_2021-01-03.csv
XYZ
abc
$ ls -1 "${arcdir}"
FILE_1_2021-01-01.csum.gz
FILE_1_2021-01-01.csv.gz
FILE_1_2021-01-02.csum.gz
FILE_1_2021-01-02.csv.gz
Your algorithm of counting files can be simplified using find. You seem to look for non-directories. The option -not -type d does exactly that. By default find searches into the subfolders, so you need to pass -maxdepth 1 to limit the search to a depth of 1.
find "$PATH" -maxdepth 1 -not -type d
If you want to get the number of files, you may pipe the command to wc:
file_count=$(find "$PATH" -maxdepth 1 -not -type d | wc -l)
Now there are two ways of detecting which file is the more recent: by looking at the filename, or by looking at the date when the files were last created/modified/etc. Since your naming convention looks pretty solid, I would recommend the first option. Sorting by creation/modification date is more complex and there are numerous cases where this information is not reliable, such as copying files, zipping/unzipping them, touching files, etc.
You can sort with sort and then grab the last element with tail -1:
find "$PATH" -maxdepth 1 -not -type d | sort | tail -1
You can do the same thing by sorting in reverse order using sort -r and then grab the first element with head -1. From a functional point of view, it is strictly equivalent, but it is slightly faster because it stops at the first result instead of parsing all results. Plus it will be more relevant later on.
find "$PATH" -maxdepth 1 -not -type d | sort -r | head -1
Once you have the filename of the most recent file, you can extract the base name in order to create a pattern out of it.
most_recent_file=$(find "$PATH" -maxdepth 1 -not -type d | sort -r | head -1)
most_recent_file=${most_recent_file%.*}
most_recent_file=${most_recent_file##*/}
Let’s explain this:
first, we grab the filename into a variable called most_recent_file
then we remove the extension using ${most_recent_file%.*} ; the % symbol will cut at the end, and .* will cut everything after the last dot, including the dot itself
finally, we remove the folder using ${most_recent_file##*/} ; the ## symbol will cut at the beginning with a greedy catch, and */ will cut everything before the last slash, including the slash itself
The difference between # and ## is how greedy the pattern is. If your file is /path/to/file.csv then ${most_recent_file#*/} (single #) will cut the first slash only, i.e. it will output path/to/file.csv, while ${most_recent_file##*/} (double #) will cut all paths, i.e. it will output file.csv.
Once you have this string, you can make a pattern to include/exclude similar files using find.
find "$PATH" -maxdepth 1 -not -type d -name "$most_recent_file.*"
find "$PATH" -maxdepth 1 -not -type d -not -name "$most_recent_file.*"
The first line will list all files which match your pattern, and the second line will list all files which do not match the pattern.
Since you want to move your 'old' files to a folder, you may execute a mv command for the last list.
find "$PATH" -maxdepth 1 -not -type d -not -name "$most_recent_file.*" -exec mv {} "$ARCH" \;
If your version of find supports it, you may use + in order to batch the move operations.
find "$PATH" -maxdepth 1 -not -type d -not -name "$most_recent_file.*" -exec mv -t "$ARCH" {} +
Otherwise you can pipe to xargs.
find "$PATH" -maxdepth 1 -not -type d -not -name "$most_recent_file.*" | xargs mv -t "$ARCH"
If put altogether:
file_count=0
PATH=/path/to/dir
ARCH=/path/to/dir
file_count=$(find "$PATH" -maxdepth 1 -not -type d | wc -l)
echo "file count $file_count"
if [ $file_count -gt 2 ]
then
echo "moving old files to $ARCH"
most_recent_file=$(find "$PATH" -maxdepth 1 -not -type d | sort -r | head -1)
most_recent_file=${most_recent_file%.*}
most_recent_file=${most_recent_file##*/}
find "$PATH" -maxdepth 1 -not -type d -not -name "$most_recent_file.*" | xargs mv -t "$ARCH"
fi
As a last note, if your path has newlines, it will not work. If you want to handle this case, you need a few modifications. Counting files would be done like this:
file_count=$(find "$PATH" -maxdepth 1 -not -type d -print . | wc -c)
Getting the most recent file:
most_recent_file=$(find "$PATH" -maxdepth 1 -not -type d -print0 | sort -rz | grep -zm1)
Moving files with xargs:
find "$PATH" -maxdepth 1 -not -type d -not -name "$most_recent_file.*" -print0 | xargs -0 mv -t "$ARCH"
(There’s no problem if moving files using -exec)
I won’t go into details, but just know that the issue is known and these are the kind of solutions you can apply if need be.

How do I use find command with pipe in bash?

The directory structure looks like
home
--dir1_foo
----subdirectory.....
--dir2_foo
--dir3_foo
--dir4_bar
--dir5_bar
I'm trying to use 'find' command to get directories containing specific strings first, (in this case 'foo'), then use 'find' command again to retrieve some directories matching conditions.
So, I first tried
#!/bin/bash
for dir in `find ./ -type d -name "*foo*" `;
do
for subdir in `find $dir -mindepth 2 -type d `;
do
[Do some jobs]
done
done
, and this script works fine.
Then I thought that using only one loop with pipe like below would also work, but this does not work
#!/bin/bash
for dir in `find ./ -type d -name "*foo*" | find -mindepth 2 -type d `;
do
[Do some jobs]
done
and actually this script works the same as
for dir in `find -mindepth 2 -type d`;
do
[Do some jobs]
done
, which means that the first find command is ignored..
What is the problem?
What your script is doing is not a good practice and has lot of potential pitfalls. See BashFAQ- Why you don't read lines with "for" to understand why.
You can use xargs with -0 to read null delimited files and use the another find command without needing to use the for-loop
find ./ -type d -name "*foo*" -print0 | xargs -0 -I{.} find {.} -mindepth 2 -type d
The string following -I in xargs acts like a placeholder for the input received from the previous pipeline and passes it to the next command. The -print0 option is GNU specific which is a safe option to hande filenames/directory names containing spaces or any other shell meta-characters.
So with the above command in-place, if you are interested in doing some action over the output from 2nd command, do a process-substitution syntax with the while command,
while IFS= read -r -d '' f; do
echo "$f"
# Your other actions can be done on "$f" here
done < <(find ./ -type d -name "*foo*" -print0 | xargs -0 -I{.} find {.} -mindepth 2 -type d -print0)
As far the reason why your pipelines using find won't work is that you are not reading the previous find command's output. You needed either xargs or -execdir while the latter is not an option I would recommend.

Bulk rename directories with prefix Unix

I am trying to bulk rename directories with a prefix in Unix. Prefix like abc-
So if current directory is 123, I want to make it abc-123, etc
I've tried
for d in $(find . -name '*' -type d) ; do
mv $d $(echo $d | sed 's/$d/abc-$d/g')
done
but that doesn't work. Do very little shell scripting so any help would be appreciated.
rename command is not available
Thank you!
If I understand your question, you could do it with one line and find -exec like so,
find . -type d -depth -execdir mv {} abc-{} \;
Try:
for d in $(find . -depth -type d); do
b=$(basename $d)
p=$(dirname $d)
mv -v $d $p/abc-$b
done
Note that the -depth argument is really important: it ensures that the directories are processed from bottom to top, so that you rename child directories before their parents. If you don't do that then you'll end up trying to rename paths that no longer exist.
Also I recommend replacing line 4 with
echo "mv -v $d $p/abc-$b"
and running that version of the loop first so you can see what it will do before trying it for real.

Loop thru directory and return directory name

Im trying to loop thru a directory (non recursive) and I only want to list the directory name, not the path.
find /dir/* -type d -prune -exec basename {} \;
This returns a list of directories in the dir and it works.
folder 1
this is folder2
And I want to loop thru these so I did:
for i in $(/dir/* -type d -prune -exec basename {} \;)
do
echo ${i}
done
But the for loop loops thru each word and not row. which results in this:
folder
1
this
is
folder2
I know there is a lot of threads on this but I haven't found anyone that works for me. Especally with spaces in the name.
Does anyone know how to solve this?
If you want to loop through directory names then you can use;
( cd /dir && for f in */; do echo "$f"; done )
In case you want to loop thru the find results only then better way to do that is:
while read -r f; do
echo "$f"
done < <(find /dir/ -type d -prune -exec basename '{}' \;)
This is preferred since it avoids spawning a subshell (though find -exec will create subshells).
Use a while loop instead:
find /dir/* -type d -prune -exec basename {} \; | while IFS= read -r line
do
echo "$line"
done
find /dir -maxdepth 1 -type d -printf %f

How can I list all unique file names without their extensions in bash?

I have a task where I need to move a bunch of files from one directory to another. I need move all files with the same file name (i.e. blah.pdf, blah.txt, blah.html, etc...) at the same time, and I can move a set of these every four minutes. I had a short bash script to just move a single file at a time at these intervals, but the new name requirement is throwing me off.
My old script is:
find ./ -maxdepth 1 -type f | while read line; do mv "$line" ~/target_dir/; echo "$line"; sleep 240; done
For the new script, I basically just need to replace find ./ -maxdepth 1 -type f
with a list of unique file names without their extensions. I can then just replace do mv "$line" ~/target_dir/; with do mv "$line*" ~/target_dir/;.
So, with all of that said. What's a good way to get a unique list of files without their file names with bash script? I was thinking about using a regex to grab file names and then throwing them in a hash to get uniqueness, but I'm hoping there's an easier/better/quicker way. Ideas?
A weird-named files tolerant one-liner could be:
find . -maxdepth 1 -type f -and -iname 'blah*' -print0 | xargs -0 -I {} mv {} ~/target/dir
If the files can start with multiple prefixes, you can use logic operators in find. For example, to move blah.* and foo.*, use:
find . -maxdepth 1 -type f -and \( -iname 'blah.*' -or -iname 'foo.*' \) -print0 | xargs -0 -I {} mv {} ~/target/dir
EDIT
Updated after comment.
Here's how I'd do it:
find ./ -type f -printf '%f\n' | sed 's/\..*//' | sort | uniq | ( while read filename ; do find . -type f -iname "$filename"'*' -exec mv {} /dest/dir \; ; sleep 240; done )
Perhaps it needs some explaination:
find ./ -type f -printf '%f\n': find all files and print just their name, followed by a newline. If you don't want to look in subdirectories, this can be substituted by a simple ls;
sed 's/\..*//': strip the file extension by removing everything after the first dot. Both foo.tar ad foo.tar.gz are transformed into foo;
sort | unique: sort the filenames just found and remove duplicates;
(: open a subshell:
while read filename: read a line and put it into the $filename variable;
find . -type f -iname "$filename"'*' -exec mv {} /dest/dir \;: find in the current directory (find .) all the files (-type f) whose name starts with the value in filename (-iname "$filename"'*', this works also for files containing whitespaces in their name) and execute the mv command on each one (-exec mv {} /dest/dir \;)
sleep 240: sleep
): end of subshell.
Add -maxdepth 1 as argument to find as you see fit for your requirements.
Nevermind, I'm dumb. there's a uniq command. Duh. New working script is: find ./ -maxdepth 1 -type f | sed -e 's/.[a-zA-Z]*$//' | uniq | while read line; do mv "$line*" ~/target_dir/; echo "$line"; sleep 240; done
EDIT: Forgot close tag on code and a backslash.

Resources