Newest file in directories - bash

Hi i have two directories
Directory 1:
song.mp3
work.txt
Directory 2:
song.mp3
work.txt
These files are the same, but song.mp3 in directory1 is newer than song.mp3 in directory2 and work.txt in directory 2 is newest than work.txt in directory 1.
And now how i can print in two files for example
in file1 files that are newer that in directory 2 so it must be song.mp3
and in file2 files that are newer that in directory 1 so it must be work.txt
i tried
find $directory1 -type f -newer $directory2
but it always print me the newest file in both directories. Could someone help me ?

-newer $directory2 is just using the timestamp on the directory $directory2 as the reference point for all the comparisons. It doesn't look at any of the files inside $directory2.
I don't think there's anything like a "compare each file to its counterpart in another directory" operation built in to find, so you'll probably have to do some of the work yourself. Here's a short script demonstrating one way it can be done:
(cd $directory1 && find . -print) | while IFS= read -r fn; do
if [ "$directory1/$fn" -nt "$directory2/$fn" ]; then
printf "%s\n" "$directory1/$fn"
else
printf "%s\n" "$directory2/$fn"
fi
done

# set up the test
mkdir directory1 directory2
touch directory1/song.mp3
touch -t 200101010000 directory1/work.txt
touch -t 200101010000 directory2/work.txt
touch directory2/work.txt
# find the newest of each filename:
# sort the files in both directories by mtime
# then only output the filename (regardless of directory) the first time seen
stat -c '%Y %n' directory[12]/* |
sort -rn |
cut -d " " -f 2- |
awk -F / '!seen[$2]++'
directory2/work.txt
directory1/song.mp3

If you are on a Linux that supports the following:
Fileage=`date +%s -r filename`
You could run a "find" and print age in seconds followed by filename for each file and then sort that file. This has the benefit that it will work across any number of directories - not just two. Glenn's more widely available "stat -c" could be used in place of my "date" command - and he's done the "sort" and "awk" for you!

Related

Identify the files year wise and delete from a dir in unix

I need to list out the files which are created in a specific year and then to delete the files. year should be the input.
i tried with date it is working for me. but not able to covert that date to year for comparison in loop to get the list of files.
Below code is giving 05/07 files. but want to list out the files which are created in 2022,2021,etc.,
for file in /tmp/abc*txt ; do
[ "$(date -I -r "$file")" == "2022-05-07" ] && ls -lstr "$file"
done
If you end up doing ls -l anyway, you might just parse the date information from the output. (However, generally don't use ls in scripts.)
ls -ltr | awk '$8 ~ /^202[01]$/'
date -r is not portable, though if you have it, you could do
for file in /tmp/abc*txt ; do
case $(date -I -r "$file") in
2020-* | 2021-* ) ls -l "$file";;
esac
done
(The -t and -r flags to ls have no meaning when you are listing a single file anyway.)
If you don't, the tool of choice would be stat, but it too has portability issues; the precise options to get the information you want will vary between platforms. On Linux, try
for file in /tmp/abc*txt ; do
case $(LC_ALL=C stat -c %y "$file") in
2020-* | 2021-* ) ls -l "$file";;
esac
done
On BSD (including MacOS) try stat -f %Sm -t %Y "$file" to get just the year.
If you need proper portability, perhaps look for a scripting language with wide support, such as Perl or Python. The stat() system call is the fundamental resource for getting metainformation about a file. The find command also has some features for finding files by age, though its default behavior is to traverse subdirectories, too (you can inhibit that with -maxdepth 1; but then the options to select files by age are again not entirely POSIX portable).
To list out files which were last modified in a specific year and then to delete those files, you could use a combination of the find -newer and touch commands:
# given a year as input
year=2022
stampdir=$(mktemp -d)
touch -t ${year}01010000 "$stampdir"/beginning
touch -t $((year+1))01010000 "$stampdir"/end
find /tmp -name 'abc*txt' -type f -newer "$stampdir/beginning" ! -newer "$stampdir/end" -print -delete
rm -r "$stampdir"
First, create a temporary working directory to store the timestamp files; we don't want the find command to accidentally find them. Be careful here; mktemp will probably create a directory in /tmp; this use-case is safe only because we're naming the timestamp files such that they don't match the "abc*txt" pattern from the question.
Next, create bordering timestamp files with the touch command: one that is the newest date in the year, named "beginning", and another for the newest date of the next year, named "end".
Then run the find command; here's the breakdown:
start in /tmp (from the question)
files named with the 'abc*txt' pattern (from the question)
only files (not directories, etc -- from the question)
newer than the beginning timestamp file
not newer (i.e. older) than the end timestamp file
if found, print the filename and then delete it
Finally, clean up the temporary working directory that we created.
Try this:
For checking which files are picked up:
echo -e "Give Year :"
read yr
ls -ltr /tmp | grep "^-" |grep -v ":" | grep $yr | awk -F " " '{ print $9;}'
** You can replace { print $9 ;} with { rm $9; } in the above command for deleting the picked files

Automator/Apple Script: Move files with same prefix on a new folder. The folder name must be the files prefix

I'm a photographer and I have multiple jpg files of clothings in one folder. The files name structure is:
TYPE_FABRIC_COLOR (Example: BU23W02CA_CNU_RED, BU23W02CA_CNU_BLUE, BU23W23MG_LINO_WHITE)
I have to move files of same TYPE (BU23W02CA) on one folder named as TYPE.
For example:
MAIN FOLDER>
BU23W02CA_CNU_RED.jpg, BU23W02CA_CNU_BLUE.jpg, BU23W23MG_LINO_WHITE.jpg
Became:
MAIN FOLDER>
BU23W02CA_CNU > BU23W02CA_CNU_RED.jpg, BU23W02CA_CNU_BLUE.jpg
BU23W23MG_LINO > BU23W23MG_LINO_WHITE.jpg
Here are some scripts.
V1
#!/bin/bash
find . -maxdepth 1 -type f -name "*.jpg" -print0 | while IFS= read -r -d '' file
do
# Extract the directory name
dirname=$(echo "$file" | cut -d'_' -f1-2 | sed 's#\./\(.*\)#\1#')
#DEBUG echo "$file --> $dirname"
# Create it if not already existing
if [[ ! -d "$dirname" ]]
then
mkdir "$dirname"
fi
# Move the file into it
mv "$file" "$dirname"
done
it assumes all files that the find lists are of the format you described in your question, i.e. TYPE_FABRIC_COLOR.ext.
dirname is the extraction of the first two words delimited by _ in the file name.
since find lists the files with a ./ prefix, it is removed from the dirname as well (that is what the sed command does).
the find specifies the name of the files to consider as *.jpg. You can change this to something else, if you want to restrict which files are considered in the move.
this version loops through each file, creates a directory with it's first two sections (if it does not exists already), and moves the file into it.
if you want to see what the script is doing to each file, you can add option -v to the mv command. I used it to debug.
However, since it loops though each file one by one, this might take time with a large number of files, hence this next version.
V2
#!/bin/bash
while IFS= read -r dirname
do
echo ">$dirname"
# Create it if not already existing
if [[ ! -d "$dirname" ]]
then
mkdir "$dirname"
fi
# Move the file into it
find . -maxdepth 1 -type f -name "${dirname}_*" -exec mv {} "$dirname" \;
done < <(find . -maxdepth 1 -type f -name "*.jpg" -print | sed 's#^\./\(.*\)_\(.*\)_.*\..*$#\1_\2#' | sort | uniq)
this version loops on the directory names instead of on each file.
the last line does the "magic". It finds all files, and extracts the first two words (with sed) right away. Then these words are sorted and "uniqued".
the while loop then creates each directory one by one.
the find inside the while loop moves all files that match the directory being processed into it. Why did I not simply do mv ${dirname}_* ${dirname}? Since the expansion of the * wildcard could result in a too long arguments list for the mv command. Doing it with the find ensures that it will work even on LARGE number of files.
Suggesting oneliner awk script:
echo "$(ls -1 *.jpg)"| awk '{system("mkdir -p "$1 OFS $2);system("mv "$0" "$1 OFS $2)}' FS=_ OFS=_
Explanation:
echo "$(ls -1 *.jpg)": List all jpg files in current directory one file per line
FS=_ : Set awk field separator to _ $1=type $2=fabric $3=color.jpg
OFS=_ : Set awk output field separator to _
awk script explanation
{ # for each file name from list
system ("mkdir -p "$1 OFS $2); # execute "mkdir -p type_fabric"
system ("mv " $0 " " $1 OFS $2); # execute "mv current-file to type_fabric"
}

Shell Script: How to copy files with specific string from big corpus

I have a small bug and don't know how to solve it. I want to copy files from a big folder with many files, where the files contain a specific string. For this I use grep, ack or (in this example) ag. When I'm inside the folder it matches without problem, but when I want to do it with a loop over the files in the following script it doesn't loop over the matches. Here my script:
ag -l "${SEARCH_QUERY}" "${INPUT_DIR}" | while read -d $'\0' file; do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done
SEARCH_QUERY holds the String I want to find inside the files, INPUT_DIR is the folder where the files are located, OUTPUT_DIR is the folder where the found files should be copied to. Is there something wrong with the while do?
EDIT:
Thanks for the suggestions! I took this one now, because it also looks for files in subfolders and saves a list with all the files.
ag -l "${SEARCH_QUERY}" "${INPUT_DIR}" > "output_list.txt"
while read file
do
echo "${file##*/}"
cp "${file}" "${OUTPUT_DIR}/${file##*/}"
done < "output_list.txt"
Better implement it like below with a find command:
find "${INPUT_DIR}" -name "*.*" | xargs grep -l "${SEARCH_QUERY}" > /tmp/file_list.txt
while read file
do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done < /tmp/file_list.txt
rm /tmp/file_list.txt
or another option:
grep -l "${SEARCH_QUERY}" "${INPUT_DIR}/*.*" > /tmp/file_list.txt
while read file
do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done < /tmp/file_list.txt
rm /tmp/file_list.txt
if you do not mind doing it in just one line, then
grep -lr 'ONE\|TWO\|THREE' | xargs -I xxx -P 0 cp xxx dist/
guide:
-l just print file name and nothing else
-r search recursively the CWD and all sub-directories
match these works alternatively: 'ONE' or 'TWO' or 'THREE'
| pipe the output of grep to xargs
-I xxx name of the files is saved in xxx it is just an alias
-P 0 run all the command (= cp) in parallel (= as fast as possible)
cp each file xxx to the dist directory
If i understand the behavior of ag correctly, then you have to
adjust the read delimiter to '\n' or
use ag -0 -l to force delimiting by '\0'
to solve the problem in your loop.
Alternatively, you can use the following script, that is based on find instead of ag.
while read file; do
echo "$file"
cp "$file" "$OUTPUT_DIR/$file"
done < <(find "$INPUT_DIR" -name "*$SEARCH_QUERY*" -print)

Delete all files in a directory matching a time pattern

I am taking one of important folder every day backup by using cron. That folder name it will store with the current date.
Now my requirement is i need to keep only the current day and last two days backup.
i.e I want to keep only:
test_2016-11-04.tgz
test_2016-11-03.tgz
test_2016-11-02.tgz
Remaining folder it has to delete automatically. Please let us know how to do in shell script.
Below is my backup folder structure.
test_2016-10-30.tgz test_2016-11-01.tgz test_2016-11-03.tgz
test_2016-10-31.tgz test_2016-11-02.tgz test_2016-11-04.tgz
With ls -lrt | head -n -3 | awk '{print $9}
you can print all but the last 3 files in your directory.
Passing this output into rm you obtain the result desidered.
you could append end of backup script;
find ./backupFolder -name "test_*.tgz" -mtime +3 -type f -delete
also use this;
ls -1 test_*.tgz | sort -r | awk 'NR > 3 { print }' | xargs -d '\n' rm -f --
Generate an array on files you want to keep:
names=()
for d in {0..2}; do
names+=( "test_"$(date -d"$d days ago" "+%Y-%m-%d")".tgz" )
done
so that it looks like this:
$ printf "%s\n" "${names[#]}"
test_2016-11-04.tgz
test_2016-11-03.tgz
test_2016-11-02.tgz
Then, loop through the files and keep those that are not in the array:
for file in test_*.tgz; do
[[ ! ${names[*]} =~ "$file" ]] && echo "remove $file" || echo "keep $file"
done
If ran on your directory, this would result on an output like:
remove test_2016-10-30.tgz
remove test_2016-10-31.tgz
remove test_2016-11-01.tgz
keep test_2016-11-02.tgz
keep test_2016-11-03.tgz
keep test_2016-11-04.tgz
So now it is just a matter or replacing those echo with something more meaningful like rm.

HandBrakeCLI bash script convert all videos in a folder

Firstly, I searched around for my problem. But none can solve it.
I want to convert all videos file in a directory and the output will be saved in another directory. I got a bash script from somewhere I dont remember.
#!/bin/bash
SRC="/home/abc/public_html/filex/store/vids/toriko/VIDEOS HERE"
DEST="/home/abc/public_html/filex/store/vids/toriko/51-100"
DEST_EXT=mp4
HANDBRAKE_CLI=HandBrakeCLI
PRESET="iPhone & iPod Touch"
for FILE in "`ls $SRC`"
do
filename=$(basename $FILE)
extension=${filename##*.}
filename=${filename%.*}
$HANDBRAKE_CLI -i "$SRC"/$FILE -o "$DEST"/"$filename".$DEST_EXT "$PRESET"
done
the problem is, the output of the file will be without filename.. only ".mp4".
and, there is only 1 file generated.. means, from 50 videos in the folder, only 1 files generated with name ".mp4" and after that, HandBrakeCLI exit.
can anyone fix my code?
I got no experince in bash coding.. so, the right script giiven will be appreciate :)
Your line
for FILE in "`ls $SRC`"
effectively creates only one iteration where FILE contains the list of the files (and it is not able to handle the space in $SRC). Better replace it with
for FILE in "$SRC"/*
Example:
$ ls test
1.txt 2.txt
$ SRC=test; for f in "`ls $SRC`" ; do echo $f; done
1.txt 2.txt
$ SRC=test; for f in "$SRC"/* ; do echo $f; done
test/1.txt
test/2.txt
Side note: you can have a space in there with no problem
$ ls "the test"
1.txt 2.txt
$ SRC="the test"; for f in "$SRC"/* ; do echo $f; done
the test/1.txt
the test/2.txt
I tried this script, and others like it, but I wanted to convert recursive directory tree's and have files placed in the same directory with .mp4 extension and delete .avi files, after much trial and error I gave up on this code and searched for a new code, id like to credit
http://www.surlyjake.com/blog/2010/08/10/script-to-run-handbrake-recursively-through-a-folder-tree/
For the original code!
Here is my modified script, barely modified BTW this script is short, sweet and easy to understand.
#!/bin/bash
# This Script Goes in Root Folder of TV show -- Example Folder Structure
# /Stargate/Season\ 1/Epiosde.avi
# /Stargate/Season\ 2/Epiosde.avi
# /Stargate/handbrake_folder.script
# Outputs all Files back inside same dir's and does all folders inside Startgate DIR
# /Stargate/Season\ 1/Epiosde.mp4
# /Stargate/Season\ 2/Epiosde.mp4
# PRESET = -o flags for CLI can be got from GUI under Activity Log or from https://trac.handbrake.fr/wiki/CLIGuide OR you can use actual Presets!
# PRESET="iPhone & iPod Touch"
PRESET="--modulus 2 -e x264 -q 20 --vfr -a 1 -E ac3 -6 5point1 -R Auto -B 384 -D 0 --gain 0 --audio-fallback ac3 --encoder-preset=veryfast --encoder-level="5.2" --encoder-profile=high --verbose=1"
if [ -z "$1" ] ; then
TRANSCODEDIR="."
else
TRANSCODEDIR="$1"
fi
find "$TRANSCODEDIR"/* -type f -name "*.avi" -exec bash -c 'HandBrakeCLI -i "$1" -o "${1%\.*}".mp4 --preset="$PRESET"' __ {} \; && find . -name '*.avi' -exec rm -r {} \;
BE WARNED: THIS WILL CONVERT THEN DELETE ALL .AVI FILES ABOVE THE SCRIPT IN FILE TREE!
Feel free to remove the
[-name "*.avi"] & [&& find . -name '*.avi' -exec rm -r {} \;]
to disable only converting .avi and removal of .avi or modify to suite another extension.
I have found the solution:
#!/bin/bash
SRC="/home/abc/public_html/filex/store/vids/toriko/VIDEOS HERE"
DEST="/home/abc/public_html/filex/store/vids/toriko/51-100"
DEST_EXT=mp4
HANDBRAKE_CLI=HandBrakeCLI
for FILE in "$SRC"/*
do
filename=$(basename "$FILE")
extension=${filename##*.}
filename=${filename%.*}
$HANDBRAKE_CLI -i "$FILE" -o "$DEST"/"$filename".$DEST_EXT
done
I just tried using this script with the modification suggested above. I found I need to to put double quotes around the two uses of $FILE in order to handle file names with spaces.
So...
filename=$(basename "$FILE")
and
$HANDBRAKE_CLI -i "$SRC"/"$FILE" -o "$DEST"/"$filename".$DEST_EXT "$PRESET"
I'd rather prefer this solution:
#!/bin/bash
SRC="$1"
DEST="$2"
EXT='mp4'
PRESET='iPhone & iPod Touch'
#for FILE in "`ls $SRC`"; do
for FILE in `find . -type f`; do
FILE=$(basename "$FILE")
filename=$(basename "$FILE")
extension=${filename##*.}
filename=${filename%.*}
HandBrakeCLI -i "$SRC"/$FILE -o "$DEST"/"$filename"."$EXT" "$PRESET"
done

Resources