Synchronize timestamps on directories - bash

Let's say I have two directories with the same structure and I want to set the timestamps of files, contained in the second to those of the first if and only if the content of the files is the same.
I give an answer here but if you guys have less clumsy and more efficient ways of achieving the goal, that would be perfect.

An even easier way is simply:
rsync -uav /path/to/dir1/ /path/to/dir2
(removing v suppresses --verbose output)
note: the trailing '/' following dir1. It tells rsync to take the contents of dir1 instead of dir1 itself.

One possible solution is this script:
#!/bin/bash
OLDDIR=$(readlink -f $1)
NEWDIR=$(readlink -f $2)
cd $NEWDIR
for file in $(find .); do
file2=$OLDDIR/$file
if test -e "$file2" && diff >/dev/null -q "$file" "$file2" ; then
touch -r "$file2" "$file"
fi
done

Related

Tar compress files when some can be missing

I am writing a bash script that pulls files from another server to the current directory. The issue is that I get a lot of files and I only need ~3 of them; however all 3 might not be there.
For example, make server all:
server call --> file1.txt file2.txt file3.xls file4.json .... (etc)
Then compress files with tar:
tar zcf needed_files.tgz file4.json file23.doc *.txt
But file4.json was not there, so I would expect tar to compress file23.doc and all .txt files but the script fails with:
tar: file4.json: Cannot stat: No such file or directory
I have tried other combinations of tar commands like czvf but no luck.
tar should successfully compress the existing files despite the "no such file or directory" errors.
Anyway, you could also use nullglob in combination with extglob #() to get only the existing files:
shopt -s extglob nullglob
files=( "fileA"#() "fileB"#() *.txt )
(( ${#files[#]} )) && tar zcf needed_files.tgz -- "${files[#]}"
Try an extended glob.
shopt -s extglob # set extended globbing on
if echo file[1234].+(txt|xls|json) | grep -vq '\['
then tar cvzf needed_files.tgz file[1234].+(txt|xls|json)
else echo No matching files for extglob 'file[1234].+(txt|xls|json)'
fi
If matching files exist, it will list them.
If not, it will literally echo back the pattern.
grepping out the pattern metacharacters tells you whether there are any files in the set. If they do exist, use the same glob to provide the files to tar, and it will receive exactly the set of matching files. If they don't, the condition test lets you skip it.
Of course, it breaks if you make files with [ in the names, etc...
Or, you could do it in a loop....
for f in file[1234].+(txt|xls|json)
do if [[ -e "$f" ]]
then [[ -e needed_files.tar ]] && c=r || c=c
tar ${c}vf needed_files.tar "$f"
fi
done
Not perfect, but might suit your tastes better.
Neither is a great solution, but one of them ought to get you rolling.
tar zcf needed_files.tgz $(ls -d file4.json file23.doc *.txt 2>/dev/null)
Notice that prints only existing files
ls -d file4.json file23.doc *.txt 2>/dev/null
Also you can use --ignore-failed-read option, but it will also ignore other read errors.

Shell Script: How to copy files with specific string from big corpus

I have a small bug and don't know how to solve it. I want to copy files from a big folder with many files, where the files contain a specific string. For this I use grep, ack or (in this example) ag. When I'm inside the folder it matches without problem, but when I want to do it with a loop over the files in the following script it doesn't loop over the matches. Here my script:
ag -l "${SEARCH_QUERY}" "${INPUT_DIR}" | while read -d $'\0' file; do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done
SEARCH_QUERY holds the String I want to find inside the files, INPUT_DIR is the folder where the files are located, OUTPUT_DIR is the folder where the found files should be copied to. Is there something wrong with the while do?
EDIT:
Thanks for the suggestions! I took this one now, because it also looks for files in subfolders and saves a list with all the files.
ag -l "${SEARCH_QUERY}" "${INPUT_DIR}" > "output_list.txt"
while read file
do
echo "${file##*/}"
cp "${file}" "${OUTPUT_DIR}/${file##*/}"
done < "output_list.txt"
Better implement it like below with a find command:
find "${INPUT_DIR}" -name "*.*" | xargs grep -l "${SEARCH_QUERY}" > /tmp/file_list.txt
while read file
do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done < /tmp/file_list.txt
rm /tmp/file_list.txt
or another option:
grep -l "${SEARCH_QUERY}" "${INPUT_DIR}/*.*" > /tmp/file_list.txt
while read file
do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done < /tmp/file_list.txt
rm /tmp/file_list.txt
if you do not mind doing it in just one line, then
grep -lr 'ONE\|TWO\|THREE' | xargs -I xxx -P 0 cp xxx dist/
guide:
-l just print file name and nothing else
-r search recursively the CWD and all sub-directories
match these works alternatively: 'ONE' or 'TWO' or 'THREE'
| pipe the output of grep to xargs
-I xxx name of the files is saved in xxx it is just an alias
-P 0 run all the command (= cp) in parallel (= as fast as possible)
cp each file xxx to the dist directory
If i understand the behavior of ag correctly, then you have to
adjust the read delimiter to '\n' or
use ag -0 -l to force delimiting by '\0'
to solve the problem in your loop.
Alternatively, you can use the following script, that is based on find instead of ag.
while read file; do
echo "$file"
cp "$file" "$OUTPUT_DIR/$file"
done < <(find "$INPUT_DIR" -name "*$SEARCH_QUERY*" -print)

Collapse nested directories in bash

Often after unzipping a file I end up with a directory containing nothing but another directory (e.g., mkdir foo; cd foo; tar xzf ~/bar.tgz may produce nothing but a bar directory in foo). I wanted to write a script to collapse that down to a single directory, but if there are dot files in the nested directory it complicates things a bit.
Here's a naive implementation:
mv -i $1/* $1/.* .
rmdir $1
The only problem here is that it'll also try to move . and .. and ask overwrite ./.? (y/n [n]). I can get around this by checking each file in turn:
IFS=$'\n'
for file in $1/* $1/.*; do
if [ "$file" != "$1/." ] && [ "$file" != "$1/.." ]; then
mv -i $file .
fi
done
rmdir $1
But this seems like an inelegant workaround. I tried a cleaner method using find:
for file in $(find $1); do
mv -i $file .
done
rmdir $1
But find $1 will also give $1 as a result, which gives an error of mv: bar and ./bar are identical.
While the second method seems to work, is there a better way to achieve this?
Turn on the dotglob shell option, which allows the your pattern to match files beginning with ..
shopt -s dotglob
mv -i "$1"/* .
rmdir "$1"
First, consider that many tar implementations provide a --strip-components option that allows you to strip off that first path. Not sure if there is a first path?
tar -tf yourball.tar | awk -F/ '!s[$1]++{print$1}'
will show you all the first-level contents. If there is only that one directory, then
tar --strip-components=1 -tf yourball.tar
will extract the contents of that directory in tar into the current directory.
So that's how you can avoid the problem altogether. But it's also a solution to your immediate problem. Having extracted the files already, so you have
foo/bar/stuff
foo/bar/.otherstuff
you can do
tar -cf- foo | tar --strip-components=2 -C final_destination -xf-
The --strip-components feature is not part of the POSIX specification for tar, but it is on both the common GNU and OSX/BSD implementations.

Recursively rename image collection with subfolders

I'm trying to rename files in a huge folder of images, that contains lots of subfolders and within them images.
Something like this:
ImageCollection/
January/
Movies/
123123.jpg
asd.jpg
Landscapes/
qweqas.jpg
February/
Movies/
ABC.jpg
QWY.jpg
Landscapes/
t.jpg
And I want to run the script and rename them in ascending order but keeping them in their corresponding folder, like this:
ImageCollection/
January/
Movies/
0.jpg
1.jpg
Landscapes/
2.jpg
February/
Movies/
3.jpg
4.jpg
Landscapes/
5.jpg
Until now I have the following:
#!/usr/bin/env bash
x=0
for i path/to/dir/*/*.jpg; do
new=$(printf path/to/dir/%d ${x})
mv ${i} ${new}
let x=x+1
done
But my problem, relies on not being able to keep the files in their corresponding subfolders, instead everything is moved to the path/to/dir root folder.
A pure Bash solution (except from the mv, of course):
#!/bin/bash
shopt -s nullglob
### Optional: if you also want the .JPG (uppercase) files
# shopt -s nocaseglob
i=1
for file in ImageCollection/*/*.jpg; do
dirname=${file%/*}
newfile=$dirname/$i.jpg
echo mv "$file" "$newfile" && ((++i))
done
This will not perform the renaming, only show what's going to happen. Remove the echo if your happy with the result you see.
You can use the -n option to mv too, so as to not overwrite existing files. (I would definitely use it in this case!). If -n is not available, you may use:
[[ ! -e $newfile ]] && mv "$file" "$newfile" && ((++i))
This is 100% safe regarding filenames (or dirnames) containing spaces or other funny symbols.
#!/bin/bash
x=0
for f in `find path_to_main_dir_or_top_folder | grep "\.jpg$"`;
do
mv $f $(dirname $f)/$x.jpg && ((x++))
done
echo shenzi
$f will hold all files with full path for all *.jpg files
dirname command will give you the fill path (excluding the filename) for a given $f file.
$x.jpg will do the trick. $x value will increment per iteration in the loop.

BASH parameters with wildcard

I'm trying to do a bash script that will find & copy similar files to a destination directory.
For example, I'm passing a parameter 12300 to a script and I want to copy all files that start with 12300... to a new directory.
like this:
sh script.sh 12300
and here's the script:
if [ -f /home/user/bashTest/$#*.jpg ]
then
cp /home/user/bashTest/$#*.jpg /home/user/bashTest/final/
fi
This just doesn't work. I have tried all kinds of solutions but nothing has worked.
The question is: How can I use wildcard with parameter?
When you're checking for multiple files with -f or -e it can get nasty. I recommend kenfallon's blog. This is something like what he would recommend:
#! /bin/bash
ls -l /home/user/bashTest/$1*.jpg > /dev/null
if [ "$?" = "0" ]
then
cp /home/user/bashTest/$1*.jpg /home/user/bashTest/final/
fi
Not sure how the $# would play in here, or if it's required.
Enclose the thing that expands to the parameters in {}, i.e. /home/user/bashTest/${#}*.jpg. You should use $1 instead of $# in your case however as you only seem to be able to handle the first argument given to the script. $1 expands to the first argument, $2 to the second etc.
You also need a loop to iterate over all files that this glob expands to, e.g.
for file in /tmp/${#}*.jpg
do
if [ -f $file ]
then
echo $file
fi
done
Here is a solution:
#!/bin/bash
cp /home/user/bashTest/${1}*.jpg /home/user/bashTest/final/
Discussion
In this case, a simple cp command will do
I have tested it with files that have embedded spaces
Write this in script.sh:
cp /home/user/bashTest/$1*.jpg /home/user/bashTest/final/
That's all.
UPD. #macduff solution usefull too.
This will find all of them in your $HOME directory and subdirectories (you may wish to tweak find to follow/not follow symlinks and/or adjust the $HOME base directory where it starts the search)
#!/bin/sh
DEST=/your/dest/folder
for FILE in `find "$HOME" -iname "$1"*`;do
[ -f "$FILE" ] && mv "$FILE" "$DEST/$FILE"
#or ln -s ...if you want to keep it in its original location
done
if you want to do multiple patterns using $#
for PATTERN in $#; do
for FILE in `find "$HOME" -iname "$PATTERN"*`;do
[ -f "$FILE" ] && mv "$FILE" "$DEST/$FILE"
done
done

Resources