file name comparison within specific folders and sub-folders using bash script - bash

I would like to do some file name comparison with the bash script to determine the file should run a compress routine or not.
Here what I want to do, look through the UPLOAD folder and all sub-folders (couple hundreds of folders in total), if filenameA.jpg and filenameA.orig are both exist in the same folder that means it is compressed before and no need to compress it again, otherwise will compress the filenameA.jpg file.
This way only compress the newer added file and not file already compressed before.
Can someone tell me how to do the if / loop statement using bash script? I plan to run it by Cron job.
Thank you for your help.

Use find to recursively search for all files named *.jpg.
For each file returned you would check for a corresponding ".orig" file, and based on the result compress of not.
Something like this perhaps should get you started:
find UPLOAD -type f -name '*.jpg' | while read JPG
do
ORIG="${JPG%.jpg}.orig"
if [ -s ${ORIG} ]
then
echo "File ${JPG} already compressed to ${ORIG}"
else
echo "File ${JPG} need compressing ..."
gzip -c ${JPG} > ${ORIG}
fi
done

Related

how to zip files that unzipped - shell script

I have a directory that I have been storing a lot of files so I'm working on a script to watch the disk space if it gets more than 80% then it will compress the files.
all the files end with file.#
my question is how to zip all files that end with a number without zipping the zipped files
I did the most of the script but I'm stuck with at this point
please your help
You can zip the files that are outputted by this command find . -not -name "*.zip".
Find is a command that is used well, to "find" files based on various criteria.
You can read more about it using man find or (online version) here
Simply run the zip command with -x argument to exclude already zipped files from being added to the compressed archive. The command will look like:
zip -r compressed.zip . -x "*.zip"

Why does 7z create different files?

I'm using 7z command in bash script to create a 7z archive for backup purposes. My script does also check if this newly created 7z archive exists in my backup folder and if it does, I go and run md5sum to see if content differs. So if the archive file doesn't exits yet or the md5sum differs from the previous I copy it to my backup folder. So I tried a simple example to test the script, but the problem is that I sometimes get different md5sum for the same folder I am compressing. Why is that so? Is there any other reliable way of checking if file content differs? The commands are simple:
SourceFolder="/home/user/Documents/"
for file in $SourceFolder*
do
localfile=${file##*/}
7z a -t7z "$SourceFolder${localfile}.7z" "$file"
md5value=`md5sum "$SourceFolder${localfile}.7z"|cut -d ' ' -f 1`
...copyinf files goes from here on...
The reliable way to check if two different losslessly compressed files have identical contents is to expand their contents and compare those (e.g. using md5sum). Comparing the compressed files is going to end badly sooner or later, regardless of which compression scheme you use.
I've partially solved this. It looks like it matters if you specify full path to the folder you are compressing or not. The resulting file is not the same. .This affects both 7z and tar.I mean like this:
value1=$(tar -c /tmp/at-spi2/|md5sum|cut -d ' ' -f 1)
value2=$(tar -c at-spi2/|md5sum|cut -d ' ' -f 1)
So obviously I'm doing this wrong. Is there a switch for 7z and tar which would remove absolute path?

Copying multiple files with same name in the same folder terminal script

I have a lot of files named the same, with a directory structure (simplified) like this:
../foo1/bar1/dir/file_1.ps
../foo1/bar2/dir/file_1.ps
../foo2/bar1/dir/file_1.ps
.... and many more
As it is extremely inefficient to view all of those ps files by going to the
respective directory, I'd like to copy all of them into another directory, but include
the name of the first two directories (which are those relevant to my purpose) in the
file name.
I have previously tried like this, but I cannot get which file is from where, as they
are all named consecutively:
#!/bin/bash -xv
cp -v --backup=numbered {} */*/dir/file* ../plots/;
Where ../plots is the folder where I copy them. However, they are now of the form file.ps.~x~ (x is a number) so I get rid of the ".ps.~*~" and leave only the ps extension with:
rename 's/\.ps.~*~//g' *;
rename 's/\~/.ps/g' *;
Then, as the ps files have hundreds of points sometimes and take a long time to open, I just transform them into jpg.
for file in * ; do convert -density 150 -quality 70 "$file" "${file/.ps/}".jpg; done;
This is not really a working bash script as I have to change the directory manually.
I guess the best way to do it is to copy the files form the beginning with the names
of the first two directories incorporated in the copied filename.
How can I do this last thing?
If you just have two levels of directories, you can use
for file in */*/*.ps
do
ln "$file" "${file//\//_}"
done
This goes over each ps file, and hard links them to the current directory with the /s replaced by _. Use cp instead of ln if you intend to edit the files but don't want to update the originals.
For arbitrary directory levels, you can use the bash specific
shopt -s globstar
for file in **/*.ps
do
ln "$file" "${file//\//_}"
done
But are you sure you need to copy them all to one directory? You might be able to open them all with yourreader */*/*.ps, which depending on your reader may let browse through them one by one while still seeing the full path.
You should run a find command and print the names first like
find . -name "file_1.ps" -print
Then iterate over each of them and do a string replacement of / to '-' or any other character like
${filename/\//-}
The general syntax is ${string/substring/replacement}. Then you can copy it to the required directory. The complete script can be written as follows. Haven't tested it (not on linux at the moment), so you might need to tweak the code if you get any syntax error ;)
for filename in `find . -name "file_1.ps" -print`
do
newFileName=${filename/\//-}
cp $filename YourNewDirectory/$newFileName
done
You will need to place the script in the same root directory or change the find command to look for the particular directory if you are placing the above script in some other directory.
References
string manipulation in bash
find man page

Bash script to find specific files in a hierarchy of files

I have a folder in which there are many many folder and in each of these I have lots and lots of files. I have no idea which folder each files might be located in. I will periodically receive a list of files I need to copy to a predefined destination.
The script will run on a Unix machine.
So, my little script should:
read received list
find all files in the list
copy each file to a predefined destination via SCP
step 1 and 3, I think I'll manage on my own, but how will I do step 2?
I was thinking about using "find" to locate each file and when found, write the location in a string array. When all files are found I loop through the string array, running the "SCP" command for each file-location.
I think this should work, but I've never written a bash script before so could anyone help me a little to get started? I just need a basic "find" command which finds a filename and returns the file location if the file is found.
find $dir -name $name -exec scp {} $destination \;

Batch script to move files into a zip

Is anybody able to point me in the right direction for writing a batch script for a UNIX shell to move files into a zip one at at time and then delete the original.
I cant use the standard zip function because i don't have enough space to fit the zip being created.
So any suggestions please
Try this:
zip -r -m source.zip *
Not a great solution but simple, i ended up finding a python script that recursively zips a folder and just added a line to delete the file after it is added to the zip
You can achieve this using find as
find . -type f -print0 | xargs -0 -n1 zip -m archive
This will move every file into the zip preserving the directory structure. You are then left with empty directories that you can easily remove. Moreover using find gives you a lot of freedom on what files you want to compress.
I use :
zip --move destination.zip src_file1 src_file2
Here the detail of "--move" option from the man pages
--move
Move the specified files into the zip archive; actually, this
deletes the target directories/files after making the specified zip
archive. If a directory becomes empty after removal of the files, the
directory is also removed. No deletions are done until zip has
created the archive without error. This is useful for conserving disk
space, but is potentially dangerous so it is recommended to use it in
combination with -T to test the archive before removing all input
files.

Resources