Bash to determine file size - image

Still learning bash but I had some questions in regards to my script.
My goal with the script is to access a folder with jpg images and if an image is 34.9kb it will return file not present. 34.9kb is the size of the image that shows "image not present".
#!/bin/bash
#Location
DIR="/mnt/windows/images"
file=file.jpg
badfile=12345
actualsize=$(du -b "$file" | cut -f 1)
if [ $actualsize -ge $badfile ]; then
echo $file does not exist >> results.txt
else
echo $file exists >> results.txt
fi
I need it to print each line to a txt file named results. I did research where some people either suggested using du -b or stat -c '%s' but I could not see what the pros and cons would be for using one or the other. Would the print to file come after the if else or stay with the if since Im printing for each file?? I need to print the name and result in the same line. What would be the best way to echo the file??

stat -c '%s' will give you the file size and nothing else, while du -b will include the file name in the output, so you'll have to use for instance cut or awk to get just the file size. For your requirements I'd go with stat.

Based on your question and your comments on your following question I'm assuming what you want to do is:
Iterate through all the *.jpg files in a specific directory
Run different commands depending on the size of the image
Specifically, you want to print "[filename] does not exist" if the file is of size 40318 bytes.
If my assumptions are close, then this should get you started:
# Location
DIR="/home/lsc"
# Size to match
BADSIZE=40318
find "$DIR" -maxdepth 1 -name "*.jpg" | while read filename; do
FILESIZE=$(stat -c "%s" "$filename") # get file size
if [ $FILESIZE -eq $BADSIZE ]; then
echo "$filename has a size that matches BADSIZE"
else
echo "$filename is fine"
fi
done
Note that I've used "find ... | while read filename" instead of "for filename in *.jpg" because the former can better handle paths that contain spaces.
Also note that $filename will contain the full path the the file (e.g. /mnt/windows/images/pic.jpg). If you want to only print the filename without the path, you can use either:
echo ${filename##*/}
or:
echo $(basename $filename)
The first uses Bash string maniputation which is more efficient but less readable, and the latter does so by making a call to basename.

Related

Add time stamp to file which has at least one row in UNIX

I have list of files at a location ${POWERCENTER_FILE_DIR} .
The files consist of row header and values.
MART_Apple.csv
MART_SAMSUNG.csv
MART_SONY.csv
MART_BlackBerry.csv
Requirements:
select only those files which has atleast 1 row.
Add time stamp to the files which has at least 1 row.
For example:
If all the files except MART_BlackBerry.csv has atleast one row then my output files names should be
MART_Apple_20170811112807.csv
MART_SAMSUNG_20170811112807.csv
MART_SONY_20170811112807.csv
Code tried so far
#!/bin/ksh
infilename=${POWERCENTER_FILE_DIR}MART*.csv
echo File name is ${infilename}
if [ wc -l "$infilename"="0" ];
then
RV=-1
echo "input file name cannot be blank or *"
exit $RV
fi
current_timestamp=`date +%Y%m%d%H%M%S`
filename=`echo $infilename | cut -d"." -f1 `
sftpfilename=`echo $filename`_${current_timestamp}.csv
cp -p ${POWERCENTER_FILE_DIR}$infilename ${POWERCENTER_FILE_DIR}$sftpfilename
RV=$?
if [[ $RV -ne 0 ]];then
echo Adding timestamp to ${POWERCENTER_FILE_DIR}$infilename failed ... Quitting
echo Return Code $RV
exit $RV
fi
Encountering errors like:
line 3: [: -l: binary operator expected
cp: target `MART_Apple_20170811121023.csv' is not a directory
failed ... Quitting
Return Code 1
to be frank, i am not able to apprehend the errors nor i am sure i am doing it right. Beginner in unix scripting.Can any experts guide me where to the correct way.
Here's an example using just find, sh, mv, basename, and date:
find ${POWERCENTER_FILE_DIR}MART*.csv ! -empty -execdir sh -c "mv {} \$(basename -s .csv {})_\$(date +%Y%m%d%H%M%S).csv" \;
I recommend reading Unix Power Tools for more ideas.
When it comes to shell scripting there is rarely a single/one/correct way to accomplish the desired task.
Often times you may need to trade off between readability vs maintainability vs performance vs adhering-to-some-local-coding-standard vs shell-environment-availability (and I'm sure there are a few more trade offs). So, fwiw ...
From your requirement that you're only interested in files with at least 1 row, I read this to also mean that you're only interested in files with size > 0.
One simple ksh script:
#!/bin/ksh
# define list of files
filelist=${POWERCENTER_FILE_DIR}/MART*.csv
# grab current datetime stamp
dtstamp=`date +%Y%m%d%H%M%S`
# for each file in our list ...
for file in ${filelist}
do
# each file should have ${POWERCENTER_FILE_DIR} as a prefix;
# uncomment 'echo' line for debugging purposes to display
# the contents of the ${file} variable:
#echo "file=${file}"
# '-s <file>' => file exists and size is greater than 0
# '! -s <file>' => file doesn't exist or size is equal to 0, eg, file is empty in our case
#
# if the file is empty, skip/continue to next file in loop
if [ ! -s ${file} ]
then
continue
fi
# otherwise strip off the '.csv'
filebase=${file%.csv}
# copy our current file to a new file containing the datetime stamp;
# keep in mind that each ${file} already contains the contents of the
# ${POWERCENTER_FILE_DIR} variable as a prefix; uncomment 'echo' line
# for debug purposes to see what the cp command looks like:
#echo "cp command=cp ${file} ${filebase}.${dtstamp}.csv"
cp ${file} ${filebase}.${dtstamp}.csv
done
A few good resources for learning ksh:
O'Reilly: Learning the Korn Shell
O'Reilly: Learning the Korn Shell, 2nd Edition (includes the newer ksh93)
at your UNIX/Linux command line: man ksh
A simplified script would be something like
#!/bin/bash
# Note I'm using bash above, can't guarantee (but I hope) it would work in ksh too.
for file in ${POWERCENTER_FILE_DIR}/*.csv # Check Ref [1]
do
if [ "$( wc -l "$file" | grep -Eo '^[[:digit:]]+' )" -ne 0 ] # checking at least one row? Check Ref [2]
then
mv "$file" "${file%.csv}$(date +'%Y%m%d%H%M%S').csv" # Check Ref [3]
fi
done
References
File Globbing [1]
Command Substitution [2]
Parameter Substitution [3]

Select modified files using AWK

I am working on a task for which AWK is the designated tool.
The task is to list files that are:
modified today (same day the script is run)
of size 1 MB or less (size <= 1048576 bytes)
User's input is to instruct where to start search.
Search for files recursively.
Script:
#!/bin/bash
#User's input target for search of files.
target="$1"
#Absolute path of target.
ap="$(realpath $target)"
echo "Start search in: $ap/*"
#Today's date (yyyy-mm-dd).
today="$(date '+%x')"
#File(s) modified today.
filemod="$(find $target -newermt $today)"
#Loop through files modified today.
for fm in $filemod
do
#Print name and size of file if no larger than 1 MiB.
ls -l $fm | awk '{if($5<=1048576) print $5"\t"$9}'
done
My problem is that the for-loop does not mind the size of files!
Every variable gets its intended value. AWK does what it should outside a for-loop. I've experimented with quotation marks to no avail.
Can anyone tell what's wrong?
I appreciate any feedback, thanks.
Update:
I've solved it by searching explicitly for files:
filemod="$(find $target -type f -newermt $today)"
How come that matters?
Don't parse 'ls' output. Use stat instead, like this:
for fm in $filemod; do
size=$(stat --printf='%s\n' "$fm")
if (( size <= 1048576)); then
printf "%s\t%s\n" "$size" "$fm"
fi
done
The above method is not immune to files that have white spaces or wild cards in their name. To handle such files gracefully, do this:
while IFS= read -r -d '' file; do
size=$(stat --printf='%s\n' "$file")
if (( size <= 1048576)); then
printf "%s\t%s\n" "$size" "$file"
fi
done < <(find $target -newermt $today -print0)
See also:
How to loop through file names returned by find?

Comparing files in the same directory with same name different extension [duplicate]

This question already has answers here:
Looping over pairs of values in bash [duplicate]
(6 answers)
Closed 6 years ago.
I have a bash script that looks through a directory and creates a .ppt from a .pdf, but i want to be able to check to see if there is a .pdf already for the .ppt because if there is I don't want to create one and if the .pdf is timestamped older then the .ppt I want to update it. I know for timestamp I can use (date -r bar +%s) but I cant seem how to figure out how to compare the files with the same name if they are in the same folder.
This is what I have:
#!/bin/env bash
#checks to see if argument is clean if so it deletes the .pdf and archive files
if [ "$1" = "clean" ]; then
rm -f *pdf
else
#reads the files that are PPT in the directory and copies them and changes the extension to .pdf
ls *.ppt|while read FILE
do
NEWFILE=$(echo $FILE|cut -d"." -f1)
echo $FILE": " $FILE " "$NEWFILE: " " $NEWFILE.pdf
cp $FILE $NEWFILE.pdf
done
fi
EDITS:
#!/bin/env bash
#checks to see if argument is clean if so it deletes the .pdf and archive files
if [ "$1" = "clean" ]; then
rm -f *pdf lectures.tar.gz
else
#reads the files that are in the directory and copies them and changes the extension to .pdf
for f in *.ppt
do
[ "$f" -nt "${f%ppt}pdf" ] &&
nf="${f%.*}"
echo $f": " $f " "$nf: " " $nf.pdf
cp $f $nf.pdf
done
To loop through all ppt files in the current directory and test to see if they are newer than the corresponding pdf and then do_something if they are:
for f in *.ppt
do
[ "$f" -nt "${f%ppt}pdf" ] && do_something
done
-nt is the bash test for one file being newer than another.
Notes:
Do not parse ls. The output from ls often contains a "displayable" form of the filename, not the actual filename.
The construct for f in *.ppt will work reliably all file names, even ones with tabs, or newlines in their names.
Avoid using all caps for shell variables. The system uses all caps for its variables and you do not want to accidentally overwrite one. Thus, use lower case or mixed case.
The shell has built-in capabilities for suffix removal. So, for example, newfile=$(echo $file |cut -d"." -f1) can be replaced with the much more efficient and more reliable form newfile="${file%%.*}". This is particularly important in the odd case that the file's name ends with a newline: command substitution removes all trailing newlines but the bash variable expansions don't.
Further, note that cut -d"." -f1 removes everything after the first period. If a file name has more than one period, this is likely not what you want. The form, ${file%.*}, with just one %, removes everything after the last period in the name. This is more likely what you want when you are trying to remove standard extensions like ppt.
Putting it all together
#!/bin/env bash
#checks to see if argument is clean if so it deletes the .pdf and archive files
if [ "$1" = "clean" ]; then
rm -f ./*pdf lectures.tar.gz
else
#reads the files that are in the directory and copies them and changes the extension to .pdf
for f in ./*.ppt
do
if [ "$f" -nt "${f%ppt}pdf" ]; then
nf="${f%.*}"
echo "$f: $f $nf: $nf.pdf"
cp "$f" "$nf.pdf"
fi
done
fi

Finding files in list using bash array loop

I'm trying to write a script that reads a file with filenames, and outputs whether or not those files were found in a directory.
Logically I'm thinking it goes like this:
$filelist = prompt for file with filenames
$directory = prompt for directory path where find command is performed
new Array[] = read $filelist line by line
for i, i > numberoflines, i++
if find Array[i] in $directory is false
echo "$i not found"
export to result.txt
I've been having a hard time getting Bash to do this, any ideas?
First, I would just assume that all the file-names are supplied on standard input. E.g., if the file names.txt contains the file-names and check.sh is the script, you can invoke it like
cat names.txt | ./script.sh
to obtain the desired behaviour (i.e., using the file-names from names.txt).
Second, inside script.sh you can loop as follows over all lines of the standard input
while read line
do
... # do your checks on $line here
done
Edit: I adapted my answer to use standard input instead of command line arguments, due to the problem indicated by #rici.
while read dirname
do
echo $dirname >> result.txt
while read filename
do
find $dirname -type f -name $filename >> result.txt
done <filenames.txt
done <dirnames.txt

bash scripting and conditional statements

I am trying to run a simple bash script but I am struggling on how to incoperate a condition. any pointers. the loop says. I would like to incoperate a conditions such that when gdalinfo cannot open the image it copies that particular file to another location.
for file in `cat path.txt`; do gdalinfo $file;done
works fine in opening the images and also shows which ones cannot be opened.
the wrong code is
for file in `cat path.txt`; do gdalinfo $file && echo $file; else cp $file /data/temp
Again, and again and again - zilion th again...
Don't use contsructions like
for file in `cat path.txt`
or
for file in `find .....`
for file in `any command what produces filenames`
Because the code will BREAK immediatelly, when the filename or path contains space. Never use it for any command what produces filenames. Bad practice. Very Bad. It is incorrect, mistaken, erroneous, inaccurate, inexact, imprecise, faulty, WRONG.
The correct form is:
for file in some/* #if want/can use filenames directly from the filesystem
or
find . -print0 | while IFS= read -r -d '' file
or (if you sure than no filename contains a newline) can use
cat path.txt | while read -r file
but here the cat is useless, (really - command what only copies a file to STDOUT is useless). You should use instead
while read -r file
do
#whatever
done < path.txt
It is faster (doesn't fork a new process, as do in case of every pipe).
The above whiles will fill the corect filename into the variable file in cases when the filename contains a space too. The for will not. Period. Uff. Omg.
And use "$variable_with_filename" instead of pure $variable_with_filename for the same reason. If the filename contains a white-space any command will misunderstand it as two filenames. This probably not, what you want too..
So, enclose any shell variable what contain a filename with double quotes. (not only filename, but anything what can contain a space). "$variable" is correct.
If i understand right, you want copy files to /data/temp when the gdalinfo returns error.
while read -r file
do
gdalinfo "$file" || cp "$file" /data/temp
done < path.txt
Nice, short and safe (at least if your path.txt really contains one filename per line).
And maybe, you want use your script more times, therefore dont out the filename inside, but save the script in a form
while read -r file
do
gdalinfo "$file" || cp "$file" /data/temp
done
and use it like:
mygdalinfo < path.txt
more universal...
and maybe, you want only show the filenames for what gdalinfo returns error
while read -r file
do
gdalinfo "$file" || printf "$file\n"
done
and if you change the printf "$file\n" to printf "$file\0" you can use the script in a pipe safely, so:
while read -r file
do
gdalinfo "$file" || printf "$file\0"
done
and use it for example as:
mygdalinfo < path.txt | xargs -0 -J% mv % /tmp/somewhere
Howgh.
You can say:
for file in `cat path.txt`; do gdalinfo $file || cp $file /data/temp; done
This would copy the file to /data/temp if gdalinfo cannot open the image.
If you want to print the filename in addition to copying it in case of failure, say:
for file in `cat path.txt`; do gdalinfo $file || (echo $file && cp $file /data/temp); done

Resources