Unix to verify file has no content and empty lines - bash

How to verify that a file has absolutely no content. [ -s $file ] gives if file is zero bytes but how to know if file is absolutely empty with no data that including empty lines ?
$cat sample.text
$ ls -lrt sample.text
-rw-r--r-- 1 testuser userstest 1 Jul 31 16:38 sample.text
When i "vi" the file the bottom has this - "sample.text" 1L, 1C

Your file might have new line character only.
Try this check:
[[ $(tr -d "\r\n" < file|wc -c) -eq 0 ]] && echo "File has no content"

A file of 0 size by definition has nothing in it, so you are good to go. However, you probably want to use:
if [ \! -s f ]; then echo "0 Sized and completely empty"; fi
Have fun!

Blank lines add data to the file and will therefore increase the file size, which means that just checking whether the file is 0 bytes is sufficient.
For a single file, the methods using the bash built-in -s (for test, [ or [[). ([[ makes dealing with ! less horrible, but is bash-specific)
fn="file"
if [[ -f "$fn" && ! -s "$fn" ]]; then # -f is needed since -s will return false on missing files as well
echo "File '$fn' is empty"
fi
A (more) POSIX shell compatible way: (The escaping of exclamation marks can be shell dependant)
fn="file"
if test -f "$fn" && test \! -s "$fn"; then
echo "File '$fn' is empty"
fi
For multiple files, find is a better method.
For a single file you can do: (It will print the filename if empty)
find "$PWD" -maxdepth 1 -type f -name 'file' -size 0 -print
for multiple files, matching the glob glob*:(It will print the filenames if empty)
find "$PWD" -maxdepth 1 -type f -name 'glob*' -size 0 -print
To allow subdirectories:
find "$PWD" -type f -name 'glob*' -size 0 -print
Some find implementations does not require a directory as the first parameter (some do, like the Solaris one). On most implementations, the -print parameter can be omitted, if it is not specified, find defaults to printing matching files.

Related

Bash Script to Prepend a Single Random Character to All Files In a Folder

I have an audio sample library with thousands of files. I would like to shuffle/randomize the order of these files. Can someone provide me with a bash script/line that would prepend a single random character to all files in a folder (including files in sub-folders). I do not want to prepend a random character to any of the folder names though.
Example:
Kickdrum73.wav
Kickdrum SUB.wav
Kick808.mp3
Renamed to:
f_Kickdrum73.wav
!_Kickdrum SUB.wav
4_Kick808.mp3
If possible, I would like to be able to run this script more than once, but on subsequent runs, it just changes the randomly prepended character instead of prepending a new one.
Some of my attempts:
find ~/Desktop/test -type f -print0 | xargs -0 -n1 bash -c 'mv "$0" "a${0}"'
find ~/Desktop/test/ -type f -exec mv -v {} $(cat a {}) \;
find ~/Desktop/test/ -type f -exec echo -e "Z\n$(cat !)" > !Hat 15.wav
for file in *; do
mv -v "$file" $RANDOM_"$file"
done
Note: I am running on macOS.
Latest attempt using code from mr. fixit:
find . -type f -maxdepth 999 -not -name ".*" |
cut -c 3- - |
while read F; do
randomCharacter="${F:2:1}"
if [ $randomCharacter == '_' ]; then
new="${F:1}"
else
new="_$F"
fi
fileName="`basename $new`"
newFilename="`jot -r -c $fileName 1 A Z`"
filePath="`dirname $new`"
newFilePath="$filePath$newFilename"
mv -v "$F" "$newFilePath"
done
Here's my first answer, enhanced to do sub-directories.
Put the following in file randomize
if [[ $# != 1 || ! -d "$1" ]]; then
echo "usage: $0 <path>"
else
find $1 -type f -not -name ".*" |
while read F; do
FDIR=`dirname "$F"`
FNAME=`basename "$F"`
char2="${FNAME:1:1}"
if [ $char2 == '_' ]; then
new="${FNAME:1}"
else
new="_$FNAME"
fi
new=`jot -r -w "%c$new" 1 A Z`
echo mv "$F" "${FDIR}/${new}"
done
fi
Set the permissions with chmod a+x randomize.
Then call it with randomize your/path.
It'll echo the commands required to rename everything, so you can examine them to ensure they'll work for you. If they look right, you can remove the echo from the 3rd to last line and rerun the script.
cd ~/Desktop/test, then
find . -type f -maxdepth 1 -not -name ".*" |
cut -c 3- - |
while read F; do
char2="${F:2:1}"
if [ $char2 == '_' ]; then
new="${F:1}"
else
new="_$F"
fi
new=`jot -r -w "%c$new" 1 A Z`
mv "$F" "$new"
done
find . -type f -maxdepth 1 -not -name ".*" will get all the files in the current directory, but not the hidden files (names starting with '.')
cut -c 3- - will strip the first 2 chars from the name. find outputs paths, and the ./ gets in the way of processing prefixes.
while read VAR; do <stuff>; done is a way to deal with one line at a time
char2="${VAR:2:1} sets a variable char2 to the 2nd character of the variable VAR.
if - then - else sets new to the filename, either preceded by _ or with the previous random character stripped off.
jot -r -w "%c$new" 1 A Z tacks random 1 character from A-Z onto the beginning of new
mv old new renames the file
You can also do it all in bash and there are several ways to approach it. The first is simply creating an array of letters containing whatever letters you want to use as a prefix and then generating a random number to use to choose the element of the array, e.g.
#!/bin/bash
letters=({0..9} {A..Z} {a..z}) ## array with [0-9] [A-Z] [a-z]
for i in *; do
num=$(($RANDOM % 63)) ## generate number
## remove echo to actually move file
echo "mv \"$i\" \"${letters[num]}_$i\"" ## move file
done
Example Use/Output
Current the script outputs the changes it would make, you must remove the echo "..." surrounding the mv command and fix the escaped quotes to actually have it apply changes:
$ bash ../randprefix.sh
mv "Kick808.mp3" "4_Kick808.mp3"
mv "Kickdrum SUB.wav" "h_Kickdrum SUB.wav"
mv "Kickdrum73.wav" "l_Kickdrum73.wav"
You can also do it by generating a random number representing the ASCII character between 48 (character '0') through 126 (character '~'), excluding 'backtick'), and then converting the random number to an ASCII character and prefix the filename with it, e.g.
#!/bin/bash
for i in *; do
num=$((($RANDOM % 78) + 48)) ## generate number for '0' - '~'
letter=$(printf "\\$(printf '%03o' "$num")") ## letter from number
while [ "$letter" = '`' ]; do ## exclude '`'
num=$((($RANDOM % 78) + 48)) ## generate number
letter=$(printf "\\$(printf '%03o' "$num")")
done
## remove echo to actually move file
echo "mv \"$i\" \"${letter}_$i\"" ## move file
done
(similar output, all punctuation other than backtick is possible)
In each case you will want to place the script in your path or call it from within the directory you want to move the file in (you split split dirname and basename and join them back together to make the script callable passing the directory to search as an argument -- that is left to you)

Find single line files and move them to a subfolder

I am using the following bash line to find text files in a subfolder with a given a pattern inside it and move them to a subfolder:
find originalFolder/ -maxdepth 1 -type f -exec grep -q 'mySpecificPattern' {} \; -exec mv -i {} destinationFolder/ \;
Now instead of grepping a pattern, I would like to move the files to a subfolder if they consist only of a single line (of text): how can I do that?
You can do it this way:
while IFS= read -r -d '' file; do
[[ $(wc -l < "$file") -eq 1 ]] && echo mv -i "$file" destinationFolder/
done < <(find originalFolder/ -maxdepth 1 -type f -print0)
Note use of echo in front of mv so that you can verify output before actually executing mv. Once you're satisfied with output, remove echo before mv.
Using wc as shown above is the most straightforward way, although it reads the entire file to determine the length. It's also possible to do length checks with awk, and the exit function lets you fit that into a find command.
find . -type f -exec awk 'END { exit (NR==1 ? 0 : 1) } NR==2 { exit 1 }' {} \; -print
The command returns status 0 if there has been only 1 input record at end-of-file, and it also exits immediately with status 1 when line 2 is encountered; this should easily outrun wc if large files are a performance concern.

find emitting unexpected ".", making wc -l list more contents than expected

I'm trying to use the newer command as follows:
touch $HOME/mark.start -d "$d1"
touch $HOME/mark.end -d "$d2"
SF=$HOME/mark.start
EF=$HOME/mark.end
find . -newer $SF ! -newer $EF
But this gives me an output like this:
.
./File5
and counts it as 2 files, however that directory only has 1 file i.e., File5. Why is this happening and how to solve it?
UPDATE:
I'm actually trying to run the following script:
#!/bin/bash
check_dir () {
d1=$2
d2=$((d1+1))
f1=`mktemp`
f2=`mktemp`
touch -d $d1 $f1
touch -d $d2 $f2
n=$(find $1 \( -name "*$d1*" \) -o \( -newer $f1 ! -newer $f2 \) | wc -l)
if [ $n != $3 ]; then echo $1 "=" $n ; fi
rm -f $f1 $f2
}
That checks if the directory has file that either has a particular date in the format YYYMMDD or if its last modification time was last 1 day.
check_dir ./dir1 20151215 4
check_dir ./dir2 20151215 3
where in dir1 there should be 4 such files and if it is not true then it will print the actual number of files that is there.
So, when the directory only has file with dates in their name, then it checks them fine, but when it checks with newer, it always gives 1 file extra (which is not even there in the directory). Why is this happening???
The question asks why there's an extra . in the results from find, even when no file or directory by that name exists. The answer is simple: . always exists, even when it's hidden. Use ls -a to show hidden contents, and you'll see that it's present.
Your existing find command doesn't exempt the target directory itself -- . -- from being a legitimate result, which is why you're getting more results than you expect.
Add the following filter:
-mindepth 1 # only include content **under** the file or directory specified
...or, if you only want to count files, use...
-type f # only include regular files
Assuming GNU find, by the way, this all can be made far more efficient:
check_dir() {
local d1 d2 # otherwise these variables leak into global scope
d1=$2
d2=$(gdate -d "+ 1 day $d1" '+%Y%m%d') # assuming GNU date is installed as gdate
n=$(find "$1" -mindepth 1 \
-name "*${d1}*" -o \
'(' -newermt "$d1" '!' -newermt "$d2" ')' \
-printf '\n' | wc -l)
if (( n != $3 )); then
echo "$1 = $n"
fi
}

find executables in my PATH with a particular string

Is there a way to quickly know whether an executable in my $PATH contains a particular string? For instance, I want to quickly list the executables that contain SRA.
The reason I'm asking is that I have several scripts with the characters SRA in them. The problem is that I always forget the starting character of the file (if I do remember, I use tab completion to find it).
You can store all the paths in an array and then use find with its various options:
IFS=":" read -ra paths <<< "$PATH"
find "${paths[#]}" -type f -executable -name '*SRA*'
IFS=":" read -ra paths <<< "$PATH" reads all the paths into an array, setting the field separator temporarily to :, as seen in Setting IFS for a single statement.
-type f looks for files.
-executable looks for files that are executable. You may use -perm +111 instead in OSX (source).
Since the -executable option is not available in FreeBSD or OSX, ghoti nicely recommends to use the -perm option:
find -perm -o=x,-g=x,-u=x
For example:
find ${PATH//:/ } -maxdepth 1 -executable -name '*SRA*'
And if you happen to have spaces (or other hazardous characters) in the $PATH (the <<< trick borrowed from the answer of #fedorqui):
tr ":\n" "\\000" <<< "$PATH" | \
xargs -0r -I{} -n1 find {} -maxdepth 1 -executable -name '*SRA*'
It also handles the empty $PATH correctly.
A bit clumsy:
find $(echo $PATH | tr : ' ') -name \*SRA\*
I wrote a bash script that wraps this up for OSX, based off this great answer on this page.
I think will work for other operating systems as well. Note that it also ignores errors, sorts the results, and only shows unique values too!
executables_in_path_matching_substring.sh
#!/usr/bin/env bash
function show_help()
{
ME=$(basename "$0")
IT=$(cat <<EOF
returns a list of files in the path that match a substring
usage: $ME SUBSTRING
e.g.
# Find all files in the path that have "git" in their name
$ME git
EOF
)
echo "$IT"
echo
exit
}
if [ -z "$1" ]
then
show_help
fi
if [ "$1" == "help" ] || [ "$1" == '?' ] || [ "$1" == "--help" ] || [ "$1" == "h" ]; then
show_help
fi
SUBSTRING="$1"
IFS=":" read -ra paths <<< "$PATH"
find "${paths[#]}" -type f -perm +111 -name "*$SUBSTRING*" 2>/dev/null | sort | uniq

Bash Script: unary operator expected error?

#!/usr/bin/env bash
FILETYPES=( "*.html" "*.css" "*.js" "*.xml" "*.json" )
DIRECTORIES=`pwd`
MIN_SIZE=1024
for currentdir in $DIRECTORIES
do
for i in "${FILETYPES[#]}"
do
find $currentdir -iname "$i" -exec bash -c 'PLAINFILE={};GZIPPEDFILE={}.gz; \
if [ -e $GZIPPEDFILE ]; \
then if [ `stat --printf=%Y $PLAINFILE` -gt `stat --printf=%Y $GZIPPEDFILE` ]; \
then gzip -k -4 -f -c $PLAINFILE > $GZIPPEDFILE; \
fi; \
elif [ `stat --printf=%s $PLAINFILE` -gt $MIN_SIZE ]; \
then gzip -k -4 -c $PLAINFILE > $GZIPPEDFILE; \
fi' \;
done
done
This script compresses all web static files using gzip. When I try to run it, I get this error bash: line 5: [: 93107: unary operator expected. What is going wrong in this script?
You need to export the MIN_SIZE variable. The bash you are having find spawn doesn't have a value for it so the script runs (as I just mentioned in my comment on #ooga's answer) [ $result_from_stat -gt ] which is an error and (when the result is 93107) gets you [ 93107 -gt ] which (if you run that in your shell) gets you output of:
$ [ 93107 -gt ]
-bash: [: 93107: unary operator expected
This could be simpler:
#!/usr/bin/env bash
FILETYPES=(html css js xml json)
DIRECTORIES=("$PWD")
MIN_SIZE=1024
IFS='|' eval 'FILTER="^.*[.](${FILETYPES[*]})\$"'
for DIR in "${DIRECTORIES[#]}"; do
while IFS= read -ru 4 FILE; do
GZ_FILE=$FILE.gz
if [[ -e $GZ_FILE ]]; then
[[ $GZ_FILE -ot "$FILE" ]] && gzip -k -4 -c "$FILE" > "$GZ_FILE"
elif [[ $(exec stat -c '%s' "$FILE") -ge MIN_SIZE ]]; then
gzip -k -4 -c "$FILE" > "$GZ_FILE"
fi
done 4< <(exec find "$DIR" -mindepth 1 -type f -regextype egrep -iregex "$FILTER")
done
There's no need to use pwd. You can just have $PWD. And probably what you needed was an array variable as well.
Instead of calling bash multiple times as an argument to find with static string commands, just read input from a pipe or better yet from a named pipe through process substitution.
Instead of comparing stats, you can just use -ot or -nt.
You don't need -f if you're writing the output through redirection (>) as that form of redirection overwrites the target by default.
You can just call find against multiple files once by making a pattern as it's more efficient. You can check how I made the filter and used -iregex. Probably doing \( -iname one_ext_pat -or -iname another_ext_pat \) can also be applicable but it's more difficult.
exec is optional to prevent unnecessary use of another process.
Always prefer [[ ]] over [ ].
4< opens input with file descriptor 4 and -u 4 makes read read from that file descriptor, not stdin (0).
What you probably need is -ge MIN_SIZE (greater than or equal) not -gt.
Come to think of it, readarray is a cleaner option if your bash is version 4.0 or newer:
for DIR in "${DIRECTORIES[#]}"; do
readarray -t FILES < <(exec find "$DIR" -mindepth 1 -type f -regextype egrep -iregex "$FILTER")
for FILE in "${FILES[#]}"; do
...
done
done

Resources