Check if multiple files are empty in single if statement - bash

awkOut1="awkOut1.csv"
awkOut2="awkOut2.csv"
if [[ "$(-s $awkOut1)" || "$(-s $awkOut2)" ]]
The above 'if' check in shell script gives me below error:
-bash: -s: command not found
Suggestions anyone?

If you just have 2 files I would do
if [[ -e "$awkOut1" && ! -s "$awkOut1" ]] &&
[[ -e "$awkOut2" && ! -s "$awkOut2" ]]
then
echo both files exist and are empty
fi
Since [[ is a command, you can chain the exit statuses together with && to ensure they are all true. Also, within [[ (but not [), you can use && to chain tests together.
Note that -s tests for True if file exists and is not empty. so I'm explicitly adding the -e tests so that -s only checks if the file is not empty.
If you have more than 2:
files=( awkOut1.csv awkOut2.csv ... )
sum=$( stat -c '%s' "${files[#]}" | awk '{sum += $1} END {print sum}' )
if (( sum == 0 )); then
echo all the files are empty
fi
This one does not test for existence of the files.

You can use basic Bourne shell syntax and the test command (a single left bracket) to find out if either file is non-empty:
if [ -s "$awkOut1" -o -s "$awkOut2" ]; then
echo "One of the files is non-empty."
fi
When using single brackets, the -o means "or", so this expression is checking to see if awkOut1 or awkOut2 is non-empty.
If you have a whole directory full of files and you want to find out if any of them is empty, you could do something like this (again with basic Bourne syntax and standard utilities):
find . -empty | grep -q . && echo "some are empty" || echo "no file is empty"
In this line, find will print any files in the current directory (and recursively in any subdirectories) that are empty; grep will turn that into an exit status; and then you can take action based on success or failure to find empties. In an if statement, it would look like this:
if find . -empty | grep -q .; then
echo "some are empty"
else
echo "no file is empty"
fi

Here is one for GNU awk and filefuncs extension. It checks all parameter given files and exits once the first one is empty:
$ touch foo
$ awk '
#load "filefuncs" # enable
END {
for(i=1;i<ARGC;i++) { # all given files
if(stat(ARGV[i], fdata)<0) { # use stat
printf("could not stat %s: %s\n", # nonexists n exits
ARGV[i], ERRNO) > "/dev/stderr"
exit 1
}
if(fdata["size"]==0) { # file size check
printf("%s is empty\n",
ARGV[i]) > "/dev/stderr"
exit 2
}
}
exit
}' foo
Output:
foo is empty

Related

Unable to execute awk command in a function, but working directly in the shell

I want to create a utility function for bash to remove duplicate lines. I am using function
function remove_empty_lines() {
if ! command -v awk &> /dev/null
then
echo '[x] ERR: "awk" command not found'
return
fi
if [[ -z "$1" ]]
then
echo "usage: remove_empty_lines <file-name> [--replace]"
echo
echo "Arguments:"
echo -e "\t--replace\t (Optional) If not passed, the result will be redirected to stdout"
return
fi
if [[ ! -f "$1" ]]
then
echo "[x] ERR: \"$1\" file not found"
return
fi
echo $0
local CMD="awk '!seen[$0]++' $1"
if [[ "$2" = '--reload' ]]
then
CMD+=" > $1"
fi
echo $CMD
}
If I am running the main awk command directly, it is working. But when i execute the same $CMD in the function, I am getting this error
$ remove_empty_lines app.js
/bin/bash
awk '!x[/bin/bash]++' app.js
The original code is broken in several ways:
When used with --reload, it would truncate the output file's contents before awk could ever read those contents (see How can I use a file in a command and redirect output to the same file without truncating it?)
It didn't ever actually run the command, and for the reasons described in BashFAQ #50, storing a shell command in a string is inherently buggy (one can work around some of those issues with eval; BashFAQ #48 describes why doing so introduces security bugs).
It wrote error messages (and other "diagnostic content") to stdout instead of stderr; this means that if your function's output was redirected to a file, you could never see its errors -- they'd end up jumbled into the output.
Error cases were handled with a return even in cases where $? would be zero; this means that return itself would return a zero/successful/truthy status, not revealing to the caller that any error had taken place.
Presumably the reason you were storing your output in CMD was to be able to perform a redirection conditionally, but that can be done other ways: Below, we always create a file descriptor out_fd, but point it to either stdout (when called without --reload), or to a temporary file (if called with --reload); if-and-only-if awk succeeds, we then move the temporary file over the output file, thus replacing it as an atomic operation.
remove_empty_lines() {
local out_fd rc=0 tempfile=
command -v awk &>/dev/null || { echo '[x] ERR: "awk" command not found' >&2; return 1; }
if [[ -z "$1" ]]; then
printf '%b\n' >&2 \
'usage: remove_empty_lines <file-name> [--replace]' \
'' \
'Arguments:' \
'\t--replace\t(Optional) If not passed, the result will be redirected to stdout'
return 1
fi
[[ -f "$1" ]] || { echo "[x] ERR: \"$1\" file not found" >&2; return 1; }
if [ "$2" = --reload ]; then
tempfile=$(mktemp -t "$1.XXXXXX") || return
exec {out_fd}>"$tempfile" || { rc=$?; rm -f "$tempfile"; return "$rc"; }
else
exec {out_fd}>&1
fi
awk '!seen[$0]++' <"$1" >&$out_fd || { rc=$?; rm -f "$tempfile"; return "$rc"; }
exec {out_fd}>&- # close our file descriptor
if [[ $tempfile ]]; then
mv -- "$tempfile" "$1" || return
fi
}
First off the output from your function call is not an error but rather the output of two echo commands (echo $0 and echo $CMD).
And as Charles Duffy has pointed out ... at no point is the function actually running the $CMD.
As for the inclusion of /bin/bash in your function's echo output ... the main problem is the reference to $0; by definition $0 is the name of the running process, which in the case of a function is the shell under which the function is being called. Consider the following when run from a bash command prompt:
$ echo $0
-bash
As you can see from your output this generates /bin/bash in your environment. See this and this for more details.
On a related note, the reference to $0 within double quotes causes the $0 to be evaluated, so this:
local CMD="awk '!seen[$0]++' $1"
becomes
local CMD="awk '!seen[/bin/bash]++' app.js"
I'm thinking what you want is something like:
echo $1 # the name of the file to be processed
local CMD="awk '!seen[\$0]++' $1" # escape the '$' in '$0'
becomes
local CMD="awk '!seen[$0]++' app.js"
That should fix the issues shown in your function's output; as for the other issues ... you're getting a good bit of feedback in the various comments ...

How to find the combination of files in unix directory?

I have a set of files at a directory. I need to exit out of my script if i don't find the pairs of files at a given time.
Let's say i have these 3 files at directory $SRC_DIR
file 1: apple_iphone_file.zip
file 2: apple_ipad_file.zip
file 3: apple_mac_file.zip
If these 3 set of files are present i am doing some post processing.
There can be multiple pairs like 2,3, OR N set of these 3 files (file1,file2,file3).
I should exit the script if the same set are not present for all 3 files.
I am planing to count file 1 and if it gives me 2 , i will check if the other two files (file 2 and file 3) also gives me same count , else i will exit.
Do you think , we can do in any other way too?
Any input is highly appreciated.
Code Tried
#!/usr/bin/ksh
file1_count=$(ls ${SRC_DIR}/apple_iphone_file.zip | wc -l)
file2_count=$(ls ${SRC_DIR}/apple_ipad_file.zip | wc -l)
file3_count=$(ls ${SRC_DIR}/apple_mac_file.zip | wc -l)
if [ "$file1_count" == "$file2_count" -a "$file2_count" == "$file3_count" ]; then
echo "Files count match"
else
echo "Files count don't match"
exit 1
fi
This is giving me the results. However, if the files aren't present (none of them) it still shows me "Count Match".
Two scripts, check and pre-process. In check, if the variables corresponding with all three files equal 1, run pre-process. The default action (which runs next if that fails) is to exit.
Pre-process finds all files in the directory, puts their names in an input stack, and then uses each name as input for the main function. The code between the two ed lines is an example; replace it with your own. After that, it exits.
check.sh:-
#!/bin/sh
find apple_iphone_file.zip && iphone=1
find apple_ipad_file.zip && ipad=1
find apple_mac_file.zip && mac=1
[ "${iphone}" -eq 1 ] && [ "${ipad}" -eq 1 ] && \
[ "${mac}" -eq 1 ] && ./pre-process.sh
exit 0
pre-process.sh:-
#!/bin/sh
next() {
[ -s stack ] && main
end
}
main() {
line=$(ed -s stack < edprint+.txt)
echo "${line}" | tr '[a-z]' '[A-Z]'
ed -s stack < edpop+.txt
next
}
end() {
rm -v ./stack
rm -v ./edprint+.txt
rm -v ./edpop+.txt
exit 0
}
find . -type -f > stack
cat >> edprint+.txt << EOF
1
q
EOF
cat >> edpop+.txt << EOF
1d
wq
EOF
There are so many possibilities.
One option:
my_counter=""
[ -f "file_name_1" ] && my_counter="x$my_counter"
[ -f "file_name_2" ] && my_counter="x$my_counter"
[ -f "file_name_3" ] && my_counter="x$my_counter"
if [ "${#my_counter}" -lt 2 ]; then
echo "Error"
else
echo "doing stuffs"
fi
You can easily add check for file, change lower and upper limit, and ev. doing more tests (in similar way) in series. Note: for shell scripts, I tend to copy paste, because the command are easy (low probability of refactoring), and often I find the need to add extra tests on specific cases.

Check if there is only one of two types of files in a directory

I have a script that checks if there is only one file in a directory. However, I can't figure out how to check if there is only one executable (no file extension) or script (.sh) in that directory. Here's what I currently have:
loc=(/Applications/*)
APPROOTDIR="${loc[RANDOM % ${#loc[#]}]}/"
APPDIR="${APPROOTDIR}Contents/MacOS/"
echo "APPROOTDIR is ${APPROOTDIR}"
echo "APPDIR is ${APPDIR}"
FIAD=$(ls ${APPDIR})
if [ `ls -1 ${APPDIR}* 2>/dev/null | wc -l ` == 1 ]; then
echo "One executable or script: ${FIAD}"
else
echo "Not one executable or script: ${FIAD}"
fi
Does anyone know how I can do this?
Don't parse ls, populate another array with the directory entries and work on it instead.
shopt -s nullglob
# set up loc, APPDIR, etc. here
ent=("$APPDIR"*)
if [[ ${#ent[#]} -eq 1 && ( $ent = *.sh || -x $ent ) ]]; then
echo 'One executable or script: '
else
echo 'Not one executable or script: '
fi
printf '%q\n' "${ent[#]#"$APPDIR"}"
Note that variables with all uppercase names are reserved for shells, it's recommended to use lower or mixed-case variable names.

Simplest way to "correct" an accidental use of mv instead of an hg mv?

I have a tracked foo. Now, since I'm absent-minded, I've run:
mv foo bar
now, when I do hg st, I get:
! foo
? bar
I want to fix this retroactively - as though I'd done an hg mv foo bar.
Now, I could write a bash script which does that for me - but is there something better/simpler/smarter I could do?
Use the --after option: hg mv --after foo bar
$ hg mv --help
hg rename [OPTION]... SOURCE... DEST
aliases: move, mv
rename files; equivalent of copy + remove
Mark dest as copies of sources; mark sources for deletion. If dest is a
directory, copies are put in that directory. If dest is a file, there can
only be one source.
By default, this command copies the contents of files as they exist in the
working directory. If invoked with -A/--after, the operation is recorded,
but no copying is performed.
This command takes effect at the next commit. To undo a rename before
that, see 'hg revert'.
Returns 0 on success, 1 if errors are encountered.
options ([+] can be repeated):
-A --after record a rename that has already occurred
-f --force forcibly copy over an existing managed file
-I --include PATTERN [+] include names matching the given patterns
-X --exclude PATTERN [+] exclude names matching the given patterns
-n --dry-run do not perform actions, just print output
--mq operate on patch repository
(some details hidden, use --verbose to show complete help)
Here's what I'm doing right now;
#!/bin/bash
function die {
echo "$1" >&2
exit -1
}
(( $# == 2 )) || die "Usage: $0 <moved filename> <original filename>"
[[ -e "$1" ]] || die "Not an existing file: $1"
[[ ! -e "$2" ]] || die "Not a missing file: $2"
hg_st_lines_1=$(hg st "$1" 2>/dev/null | wc -l)
hg_st_lines_2=$(hg st "$2" 2>/dev/null | wc -l)
(( ${hg_st_lines_1} == 1 )) || die "Expected exactly one line in hg status for $1, but got ${hg_st_lines_1}"
(( ${hg_st_lines_2} == 1 )) || die "Expected exactly one line in hg status for $2, but got ${hg_st_lines_2}"
[[ "$(hg st "$1" 2>/dev/null)" == \?* ]] || die "Mercurial does not consider $1 to be an unknown (untracked) file"
[[ "$(hg st "$2" 2>/dev/null)" =~ !.* ]] || die "Mercurial does not consider $2 to be a missing file"
mv $1 $2
hg mv $2 $1

Bash Script - Will not completely execute

I am writing a script that will take in 3 outputs and then search all files within a predefined path. However, my grep command seems to be breaking the script with error code 123. I have been staring at it for a while and cannot really seem the error so I was hoping someone could point out my error. Here is the code:
#! /bin/bash -e
#Check if path exists
if [ -z $ARCHIVE ]; then
echo "ARCHIVE NOT SET, PLEASE SET TO PROCEED."
echo "EXITING...."
exit 1
elif [ $# -ne 3 ]; then
echo "Illegal number of arguments"
echo "Please enter the date in yyyy mm dd"
echo "EXITING..."
exit 1
fi
filename=output.txt
#Simple signal handler
signal_handler()
{
echo ""
echo "Process killed or interrupted"
echo "Cleaning up files..."
rm -f out
echo "Finsihed"
exit 1
}
trap 'signal_handler' KILL
trap 'signal_handler' TERM
trap 'signal_handler' INT
echo "line 32"
echo $1 $2 $3
#Search for the TimeStamp field and replace the / and : characters
find $ARCHIVE | xargs grep -l "TimeStamp: $2/$3/$1"
echo "line 35"
fileSize=`wc -c out.txt | cut -f 1 -d ' '`
echo $fileSize
if [ $fileSize -ge 1 ]; then
echo "no"
xargs -n1 basename < $filename
else
echo "NO FILES EXIST"
fi
I added the echo's to know where it was breaking. My program prints out line 32 and the args but never line 35. When I check the exit code I get 123.
Thanks!
Notes:
ARCHIVE is set to a test directory, i.e. /home/'uname'/testDir
$1 $2 $3 == yyyy mm dd (ie a date)
In testDir there are N number of directories. Inside these directories there are data files that have contain data as well as a time tag. The time tag is of the following format: TimeStamp: 02/02/2004 at 20:38:01
The scripts goal is to find all files that have the date tag you are searching for.
Here's a simpler test case that demonstrates your problem:
#!/bin/bash -e
echo "This prints"
true | xargs false
echo "This does not"
The snippet exits with code 123.
The problem is that xargs exits with code 123 if any command fails. When xargs exits with non-zero status, -e causes the script to exit.
The quickest fix is to use || true to effectively ignore xargs' status:
#!/bin/bash -e
echo "This prints"
true | xargs false || true
echo "This now prints too"
The better fix is to not rely on -e, since this option is misleading and unpredictable.
xargs makes the error code 123 when grep returns a nonzero code even just once. Since you're using -e (#!/bin/bash -e), bash would exit the script when one of its commands return a nonzero exit code. Not using -e would allow your code to continue. Just disabling it on that part can be a solution too:
set +e ## Disable
find "$ARCHIVE" | xargs grep -l "TimeStamp: $2/$1/$3" ## If one of the files doesn't match the pattern, `grep` would return a nonzero code.
set -e ## Enable again.
Consider placing your variables around quotes to prevent word splitting as well like "$ARCHIVE".
-d '\n' may also be required if one of your files' filename contain spaces.
find "$ARCHIVE" | xargs -d '\n' grep -l "TimeStamp: $2/$1/$3"

Resources