grabbing the newest file from a subset of the contents of a folder in bash - bash

I have a folder with such contents
nass#starmaze:~/audio_setup/scripts$ ls -l ../jmess/
total 32
-rw-rw-r-- 1 nass users 1573 Νοέ 16 2014 jmess_fxio56-78feedsHDA-play12.jmess
-rw-rw-r-- 1 nass nass 1573 Δεκ 13 2014 jmess_pb-2.jmess
-rw-rw-r-- 1 nass nass 1573 Δεκ 20 2014 jmess_pb-3.jmess
-rw-rw-r-- 1 nass nass 1939 Ιούν 12 13:05 jmess_starmazeOnMaster.jmess
-rw-rw-r-- 1 nass nass 2163 Δεκ 15 2014 jmess_starmazeOnMaster.jmess.bak1-art
-rw-rw-r-- 1 nass nass 2161 Δεκ 15 2014 jmess_starmazeOnMaster.jmess.bak2-bcr
-rw-rw-r-- 1 nass nass 2389 Δεκ 22 2014 jmess_starmazeOnMaster.jmess.bak3-hoo
-rw-rw-r-- 1 nass nass 2163 Δεκ 15 2014 jmess_starmazeOnMaster.jmess.bak4-dsp
I want to be able to pick up the newest file, but only from the subset of files that do not contain the word "Master" in them. And I want to put that in a bash script.
So this
ls -t1 "${JCMESS_FOLDER}" | head -n1
provides the newest file in the folder , while this
ls -t1 "${JCMESS_FOLDER}"/!(*Master*) | head -n1
provides the newest file among the subset that I am interested in.
However, when I place the latter in a bash script as
$NEWEST_JCMESS_FILE=$( ls -t1 "${JCMESS_FOLDER}"/!(*Master*) | head -n1 )
it does not work:
./06.load_jcmess: command substitution: line 8: syntax error near unexpected token `('
./06.load_jcmess: command substitution: line 8: ` ls -t1 "${JCMESS_FOLDER}"/!(*Master*) | head -n1 )'
I am not sure what is wrong in this case and I ahve not been able to successfully find an answer for this.
thank you in advance for your help

This is BashFAQ #3:
newest() {
local candidate result=$1; shift # start with first argument as candidate
[[ -e $result ]] || return # handle case where nothing matched
for candidate; do # for loop default behavior is to loop over "$#"
[[ $candidate -nt $result ]] && result=$candidate
done
printf '%s\n' "$result"
}
shopt -s extglob # enable extglobs, ie. !(...)
newest_file=$(newest "$JCMESS_FOLDER"/!(*Master*))

Related

Does anybody have a script that counts the number of consecutive files which contain a specific word?

Any resources or advice would help, since I am pretty rubbish at scripting
So, I need to go to this path: /home/client/data/storage/customer/data/2020/09/15
And check to see if there are 5 or more consecutive files that contain the word "REJECTED":
ls -ltr
-rw-rw-r-- 1 root root 5059 Sep 15 00:05 customer_rlt_20200915000514737_20200915000547948_8206b49d-b585-4360-8da0-e90b8081a399.zip
-rw-rw-r-- 1 root root 5023 Sep 15 00:06 customer_rlt_20200915000547619_20200915000635576_900b44dc-1cf4-4b1b-a04f-0fd963591e5f.zip
-rw-rw-r-- 1 root root 39856 Sep 15 00:09 customer_rlt_20200915000824108_20200915000908982_b87b01b3-a5dc-4a80-b19d-14f31ff667bc.zip
-rw-rw-r-- 1 root root 39719 Sep 15 00:09 customer_rlt_20200915000901688_20200915000938206_38261b59-8ebc-4f9f-9e2d-3e32eca3fd4d.zip
-rw-rw-r-- 1 root root 12829 Sep 15 00:13 customer_rlt_20200915001229811_20200915001334327_1667be2f-f1a7-41ae-b9ca-e7103d9abbf8.zip
-rw-rw-r-- 1 root root 12706 Sep 15 00:13 customer_rlt_20200915001333922_20200915001357405_609195c9-f23a-4984-936f-1a0903a35c07.zip
Example of rejected file:
customer_rlt_20200513202515792_20200513202705506_5b8deae0-0405-413c-9a81-d1cc2171fa51REJECTED.zip
What I have so far:
!/bin/bash
YYYY=$(date +%Y);
MM=$(date +%m)
DD=$(date +%d)
#Set constants
CODE_OK=0
CODE_WARN=1
CODE_CRITICAL=2
CODE_UNKNOWN=3
#Set Default Values
FILE="/home/client/data/storage/customer/data/${YYYY}/${MM}/{DD}"
if [ ! -f $FILE ]
then
echo "NO TRANSACTIONS FOUND"
exit $CODE_CRITICAL
fi
You can do something quick in AWK:
$ cat consec.awk
/REJECTED/ {
if (match_line == NR - 1) {
consecutives++
} else {
consecutives = 1
}
if (consecutives == 5) {
print "5 REJECTED"
exit
}
match_line = NR
}
$ touch 1 2REJECTED 3REJECTED 5REJECTED 6REJECTED 7REJECTED 8
$ ls -1 | awk -f consec.awk
5 REJECTED
$ rm 3REJECTED; touch 3
$ ls -1 | awk -f consec.awk
$
This works by matching line containing REJECTED, counting consecutive lines (checked with match_line == NR - 1, which means "the last matching line was the previous line") and printing "5 REJECTED" if the number of consecutive lines is 5.
I've used ls -1 (note digit 1, not letter l) to sort by filename in this example. You could use ls -1rt (digit 1 again) to sort by file modification time, as in your original post.

How to iterate through multiple directories with multiple ifs in bash?

unfortunately I'm quite new at bash, and I want to write a script that will start in a main directory, and check all subdirectories one by one for the presence of certain files, and if those files are present, perform an operation on them. For now, I have written a simplified version to test whether I can do the first part (checking for the files in each directory). This code runs without any errors that I can tell, but it does not echo anything to say that it has successfully found the files which I know are there.
#!/bin/bash
runlist=(1 2 3 4 5 6 7 8 9)
for f in *; do
if [[ -d {$f} ]]; then
#if f is a directory then cd into it
cd "{$f}"
for b in $runlist; do
if [[ -e "{$b}.png" ]]; then
echo "Found {$b}"
#if the file exists then say so
fi
done
cd -
fi
done
'''
Welcome to stackoverflow.
The following will do the trick (a combination of find, array, and if then else):
# list of files we are looking for
runlist=(1 2 4 8 16 32 64 128)
#find each of above anywhere below current directory
# using -maxdepth 1 because, based on on your exam you want to look one level only
# if that's not what you want then take out -maxdepth 1 from the find command
for b in ${runlist[#]}; do
echo
PATH_TO_FOUND_FILE=`find . -name $b.png`
if [ -z "$PATH_TO_FOUND_FILE" ]
then
echo "nothing found" >> /dev/null
else
# You wanted a postive confirmation, so
echo found $b.png
# Now do something with the found file. Let's say ls -l: change that to whatever
ls -l $PATH_TO_FOUND_FILE
fi
done
Here is an example run:
mamuns-mac:stack foo$ ls -lR
total 8
drwxr-xr-x 4 foo 1951595366 128 Apr 11 18:03 dir1
drwxr-xr-x 3 foo 1951595366 96 Apr 11 18:03 dir2
-rwxr--r-- 1 foo 1951595366 652 Apr 11 18:15 find_file_and_do_something.sh
./dir1:
total 0
-rw-r--r-- 1 foo 1951595366 0 Apr 11 17:58 1.png
-rw-r--r-- 1 foo 1951595366 0 Apr 11 17:58 8.png
./dir2:
total 0
-rw-r--r-- 1 foo 1951595366 0 Apr 11 18:03 64.png
mamuns-mac:stack foo$ ./find_file_and_do_something.sh
found 1.png
-rw-r--r-- 1 foo 1951595366 0 Apr 11 17:58 ./dir1/1.png
found 8.png
-rw-r--r-- 1 foo 1951595366 0 Apr 11 17:58 ./dir1/8.png
found 64.png
-rw-r--r-- 1 foo 1951595366 0 Apr 11 18:03 ./dir2/64.png

For loop with if statements isn't working as expected in bash

It only prints the "else" statement for everything but I know for a fact the files exist that it's looking for. I've tried adapting some of the other answers but I thought this should definitely work.
Does anyone know what's wrong with my syntax?
# Contents of script
for ID_SAMPLE in $(cut -f1 metadata.tsv | tail -n +2);
do if [ -f ./output/${ID_SAMPLE} ]; then
echo Skipping ${ID_SAMPLE};
else
echo Processing ${ID_SAMPLE};
fi
done
Additional information
# Output directory
(base) -bash-4.1$ ls -lhS output/
total 170K
drwxr-xr-x 8 jespinoz tigr 185 Jan 3 16:16 ERR1701760
drwxr-xr-x 8 jespinoz tigr 185 Jan 17 18:03 ERR315863
drwxr-xr-x 8 jespinoz tigr 185 Jan 16 23:23 ERR599042
drwxr-xr-x 8 jespinoz tigr 185 Jan 17 00:10 ERR599072
drwxr-xr-x 8 jespinoz tigr 185 Jan 16 13:00 ERR599078
# Example of inputs
(base) -bash-4.1$ cut -f1 metadata.tsv | tail -n +2 | head -n 10
ERR1701760
ERR599078
ERR599079
ERR599070
ERR599071
ERR599072
ERR599073
ERR599074
ERR599075
ERR599076
# Output of script
(base) -bash-4.1$ bash test.sh | head -n 10
Processing ERR1701760
Processing ERR599078
Processing ERR599079
Processing ERR599070
Processing ERR599071
Processing ERR599072
Processing ERR599073
Processing ERR599074
Processing ERR599075
Processing ERR599076
# Checking a directory
(base) -bash-4.1$ ls -l ./output/ERR1701760
total 294
drwxr-xr-x 2 jespinoz tigr 386 Jan 15 21:00 checkpoints
drwxr-xr-x 2 jespinoz tigr 0 Jan 10 01:36 tmp
-f is for checking whether the name is a file, but all your names are directories. Use -d to check that.
if [ -d "./output/$ID_SAMPLE" ]
then
If you want to check whether the name exists with any type, use -e.

I dont want to print repeated lines based on column 6 and 7

I don't want to print repeated lines based on column 6 and 7. sort -u does not seem to help
cat /tmp/testing :-
-rwxrwxr-x. 1 root root 52662693 Feb 27 13:11 /home/something/bin/proxy_exec
-rwxrwxr-x. 1 root root 27441394 Feb 27 13:12 /home/something/bin/keychain_exec
-rwxrwxr-x. 1 root root 45570820 Feb 27 13:11 /home/something/bin/wallnut_exec
-rwxrwxr-x. 1 root root 10942993 Feb 27 13:12 /home/something/bin/log_exec
-rwxrwxr-x. 1 root root 137922408 Apr 16 03:43 /home/something/bin/android_exec
When I try cat /tmp/testing | sort -u -k 6,6 -k 7,7 I get :-
-rwxrwxr-x. 1 root root 137922408 Apr 16 03:43 /home/something/bin/android_exec
-rwxrwxr-x. 1 root root 52662693 Feb 27 13:11 /home/something/bin/proxy_exec
Desired output is below, as that is the only file different from others based on month and date column
-rwxrwxr-x. 1 root root 137922408 Apr 16 03:43 /home/something/bin/android_exec
[not] to print repeated lines based on column 6 and 7 using awk, you could:
$ awk '
++seen[$6,$7]==1 { # count seen instances
keep[$6,$7]=$0 # keep first seen ones
}
END { # in the end
for(i in seen)
if(seen[i]==1) # the ones seen only once
print keep[i] # get printed
}' file # from file or pipe your ls to the awk
Output for given input:
-rwxrwxr-x. 1 root root 137922408 Apr 16 03:43 /home/something/bin/android_exec
Notice: All standard warnings against parsing ls output still apply.
tried on gnu sed
sed -E '/^\s*(\S+\s+){5}Feb\s+27/d' testing
tried on gnu awk
awk 'NR==1{a=$6$7;next} a!=$6$7{print}' testing

How to get a filename list with ncftp?

So I tried
ncftpls -l
which gives me a list
-rw-r--r-- 1 100 ftpgroup 3817084 Jan 29 15:50 1548773401.tar.gz
-rw-r--r-- 1 100 ftpgroup 3817089 Jan 29 15:51 1548773461.tar.gz
-rw-r--r-- 1 100 ftpgroup 3817083 Jan 29 15:52 1548773521.tar.gz
-rw-r--r-- 1 100 ftpgroup 3817085 Jan 29 15:53 1548773582.tar.gz
-rw-r--r-- 1 100 ftpgroup 3817090 Jan 29 15:54 1548773642.tar.gz
But all I want is to check the timestamp (which is the name of the tar.gz)
How to only get the timestamp list ?
As requested, all I wanted to do is delete old backups, so awk was a good idea (at least it was effective) even it wasn't the right params. My method to delete old backup is probably not the best but it works
ncftpls *authParams* | (awk '{match($9,/^[0-9]+/, a)}{ print a[0] }') | while read fileCreationDate; do
VALIDITY_LIMIT="$((`date +%s`-600))"
a=$VALIDITY_LIMIT
b=$fileCreationDate
if [ $b -lt $a ];then
deleteFtpFile $b
fi
done;
You can use awk to only display the timestamps from the output like so:
ncftpls -l | awk '{ print $5 }'

Resources