Extracting maximum numbers from file names - bash

I'm new to bash and wondered if you guys could help - I have a list of files named things like
amp1_X_1
amp1_X_2
...
amp43_X_3
and have to extract the maximum values of the numbers which appear either side of X.
I've been looking at various links to help solve my problem, but when I try and tweak the examples to suit my purpose I get various errors, like "command not found" or %% operators remaining unevaluated.
How do I Select Highest Number From Series of <string>_# File Names in Bash Script
extract numbers from file names
For example, I tried something like
max=-1
for file in amp*_X_1
do
pattern=_X_*
num=${file}%%${pattern}
num=${num}##amp
echo "num is $num"
[[ $num -gt $max ]] && max=$num
done
echo "max is: $max"
(credit largely to ghostdog74 in the first link) which would in any case only work for one set of numbers, but it returns with the %% unevaluated. Have I missed something stupid/is there a better way to do this?
Thanks!

there are some errors in your script:
1)for file in amp*_X_1
if you want use the files in a specific folder you have to use commands like ls or find
i.e.
for file in $(ls -1 <your file>)
for file in ($find . -name "<your file>")
etc..
2) num=${file}%%${pattern}
use num=${file%%$pattern}
below you can find your script edited:
max=-1
for file in $(find . -name "amp*")
do
echo "file[$file]"
pattern=_X_*
num=${file%%$pattern}
num=$(echo $num|sed "s/^\.\///g" | sed "s/amp//g")
echo "num is $num"
[[ $num -gt $max ]] && max=$num
done
echo "max is: $max"
output:
sh-4.3$ bash -f main.sh
file[./amp1_X_1]
num is 1
file[./amp3_X_2]
num is 3
file[./amp3_X_1]
num is 3
file[./amp43_X_1]
num is 43
max is: 43
now this script is looking for the maximum between the numbers on the left side of X.
From your question isn't clear what's your expected result..
1) find maximum between all numbers on both X side?
2) find maximum between all numbers on left X side?
3) find maximum between all numbers on right X side?
if 1 is the case you have to add another variable, get the value, and check it.
something like
max=-1
for file in $(find . -name "amp*")
do
echo "file[$file]"
pattern=_X_*
num=${file%%$pattern}
num=$(echo $num|sed "s/^\.\///g" | sed "s/amp//g")
echo "num is $num"
pattern2=*_X_
num2=${file##$pattern2}
echo "num2 is $num2"
[[ $num -gt $max ]] && max1=$num
[[ $num2 -gt $max ]] && max2=$num2
if [[ $max1 -gt $max2 ]]; then
echo "max is: $max1"
else
echo "max is: $max2"
fi
done
#echo "max is: $max"

Related

Unix Scripting - Finding Minimum and Maximum (Bash Shell)

My code below is part of an assignment, but I'm racking my head against the desk not understanding why it won't assign a "MIN" value. I tried assigning the MIN and MAX to ${LIST[0]} just to have the first index in place, but it returns the whole array, which doesn't make sense to me. I'm executing this on a CentOS VM (which I can't see making a difference). I know the beginning of the first and second "if" statements need better logic, but I'm more concerned on the MIN and MAX outputs.
#!/bin/bash
LIST=()
read -p "Enter a set of numbers. " LIST
MIN=
MAX=
if [ ${#LIST[*]} == 0 ]; then echo "More numbers are needed."; fi
if [ ${#LIST[#]} -gt 0 ]; then
for i in ${LIST[#]}; do
if [[ $i -gt $MAX ]]; then
MAX=$i
fi
if [[ $i -lt $MIN ]]; then
MIN=$i
fi
done
echo Max is: $MAX.
echo Min is: $MIN.
fi
The code is almost functional.
Since $LIST is an array, not a variable, change read -p "Enter a set of numbers. " LIST to:
read -p "Enter a set of numbers. " -a LIST
Move the $MIN and $MAX init code down 5 lines, (just before the for loop):
MIN=
MAX=
...and change it to:
MIN=${LIST[0]}
MAX=$MIN
And it'll work. Test:
echo 3 5 6 | ./minmax.sh
Output:
Max is: 6.
Min is: 3.

How to increment every number in a bash variable

I have a bash script to control Linux perf. As you may know, perf takes core list which can be specified in 1 of the three ways.
-C1 #core 1 only
-C1-4 # core 1 through 4
-C1,3 # core 1 and 3
Currently, I have an environment variable CORENO which will control -C$CORENO.
However, I need to offset CORENO by a fix offset (I.e.2)
I could do ((CORENO+=2)) but that only work for case 1.
Is there a Linux/bash trick to allow me to apply fix offset to every number in a bash variable?
Since you're on Linux, here's some GNU sed:
addtwo() {
sed -re 's/[^0-9,-]//g; s/[0-9]+/$((\0+2))/g; s/^/echo /e;' <<< "$1"
}
addtwo "1"
addtwo "1-4"
addtwo "3,4,5"
It will output:
3
3-6
5,6,7
It works by replacing all numbers with $((number+2)) and evaluating the result as a shell command. A whitelisting of allowed characters is applied first to avoid any security issues.
Take a look at seq
for core in `seq 2 10`; do
echo CORENO=$core
done
I’ve upvoted the sed-based answer from #that other guy because I like it more than mine, which is a “pure bash” solution, consisting of a recursive function.
function increment () {
local current="$1" n=$(($2))
if [[ "$current" =~ ^[0-9]+$ ]]; then
echo $((current+n))
elif [[ $current == *,* ]]; then
echo $(increment ${current%%,*} $n),$(increment ${current#*,} $n)
elif [[ $current == *-*-* ]]; then
echo ERROR
elif [[ $current == *-* ]]; then
echo $(increment ${current%-*} $n)-$(increment ${current#*-} $n)
else
echo ERROR
fi
}
CORENO=3-5
CORENO=$(increment $CORENO 2)
echo $CORENO
increment 3-5,6-8 3
My function will print ERROR when given an illegal argument. The one from #that other guy is much more liberal...

How to list files with words exceeding n characters in all subdirectories

I have to write a shell script that creates a file containing the name of each text files from a folder (given as parameter) and it's subfolders that contain words longer than n characters (read n from keyboard).
I wrote the following code so far :
#!/bin/bash
Verifies if the first given parameter is a folder:
if [ ! -d $1 ]
then echo $1 is not a directory\!
exit 1
fi
Reading n
echo -n "Give the number n: "
read n
echo "You entered: $n"
Destination where to write the name of the files:
destinatie="destinatie"
the actual part that i think it makes me problems:
nr=0;
#while read line;
#do
for fisier in `find $1 -type f`
do
counter=0
for word in $(<$fisier);
do
file=`basename "$fisier"`
length=`expr length $word`
echo "$length"
if [ $length -gt $n ];
then counter=$(($counter+1))
fi
done
if [ $counter -gt $nr ];
then echo "$file" >> $destinatie
fi
done
break
done
exit
The script works but it does a few more steps that i don't need.It seems like it reads some files more than 1 time. If anyone can help me please?
Does this help?
egrep -lr "\w{$n,}" $1/* >$destinatie
Some explanation:
\w means: a character that words consist of
{$n,} means: number of consecutive characters is at least $n
Option -l lists files and does not print the grepped text and -r performs a recursive scan on your directory in $1
Edit:
a bit more complete version around the egrep command:
#!/bin/bash
die() { echo "$#" 1>&2 ; exit 1; }
[ -z "$1" ] && die "which directory to scan?"
dir="$1"
[ -d "$dir" ] || die "$dir isn't a directory"
echo -n "Give the number n: "
read n
echo "You entered: $n"
[ $n -le 0 ] && die "the number should be > 0"
destinatie="destinatie"
egrep -lr "\w{$n,}" "$dir"/* | while read f; do basename "$f"; done >$destinatie
This code has syntax errors, probably leftovers from your commented-out while loop: It would be best to remove the last 3 lines: done causes the error, break and exit are unnecessary as there is nothing to break out from and the program always terminates at its end.
The program appears to output files multiple times because you just append to $destinatie. You could simply delete that file when you start:
rm "$destinatie"
You echo the numbers to stdout (echo "$length") and the file names to $destinatie (echo "$file" >> $destinatie). I do not know if that is intentional.
I found the problem.The problem was the directory in which i was searching.Because i worked on the files from the direcotry and modified them , it seems that there remained some files which were not displayed in file explorer but the script would find them.i created another directory and i gived it as parameter and it works. Thank you for your answers
.

Using shell script, how do I put numbers in formats like 000?

I have large number of files which have file names in the format of XXX_name_YYY.out with YYY and YYY being numbers. I want to use a loop to move all files starting with XXX_name to a folder with the name 'XXX_name'. I am very new to shell scripting and only code a bit in C.
I would do something like this but the format of the numbers does not match the numbers in the file names.
c=1
while[c -le 100]
do
d=1
mkdir "$c"_name
while[d - le 100]
do
mv "$c"_name_"$d".out "$c"_name/"$c"_name_"$d".out
(( d++ ))
done
(( c++ ))
done
for FILE in [0-9][0-9][0-9]_name_[0-9][0-9][0-9].out; do
DIR=${FILE%_*.out}
[[ -d $DIR ]] || mkdir "$DIR" && echo mv "$FILE" $DIR/"
done
Remove echo when you're sure it works already.

I want a to compare a variable with files in a directory and output the equals

I am making a bash script where I want to find files that are equal to a variable. The equals will then be used.
I want to use "mogrify" to shrink a couple of image files that have the same name as the ones i gather from a list (similar to "dpkg -l"). It is not "dpkg -l" I am using but it is similar. My problem is that it prints all the files not just the equals. I am pretty sure this could be done with awk instead of a for-loop but I do not know how.
prog="`dpkg -l | awk '{print $1}'`"
for file in $dirone* $dirtwo*
do
if [ "basename ${file}" = "${prog}" ]; then
echo ${file} are equal
else
echo ${file} are not equal
fi
done
Could you please help me get this working?
First, I think there's a small typo. if [ "basename ${file}" =... should have backticks inside the double quotes, just like the prog=... line at the top does.
Second, if $prog is a multi-line string (like dpkg -l) you can't really compare a filename to the entire list. Instead you have to compare one item at a time to the filename.
Here's an example using dpkg and /usr/bin
#!/bin/bash
progs="`dpkg -l | awk '{print $2}'`"
for file in /usr/bin/*
do
base=`basename ${file}`
for prog in ${progs}
do
if [ "${base}" = "${prog}" ]; then
echo "${file}" matches "${prog}"
fi
done
done
The condition "$file = $prog" is a single string. You should try "$file" = "$prog" instead.
The following transcript shows the fix:
pax> ls -1 qq*
qq
qq.c
qq.cpp
pax> export xx=qq.cpp
pax> for file in qq* ; do
if [[ "${file} = ${xx}" ]] ; then
echo .....${file} equal
else
echo .....${file} not equal
fi
done
.....qq equal
.....qq.c equal
.....qq.cpp equal
pax> for file in qq* ; do
if [[ "${file}" = "${xx}" ]] ; then
echo .....${file} equal
else
echo .....${file} not equal
fi
done
.....qq not equal
.....qq.c not equal
.....qq.cpp equal
You can see in the last bit of output that only qq.cpp is shown as equal since it's the only one that matches ${xx}.
The reason you're getting true is because that's what non-empty strings will give you:
pax> if [[ "" ]] ; then
echo .....equal
fi
pax> if [[ "x" ]] ; then
echo .....equal
fi
.....equal
That's because that form is the string length checking variation. From the bash manpage under CONDITIONAL EXPRESSIONS:
string
-n string
True if the length of string is non-zero.
Update:
The new code in your question won't quite work as expected. You need:
if [[ "$(basename ${file})" = "${prog}" ]]; then
to actually execute basename and use its output as the first part of the equality check.
you can use case/esac
case "$file" in
"$prog" ) echo "same";;
esac

Resources