Q: find longest string using for loop (Bash) - bash

Im learning bash, and I have an assignment where I need to iterate through a list of strings in bash using a for loop, and return the longest string.
This is what I've written:
max=-1
word=""
list=`cat random-text.txt | tr -s [:space:] " " | sed -r 's/([.* ])/\1\n/g' | grep -E "^a.*" | sed -r 's/(.*)[[:space:]]/\1/' | tr -s [:space:] " "`
for i in $list; do
int=`$i | wc -c`
if [ $int > $max ]; then
max=$int
word=$i
fi
done
echo The longest word in $infile that starts with $char is $i
that's probably a bit messy, but I'm having trouble using the for loop (I need the echo function at the end to return the longest string I have found iterating through the array.
** that's a part of a longer script I've written, I
Thanks in advance, much appreciated!

for some reason, while I run this script I get an error which says: "Command 'an' not found
That's because you erroneously used $i | to feed the content of variable i to wc; correct is <<<$i instead (with Bash). But better use just int=${#i}.
Then in $int > $max the > is interpreted as an output redirection; the correct arithmetic comparison operator is -gt.
Finally you don't echo the longest word found, but rather the last processed one; change $i to $word there.

Related

Need help for string manipulation in a bash script

I'm not use to the syntax of bash script. I'm trying to read a file. For each line I want to keep only the part of the string before the delimiter '/' and put it back into a new file if the word respect a perticular length. I've download a dictionary, but the format does not meet my expectation. Since there is 84000 words, I don't really want to manualy remove what after the '/' for each word. I though it would be an easy thing and I follow couple of idea in other similar question on this site, but it seem that I'm missing something somewhere because it still doesn't work. I can't get the length right. The file Test_Input contains one word per line. Here's the code:
#!/usr/bin/bash
filename="Test_Input.txt"
while read -r line
do
sub= echo $line | cut -d '/' -f1
length= echo ${#sub}
if $length >= 4 && $length <= 10;
then echo $sub >> Test_Output.txt
fi
done < "$filename"
Several items:
I assume that you have been using single back-quotes in the assignments, and not literally sub= echo $line | cut -d '/' -f1, as this would have certainly failed. Alternatively, you can also use sub=$(), as in $(echo $line | cut -d '/' -f1)
The conditions in an if clause need to be encompassed by single or double [], like this: if [[ $length -ge 4 ]] && [[ $length -le 10 ]];
Which brings me to the next point: <= doesn't reliably work in bash. Just use -ge for "greater or equal" and -le for "less or equal".
If your line does not contain any / characters, in your version sub will contain the whole line. This might not be what you want, so I'd advise to also add the -s flag to cut.
You don't need somevar=$(echo $someothervar). Just use somevar=$someothervar
Here's a version that works:
#!/usr/bin/env bash
filename="Test_Input.txt"
while read -r line
do
sub=$(echo $line | cut -s -d '/' -f 1)
length=${#sub}
if [[ $length -ge 4 ]] && [[ $length -le 10 ]];
then echo $sub >> Test_Output.txt
fi
done < "$filename"
Of course, you could also just use sed:
sed -n -r '/^[^/]{4,10}\// s;/.*$;;p' Test_Input.txt > Test_Output.txt
Explanation:
-n Don't print anything unless explicitly marked for printing.
-r Use the extended regex
/<searchterm>/ <operation> Search for lines that match a certain criteria, and perform this operation:
Searchterm is: ^[^/]{4,10}\/ From the beginning of the line, there should be between 4 and 10 non-slash characters, followed by the slash
Operation is: s;/.*$;;p replace everything between the first slash and the end of the line with nothing, then print.
awk is the best tool for this
awk -F/ 'length($1) >= 4 && length($1) <= 10 {print $1} > newfile

Find number of files with prefixes in bash

I've been trying to count all files with a specific prefix and then if the number of files with the prefix does not match the number 5 I want to print the prefix.
To achieve this, I wrote the following bash script:
#!/bin/bash
for filename in $(ls); do
name=$(echo $filename | cut -f 1 -d '.')
num=$(ls $name* | wc -l)
if [$num != 5]; then
echo $name
fi
done
But I get this error (repeatedly):
./check_uneven_number.sh: line 5: [1: command not found
Thank you!
The if statement takes a command, runs it, and checks its exit status. Left bracket ([) by itself is a command, but you wrote [$num. The shell expands $num to 1, creating the word [1, which is not a command.
if [ $num != 5 ]; then
Your code loops over file names, not prefixes; so if there are three file names with a particular prefix, you will get three warnings, instead of one.
Try this instead:
# Avoid pesky ls
printf '%s\n' * |
# Trim to just prefixes
cut -d . -f 1 |
# Reduce to unique
sort -u |
while IFS='' read -r prefix; do
# Pay attention to quoting
num=$(printf . "$prefix"* | wc -c)
# Pay attention to spaces
if [ "$num" -ne 5 ]; then
printf '%s\n' "$prefix"
fi
done
Personally, I'd prefer case over the clunky if here, but it takes some getting used to.

How to check if word is in alphabetical order

I 'd like to find a bash only (no sed, awk, perl, ...) for finding out if a word is in alphabetical order, in other words every letter is.
example:
bdjkz is true,
ahjmno is true,
sdgla is false.
I'm already struggling just comparing ascii values for characters, so if anyone could point me in the right direction for that it would help a lot!
Thanks
Pure bash solution (no external tool used), using Parameter Expansion to address characters inside strings:
function compare () {
word=$1
for (( pos=0; pos<${#word}-1; pos++ )) ; do
[[ ${word:pos:1} < ${word:pos+1:1} ]] || return 1
done
return 0
}
Tested with
for word in bdjkz ahjmno sdgla ; do
if compare $word ; then
echo $word ordered
else
echo $word not ordered
fi
done
If you can utilize other command line tools (but not awk, sed, perl), you can try:
[[ "YOURSTRING" = "$(echo "YOURSTRING" | grep -o '.' | sort -n |tr -d '\n')" ]] && \
echo "Alphabetic order"
[[ ... ]] is testing the expresion
"YOURSTRING" = string comparison
"$( ... )" capture the inner workings output in a string
echo "YOURSTRING" | grep -o '.' print every character on a line from "YOURSTRING" (-o '.': print only the matches for any single character - NOTE: you might need a new version of grep for this option)
... sort -n | sort the output from 4.
... tr -d '\n' rejoin the characters from 5. (by deleting the trailing new line characters)
You can use:
p='bdjkz'
q=$(fold -w1 <<< "$p"|sort|tr -d "\n")
[[ "$p" == "$q" ]] && echo "in alphabetical order" || echo "not in alphabetical order"
s=($(echo "existingString" | grep -o .)) # put each character of input string in an array.
k=($(printf '%s\n' "${s[#]}" | sort)) # sorts the input string
if [[ "${s[*]}" == "${k[*]}" ]]; then # comparing the input string array with sorted array
echo "alphabetical"
else
echo "not alphabetical"
fi

how do i verify presence of special characters in a bash password generator

Supposed to be a simple bash script, but turned into a monster. This is the 5th try. You don't even want to see the 30 line monstrosity that was attempt #4.. :)
Here's what I want to do: Script generates a random password, with $1=password length, and $2=amount of special characters present in the output.
Or at least, verify before sending to standard out, that at least 1 special character exists. I would prefer the former, but settle for the latter.
Here's my very simple 5th version of this script. It has no verification, or $2:
#!/bin/bash
cat /dev/urandom | tr -dc [=!=][=#=][=#=][=$=][=%=][=^=][:alnum:] | head -c $1
This works just fine, and it's a sufficiently secure password with Usage:
$ passgen 12
2ZuQacN9M#6!
But it, of course, doesn't always print special characters, and it's become an obsession for me now to be able to allow selection of how many special characters are present in the output. It's not as easy as I thought.
Make sense?
By the way, I don't mind a complete rework of the code, I'd be very interested to see some creative solutions!
(By the way: I've tried to pipe it into egrep/grep in various ways, to no avail, but I have a feeling that is a possible solution...)
Thanks
Kevin
How about this:
HASRANDOM=0
while [ $HASRANDOM -eq 0 ]; do
PASS=`cat /dev/urandom | tr -dc [=!=][=#=][=#=][=$=][=%=][=^=][:alnum:] | head -c $1`
if [[ "$PASS" =~ "[~\!#\#\$%^&\*\(\)\-\+\{\}\\\/=]{$2,}" ]]; then
HASRANDOM=1
fi
done
echo $PASS
Supports specifying characters in the output. You could add characters in the regex though I couldn't seem to get square brackets to work even when escaping them.
You probably would want to add some kind of check to make sure it doesn't loop infinitely (though it never went that far for me but I didn't ask for too many special characters either)
Checking for special characters is easy:
echo "$pass" | grep -q '[^a-zA-Z0-9]'
Like this:
while [ 1 ]; do
pass=`cat /dev/urandom | tr -dc [=!=][=#=][=#=][=$=][=%=][=^=][:alnum:] | head -c $1`
if echo "$pass" | grep -q '[^a-zA-Z0-9]'; then
break;
fi
done
And finally:
normal=$(($1 - $2))
(
for ((i=1; i <= $normal; i++)); do
cat /dev/urandom | tr -dc [:alnum:] | head -c 1
echo
done
for ((i=1; i <= $2; i++)); do
cat /dev/urandom | tr -dc [=!=][=#=][=#=][=$=][=%=][=^=] | head -c 1
echo
done
) | shuf | sed -e :a -e '$!N;s/\n//;ta'
Keep it simple... Solution in awk that return the number of "special characters" in input
BEGIN {
FS=""
split("!##$%^",special,"")
}
{
split($0,array,"")
}
END {
for (i in array) {
for (s in special) {
if (special[s] == array[i])
tot=tot+1
}
}
print tot
}
Example output for a2ZuQacN9M#6! is
2
Similar approach in bash:
#!/bin/bash
MyString=a2ZuQacN9M#6!
special=!##$%^
i=0
while (( i++ < ${#MyString} ))
do
char=$(expr substr "$MyString" $i 1)
n=0
while (( n++ < ${#special} ))
do
s=$(expr substr "$special" $n 1)
if [[ $s == $char ]]
then
echo $s
fi
done
done
You may also use a character class in parameter expansion to delete all special chars in a string and then apply some simple Bash string length math to check if there was a minimum (or exact) number of special chars in the password.
# example: delete all punctuation characters in string
str='a!#%3"'
echo "${str//[[:punct:]]/}"
# ... taking Cfreak's approach we could write ...
(
set -- 12 3
strlen1=$1
strlen2=0
nchars=$2
special_chars='[=!=][=#=][=#=][=$=][=%=][=^=]'
HASRANDOM=0
while [ $HASRANDOM -eq 0 ]; do
PASS=`cat /dev/urandom | LC_ALL=C tr -dc "${special_chars}[:alnum:]" | head -c $1`
PASS2="${PASS//[${special_chars}]/}"
strlen2=${#PASS2}
#if [[ $((strlen1 - strlen2)) -eq $nchars ]]; then # set exact number of special chars
if [[ $((strlen1 - strlen2)) -ge $nchars ]]; then # set minimum number of special chars
echo "$PASS"
HASRANDOM=1
fi
done
)
You can count the number of special chars using something like:
number of characters - number of non special characters
Try this:
$ # define a string
$ string='abc!d$'
$ # extract non special chars to letters
$ letters=$(echo $string | tr -dc [:alnum:] )
$ # substract the number on non special chars from total
$ echo $(( ${#string} - ${#letters} ))
2
The last part $(( ... )) evaluate a mathematical expression.

bash script to extract ALL matches of a regex pattern

I found this but it assumes the words are space separated.
result="abcdefADDNAME25abcdefgHELLOabcdefgADDNAME25abcdefgHELLOabcdefg"
for word in $result
do
if echo $word | grep -qi '(ADDNAME\d\d.*HELLO)'
then
match="$match $word"
fi
done
POST EDITED
Re-naming for clarity:
data="abcdefADDNAME25abcdefgHELLOabcdefgADDNAME25abcdefgHELLOabcdefg"
for word in $data
do
if echo $word | grep -qi '(ADDNAME\d\d.*HELLO)'
then
match="$match $word"
fi
done
echo $match
Original left so comments asking about result continue to make sense.
Use grep -o
-o, --only-matching show only the part of a line matching PATTERN
Edit: answer to edited question:
for string in "$(echo $result | grep -Po "ADDNAME[0-9]{2}.*?HELLO")"; do
match="${match:+$match }$string"
done
Original answer:
If you're using Bash version 3.2 or higher, you can use its regex matching.
string="string to search 99 with 88 some 42 numbers"
pattern="[0-9]{2}"
for word in $string; do
[[ $word =~ $pattern ]]
if [[ ${BASH_REMATCH[0]} ]]; then
match="${match:+$match }${BASH_REMATCH[0]}"
fi
done
The result will be "99 88 42".
Not very elegant - and there are problems because of greedy matching - but this more or less works:
data="abcdefADDNAME25abcdefgHELLOabcdefgADDNAME25abcdefgHELLOabcdefg"
for word in $data \
"ADDNAME25abcdefgHELLOabcdefgADDNAME25abcdefgHELLOabcdefg" \
"ADDNAME25abcdefgHELLOabcdefgADDNAME25abcdefgHELLO"
do
echo $word
done |
sed -e '/ADDNAME[0-9][0-9][a-z]*HELLO/{
s/\(ADDNAME[0-9][0-9][a-z]*HELLO\)/ \1 /g
}' |
while read line
do
set -- $line
for arg in "$#"
do echo $arg
done
done |
grep "ADDNAME[0-9][0-9][a-z]*HELLO"
The first loop echoes three lines of data - you'd probably replace that with cat or I/O redirection. The sed script uses a modified regex to put spaces around the patterns. The last loop breaks up the 'space separated words' into one 'word' per line. The final grep selects the lines you want.
The regex is modified with [a-z]* in place of the original .* because the pattern matching is greedy. If the data between ADDNAME and HELLO is unconstrained, then you need to think about using non-greedy regexes, which are available in Perl and probably Python and other modern scripting languages:
#!/bin/perl -w
while (<>)
{
while (/(ADDNAME\d\d.*?HELLO)/g)
{
print "$1\n";
}
}
This is a good demonstration of using the right too for the job.

Resources