I'm wondering if there's a way to replace the if statement with something that checks whether $2 has the 7th bit set to 1?
cat $file | awk '{if ($2 == 87) print $1; else {}}' > out.txt"
For instance, 93 should print something whereas 128 should not.
bash has bitwise operators
Test 7th bit:
$ echo $(((93 & 0x40) != 0))
1
$ echo $(((128 & 0x40) != 0))
0
See also the bash documentation
Though if you're parsing the values out of a file, you're probably better off continuing to use awk, as the answer of #RakholiyaJenish
You can use bitwise operation to check if 7th bit is 1 in gawk:
and($2,0x40)
Note: Standard awk does not have bitwise operation. So for that you can use bash bitwise operation or perl bitwise operation (for string processing).
Using gawk:
gawk '(and($2,0x40)){print $1}' filename
Using perl:
perl -ane 'print "$F[0]\n" if ($F[1]&0x40)' filename
You can use a bash wrapper function to check for the bit set, defining a local IFS operator either in a script or .bashrc
# Returns the bit positions set in the number for the given bit mask
# Use the return value of the function directly in a script
bitWiseAnd() {
local IFS='&'
printf "%s\n" "$(( $* ))"
}
and use it in a function as below in your script
# The arguments to this function should be used as number and bit-mask, i.e.
# bitWiseAnd <number> <bit-mask>
if [ $(bitWiseAnd "93" "0x40") -ne 0 ]
then
# other script actions go here
echo "Bit set"
fi
The idea is to use the Input-Field-Separator(IFS), a special variable in bash used for word splitting after expansion and to split lines into words. The function changes the value locally to use word-splitting character as the bit-wise AND operator &.
Remember the IFS is changed locally and does NOT take effect on the default IFS behaviour outside the function scope. An excerpt from the man bash page,
The shell treats each character of IFS as a delimiter, and splits the results of the other expansions into words on these characters. If IFS is unset, or its value is exactly , the default, then sequences of , , and at the beginning and end of the results of the previous expansions are ignored, and any sequence of IFS characters not at the beginning or end serves to delimit words.
The "$(( $* ))" represents the list of arguments passed to be split by & and later the calculated value is output using the printf function. The function can be extended to add scope for other arithmetic operations also.
This function returns an exit status of zero if the number given as its first argument has the bit given as its second argument set:
hasbitset () {
local num=$1
local bit=$2
if (( num & 2**(bit-1) )); then
return 0
else
return 1
fi
}
or short and less readable:
hasbitset () { (( $1 & 2**($2-1) )) && return 0 || return 1; }
For the examples from the question:
$ hasbitset 93 7 && echo "Yes" || echo "No"
Yes
$ hasbitset 128 7 && echo "Yes" || echo "No"
No
Notice that it's often customary to count bits in offsets instead of positions, i.e., starting from bit 0 – unlike the usage in this question.
Related
I have created a hex to ASCII converter for strings in bash. The application I'm on changes characters (anything but [0-9],[A-Z],[a-z]) , in a string to its corresponding %hexadecimal. Eg: / changes to %2F in a string
I want to retain the ASCII characters as it is. Below is my code:
NAME=%2fhome%40%21%23
C_NAME=""
for (( i=0; i<${#NAME}; i++ )); do
CHK=$(echo "{NAME:$i:1}" | grep -v "\%" &> /dev/null;echo $?)
if [[ ${CHK} -eq 0 ]]; then
C_NAME=`echo "$C_NAME${NAME:$i:1}"`
else
HEX=`echo "${NAME:$i:3}" | sed "s/%//"`
C_NAME=`echo -n "$C_NAME";printf "\x$HEX"`
continue 2
fi
done
echo "$C_NAME"
OUTPUT:
/2fhome#40!21#23
EXPECTED:
/home#!#
So basically the conversion is happening, but not in place. Its retaining the hex values as well, which tells me the continue 2 statement is probably not working as I expect in my code. Any workarounds please.
You only have one loop so I assume you expected that continue 2 skips the current and next iteration of the current loop, however, the documentation help continue clearly states
continue [n]
[...]
If N is specified, resumes the Nth enclosing loop.
There is no built-in to skip the current and also the next iteration of the current loop, but in your case you can use (( i += 2 )) instead of continue 2.
Using the structure of your script with some simplifications and corrections:
#!/bin/bash
name=%2fhome%40%21%23
c_name=""
for (( i=0; i<${#name}; i++ )); do
c=${name:i:1}
if [[ $c != % ]]; then
c_name=$c_name$c
else
hex=${name:i+1:2}
printf -v c_name "%s\x$hex" "$c_name"
(( i += 2 )) # stolen from Dudi Boy's answer
fi
done
echo "$c_name"
Always use lower case or mixed case variables to avoid the chance of name collisions with shell or environment variables
Always use $() instead of backticks
Most of the echo commands you use aren't necessary
You can avoid using sed and grep
Variables should never be included in the format string of printf but it can't be avoided easily here (you could use echo -e "\x$hex" instead though)
You can do math inside parameter expansions
% doesn't need to be escaped in your grep command
You could eliminate the $hex variable if you used its value directly:
printf -v c_name "%s\x${name:i+1:2}" "$c_name"
I really enjoyed your exercise and decided to solve it with awk (my current study).
Hope you like it as well.
cat script.awk
BEGIN {RS = "%[[:xdigit:]]+"} { # redefine record separtor to RegEx (gawk specific)
decNum = strtonum("0x"substr(RT, 2)); # remove prefix # from record separator, convert hex num to dec
outputStr = outputStr""$0""sprintf("%c", decNum); # reconstruct output string
}
END {print outputStr}
The output
echo %2fhome%40%21%23 |awk -f script.awk
/home#!#
I have a string in a common pattern that I want to manipulate. I want to be able to turn string 5B299 into 5B300 (increment the last number by one).
I want to avoid blindly splicing the string by index, as the first number and letter can change in size. Essentially I want to be able to get the entire value of everything after the first character, increment it by one, and re-append it.
The only things I've found online so far show me how to cut by a delimiter, but I don't have a constant delimiter.
You could use the regex features supported by the bash shell with its ~ construct that supports basic Extended Regular Expression matching (ERE). All you need to do is define a regex and work on the captured groups to get the resulting string
str=5B299
re='^(.*[A-Z])([0-9]+)$'
Now use the ~ operator to do the regex match. The ~ operator populates an array BASH_REMATCH with the captured groups if regex match was successful. The first part (5B in the example) would be stored in the index 0 and the next one at 1. We increment the value at index 1 with the $((..)) operator.
if [[ $str =~ $re ]]; then
result="${BASH_REMATCH[1]}$(( BASH_REMATCH[2] + 1 ))"
printf '%s\n' "$result"
fi
The POSIX version of the regex, free of the locale dependency would be to use character classes instead of range expressions as
posix_re='^(.*[[:alpha:]])([[:digit:]]+)$'
You can do what you are attempting fairly easily with the bash parameter-expansion for string indexes along with the POSIX arithmetic operator. For instance you could do:
#!/bin/bash
[ -z "$1" ] && { ## validate at least 1 argument provided
printf "error: please provide a number.\n" >&2
exit 1
}
[[ $1 =~ [^0-9][^0-9]* ]] && { ## validate all digits in argument
printf "error: input contains non-digit characters.\n" >&2
exit 1
}
suffix=${1:1} ## take all character past 1st as suffix
suffix=$((suffix + 1)) ## increment suffix by 1
result=${1:0:1}$suffix ## append suffent to orginal 1st character
echo "$result" ## output
exit 0
Which will leave the 1st character alone while incrementing the remaining characters by 1 and then joining again with the original 1st digit, while validating that the input consisted only of digits, e.g.
Example Use/Output
$ bash prefixsuffix.sh
error: please provide a number.
$ bash prefixsuffix.sh 38a900
error: input contains non-digit characters.
$ bash prefixsuffix.sh 38900
38901
$ bash prefixsuffix.sh 39999
310000
Look things over and let me know if that is what you intended.
You can use sed in conjunction with awk:
increment() {
echo $1 | sed -r 's/([0-9]+[a-zA-Z]+)([0-9]+)/\1 \2/' | awk '{printf "%s%d", $1, ++$2}'
}
echo $(increment "5B299")
echo $(increment "127ABC385")
echo $(increment "7cf999")
Output:
5B300
127ABC386
7cf1000
A POSIX compliant shell shall provide mechanisms like this to iterate over collections of strings:
for x in $(seq 1 5); do
echo $x
done
But, how do I iterate over each character of a word?
It's a little circuitous, but I think this'll work in any posix-compliant shell. I've tried it in dash, but I don't have busybox handy to test with.
var='ab * cd'
tmp="$var" # The loop will consume the variable, so make a temp copy first
while [ -n "$tmp" ]; do
rest="${tmp#?}" # All but the first character of the string
first="${tmp%"$rest"}" # Remove $rest, and you're left with the first character
echo "$first"
tmp="$rest"
done
Output:
a
b
*
c
d
Note that the double-quotes around the right-hand side of assignments are not needed; I just prefer to use double-quotes around all expansions rather than trying to keep track of where it's safe to leave them off. On the other hand, the double-quotes in [ -n "$tmp" ] are absolutely necessary, and the inner double-quotes in first="${tmp%"$rest"}" are needed if the string contains "*".
Use getopts to process input one character at a time. The : instructs getopts to ignore illegal options and set OPTARG. The leading - in the input makes getopts treat the string as a options.
If getopts encounters a colon, it will not set OPTARG, so the script uses parameter expansion to return : when OPTARG is not set/null.
#!/bin/sh
IFS='
'
split_string () {
OPTIND=1;
while getopts ":" opt "-$1"
do echo "'${OPTARG:-:}'"
done
}
while read -r line;do
split_string "$line"
done
As with the accepted answer, this processes strings byte-wise instead of character-wise, corrupting multibyte codepoints. The trick is to detect multibyte codepoints, concatenate their bytes and then print them:
#!/bin/sh
IFS='
'
split_string () {
OPTIND=1;
while getopts ":" opt "$1";do
case "${OPTARG:=:}" in
([[:print:]])
[ -n "$multi" ] && echo "$multi" && multi=
echo "$OPTARG" && continue
esac
multi="$multi$OPTARG"
case "$multi" in
([[:print:]]) echo "$multi" && multi=
esac
done
[ -n "$multi" ] && echo "$multi"
}
while read -r line;do
split_string "-$line"
done
Here the extra case "$multi" is used to detect when the multi buffer contains a printable character. This works on shells like Bash and Zsh but Dash and busybox ash do not pattern match multibyte codepoints, ignoring locale.
This degrades somewhat nicely: Dash/ash treat sequences of multibyte codepoints as one character, but handle multibyte characters surrounded by single byte characters fine.
Depending on your requirements it may be preferable not to split consecutive multibyte codepoints anyway, as the next codepoint may be a combining character which modifies the character before it.
This won't handle the case where a single byte character is followed by a combining character.
This works in dash and busybox:
echo 'ab * cd' | grep -o .
Output:
a
b
*
c
d
I was developing a script which demanded stacks... So, we can use it to iterate through strings
#!/bin/sh
# posix script
pop () {
# $1 top
# $2 stack
eval $1='$(expr "'\$$2'" : "\(.\).*")'
eval $2='$(expr "'\$$2'" : ".\(.*\)" )'
}
string="ABCDEFG"
while [ "$string" != "" ]
do
pop c string
echo "--" $c
done
I have two strings which I want to compare for equal chars, the strings must contain the exact chars but mychars can have extra chars.
mychars="abcdefg"
testone="abcdefgh" # false h is not in mychars
testtwo="abcddabc" # true all char in testtwo are in mychars
function test() {
if each char in $1 is in $2 # PSEUDO CODE
then
return 1
else
return 0
fi
}
if test $testone $mychars; then
echo "All in the string" ;
else ; echo "Not all in the string" ; fi
# should echo "Not all in the string" because the h is not in the string mychars
if test $testtwo $mychars; then
echo "All in the string" ;
else ; echo "Not all in the string" ; fi
# should echo 'All in the string'
What is the best way to do this? My guess is to loop over all the chars in the first parameter.
You can use tr to replace any char from mychars with a symbol, then you can test if the resulting string is any different from the symbol, p.e.,:
tr -s "[$mychars]" "." <<< "ggaaabbbcdefg"
Outputs:
.
But:
tr -s "[$mychars]" "." <<< "xxxggaaabbbcdefgxxx"
Prints:
xxx.xxx
So, your function could be like the following:
function test() {
local dictionary="$1"
local res=$(tr -s "[$dictionary]" "." <<< "$2")
if [ "$res" == "." ]; then
return 1
else
return 0
fi
}
Update: As suggested by #mklement0, the whole function could be shortened (and the logic fixed) by the following:
function test() {
local dictionary="$1"
[[ '.' == $(tr -s "[$dictionary]" "." <<< "$2") ]]
}
The accepted answer's solution is short, clever, and efficient.
Here's a less efficient alternative, which may be of interest if you want to know which characters are unique to the 1st string, returned as a sorted, distinct list:
charTest() {
local charsUniqueToStr1
# Determine which chars. in $1 aren't in $2.
# This returns a sorted, distinct list of chars., each on its own line.
charsUniqueToStr1=$(comm -23 \
<(sed 's/\(.\)/\1\'$'\n''/g' <<<"$1" | sort -u) \
<(sed 's/\(.\)/\1\'$'\n''/g' <<<"$2" | sort -u))
# The test succeeds if there are no chars. in $1 that aren't also in $2.
[[ -z $charsUniqueToStr1 ]]
}
mychars="abcdefg" # define reference string
charTest "abcdefgh" "$mychars"
echo $? # print exit code: 1 - 'h' is not in reference string
charTest "abcddabc" "$mychars"
echo $? # print exit code: 0 - all chars. are in reference string
Note that I've renamed test() to charTest() to avoid a name collision with the test builtin/utility.
sed 's/\(.\)/\1\'$'\n''/g' splits the input into individual characters by placing each on a separate line.
Note that the command creates an extra empty line at the end, but that doesn't matter in this case; to eliminate it, append ; ${s/\n$//;} to the sed script.
The command is written in a POSIX-compliant manner, which complicates it, due to having to splice in an \-escaped actual newline (via an ANSI C-quoted string, $\n'); if you have GNU sed, you can simplify to sed -r 's/(.)/\1\n/g
sort -u then sorts the resulting list of characters and weeds out duplicates (-u).
comm -23 compares the distinct set of sorted characters in both strings and prints those unique to the 1st string (comm uses a 3-column layout, with the 1st column containing lines unique to the 1st file, the 2nd column containing lines unique to the 2nd column, and the 3rd column printing lines the two input files have in common; -23 suppresses the 2nd and 3rd columns, effectively only printing the lines that are unique to the 1st input).
[[ -z $charsUniqueToStr1 ]] then tests if $charsUniqueToStr1 is empty (-z);
in other words: success (exit code 0) is indicated, if the 1st string contains no chars. that aren't also contained in the 2nd string; otherwise, failure (exit code 1); by virtue of the conditional ([[ .. ]]) being the last statement in the function, its exit code also becomes the function's exit code.
OK so Ive been at this for a couple days,im new to this whole bash UNIX system thing i just got into it but I am trying to write a script where the user inputs an integer and the script will take that integer and print out a triangle using the integer that was inputted as a base and decreasing until it reaches zero. An example would be:
reverse_triangle.bash 4
****
***
**
*
so this is what I have so far but when I run it nothing happens I have no idea what is wrong
#!/bin/bash
input=$1
count=1
for (( i=$input; i>=$count;i-- ))
do
for (( j=1; j>=i; j++ ))
do
echo -n "*"
done
echo
done
exit 0
when I try to run it nothing happens it just goes to the next line. help would be greatly appreciated :)
As I said in a comment, your test is wrong: you need
for (( j=1; j<=i; j++ ))
instead of
for (( j=1; j>=i; j++ ))
Otherwise, this loop is only executed when i=1, and it becomes an infinite loop.
Now if you want another way to solve that, in a much better way:
#!/bin/bash
[[ $1 = +([[:digit:]]) ]] || { printf >&2 'Argument must be a number\n'; exit 1; }
number=$((10#$1))
for ((;number>=1;--number)); do
printf -v spn '%*s' "$number"
printf '%s\n' "${spn// /*}"
done
Why is it better? first off, we check that the argument is really a number. Without this, your code is subject to arbitrary code injection. Also, we make sure that the number is understood in radix 10 with 10#$1. Otherwise, an argument like 09 would raise an error.
We don't really need an extra variable for the loop, the provided argument is good enough. Now the trick: to print n times a pattern, a cool method is to store n spaces in a variable with printf: %*s will expand to n spaces, where n is the corresponding argument found by printf.
For example:
printf '%s%*s%s\n' hello 42 world
would print:
hello world
(with 42 spaces).
Editor's note: %*s will NOT generally expand to n spaces, as evidenced by above output, which contains 37 spaces.
Instead, the argument that * is mapped to,42, is the field width for the sfield, which maps to the following argument,world, causing string world to be left-space-padded to a length of 42; since world has a character count of 5, 37 spaces are used for padding.
To make the example work as intended, use printf '%s%*s%s\n' hello 42 '' world - note the empty string argument following 42, which ensures that the entire field is made up of padding, i.e., spaces (you'd get the same effect if no arguments followed 42).
With printf's -v option, we can store any string formatted by printf into a variable; here we're storing $number spaces in spn. Finally, we replace all spaces by the character *, using the expansion ${spn// /*}.
Yet another possibility:
#!/bin/bash
[[ $1 = +([[:digit:]]) ]] || { printf >&2 'Argument must be a number\n'; exit 1; }
printf -v s '%*s' $((10#1))
s=${s// /*}
while [[ $s ]]; do
printf '%s\n' "$s"
s=${s%?}
done
This time we construct the variable s that contains a bunch of * (number given by user), using the previous technique. Then we have a while loop that loops while s is non empty. At each iteration we print the content of s and we remove a character with the expansion ${s%?} that removes the last character of s.
Building on gniourf_gniourf's helpful answer:
The following is simpler and performs significantly better:
#!/bin/bash
count=$1 # (... number-validation code omitted for brevity)
# Create the 1st line, composed of $count '*' chars, and store in var. $line.
printf -v line '%.s*' $(seq $count)
# Count from $count down to 1.
while (( count-- )); do
# Print a *substring* of the 1st line based on the current value of $count.
printf "%.${count}s\n" "$line"
done
printf -v line '*%.s' $(seq $count) is a trick that prints * $count times, thanks to %.s* resulting in * for each argument supplied, irrespective of the arguments' values (thanks to %.s, which effectively ignores its argument). $(seq $count) expands to $count arguments, resulting in a string composed of $count * chars. overall, which - thanks to -v line, is stored in variable $line.
printf "%.${count}s\n" "$line" prints a substring from the beginning of $line that is $count chars. long.