How to turn a string into a modified hex representation? - bash

i want to turn a string like
AaAa
into
a string like this
%<41>%<61>%<41>%<61>
Simple enough with the programming languages i am familar with, but with bash i can't get get the piping right to do what i am trying to do:
split string into char array
turn each char into hex
wrap each hex value into %<FF>
concat string
this is my current way which gets me half way there:
echo -n "AaAa" | od -A n -t x1

If you are already using od,
printf "%%<%s>" $(od -A n -t x1<<<"AaAa")
For an all-bash without od,
while read -r -N 1 c; do printf "%%<%02X>" "$( printf "%d" \'$c )"; done <<< AaAa
The downside of this approach is that it spawns a subshell for every character, and assumes ASCII/UTF8.
edit
#Shawn pointed out that you don't need the subshell -
while read -r -N 1 c; do printf "%%<%02X>" \'$c; done <<< AaAa
I noticed that these are leaving the string terminator in your output, though, and realized I could eliminate that and the read by assigning the data to a variable and using the built-in parsing tools.
$: x=AaAa && for((i=0;i<${#x};i++)); do printf "%%<%02X>" \'${x:i:1}; done; echo
%<41>%<61>%<41>%<61>

A simple Perl substitution would do the trick:
echo -n AaAa | perl -pe's/(.)/ sprintf "%%<%02X>", ord($1) /seg'
Shorter:
echo -n AaAa | perl -ne'printf "%%<%02X>", $_ for unpack "C*"'
In both cases, the output is the expected
%<41>%<61>%<41>%<61>
(No trailing line feed added. If you want one, append ; END { print "\n" }.)

You can pipe to sed to wrap each byte in %<> and then remove the whitespace.
echo -n "AaAa" | od -A n -t x1 | sed -E -e 's/[a-z0-9]+/%<&>/g' -e 's/ //g'

You could use perl:
echo -n AaAa | perl -ne 'for $c (split//) { printf("%%<%02X>", ord($c)); }'
Output
%<41>%<61>%<41>%<61>

Maybe awk
echo -n "AaAa" |
od -A n -t x1 |
awk 'BEGIN { ORS = "" } { for (i = 1; i <= NF; i+=1) print "%<"$i">"}'

Related

Unix file pattern issue: append changing value of variable pattern to copies of matching line

I have a file with contents:
abc|r=1,f=2,c=2
abc|r=1,f=2,c=2;r=3,f=4,c=8
I want a result like below:
abc|r=1,f=2,c=2|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|3
The third column value is r value. A new line would be inserted for each occurrence.
I have tried with:
for i in `cat $xxxx.txt`
do
#echo $i
live=$(echo $i | awk -F " " '{print $1}')
home=$(echo $i | awk -F " " '{print $2}')
echo $live
done
but is not working properly. I am a beginner to sed/awk and not sure how can I use them. Can someone please help on this?
awk to the rescue!
$ awk -F'[,;|]' '{c=0;
for(i=2;i<=NF;i++)
if(match($i,/^r=/)) a[c++]=substr($i,RSTART+2);
delim=substr($0,length($0))=="|"?"":"|";
for(i=0;i<c;i++) print $0 delim a[i]}' file
abc|r=1,f=2,c=2|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|3
Use an inner routine (made up of GNU grep, sed, and tr) to compile a second more elaborate sed command, the output of which needs further cleanup with more sed. Call the input file "foo".
sed -n $(grep -no 'r=[0-9]*' foo | \
sed 's/^[0-9]*/&s#.*#\&/;s/:r=/|/;s/.*/&#p;/' | \
tr -d '\n') foo | \
sed 's/|[0-9|]*|/|/'
Output:
abc|r=1,f=2,c=2|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|3
Looking at the inner sed code:
grep -no 'r=[0-9]*' foo | \
sed 's/^[0-9]*/&s#.*#\&/;s/:r=/|/;s/.*/&#p;/' | \
tr -d '\n'
It's purpose is to parse foo on-the-fly (when foo changes, so will the output), and in this instance come up with:
1s#.*#&|1#p;2s#.*#&|1#p;2s#.*#&|3#p;
Which is almost perfect, but it leaves in old data on the last line:
sed -n '1s#.*#&|1#p;2s#.*#&|1#p;2s#.*#&|3#p;' foo
abc|r=1,f=2,c=2|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1|3
...which old data |1 is what the final sed 's/|[0-9|]*|/|/' removes.
Here is a pure bash solution. I wouldn't recommend actually using this, but it might help you understand better how to work with files in bash.
# Iterate over each line, splitting into three fields
# using | as the delimiter. (f3 is only there to make
# sure a trailing | is not included in the value of f2)
while IFS="|" read -r f1 f2 f3; do
# Create an array of variable groups from $f2, using ;
# as the delimiter
IFS=";" read -a groups <<< "$f2"
for group in "${groups[#]}"; do
# Get each variable from the group separately
# by splitting on ,
IFS=, read -a vars <<< "$group"
for var in "${vars[#]}"; do
# Split each assignment on =, create
# the variable for real, and quit once we
# have found r
IFS== read name value <<< "$var"
declare "$name=$value"
[[ $name == r ]] && break
done
# Output the desired line for the current value of r
printf '%s|%s|%s\n' "$f1" "$f2" "$r"
done
done < $xxxx.txt
Changes for ksh:
read -A instead of read -a.
typeset instead of declare.
If <<< is a problem, you can use a here document instead. For example:
IFS=";" read -A groups <<EOF
$f2
EOF

setting variables to inlines with variables in bash?

I'm trying to set the value of a variable to one line of a file, over and over.
for i in {1..5}
do
THIS = "grep -m $i'[a-z]' newdict2" | tail -1
echo $THIS
done
What's the trick to this black magic?
It's actually easier to run it with sed than tail and grep's -m option:
for i in {1..5}
do
THIS=$(grep -e '[a-z]' newdict2 | sed -ne "${i}p")
echo "$THIS"
done
If you start from 1 to x, other ways to solve it is through line reading in a loop:
while IFS= read -r THIS; do
echo "$THIS"
done < <(grep -e '[a-z]' newdict2)
And through awk:
while IFS= read -r THIS; do
echo "$THIS"
done < (awk '/[a-z]/ && ++i <= 5' newdict2)
Another awk version with different initial value:
while IFS= read -r THIS; do
echo "$THIS"
done < (awk 'BEGIN { i = 2 } /[a-z]/ && i++ <= 5' newdict2)
It's better to find the occurrences once, and loop over them.
grep -m "$i" '[a-z]' newdict |
nl |
while read i THIS; do
echo "$THIS"
done
If you don't need $i for anything inside the loop, remove nl and just read THIS.
Note also the use of double quotes around variable interpolations.

Reverse input order with sed

I have a file, lets call it 'a.txt' and this file contains the following text line
do to what
I'm wondering what the SED command is to reverse the order of this text to make it look like
what to do
Do I have to do some sort of append? Like append 'do' to 'to' so it would look like
to ++ do (used ++ just to make it clear)
I know tac can do something related
$ cat file
do to what
$ tac -s' ' file
what to do $
Where the -s defines the separator, which is by default a newline.
I would use awk to do this:
awk '{ for (i=NF; i>=1; i--) printf (i!=1) ? $i OFS : $i "\n" }' file.txt
Results:
what to do
EDIT:
If you require a one-liner to modify your file "in-place", try:
{ rm file.txt && awk '{ for (i=NF; i>=1; i--) printf (i!=1) ? $i OFS : $i "\n" }' > file.txt; } < file.txt
sed answer
As this question was tagged sed, my 1st answer was:
First (using arbitraty _ to mark viewed spaces, when a.txt contain do to what:
sed -e '
:a;
s/\([^_]*\) \([^ ]*\)/\2_\1/;
ta;
y/_/ /;
' a.txt
what to do
than, when a.txt contain do to to what:
sed -e '
:a;
s/^\(\|.* \)\([^+ ]\+\) \2\([+]*\)\(\| .*\)$/\1\2\3+\4/g;
ta;
:b;
s/\([^_]*\) \([^ ]*\)/\2_\1/;
tb;
y/_/ /;
' <<<'do to to to what'
what to++ do
There is one + for each supressed duplicated word:
sed -e ':a;s/^\(\|.* \)\([^+ ]\+\) \2\([+]*\)\(\| .*\)$/\1\2\3+\4/g;ta;
:b;s/\([^_]*\) \([^ ]*\)/\2_\1/;tb;
y/_/ /;' <<<'do do to what what what what'
what+++ to do+
bash answer
But as there is a lot of people searching for simple bash solutions, there is a simple way:
xargs < <(uniq <(tac <(tr \ \\n <<<'do do to what what what what')))
what to do
this could be written:
tr \ \\n <<<'do do to what what what what' | tac | uniq | xargs
what to do
or even with some bash scripting:
revcnt () {
local wrd cnt plut out="";
while read cnt wrd; do
printf -v plus %$((cnt-1))s;
out+=$wrd${plus// /+}\ ;
done < <(uniq -c <(tac <(tr \ \\n )));
echo $out
}
Will do:
revcnt <<<'do do to what what what what'
what+++ to do+
Or as pure bash
revcnt() {
local out i;
for ((i=$#; i>0; i--))
do
[[ $out =~ ${!i}[+]*$ ]] && out+=+ || out+=\ ${!i};
done;
echo $out
}
where submited string have to be submitted as argument:
revcnt do do to what what what what
what+++ to do+
Or if prossessing standard input (or from file) is required:
revcnt() {
local out i arr;
while read -a arr; do
out=""
for ((i=${#arr[#]}; i--; 1))
do
[[ $out =~ ${arr[i]}[+]*$ ]] && out+=+ || out+=\ ${arr[i]};
done;
echo $out;
done
}
So you can process multiple lines:
revcnt <<eof
do to what
do to to to what
do do to what what what what
eof
what to do
what to++ do
what+++ to do+
This might work for you (GNU sed):
sed -r 'G;:a;s/^\n//;t;s/^(\S+|\s+)(.*)\n/\2\n\1/;ta' file
Explanation:
G add a newline to the end of the pattern space (PS)
:a loop name space
s/^\n//;t when the newline is at the front of the PS, remove it and print line
s/^(\S+|\s+)(.*)\n/\2\n\1/;ta insert either a non-space or a space string directly after the newline and loop to :a
The -r switch makes the regexp easier-on-the-eye (grouping (...), alternation ...|... and the metacharacter for one-or-more + are relieved of the need of a backslash prefix).
Alternative:
sed -E 'G;:a;s/^(\S+)(\s*)(.*\n)/\3\2\1/;ta;s/.//' file
N.B. To reverse the line, adapt the above solution to:
sed -E 'G;:a;/^(.)(.*\n)/\2\1/;ta;s/.//' file
May be you would like perl for this:
perl -F -lane '#rev=reverse(#F);print "#rev"' your_file
As Bernhard said, tac can be used here:
#!/usr/bin/env bash
set -eu
echo '1 2 3
2 3 4
3 4 5' | while IFS= read -r; do
echo -n "$REPLY " | tac -s' '
echo
done
$ ./1.sh
3 2 1
4 3 2
5 4 3
I believe my example is more helpful.

reverse the order of characters in a string

In string "12345", out string "54321". Preferably without third party tools and regex.
I know you said "without third-party tools", but sometimes a tool is just too obviously the right one, plus it's installed on most Linux systems by default:
[madhatta#risby tmp]$ echo 12345 | rev
54321
See rev's man page for more.
Simple:
var="12345"
copy=${var}
len=${#copy}
for((i=$len-1;i>=0;i--)); do rev="$rev${copy:$i:1}"; done
echo "var: $var, rev: $rev"
Output:
$ bash rev
var: 12345, rev: 54321
Presume that a variable 'var' has the value '123'
var="123"
Reverse the string and store in a new variable 'rav':
rav=$(echo $var | rev)
You'll see the 'rav' has the value of '321' using echo.
echo $rav
rev | tail -r (BSD) or rev | tac (GNU) also reverse lines:
$ rev <<< $'12\n34' | tail -r
43
21
$ rev <<< $'12\n34' | gtac
43
21
If LC_CTYPE is C, rev reverses the bytes of multibyte characters:
$ LC_CTYPE=C rev <<< あの
��め�
$ export LC_ALL=C; LC_ALL=en_US.UTF-8 rev <<< あの
のあ
A bash solution improving over #osdyng answer (my edit was not accepted):
var="12345" rev=""
for(( i=0 ; i<${#var} ; i++ )); do rev="${var:i:1}$rev"; done
echo "var: $var, rev: $rev"
Or an even simpler (bash) loop:
var=$1 len="${#var}" i=0 rev=""
while (( i<len )); do rev="${var:i++:1}$rev"; done
echo "var: $var, rev: $rev"
A POSIX solution:
var="12345" rev="" i=1
while [ "$i" -le "${#var}" ]
do rev="$(echo "$var" | awk -v i="$i" '{print(substr($0,i,1))}')$rev"
: $(( i+=1 ))
done
echo "var: $var, rev: $rev"
Note: This works on multi byte strings. Cut solutions will work only in ASCII (1 byte) strings.
Some simple methods of reversing a string
echo '!!!esreveR si sihT' | grep -o . | tac | tr -d '\n' ; echo
echo '!!!esreveR si sihT' | fold -w 1 | tac | tr -d '\n' ; echo
Convert to hex values then reverse
echo '!!!esreveR si sihT' | xxd -p | grep -o .. | tac | xxd -r -p ; echo
echo '!!!esreveR si sihT' | xxd -p | fold -w 2 | tac | xxd -r -p ; echo
This reverses the string "in place":
a=12345
len=${#a}
for ((i=1;i<len;i++)); do a=$a${a: -i*2:1}; done; a=${a:len-1}
echo $a
or the third line could be:
for ((i=0;i<len;i++)); do a=${a:i*2:1}$a; done; a=${a:0:len}
or
for ((i=1;i<len;i++)); do a=${a:0:len-i-1}${a: -i:i+1}${a:len-i-1:1}; done
For those without rev (recommended), there is the following simple awk solution that splits fields on the null string (every character is a separate field) and prints in reverse:
awk -F '' '{ for(i=NF; i; i--) printf("%c", $i); print "" }'
The above awk code is POSIX compliant. As a compliant awk implementation is guaranteed to be on every POSIX compliant OS, the solution should thus not be thought of as "3rd-party." This code will likely be more concise and understandable than a pure POSIX sh (or bash) solution.
(; I do not know if you consider the null string to -F a regex... ;)
If var=12345:
bash for((i=0;i<${#var};i++)); do rev="$rev${var:~i:1}"; done
sh c=$var; while [ "$c" ]; do rev=$rev${c#"${c%?}"}; c=${c%?}; done
echo "var: $var, rev: $rev"
Run it:
$ rev
var: 12345, rev: 54321
This can of course be shortened, but it should be simple to understand: the final print adds the newline.
echo 12345 | awk '{for (i = length($0); i > 0; i--) {printf("%s", substr($0, i, 1));} print "";}'
Nobody appears to have posted a sed solution, so here's one that works in non-GNU sed (so I wouldn't consider it "3rd party"). It does capture single characters using the regex ., but that's the only regex.
In two stages:
$ echo 123456 | sed $'s/./&\\\n/g' | sed -ne $'x;H;${x;s/\\n//g;p;}'
654321
This uses bash format substitution to include newlines in the scripts (since the question is tagged bash). It works by first separating the input string into one line per character, and then by inserting each character into the beginning of the hold buffer.
x swaps the hold space and the pattern space, and
H H appends the (current) pattern space to the hold space.
So for every character, we place that character into the hold space, then append the old hold space to it, thus reversing the input. The final command removes the newlines in order to reconstruct the original string.
This should work for any single string, but it will concatenate multi-line input into a single output string.
Here is another simpler awk solution:
awk 'BEGIN{FS=""} {for (i=NF; i>0; i--) s=s $i; print s}' <<< '123456'
654321
Try Perl:
echo 12345 | perl -nle 'print scalar reverse $_'
Source: Perl one-liners
read word
reve=`echo "$word" | awk '{for(i=length($0); i>0;i--) printf (substr($0,i,1));}'`
echo "$reve"

Randomizing arg order for a bash for statement

I have a bash script that processes all of the files in a directory using a loop like
for i in *.txt
do
ops.....
done
There are thousands of files and they are always processed in alphanumerical order because of '*.txt' expansion.
Is there a simple way to random the order and still insure that I process all of the files only once?
Assuming the filenames do not have spaces, just substitute the output of List::Util::shuffle.
for i in `perl -MList::Util=shuffle -e'$,=$";print shuffle<*.txt>'`; do
....
done
If filenames do have spaces but don't have embedded newlines or backslashes, read a line at a time.
perl -MList::Util=shuffle -le'$,=$\;print shuffle<*.txt>' | while read i; do
....
done
To be completely safe in Bash, use NUL-terminated strings.
perl -MList::Util=shuffle -0 -le'$,=$\;print shuffle<*.txt>' |
while read -r -d '' i; do
....
done
Not very efficient, but it is possible to do this in pure Bash if desired. sort -R does something like this, internally.
declare -a a # create an integer-indexed associative array
for i in *.txt; do
j=$RANDOM # find an unused slot
while [[ -n ${a[$j]} ]]; do
j=$RANDOM
done
a[$j]=$i # fill that slot
done
for i in "${a[#]}"; do # iterate in index order (which is random)
....
done
Or use a traditional Fisher-Yates shuffle.
a=(*.txt)
for ((i=${#a[*]}; i>1; i--)); do
j=$[RANDOM%i]
tmp=${a[$j]}
a[$j]=${a[$[i-1]]}
a[$[i-1]]=$tmp
done
for i in "${a[#]}"; do
....
done
You could pipe your filenames through the sort command:
ls | sort --random-sort | xargs ....
Here's an answer that relies on very basic functions within awk so it should be portable between unices.
ls -1 | awk '{print rand()*100, $0}' | sort -n | awk '{print $2}'
EDIT:
ephemient makes a good point that the above is not space-safe. Here's a version that is:
ls -1 | awk '{print rand()*100, $0}' | sort -n | sed 's/[0-9\.]* //'
If you have GNU coreutils, you can use shuf:
while read -d '' f
do
# some stuff with $f
done < <(shuf -ze *)
This will work with files with spaces or newlines in their names.
Off-topic Edit:
To illustrate SiegeX's point in the comment:
$ a=42; echo "Don't Panic" | while read line; do echo $line; echo $a; a=0; echo $a; done; echo $a
Don't Panic
42
0
42
$ a=42; while read line; do echo $line; echo $a; a=0; echo $a; done < <(echo "Don't Panic"); echo $a
Don't Panic
42
0
0
The pipe causes the while to be executed in a subshell and so changes to variables in the child don't flow back to the parent.
Here's a solution with standard unix commands:
for i in $(ls); do echo $RANDOM-$i; done | sort | cut -d- -f 2-
Here's a Python solution, if its available on your system
import glob
import random
files = glob.glob("*.txt")
if files:
for file in random.shuffle(files):
print file

Resources