I currently have a bash script which gives me numbers such as 26084729776630091742. This number is converted 26,084729776630091742 with the last 18 digits being decimal places, but how can I convert this in bash?
(The number I want to convert is an ethereum unit called WEI and the 18 last numbers are the decimal places. So it doesnt work adding a dot after the first 2 digits.)
EDIT2:
echo "128084729776630091742" | awk '{gsub(/,/,"");print substr($0,1,length($0)-18) "," substr($0,length($0)-17)}'
EDIT: As OP changed the sample so adding this code now.
echo "128,6436xxxx" | awk '{gsub(/,/,"");print substr($0,1,2) "," substr($0,3)}'
If you are looking for only this specific example then following may help you.
echo "26084729776630091742" | awk '{print substr($0,1,2) "," substr($0,3)}'
Solution 2nd: Using sed.
echo "26084729776630091742" | sed 's/\(..\)\(.*\)/\1,\2/'
what you are looking for is something like: "${str:0:2},${str:2}"
Example in your case:
str="26084729776630091742"
echo "${str:0:2},${str:2}"
prints 26,084729776630091742
EDIT as question was precised
str="26084729776630091742133"
echo "${str:0:${#str}-18},${str:${#str}-18}"
prints 26084,729776630091742133 with 18 decimal places (works fine if ${#str} > 18)
Related
So I need to subset 10 characters from all strings in a particular column of a file, randomly and without repetition (i.e. I want to avoid drawing a character from any given index more than once).
For the sake of simplicity, let's say I have the following string:
ABCDEFGHIJKLMN
For which I should obtain, for example, this result:
DAKLFCHGBI
Notice that no letter occurs twice, which means that no position is extracted more than once.
For this other string:
CCCCCCCCCCCCGG
Analogously, I should never find more than two "G" characters in the output (otherwise it would mean that a "G" character has been sampled more than once), e.g.:
CCGCCCCCCC
Or, in other words, I want to shuffle all characters from each string, and keep the first 10. This can be easily achieved in bash using:
echo "ABCDEFGHIJKLMN" | fold -w1 | shuf -n10 | tr -d '\n'
However, since I need to perform this many times on dozens of files with over a hundred thousand lines each, this is way too slow. So looking around, I've arrived at the following awk code, which seems to work fine whenever the strings are passed to it one by one, e.g.:
awk '{srand(); len=length($1); for(i=1;i<=10;) {k=int(rand()*len)+1; if(!(k in N)) {N[k]; printf "%s", substr($1,k,1); i++}} print ""}' <(echo "ABCDEFGHIJKLMN")
But when I input the following file with a string on each row, awk hangs and the output gets truncated on the second line:
echo "ABCDEFGHIJKLMN" > file.txt
echo "CCCCCCCCCCCCGG" >> file.txt
awk '{srand(); len=length($1); for(i=1;i<=10;) {k=int(rand()*len)+1; if(!(k in N)) {N[k]; printf "%s", substr($1,k,1); i++}} print ""}' file.txt
This other version of the code which samples characters from the string with repetition works fine, so it looks like the issue lies in the part which populates the N array, but I'm not proficient in awk so I'm a bit stuck:
awk '{srand(); len=length($1); for(i=1;i<=10;i++) {k=int(rand()*len)+1; printf "%s", substr($1,k,1)} print ""}'
Anyone can help?
In case this matters: my actual file is more complex than the examples provided here, with several other columns, and unlike the ones in this example, its strings may have different lengths.
Thanks in advance for your time :)
EDIT:
As mentioned in the comments, I managed to make it work by removing the N array (so that it resets before processing each row):
awk 'BEGIN{srand()} {len=length($1); for(i=1;i<=10;) {k=int(rand()*len)+1; if(!(k in N)) {N[k]; printf "%s", substr($1,k,1); i++}} split("", N); print ""}' file.txt
Do note however that if the string in $1 is shorter than 10, this will get stuck in an infinite loop, so make sure that all strings are always longer than the subset target size. The alternative solution provided by Andre Wildberg in the comments doesn't carry this issue.
I would harness GNU AWK for this task following way, let file.txt content be
ABCDEFGHIJKLMN
CCCCCCCCCCCCGG
then
awk 'function comp_func(i1, v1, i2, v2){return rand()-0.5}BEGIN{FPAT=".";PROCINFO["sorted_in"]="comp_func"}{s="";patsplit($0,arr);for(i in arr){s = s arr[i]};print substr(s,1,10)}' file.txt
might give output
NGLHCKEIMJ
CCCCCCCCGG
Explanation: I use custom Array Traversal Control function which does randomly decides which element should be considered greater. -0.5 is used as rand() gives values from 0 to 1. For each line array arr is populated by characters of line, then traversed in random order to create s string which are characters shuffled, then substr used to get first 10 characters. You might elect to add counter which will terminate for loop if you have very long lines in comparison to number of characters to select.
(tested in GNU Awk 5.0.1)
Iteratively construct a substring of the remaining letters.
Tested with
awk version 20121220
GNU Awk 4.2.1, API: 2.0
GNU Awk 5.2.1, API 3.2
mawk 1.3.4 20200120
% awk -v size=10 'BEGIN{srand()} {n=length($0); a=$0; x=0;
for(i=1; i<=n; i++){x++; na=length(a); rnd = int(rand() * na + 1)
printf("%s", substr(a, rnd, 1))
a=substr(a, 1, rnd - 1)""substr(a, rnd + 1, na)
if(x >= size){break}}
print ""}' file.txt
CJFMKHNDLA
CGCCCCCCCC
In consecutive iterative runs remember to check if srand works the way you expect in your version of awk. If in doubt use $RANDOM or, better, /dev/urandom.
if u don't need to be strictly within awk, then jot makes it super easy :
say you want 20 random characters between
"A" (ascii 65) and "N" (ascii 78), inc. repeats of same chars
jot -s '' -c -r 20 65 78
ANNKECLDMLMNCLGDIGNL
I am trying to do some smart contract testing for Elrond blockchain with multiple wallets and different token amounts.
The Elrond blockchain requires some hexadecimal encodings for the Smart Contract interaction. The problem is that my hex encoding matches some conversions like in this webpage http://207.244.241.38/elrond-converters/ but some others are not:
For example here is my for loop:
for line in $file; do
MAIAR_AMOUNT=$(echo $line | awk -F "," '{print $2}')
MAIAR_AMOUNT_HEX=$(printf "%04x" $MAIAR_AMOUNT)
echo -e "$MAIAR_AMOUNT > $MAIAR_AMOUNT_HEX"
done
And here is the output
620000000000 > 905ae13800
1009000000000 > eaed162a00
2925000000000 > 2a907960200
31000000000 > 737be7600
111000000000 > 19d81d9600
The first one is the decimal value I want to convert, the second is the hexadecimal.
Now if I compare the results with http://207.244.241.38/elrond-converters/
A value like 2925000000000 is 02a907960200 not 2a907960200 like I have in my output. (notice the 0 at the beginning)
But a value like 620000000000 is matching with the website 905ae13800
Of course adding a 0 in front of %04x is not gonna help me.
Now if I go to this guy repository (link below) I can see there is a calculus made, but I don't know JavaScript/TypeScript so I don't know how to interpret it in Bash.
https://github.com/bogdan-rosianu/elrond-converters/blob/main/src/index.ts#L30
It looks the converted hex string should have even number of digits.
Then would you please try:
MAIAR_AMOUNT_HEX=$(printf "%x" "$MAIAR_AMOUNT")
(( ${#MAIAR_AMOUNT_HEX} % 2 )) && MAIAR_AMOUNT_HEX="0$MAIAR_AMOUNT_HEX"
The condition (( ${#MAIAR_AMOUNT_HEX} % 2 )) is evaluated to be true if
$MAIAR_AMOUNT_HEX has odd length. Then 0 is prepended to adjust
the length.
The 4 in %04x is how many digits to pad the output to. Use %012x to pad the output to 12 hex digits.
❯ printf '%012x' 2925000000000
02a907960200
suppose I have file containing numbers like:
1 4 7
2 5 8
and I want to add 1 to all these numbers, making the output like:
2 5 8
3 6 9
is there a simple one-line command (e.g. awk) to realize this?
try following once.
awk '{for(i=1;i<=NF;i++){$i=$i+1}} 1' Input_file
EDIT: As per OP's request without loop, here is a solution(written as per shown sample only).
With hardcoding of number of fields.
awk -v RS='[ \n]' '{ORS=NR%3==0?"\n":" ";print $0+1}' Input_file
OR
Without hardcoding number of fields.
awk -v RS='[ \n]' -v col=$(awk 'FNR==1{print NF}' Input_file) '{ORS=NR%col==0?"\n":" ";print $0+1}' Input_file
Explanation: So in EDIT section 1st solution I have hardcoded the number of fields by mentioning 3 there, in OR solution of EDIT, I am creating a variable named col which will read the very first line of Input_file to get the number of fields. Then it will not read all the Input_file, Now coming onto the code I have set Record separator as space or new line to it will add them without using a loop and it will add space each time after incrementing 1 in their values. It will print new line only when number of lines are completely divided by value of col(which is why we have taken number of fields in -v col section).
In native bash (no awk or other external tool needed):
#!/usr/bin/env bash
while read -r -a nums; do # read a line into an array, splitting on spaces
out=( ) # initialize an empty output array for that line
for num in "${nums[#]}"; do # iterate over the input array...
out+=( "$(( num + 1 ))" ) # ...and add n+1 to the output array.
done
printf '%s\n' "${out[*]}" # then print that output array with a newline following
done <in.txt >out.txt # with input from in.txt and output to out.txt
You can do this using gnu awk:
awk -v RS="[[:space:]]+" '{$0++; ORS=RT} 1' file
2 5 8
3 6 9
If you don't mind Perl:
perl -pe 's/(\d+)/$1+1/eg' file
Substitute any number composed of multiple digits (\d+) with that number ($1) plus 1. /e means to execute the replacement calculation, and /g means globally throughout the file.
As mentioned in the comments, the above only works for positive integers - per the OP's original sample file. If you wanted it to work with negative numbers, decimals and still retain text and spacing, you could go for something like this:
perl -pe 's/([-]?[.0-9]+)/$1+1/eg' file
Input file
Some column headers # words
1 4 7 # a comment
2 5 cat dog # spacing and stray words
+5 0 # plus sign
-7 4 # minus sign
+1000.6 # positive decimal
-21.789 # negative decimal
Output
Some column headers # words
2 5 8 # a comment
3 6 cat dog # spacing and stray words
+6 1 # plus sign
-6 5 # minus sign
+1001.6 # positive decimal
-20.789 # negative decimal
I'm trying to write a script to pull the integers out of 4 files that store temperature readings from 4 industrial freezers, this is a hobby script it generates the general readouts I wanted, however when I try to generate a SUM of the temperature readings I get the following printout into the file and my goal is to print the end SUM only not the individual numbers printed out in a vertical format
Any help would be greatly appreciated;here's my code
grep -o "[0.00-9.99]" "/location/$value-1.txt" | awk '{ SUM += $1; print $1} END { print SUM }' >> "/location/$value-1.txt"
here is what I am getting in return
Morningtemp:17.28
Noontemp:17.01
Lowtemp:17.00 Hightemp:18.72
1
7
.
2
8
1
7
.
0
1
1
7
.
0
0
1
8
.
7
2
53
It does generate the SUM I don't need the already listed numbers, just the SUM total
Why not stick with AWK completely? Code:
$ cat > summer.awk
{
while(match($0,/[0-9]+\.[0-9]+/)) # while matches on record
{
sum+=substr($0, RSTART, RLENGTH) # extract matches and sum them
$0=substr($0, RSTART + RLENGTH) # reset to start after previous match
count++ # count matches
}
}
END {
print sum"/"count"="sum/count # print stuff
Data:
$ cat > data.txt
Morningtemp:17.28
Noontemp:17.01
Lowtemp:17.00 Hightemp:18.72
Run:
$ awk -f summer.awk file
70.01/4=17.5025
It might work in the winter too.
The regex in grep -o "[0.00-9.99]" "/location/$value-1.txt" is equivalent to [0-9.], but you're probably looking for numbers in the range 0.00 to 9.99. For that, you need a different regex:
grep -o "[0-9]\.[0-9][0-9]" "/location/$value-1.txt"
That looks for a digit, a dot, and two more digits. It was almost tempting to use [.] in place of \.; it would also work. A plain . would not; that would select entries such as 0X87.
Note that the pattern shown ([0-9]\.[0-9][0-9]) will match 192.16.24.231 twice (2.16 and 4.23). If that's not what you want, you have to be a lot more precise. OTOH, it may not matter in the slightest for the actual data you have. If you'd want it to match 192.16 and 24.231 (or .24 and .231), you have to refine your regex.
Your command structure:
grep … filename | awk '…' >> filename
is living dangerously. In the example, it is 'OK' (but there's a huge grimace on my face as I type 'OK') because the awk script doesn't write anything to the file until grep has read it all. But change the >> to > and you have an empty input, or have awk write material before the grep is complete and suddenly it gets very tricky to determine what happens (it depends, in part, on what awk writes to the end of the file).
In a text file, test.txt, I have the next information:
sl-gs5 desconnected Wed Oct 10 08:00:01 EDT 2012 1001
I want to extract the hour of the event by the next command line:
hour=$(grep -n sl-gs5 test.txt | tail -1 | cut -d' ' -f6 | awk -F ":" '{print $1}')
and I got "08". When I try to add 1,
14 echo $((hour+1))
I receive the next error message:
./test2.sh: line 14: 08: value too great for base (error token is "08")
If variables in Bash are untyped, why?
See ARITHMETIC EVALUATION in man bash:
Constants with a leading 0 are interpreted as octal numbers.
You can remove the leading zero by parameter expansion:
hour=${hour#0}
or force base-10 interpretation:
$((10#$hour + 1))
what I'd call a hack, but given that you're only processing hour values, you can do
hour=08
echo $(( ${hour#0} +1 ))
9
hour=10
echo $(( ${hour#0} +1))
11
with little risk.
IHTH.
You could also use bc
hour=8
result=$(echo "$hour + 1" | bc)
echo $result
9
Here's an easy way, albeit not the prettiest way to get an int value for a string.
hour=`expr $hour + 0`
Example
bash-3.2$ hour="08"
bash-3.2$ hour=`expr $hour + 0`
bash-3.2$ echo $hour
8
In Short: In order to deal with "Leading Zero" numbers (any 0 digit that comes before the first non-zero) in bash
- Use bc An arbitrary precision calculator language
Example:
a="000001"
b=$(echo $a | bc)
echo $b
Output: 1
From Bash manual:
"bc is a language that supports arbitrary precision numbers with interactive execution
of statements. There are some similarities in the syntax to the C programming lan-
guage. A standard math library is available by command line option. If requested, the
math library is defined before processing any files. bc starts by processing code from
all the files listed on the command line in the order listed. After all files have
been processed, bc reads from the standard input. All code is executed as it is read.
(If a file contains a command to halt the processor, bc will never read from the standard input.)"
Since hours are always positive, and always 2 digits, you can set a 1 in front of it and subtract 100:
echo $((1$hour+1-100))
which is equivalent to
echo $((1$hour-99))
Be sure to comment such gymnastics. :)
The leading 0 is leading to bash trying to interpret your number as an octal number, but octal numbers are 0-7, and 8 is thus an invalid token.
If I were you, I would add some logic to remove a leading 0, add one, and re-add the leading 0 if the result is < 10.
How about sed?
hour=`echo $hour|sed -e "s/^0*//g"`