In Jonesforth, a dictionary entry is laid out as follows:
<--- DICTIONARY ENTRY (HEADER) ----------------------->
+------------------------+--------+---------- - - - - +----------- - - - -
| LINK POINTER | LENGTH/| NAME | DEFINITION
| | FLAGS | |
+--- (4 bytes) ----------+- byte -+- n bytes - - - - +----------- - - - -
We can take a peek at one of these entries using GDB. (See this question for details on using GDB with Jonesforth.)
Let's display the first 16 bytes of the dictionary entry for SWAP as characters:
>>> x/16cb &name_SWAP
0x105cc: -68 '\274' 5 '\005' 1 '\001' 0 '\000' 4 '\004' 83 'S' 87 'W' 65 'A'
0x105d4: 80 'P' 0 '\000' 0 '\000' 0 '\000' 43 '+' 0 '\000' 1 '\001' 0 '\000'
You can kind of see what's going on here.
The first four bytes are the pointer to the previous word in the dictionary:
-68 '\274' 5 '\005' 1 '\001' 0 '\000'
Then comes the length of the name:
4 '\004'
Then we see the characters of the word name, "SWAP":
83 'S' 87 'W' 65 'A' 80 'P'
And finally some padding to align on a 32-bit boundary:
0 '\000' 0 '\000' 0 '\000'
It would be nice if there was a way to format the word entry in a nicer manner.
If we do the following:
>>> x/1xw &name_SWAP
0x105cc: 0x000105bc
we note that name_SWAP is at 0x105cc.
Let's use GDB's printf to display the word entry:
>>> printf "link: %#010x name length: %i name: %s\n", *(0x105cc), (char)*(0x105cc+4), (0x105cc+5)
link: 0x000105bc name length: 4 name: SWAP
OK, that's not bad! We see the link, the name length, and name, all nicely displayed and labeled.
The downside here is that I have to use the explicit address in the call to printf:
printf "link: %#010x name length: %i name: %s\n", *(0x105cc), (char)*(0x105cc+4), (0x105cc+5)
Ideally, I'd just be able to say something like:
show_forth_word name_SWAP
and it'd display the above.
What's the best way to go about this? Is this doable with a GDB user-defined command? Or is it something more appropriate for the GDB Python interface?
My question is, what's the best way to go about this?
It depends on whether GDB knows about the type of name_SWAP. If it does, the Python pretty-printer is the way to go.
If it doesn't, something as simple as user-defined command is likely easier. Assuming 32-bit mode:
define print_key
set var $v = (char*)$arg0
printf "link: %#010x name length: %i name: %s\n", *((char**)$v), *($v+4), ($v+5)
end
I just used nm , the good old Unix command.
Related
I'm attempting to parse email body to excel file.
After some manipulations, my current output is an array, where each line is data related to a product.
[
"Periods: 01.01.2023 - 01.02.2023 | Code: 111 | Code2: 1111 | product-name",
"Periods: 01.01.2023 - 01.02.2023 | Code: 222 | Code2: 2222 | product-name2"
]
I need to replace the 3rd occurrence of " | " with " | Product: " , so i can get field Product before the product name.
I've tried to use Apply to each -> current item -> various ways to find 3rd occurrence and replace it, but can't succeed.
Any suggestion?
You should be able to loop through each item and perform a simple replace expression like thus ...
replace(item(), split(item(), ' | ')[3], concat('Product: ', split(item(), ' | ')[3]))
That should get you across the line. Of course, I'm basing my answer off the limited information you provided.
I have a tab separated file, consisting of 7 columns.
ABC 1437 1 0 71 15.7 174.4
DEF 0 0 0 1 45.9 45.9
GHIJ 2 3 0 9 1.1 1.6
What I need is to replace the tab character with variable amount of space characters in order ot maintain the column alignment. Note that, I do not want every tab to be replaced by 8 spaces. Instead, I want 5 spaces after row #1 column #1 (8 - length(ABC) = 5), 4 spaces after row #1 column #2 (8 - length(1437) = 4), etc.
Is there a linux tool to do it for me, or I should write it myself?
The POSIX utility pr called as pr -e -t does exactly what you want and AFAIK is present in every Unix installation.
$ cat file
ABC 1437 1 0 71 15.7 174.4
DEF 0 0 0 1 45.9 45.9
GHIJ 2 3 0 9 1.1 1.6
$ pr -e -t file
ABC 1437 1 0 71 15.7 174.4
DEF 0 0 0 1 45.9 45.9
GHIJ 2 3 0 9 1.1 1.6
and with the tabs visible as ^Is:
$ cat -ET file
ABC^I1437^I1^I0^I71^I15.7^I174.4$
DEF^I0^I0^I0^I1^I45.9^I45.9$
GHIJ^I2^I3^I0^I9^I1.1^I1.6$
$ pr -e -t file | cat -ET
ABC 1437 1 0 71 15.7 174.4$
DEF 0 0 0 1 45.9 45.9$
GHIJ 2 3 0 9 1.1 1.6$
There is command pair dedicated for this task.
$ expand file
will do exactly what you want. The counterpart unexpand -a to do the reverse. There are few other useful options in both.
Use column, as suggested in the comment by anubhava, specifically using -t and -s options:
column -t -s $'\t' in_file
From the column manual:
-s, --separator separators
Specify the possible input item delimiters (default is
whitespace).
-t, --table
Determine the number of columns the input contains and
create a table. Columns are delimited with whitespace, by
default, or with the characters supplied using the
--output-separator option. Table output is useful for
pretty-printing.
How to count number of integers in a file using egrep?
I tried to solve it as a pattern finding problem. Actually, I am facing problem of how to represent range of characters [0-9] continuously which include "space" before the beginning and "space or dot" after the end. I think the latter can be solved by using \< and \> respectively. Also, It should not include dot in between otherwise it will not be an integer. I am unable to convert this logic into regular expression using available tools and techniques.
My name is 2322.
33 is my sister.
I am blessed with a son named 55.
Why are you so 69. Is everything 33.
66.88 is not an integer
55whereareyou?
The right answer should be 5 i.e. for 2322, 33, 55, 69 and 33.
grep -Eo '(^| )([0-9]+[\.\?\=\:]?( |$))+' | wc -w
^^ ^ ^ ^ ^ ^ ^
|| | | | | | |
E = extended regex--------+| | | | | | |
o = extract what found-----+ | | | | | |
starts with new line or space---+ | | | | |
digits--------------------------------+ | | | |
optional dot, question mark, etc.-------------+ | | |
ends with end line or space----------------------------+ | |
repeat 1 time or more (to detect integers like "123 456")--+ |
count words------------------------------------------------------+
Note: 123. 123? 123: are also counted as integer
Test:
#!/bin/bash
exec 3<<EOF
My name is 2322.
33 is my sister.
I am blessed with a son named 55.
Why are you so 69. Is everything 33.
66.88 is not an integer
55whereareyou?
two integers 123 456.
how many tables in room 400? 50.
50? oh I thought it was 40.
23: It's late, 23:00 already
EOF
grep -Eo '(^| )([0-9]+[\.\?\=\:]?( |$))+' <&3 | \
tee >(sleep 0.5; echo -n "integer counted: "; wc -w; )
Outputs:
2322.
33
55.
69.
33.
123 456.
400? 50.
50?
40.
23:
integer counted: 12
Based on the observation that you want 66.88 excluded, I'm guessing
grep -Ec '[0-9]\.?( |$)' file
which finds a digit, optionally followed by a dot, followed by either a space or end of line.
The -c option says to report the number of lines which contain a match (so not strictly the number of matches, if there are lines which contain multiple matches) and the -E option enables extended regular expression syntax, i.e. what was traditionally calned egrep (though the command name is now obsolescent).
If you need to count matches, the -o option prints each match on a separate line, which you can then pass to wc -l (or in lucky cases combine with grep -c, but check first; this doesn't work e.g. with GNU grep currently).
On my ubuntu this code working fine
grep -P '((^)|(\s+))[-+]?\d+\.?((\s+)|($))' test
I'd like to generate a lot of integers between 0 and 1 using bash.
I tried shuf but the generation is very slow. Is there another way to generate numbers ?
This will output an infinite stream of bytes, written in binary and separated by a space :
cat /dev/urandom | xxd -b | cut -d" " -f 2-7 | tr "\n" " "
As an example :
10100010 10001101 10101110 11111000 10011001 01111011 11001010 00011010 11101001 01111101 10100111 00111011 10100110 01010110 11101110 01000011 00101011 10111000 01010110 10011101 01000011 00000010 10100001 11000110 11101100 11001011 10011100 10010001 01000111 01000010 01001011 11001101 11000111 11110111 00101011 00111011 10110000 01110101 01001111 01101000 01100000 11011101 11111111 11110001 10001011 11100001 11100110 10101100 11011001 11010100 10011010 00010001 00111001 01011010 00100101 00100100 00000101 10101010 00001011 10101101 11000001 10001111 10010111 01000111 11011000 01111011 10010110 00111100 11010000 11110000 11111011 00000110 00011011 11110110 00011011 11000111 11101100 11111001 10000110 11011101 01000000 00010000 00111111 11111011 01001101 10001001 00000010 10010000 00000001 10010101 11001011 00001101 00101110 01010101 11110101 10111011 01011100 00110111 10001001 00100100 01111001 01101101 10011011 00100001 01101101 01001111 01101000 00100001 10100011 00011000 01000001 00100100 10001101 10110110 11111000 01110111 10110111 11001000 00101000 01101000 01001100 10000001 11011000 11101110 11001010 10001101 00010011^C
If you don't want spaces between bytes (thanks #Chris):
cat /dev/urandom | xxd -b | head | cut -d" " -f 2-7 | tr -d "\n "
1000110001000101011111000010011011011111111001000000011000000100111101000001110110011011000000001101111111011000000100101001001110110001111000010100100100010110110000100111111110111011111100101000011000010010111010010001001001111000010101000110010010011011110000000011100110000000100111010001110000000011001011010101111001
tr -dc '01' < /dev/urandom is a quick and dirty way to do this.
If you're on OSX, tr can work a little weird, so you can use perl instead: perl -pe 'tr/01//dc' < /dev/urandom
Just for fun --
A native-bash function to print a specified number of random bits, extracted from the smallest possible number of evaluations of $RANDOM:
randbits() {
local x x_bits num_bits
num_bits=$1
while (( num_bits > 0 )); do
x=$RANDOM
x_bits="$(( x % 2 ))$(( x / 2 % 2 ))$(( x / 4 % 2 ))$(( x / 8 % 2 ))$(( x / 16 % 2 ))$(( x / 32 % 2 ))$(( x / 64 % 2 ))$(( x / 128 % 2 ))$(( x / 256 % 2 ))$(( x / 512 % 2 ))$(( x / 1024 % 2 ))$(( x / 2048 % 2 ))$(( x / 4096 % 2))$(( x / 8192 % 2 ))$(( x / 16384 % 2 ))"
if (( ${#x_bits} < $num_bits )); then
printf '%s' "$x_bits"
(( num_bits -= ${#x_bits} ))
else
printf '%s' "${x_bits:0:num_bits}"
break
fi
done
printf '\n'
}
Usage:
$ randbits 64
1011010001010011010110010110101010101010101011101100011101010010
Because this uses $RANDOM, its behavior can be made reproducible by assigning a seed value to $RANDOM before invoking it. This can be handy if you want to be able to reproduce bugs in software that uses "random" inputs.
Since the question asks for integers between 1 and 0, there is this extremely random and very fast method. A good one-liner for sure:
echo "0.$(printf $(date +'%N') | md5sum | tr -d '[:alpha:][:punct:]')"
This command will give you an output similar to this when thrown inside a for loop with 10 iterations:
0.97238535471032972041395
0.8642459339189067551494
0.18109959700829495487820
0.39135471514800072505703651
0.624084503017958530984255
0.41997456791539740171
0.689027289676627803
0.22698852059605560195614
0.037745437519184791498537
0.428629619193662260133
And if you need to print random strings of 1's and 0's, as others have assumed, you can make a slight change to the command like this:
printf $(date +'%N') | sha512sum | tr -d '[2-9][:alpha:][:punct:]'
Which will yield an output of random 0's and 1's similar to this when thrown into a for loop with 10 iterations:
011101001110
001110011011
0010100010111111
0000001101101001111011111111
1110101100
00010110100
1100101101110010
101100110101100
1100010100
0000111101100010001001
To my knowledge, and from what I have found online, this is the closest to true randomness we can get in bash. I have even made a game of dice (where the dice has 10 sides 0-9) to test the randomness, using this method for generating a single number from 0 to 9. Out of 100 dice throws, each side lands almost a perfect 10 times. Out of 1000 throws, each side hits around 890-1100 times. The variation of what side lands doesn't change much after 1000 throws. So you can be very sure that this method is highly ideal, at least for bash tools generating pseudo-random numbers, for the job.
And if you need just an absolute mind-blowingly ridiculous amount of randomness, the simple md5sum checksum command can be compounded upon itself many, many times and still be very fast. As an example:
printf $(date +'%N') | md5sum | md5sum | md5sum | tr -d '[:punct:][:space:]'
This will have a not-so-random number, obtained from printing the date command's nanosecond option, piped into md5sum. Then that md5 hash is piped into md5sum and then "that" hash is sent into md5sum for a last time. The output is a completely randomized hash that you can use tools like awk, sed, grep, and tr to control what you want printed.
Hope this helps.
while read line1
do
while read line2
do
while read line3
do echo "$line1, $line2, $line3" | awk -F , ' $1==$5 && $6==$11 && $10==$12 {print $1,",",$2,",",$3,",",$4,",",$6,",",$7,",",$8,",",$9,",",$10,",",$13,",",$14,",",$15}' >>out.txt
done < grades.csv
done < subjects.csv
done < students.csv
In this code i am merging three files by line(cross product) and if any merged line meets the condition "$1==$5 && $6==$11 && $10==$12", I am printing them in the output file.
Now my problem is i want to keep adding "$13" field values for each iteration if it meets the condition.
How can I do this? Please help.
Here is the sample files.
gardes.csv containes lines :
1,ARCH,1,90,very good,80
1,ARCH,2,70,good,85
1,PLNG,1,89,very good,85
subjects.csv contains lines :
1,ARCH,Computer Architecture,A,K.Gose
1,PLNG,Programming Languages,A,P.Yang
1,OS,Operating System,B,K.Gopalan
2,ARCH,Computer Architecture,A,K.Gose
students.csv contains lines:
1,pankaj,vestal,986-654-32
2,satadisha,binghamton,879-876-54
5,pankaj,vestal,986-654-32
6,pankaj,vestal,986-654-31
This is the expected output:
ARCH 1 pankaj vestal 986-654-32 Computer Architecture A K.Gose 1 1 90 very good 80
ARCH 1 pankaj vestal 986-654-32 Computer Architecture A K.Gose 1 2 70 good 85
ARCH 2 satadisha binghamton 879-876-54 Computer Architecture A K.Gose 1 1 90 very good 80
ARCH 2 satadisha binghamton 879-876-54 Computer Architecture A K.Gose 1 2 70 good 85
PLNG 1 pankaj vestal 986-654-32 Programming Languages A P.Yang 1 1 89 very good 85
Also I need the sum of (90+70+90+70+89) in another shell variable which can be written to a file.
Assuming you have joined the columns to form a TSV (tab-separated values) file or stream, and that columns $k1, $k2, and $k3 (in that file or stream) form the key, and that you want to sum column $s in the join, here is the awk command you can use to form a TSV listing of the keys and sum:
awk -F\\t -v k1=$k1 -v k2=$k2 -v k3=$k3 '
BEGIN{t=OFS="\t"}
{ key=$k1 t $k2 t $k3; sum[key]+=$s }
END {for (key in sum) {print key, sum[key] } }'
(Using awk to process CSV files that might contain commas is asking for trouble, so I've illustrated how to use awk with tabs.)
You can use the join to create your expanded data and operate with awk on it.
$ join -t, -1 5 -2 2 <(join -t, -j 1 file3 file2 | sort -t, -k5,5) file1 | column -s, -t
ARCH 1 pankaj vestal 986-654-32 Computer Architecture A K.Gose 1 1 90 very good 80
ARCH 1 pankaj vestal 986-654-32 Computer Architecture A K.Gose 1 2 70 good 85
ARCH 2 satadisha binghamton 879-876-54 Computer Architecture A K.Gose 1 1 90 very good 80
ARCH 2 satadisha binghamton 879-876-54 Computer Architecture A K.Gose 1 2 70 good 85
PLNG 1 pankaj vestal 986-654-32 Programming Languages A P.Yang 1 1 89 very good 85
alternatively, you can do the join in awk as well, eliminating the while loops.
If you want to add the values in $11.
$ join -t, -1 5 -2 2 <(join -t, -j 1 file3 file2
| sort -t, -k5,5) file1 | awk -F, '{sum+=$11} END{print sum}'
To assign the result to a shell variable
$ sum=$(join ... )