In the following code INS=\x1b\x5b\x32\x7e the hex value corresponds to Insert key.
Is it any list of hex values corresponding to key-press exists?
Bash use ANSI-C quoting, so first notice that \x is not part of the key code, but used to tell that the following characters are an hexadecimal 1/2 digits code.
Let's look at \x1b\x5b\x32\x7e as example:
\x1b is the ASCII escape code (ESC)
\x5b refers to the character [. Used right after the escape code (so ESC[), it form with it a Control Sequence Introducer (CSI). A CSI tells to the terminal to interpret the next parameters as part of a sequence.
This is used in the special cases of (not comprehensive): INS, DEL, ⇱, End, PgUp, PgDn and F6 to F12.Why we need to use a CSI on those special keys? Because those keys use a 4 to 6 digits hexadecimal code - so it can't be interpreted with a single \x00 parameter, it must be a sequence.
\x32 and \x7e together are the control sequence, which basically is translated to 2~ which refer to the INS command. In the ANSI standard for ASCII terminals, the keys listed before are identified by a combination of a decimal code and the ~ (\x7e) character. In a CSI sequence, the action is defined by the last character - in this case ~ which correspond to the "special keys action".
Let's get an other example to prove this: 17~ means F6 , and is translated to \x31\x37\x7e using the ASCII table:
HEX Chr
--- ---
31 1
37 7
7E ~
So, that looks good - we have \x1b\x5b\x31\x37\x7e when adding the CSI, which is the valid code in bash for F6 - see this gist that list most of the special keys code.
But how to know that F6 should be translated to 17~? By looking at the ANSI code table, you will find the following:
176 7E ~ DECKEYS Sent by special function keys
[1~=FIND, [2~=INSERT, [3~=REMOVE, [4~=SELECT, [5~=PREV, [6~=NEXT
[11~=F1… [17~=F6…[34~=F20 ([23~=ESC,[24~=BS,[25~=LF,[28~=HELP,[29~=DO)
This isn't perfect, but from this you can establish the following table (this is theoretical, probably need some validation):
+-----+------+-------------+-----------+------------+-------------------+
| Key | Code | Hex | Key | Code | Hex |
+-----+------+-------------+-----------+------------+-------------------+
| F6 | [17~ | 5B 31 37 7E | END | [4~ OR [OF | 5B 34 7E OR 4F 46 |
| F7 | [18~ | 5B 31 38 7E | Home (⇱) | [1~ OR [OH | 5B 31 7E OR 4F 48 |
| F8 | [19~ | 5B 31 39 7E | Page down | [6~ | 5B 36 7E |
| F9 | [20~ | 5B 32 30 7E | Page up | [5~ | 5B 35 7E |
| F10 | [21~ | 5B 32 31 7E | Up | [! | 5B 41 |
| F11 | [23~ | 5B 32 33 7E | Down | [" | 5B 42 |
| F12 | [24~ | 5B 32 34 7E | Right | [# | 5B 43 |
| INS | [2~ | 5B 32 7E | Left | [$ | 5B 44 |
| DEL | [3~ | 5B 33 7E | | | |
+-----+------+-------------+-----------+------------+-------------------+
Which should be enough for most of your uses.
Please note that this answer is an attempt to explain the ANSI quoting for keys in bash, it might not be accurate because of the lack of documentation about this.
Related
I use dieHarder tool with ASCII format input files and results are OK but now it`s the right time to use binary files instead. When I had converted my random data to a BIN file like described below no BIAS at all tests is seen. The documentation speaks of raw-binary input format when running on my UBUNTU machine but how this should look like ? My file content is as follows:
(UINT32 as bitstream in file)
0001110000111111000011101110111001111001010000000101110111111011111010011111001011111100100001 ...
call program as:
dieharder -g 201 -f <myFile.bin> -a
some sample probes of my input values:
473894638
00011100001111110000111011101110
2034261499
01111001010000000101110111111011
3925015684
11101001111100101111110010000100
...
All p-values will remain at 0.00000 when applying that binary format file.
I am curious whether how you write .bin file. I guess you wrote binary file in ASCII character. But it is not PROPER input_file_raw that Dieharder test needs. You should write file in Bytes(Binary) not ASCII. This post will be helpful to you or Comment please :)
I had tested several files with MT19937 (Mersenne Twister) and find out PROPER input file.
When you are going to write binary file for Dieharder test, you should keep in mind below 2 things.
Remove Header. (6 lines. From ####... to numbit: 32)
Change Integers from ASCII to Little Endian bytes
Dieharder Test example
Below Data is from MT19937 (32-bits, NOT 64-bits) in Go-language, with seed=0, generating 10,000,000 integers.
(Decimal) ASCII example
#==================================================================
# generator MT19937 seed = 0
#==================================================================
type: d
count: 10000000
numbit: 32
2357136044
2546248239
3071714933
3626093760
...
Binary file example
This is screenshots whether I can see with VSCode-HexEditor
AC 0A 7F 8C 2F AA C4 97 75 A6 16 B7 C0 CC 21 D8
43 B3 4E 9A FB 52 A2 DB C3 76 7D 8B 67 7D E5 D8
09 A4 74 6C D3 DE A1 9F 15 51 59 A5 F2 D6 66 62
24 B7 05 70 57 3A 2B 4C 46 3C 4B E4 D8 BD 84 0E
58 9A B2 F6 8C CD CC 45 3A 39 29 62 C1 42 48 7A
E6 7D AE CA 27 4A EA CF 57 A8 65 87 AE C8 DF 7A
58 5E 6B 91 51 8B 8D 64 A5 E6 F3 EC 19 42 09 D6
...
First data 2357136044 = 0x8C7F0AAC You can see First 4 bytes starts with 'AC' '0A' '7F' '8C'. That shows 2 things, there is no Header and it is Little Endian.
Code in Golang
I know below code is not helpful to you. As far as I know, there is no official Pure-MT19937 generator in Go-language. So, I do porting on my own from Pseudo-code in wiki to Go-language (1.17.1).
littleEndianFile, err := os.Create("./MT19937_LittleEndian.bin")
littleEndianFileBuffer := bufio.NewWriter(littleEndianFile)
littleEndianByte := make([]byte, 4)
// Generate MT19937 on my own.
test := NewMT19937(0)
newInt32 := test.NextUint32()
binary.LittleEndian.PutUint32(littleEndianByte, newInt32)
for _, eachByte := range littleEndianByte {
littleEndianFileBuffer.WriteByte(eachByte)
}
littleEndianFileBuffer.Flush()
Result - (Decimal) ASCII example
> dieharder -a -g 202 -f ./generated/MT19937_10000000.dat
#=============================================================================#
# dieharder version 3.31.1 Copyright 2003 Robert G. Brown #
#=============================================================================#
rng_name | filename |rands/second|
file_input|./generated/MT19937_10000000.dat| 7.79e+06 |
#=============================================================================#
test_name |ntup| tsamples |psamples| p-value |Assessment
#=============================================================================#
diehard_birthdays| 0| 100| 100|0.63638992| PASSED
diehard_operm5| 0| 1000000| 100|0.00012670| WEAK
diehard_rank_32x32| 0| 40000| 100|0.93085433| PASSED
diehard_rank_6x8| 0| 100000| 100|0.07088597| PASSED
diehard_bitstream| 0| 2097152| 100|0.10456387| PASSED
Result - Little-Endian example
> dieharder -a -g 201 -f ./generated/MT19937_10000000_LittleEndian.bin
#=============================================================================#
# dieharder version 3.31.1 Copyright 2003 Robert G. Brown #
#=============================================================================#
rng_name | filename |rands/second|
file_input_raw|./generated/MT19937_10000000_LittleEndian.bin| 5.60e+07 |
#=============================================================================#
test_name |ntup| tsamples |psamples| p-value |Assessment
#=============================================================================#
diehard_birthdays| 0| 100| 100|0.63638992| PASSED
diehard_operm5| 0| 1000000| 100|0.00012670| WEAK
diehard_rank_32x32| 0| 40000| 100|0.93085433| PASSED
diehard_rank_6x8| 0| 100000| 100|0.07088597| PASSED
diehard_bitstream| 0| 2097152| 100|0.10456387| PASSED
You can see above 2 tests (Decimal ASCII and Little-Endian) have same results (P-value)
Result - Big-Endian example
> dieharder -a -g 201 -f ./generated/MT19937_10000000_BigEndian.bin
#=============================================================================#
# dieharder version 3.31.1 Copyright 2003 Robert G. Brown #
#=============================================================================#
rng_name | filename |rands/second|
file_input_raw|./generated/MT19937_10000000_BigEndian.bin| 5.65e+07 |
#=============================================================================#
test_name |ntup| tsamples |psamples| p-value |Assessment
#=============================================================================#
diehard_birthdays| 0| 100| 100|0.46325487| PASSED
diehard_operm5| 0| 1000000| 100|0.00000093| FAILED
diehard_rank_32x32| 0| 40000| 100|0.93085433| PASSED
diehard_rank_6x8| 0| 100000| 100|0.27138035| PASSED
diehard_bitstream| 0| 2097152| 100|0.75581067| PASSED
diehard_opso| 0| 2097152| 100|0.25961325| PASSED
diehard_oqso| 0| 2097152| 100|0.00025268| WEAK
However you can see that there is some different P-value between above and Big-Endian File. That proves that Dieharder PROPER example should be written in Little-Endian Binary.
Conclusion and Comments
I am afraid that you wrote binaries in ASCII characters. If you can see data with normal text editor like Windows-notepad, that means you wrote in ASCII character and it is UN-PROPER input_file. So, you have to write in Little-Endian Binary instead. This post and test results proved that Little-Endian is right and input_file_raw don't need header.
I am not sure if there is difference between Little-Endian and Big-Endian in "Analyzing test results". In NIST SP800-22, the statistical randomness test is kind of "Counting the number of 0 or 1" or "Checking if there is pattern of '0101', '001100', etc." I think there is no difference in "TRUTH level", which means this generate random or not.
But, I recommend you that writing binaries in Little-Endian. Because we don't know if test builder has profound reason or not.. We just follow the "PROPER" direction for use. :)
I have a hex file, I need to extract a range of it to a text file
From range:
To Range:
I need Output: AC:E4:B5:9A:53:1C
i tried many but it not really correct requirements, Output: Binary file filehex matches
grep "["'\x9f\x87\x6f\x11'"-"'\x9f\x87\x70\x11'"]" filehex > test.txt
hope someone can help me
Use -a to force the text interpretation of the input.
Use -o to only output the matching part.
The expression you used doesn't make much sense. It matches any characters in the set \x9, \x87, \x6f, and then the range \x11-\x9f, etc.
You are rather interested in something that starts with \x9\x87\x6f\x11 and ends in \x9f\x87\x70\x11, and there can be anything in between.
You can use cut to remove the leading and trailing 4 bytes.
grep -oa $'\x9f\x87\x6f\x11.*\x9f\x87\x70\x11' hexfile | cut -b5-21
If you know the length of the string will always be 17 bytes, you can use .\{17\} instead of .*.
Ok I've build randomly one binary $file
with your string at a location making hd command to split them.
Note: regarding k314159' comment, I use hd to produce hexdump output similarto CentOS's hexdump tool.
One shoot using sed:
hd $file |sed -e 'N;/ 9f \+\(|.*\n[0-9a-f]\+ \+\|\)87 \+\(|.*\n[0-9a-f]\+ \+\|\)6f \+\(|.*\n[0-9a-f]\+ \+\|\)11 /p;D;'
000161c0 96 7a b2 21 28 f1 b3 32 63 43 93 ff 50 a6 9f 87 |.z.!(..2cC..P...|
000161d0 6f 11 0d 7a a5 a9 81 9e 32 9d fb 71 27 6d 60 f2 |o..z....2..q'm`.|
0002c3a0
Explanation:
N merge next line in current buffer
\(|.*\n[0-9a-f]\+ \+\|\) match a | followed by anything and a newline (\n), then immediately an hexadecimal number and a space OR nothing.
p print current buffer (two lines)
D Delete upto newline in current buffer, keep last line for next sed loop.
The last hexadecimal 00028d2a correspond to the size of my binary $file:
printf "%x\n" $(stat -c %s $file)
Using bash + grep:
printf -v var "\x9f\x87\x6f\x11"
IFS=: read -r offset _ < <(grep -abo "$var" $file)
hd $file | sed -ne "$((offset/16-1)),+4p"
000161a0 b7 8f 4a 4d ed 89 6c 0b 25 f9 e7 c9 8c 99 6e 23 |..JM..l.%.....n#|
000161b0 3c ba 80 ec 2e 32 dd f3 a4 a2 09 bd 74 bf 66 11 |<....2......t.f.|
000161c0 96 7a b2 21 28 f1 b3 32 63 43 93 ff 50 a6 9f 87 |.z.!(..2cC..P...|
000161d0 6f 11 0d 7a a5 a9 81 9e 32 9d fb 71 27 6d 60 f2 |o..z....2..q'm`.|
000161e0 15 86 c2 bd 11 d0 08 90 c4 84 b9 80 04 4e 17 f1 |.............N..|
Where you could read your string:
000161c0 9f 87 | ..|
000161d0 6f 11 |o. |
For testing, I've built my test file by:
dd if=/vmlinuz bs=90574 count=1 of=/tmp/testfile
printf '\x9f\x87\x6f\x11' >>/tmp/testfile
dd if=/vmlinuz bs=90574 count=1 >>/tmp/testfile
file=/tmp/testfile
Use grep to search for the original binary file, not the hex dump. Extending choroba's answer, I think you may have problems with grep trying to interpret your search pattern as UTF-8 or some other encoding. You should temporarily set the environment variable LC_ALL=C for grep to treat each byte individually. Also, you can use the -P option to enable use of lookbehind and lookahead in your pattern. So your command becomes:
LANG=C grep -oaP $'(?<=\x9f\x87\x6f\x11).*(?=\x9f\x87\x70\x11)' binary-file > test.txt
Proof that it works:
$ echo $'BEFORE\x9f\x87\x6f\x11AC:E4:B5:9A:53:1C\x9f\x87\x70\x11AFTER' | LANG=C grep -oaP $'(?<=\x9f\x87\x6f\x11).*(?=\x9f\x87\x70\x11)'
AC:E4:B5:9A:53:1C
$
Grep doesn't seem to match certain strings from man output. It seems to be random in that I can't work out any rhyme or reason as to whether a string will match or not.
man sed | head -7:
SED(1) BSD General Commands Manual SED(1)
NAME
sed -- stream editor
SYNOPSIS
$ man sed | head -7 | grep sed # no match
$ man sed | head -7 | grep stream # match on "stream"
sed -- stream editor
$ man sed | head -7 | grep '\-\-' # match on "--"
sed -- stream editor
$ man sed | head -7 | grep NAME # no match
$ man sed | head -7 | grep SYNOPSIS # no match
This also happens when redirecting the output to a file and grepping that
$ man sed | head -7 > /tmp/sed.man
$ cat /tmp/sed.man | grep sed # no match
$ cat /tmp/sed.man | grep stream # match on "stream"
sed -- stream editor
$ grep sed /tmp/sed.man # no match
$ grep stream /tmp/sed.man # match on "stream"
sed -- stream editor
grep: grep (BSD grep) 2.5.1-FreeBSD
man: version 1.6c
macOS: 10.14.6 Beta
bash: GNU bash, version 5.0.7(1)-release (x86_64-apple-darwin18.5.0)
$ man sed | head -7 | hexdump -C
00000000 0a 53 45 44 28 31 29 20 20 20 20 20 20 20 20 20 |.SED(1) |
00000010 20 20 20 20 20 20 20 20 20 20 20 42 53 44 20 47 | BSD G|
00000020 65 6e 65 72 61 6c 20 43 6f 6d 6d 61 6e 64 73 20 |eneral Commands |
00000030 4d 61 6e 75 61 6c 20 20 20 20 20 20 20 20 20 20 |Manual |
00000040 20 20 20 20 20 20 20 20 20 53 45 44 28 31 29 0a | SED(1).|
00000050 0a 4e 08 4e 41 08 41 4d 08 4d 45 08 45 0a 20 20 |.N.NA.AM.ME.E. |
00000060 20 20 20 73 08 73 65 08 65 64 08 64 20 2d 2d 20 | s.se.ed.d -- |
00000070 73 74 72 65 61 6d 20 65 64 69 74 6f 72 0a 0a 53 |stream editor..S|
00000080 08 53 59 08 59 4e 08 4e 4f 08 4f 50 08 50 53 08 |.SY.YN.NO.OP.PS.|
00000090 53 49 08 49 53 08 53 0a |SI.IS.S.|
00000098
Googling is hard for this problem as any combination of "man" or "grep" doesn't mention my problem that strings (with no special characters) are not matching.
man-pages are using the roff-format (https://man.openbsd.org/roff). Do the following:
man sed > sed.man
vi sed.man
so you see:
SED(1) BSD General Commands Manual SED(1)
N^HNA^HAM^HME^HE
s^Hse^Hed^Hd -- stream editor
to convert a man-page to text without the ^H-stuff. have a look on http://www.schweikhardt.net/man_page_howto.html#q10
create a perl-Skript called strip-headers with the content:
#!/usr/bin/perl -wn
# make it slurp the whole file at once:
undef $/;
# delete first header:
s/^\n*.*\n+//;
# delete last footer:
s/\n+.*\n+$/\n/g;
# delete page breaks:
s/\n\n+[^ \t].*\n\n+(\S+).*\1\n\n+/\n/g;
# collapse two or more blank lines into a single one:
s/\n{3,}/\n\n/g;
# see what is left...
print;
change the rights on the perl-script chmod 750 strip-headers and run it with:
man sed | ./strip-headers | col -bx > sed.man
or
man sed | ./strip-headers | col -bx | head -7 | grep sed
macOS man doesn't support the --ascii flag, so I used col -bx to strip the annoying formatting from man for piping into other commands.
man sed | col -bx | grep SYNOPSIS
col -b: Do not output any backspaces, printing only the last character written to each column position.
col -x: Output multiple spaces instead of tabs.
Notes:
I've read that man is meant to detect whether you're piping to another command or into a file, etc, but that was not my experience. At least for man 1.6c, the default for macOS.
Solution using col: https://unix.stackexchange.com/a/15866
Thanks #Cyrus - I didn't know about hexdump
Thanks #Oliver Gaida - I didn't know cat and vi would show display differently
So I'm trying to get a list of all the directories i'm currently running a program in, so i can keep track of the numerous jobs i have running at the moment.
When i run the commands individually, they all seem to work, but when i chain them together, something is going wrong... (ll is just the regular ls -l alias)
for pid in `top -n 1 -u will | grep -iP "(programs|to|match)" | awk '{print $1}'`;
do
ll /proc/$pid/fd | head -n 2 | tail -n 1;
done
Why is it that when i have the ll /proc/31353/fd inside the for loop, it cannot access the file, but when i use it normally it works fine?
And piped through hexdump -C:
$ top -n 1 -u will |
grep -iP "(scatci|congen|denprop|swmol3|sword|swedmos|swtrmo)" |
awk '{print $1}' | hexdump -C
00000000 1b 28 42 1b 5b 6d 1b 28 42 1b 5b 6d 32 31 33 35 |.(B.[m.(B.[m2135|
00000010 33 0a 1b 28 42 1b 5b 6d 1b 28 42 1b 5b 6d 32 39 |3..(B.[m.(B.[m29|
00000020 33 33 31 0a 1b 28 42 1b 5b 6d 1b 28 42 1b 5b 6d |331..(B.[m.(B.[m|
00000030 33 30 39 39 36 0a 1b 28 42 1b 5b 6d 1b 28 42 1b |30996..(B.[m.(B.|
00000040 5b 6d 32 36 37 31 38 0a |[m26718.|
00000048
chepner had the right hunch. The output of top is designed for humans, not for parsing. The hexdump shows that top is producing some terminal escape sequences. These escape sequences are part of the first field of the line so the resulting file name is something like /proc/\e(B\e[m\e(B\e[m21353/pid instead of /proc/21353/pid where \e is an escape character.
Use ps, pgrep or pidof instead. Under Linux, you can use the -C option to ps to match an exact program name (repeat the option to allow multiple names). Use the -o option to control the display format.
for pid in $(ps -o pid= -C scatci -C congen -C denprop -C swmol3 -C sword -C swedmos -C swtrmo); do
ls -l /proc/$pid/fd | head -n 2 | tail -n 1
done
If you want to sort by decreasing CPU usage:
for pid in $(ps -o %cpu=,pid= \
-C scatci -C congen -C denprop -C swmol3 -C sword -C swedmos -C swtrmo |
sort -k 1gr |
awk '{print $2}'); do
Additionally, use backticks instead of dollar-parenthesis for command substitution — quotes inside backticks behave somewhat bizarrely, and it's easy to make a mistake there. Quoting inside dollar-parenthesis is intuitive.
try to use "cut" instead of "awk", something like this:
for pid in `top -n 1 -u will | grep -iP "(scatci|congen|denprop|swmol3|sword|swedmos|swtrmo)" | sed 's/ / /g' | cut -d ' ' -f2`; do echo /proc/$pid/fd | head -n 2 | tail -n 1; done
can someone tell me why sed won't remove my NULLs?
this is on OS X
$ printf '123\x00456' | sed 's/\x00/Z/g' | hexdump
0000000 31 32 33 00 34 35 36 0a
this doesn't work either:
$ printf '123'$(echo "\000")'456' | sed 's/'$(echo "\000")'/Z/g' | hexdump
0000000 31 32 33 00 34 35 36 0a
For deleting a single character or translating a single character to a single other character (not including multibyte characters), tr can do it, and unlike sed it supports all characters, including NULs, in all versions of unix since the beginning.
For translating:
tr '\0' Z
And for deleting:
tr -d '\0'