Swap or replace bytes in a binary file from command line - bash

There already is a beautiful trick in this thread
to write bytes to binary file at desired address with dd ,is there any way to swap bytes(e.g swap 0x00 and 0xFF), or replace bytes with common tools (such as dd)?

Would you please try the following:
xxd -p input_file | fold -w2 | perl -pe 's/00/ff/ || s/ff/00/' | xxd -r -p > output_file
xxd -p file dumps the binary data file in continuous hexdump style.
fold -w2 wraps the input lines by every two characters (= every bytes).
perl -pe 's/00/ff/ || s/ff/00/' swaps 00 and ff in the input string.
The || logic works as if .. else .. condition. Otherwise the input 00
is once converted to ff and immediately converted back to 00 again.
xxd -r -p is the reversed version of xxd -p which converts the input
hexadecimal strings into binaries.

Related

In Unix shell, how to convert from hex string to stdout bytes in machine-endian order

I'd like to run a command similar to:
# echo 00: 0123456789abcdef | xxd -r | od -tx1
0000000 01 23 45 67 89 ab cd ef
0000010
That is, I'd like to input a hex string and have it converted to bytes on stdout. However, I'd like it to respect byte order of the machine I'm on, which is little endian. Here's the proof:
# lscpu | grep Byte.Order
Byte Order: Little Endian
So, I'd like it to work as above if my machine was big-endian. But since it isn't, I'd like to see:
# <something different here> | od -tx1
0000000 ef cd ab 89 67 45 23 01
0000010
Now, xxd has a "-e" option for little endianess. But 1) I want machine endianess, because I'd like something that works on big or little endian machines, and 2) "-e" isn't support with "-r" anyway.
Thanks!
What about this —
$ echo 00: 0123456789abcdef | xxd -r | xxd -g 8 -e | xxd -r | od -tx1
0000000 ef cd ab 89 67 45 23 01
0000010
According to man xxd:
-e
Switch to little-endian hexdump. This option treats byte groups as words in little-endian byte order. The default grouping of 4 bytes may be changed using -g. This option only applies to hexdump, leaving the ASCII (or EBCDIC) representation unchanged. The command line switches -r, -p, -i do not work with this mode.
-g bytes | -groupsize bytes
Separate the output of every bytes bytes (two hex characters or eight bit-digits each) by a whitespace. Specify -g 0 to suppress grouping. Bytes defaults to 2 in normal mode, 4 in little-endian mode and 1 in bits mode. Grouping does not apply to postscript or include style.

Modify a byte in a binary file using standard Linux command-line tools

I need to modify a byte in a binary file at a certain offset.
Example:
Input file: A.bin
Output file: B.bin
I need to read a byte at the offset 0x40c from A.bin, clear to 0 least significant 2 bits of this byte, and then write file B.bin equal to A.bin, but with the calculated byte at offset 0x40c.
I can use Bash and standard commands like printf and dd.
I can easily write a byte into a binary file, but I don't know how to read it.
Modify a byte in a binary file using standard Linux command line tools.
# Read one byte at offset 40C
b_hex=$(xxd -seek $((16#40C)) -l 1 -ps A.bin -)
# Delete the three least significant bits
b_dec=$(($((16#$b_hex)) & $((2#11111000))))
cp A.bin B.bin
# Write one byte back at offset 40C
printf "00040c: %02x" $b_dec | xxd -r - B.bin
It was tested in Bash and Z shell (zsh) on OS X and Linux.
The last line explained:
00040c: is the offset xxd should write to
%02x converts $b from decimal to hexadecimal
xxd -r - B.bin: reverse hexadecimal dump (xxd -r) — take the byte number and the hexadecimal value from standard input (-) and write to B.bin

BASh - Convert list of hex values to binary file (application)

I have a file containing hex representations of code from a small program, and am trying to actually convert it into the program itself.
For example, here is a sample of such text, stored in a file, input.txt:
8d
00
a1
21
53
57
43
48
0e
00
bb
I am using the following BASh snippet to convert the file to a binary file:
rm outfile; while read h; do echo -n ${h}; echo -ne \\x${h} >> outfile; done < input.txt
After opening the output file in VIM:
¡!SWCH»
And then converting it to hex representation via xxd:
0000000: 8d00 a121 5357 4348 0e00 bb0a ...!SWCH....
This is all good, except for one thing: There is a trailing byte, 0a, trailing at the end of my binary output file. This happens for every program file I work with. How is the trailing 0a being appending to every output binary file? It's not present in my input file.
Thank you.
Simply, use xxd directly from a bash like
xxd outfile > outfile.hex
and you will see, here isn't any 0a.
The 0a is appended somewhere when the vim sends a line to xxd command. If you want convert inside vim - try use
vim -b outfile
what open the outfile in binary mode.

Large number base conversion

Is there any shell command to convert large numbers from one base to another
for ex:
Converting 1024-bit binary number into hexadecimal number
You could have a look at bc, here. E.g. convert 12 to binary (output base = 2)
echo 'obase=2;12' | bc
1100
And also dc, here.
And also printf. E.g.
printf "%x" 32
20
Or you can Perl with bigint or bignum. See here. E.g.
perl -e '$line="1101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111";$hex=unpack("H*",pack("B*",$line));print $hex'
Output:
deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef
Also, you can use xxd to convert hex to binary:
echo -n $'\x02\x02' | xxd -b
0000000: 00000010 00000010

Using grep to search for hex strings in a file

Does anyone know how to get grep, or similar tool, to retrieve offsets of hex strings in a file?
I have a bunch of hexdumps (from GDB) that I need to check for strings and then run again and check if the value has changed.
I have tried hexdump and dd, but the problem is because it's a stream, I lose my offset for the files.
Someone must have had this problem and a workaround. What can I do?
To clarify:
I have a series of dumped memory regions from GDB (typically several hundred MB)
I am trying to narrow down a number by searching for all the places the number is stored, then doing it again and checking if the new value is stored at the same memory location.
I cannot get grep to do anything because I am looking for hex values so all the times I have tried (like a bazillion, roughly) it will not give me the correct output.
The hex dumps are just complete binary files, the paterns are within float values at larges so 8? bytes?
The patterns are not line-wrapping, as far as I am aware. I am aware of the what it changes to, and I can do the same process and compare the lists to see which match.
Perl COULD be a option, but at this point, I would assume my lack of knowledge with bash and its tools is the main culprit.
Desired output format
It's a little hard to explain the output I am getting since I really am not getting any output.
I am anticipating (and expecting) something along the lines of:
<offset>:<searched value>
Which is the pretty well standard output I would normally get with grep -URbFo <searchterm> . > <output>
What I tried:
A. Problem is, when I try to search for hex values, I get the problem of if just not searching for the hex values, so if I search for 00 I should get like a million hits, because thats always the blankspace, but instead its searching for 00 as text, so in hex, 3030.
Any idea's?
B. I CAN force it through hexdump or something of the link but because its a stream it will not give me the offsets and filename that it found a match in.
C. Using grep -b option doesnt seem to work either, I did try all the flags that seemed useful to my situation, and nothing worked.
D. Using xxd -u /usr/bin/xxd as an example I get a output that would be useful, but I cannot use that for searching..
0004760: 73CC 6446 161E 266A 3140 5E79 4D37 FDC6 s.dF..&j1#^yM7..
0004770: BF04 0E34 A44E 5BE7 229F 9EEF 5F4F DFFA ...4.N[."..._O..
0004780: FADE 0C01 0000 000C 0000 0000 0000 0000 ................
Nice output, just what I want to see, but it just doesn't work for me in this situation..
E. Here are some of the things I've tried since posting this:
xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003 #.........S.....
root# grep -ibH "df" /usr/bin/xxd
Binary file /usr/bin/xxd matches
xxd -u /usr/bin/xxd | grep -H 'DF'
(standard input):00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003 #.........S.....
This seems to work for me:
LANG=C grep --only-matching --byte-offset --binary --text --perl-regexp "<\x-hex pattern>" <file>
short form:
LANG=C grep -obUaP "<\x-hex pattern>" <file>
Example:
LANG=C grep -obUaP "\x01\x02" /bin/grep
Output (cygwin binary):
153: <\x01\x02>
33210: <\x01\x02>
53453: <\x01\x02>
So you can grep this again to extract offsets. But don't forget to use binary mode again.
Note: LANG=C is needed to avoid utf8 encoding issues.
There's also a pretty handy tool called binwalk, written in python, which provides for binary pattern matching (and quite a lot more besides). Here's how you would search for a binary string, which outputs the offset in decimal and hex (from the docs):
$ binwalk -R "\x00\x01\x02\x03\x04" firmware.bin
DECIMAL HEX DESCRIPTION
--------------------------------------------------------------------------
377654 0x5C336 Raw string signature
We tried several things before arriving at an acceptable solution:
xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003 #.........S.....
root# grep -ibH "df" /usr/bin/xxd
Binary file /usr/bin/xxd matches
xxd -u /usr/bin/xxd | grep -H 'DF'
(standard input):00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003 #.........S.....
Then found we could get usable results with
xxd -u /usr/bin/xxd > /tmp/xxd.hex ; grep -H 'DF' /tmp/xxd
Note that using a simple search target like 'DF' will incorrectly match characters that span across byte boundaries, i.e.
xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003 #.........S.....
--------------------^^
So we use an ORed regexp to search for ' DF' OR 'DF ' (the searchTarget preceded or followed by a space char).
The final result seems to be
xxd -u -ps -c 10000000000 DumpFile > DumpFile.hex
egrep ' DF|DF ' Dumpfile.hex
0001020: 0089 0424 8D95 D8F5 FFFF 89F0 E8DF F6FF ...$............
-----------------------------------------^^
0001220: 0C24 E871 0B00 0083 F8FF 89C3 0F84 DF03 .$.q............
--------------------------------------------^^
grep has a -P switch allowing to use perl regexp syntax
the perl regex allows to look at bytes, using \x.. syntax.
so you can look for a given hex string in a file with: grep -aP "\xdf"
but the outpt won't be very useful; indeed better do a regexp on the hexdump output;
The grep -P can be useful however to just find files matrching a given binary pattern.
Or to do a binary query of a pattern that actually happens in text
(see for example How to regexp CJK ideographs (in utf-8) )
I just used this:
grep -c $'\x0c' filename
To search for and count a page control character in the file..
So to include an offset in the output:
grep -b -o $'\x0c' filename | less
I am just piping the result to less because the character I am greping for does not print well and the less displays the results cleanly.
Output example:
21:^L
23:^L
2005:^L
If you want search for printable strings, you can use:
strings -ao filename | grep string
strings will output all printable strings from a binary with offsets, and grep will search within.
If you want search for any binary string, here is your friend:
https://github.com/tmbinc/bgrep

Resources