why does hexdump reverse the input when given "\xFF\x00"? [duplicate] - bash

This question already has answers here:
hexdump output order
(2 answers)
Closed 10 months ago.
Why does hexdump print 00ff here? I expected it to print ff00 , like it got in stdin, but
$ printf "\xFF\x00" | hexdump
0000000 00ff
0000002
hexdump decided to reverse it? why?

This is because hexdump is dumping 16-bit WORDS (=2-bytes-hex) and x86 processors stores words in little-endian format (you're probably using this processor).
From Wikipedia, Endianness:
A big-endian system stores the most significant byte of a word at the
smallest memory address and the least significant byte at the largest.
A little-endian system, in contrast, stores the least-significant byte
at the smallest address.
Notice that, when you use hexdump without specifiyng a parameter, the output is similar to -x.
From hexdump, man page:
If no format strings are specified, the default
display is very similar to the -x output format (the
-x option causes more space to be used between format
units than in the default output).
...
-x, --two-bytes-hex
Two-byte hexadecimal display. Display the input offset in
hexadecimal, followed by eight space-separated, four-column,
zero-filled, two-byte quantities of input data, in
hexadecimal, per line.
If you want to dump single bytes in order, use -C parameter or specify your custom formatting with -e.
$ printf "\xFF\x00" | hexdump -C
00000000 ff 00 |?.|
00000002
$ printf "\xFF\x00" | hexdump -e '"%07.7_ax " 8/1 "%02x " "\n"'
0000000 ff 00

From the hexdump(1) man page:
If no format strings are specified, the default display is very similar to the -x output format
-x, --two-bytes-hex
Two-byte hexadecimal display. Display the input offset in hexadecimal, followed by eight space-separated, four-column, zero-filled, two-byte quantities of input data, in hexadecimal, per line.
On a little-endian host the most significant byte (here 0xFF) is listed last.

Related

Swap or replace bytes in a binary file from command line

There already is a beautiful trick in this thread
to write bytes to binary file at desired address with dd ,is there any way to swap bytes(e.g swap 0x00 and 0xFF), or replace bytes with common tools (such as dd)?
Would you please try the following:
xxd -p input_file | fold -w2 | perl -pe 's/00/ff/ || s/ff/00/' | xxd -r -p > output_file
xxd -p file dumps the binary data file in continuous hexdump style.
fold -w2 wraps the input lines by every two characters (= every bytes).
perl -pe 's/00/ff/ || s/ff/00/' swaps 00 and ff in the input string.
The || logic works as if .. else .. condition. Otherwise the input 00
is once converted to ff and immediately converted back to 00 again.
xxd -r -p is the reversed version of xxd -p which converts the input
hexadecimal strings into binaries.

How to convert hex to ASCII while preserving non-printable characters

I've been experiencing some weird issues today while debugging, and I've managed to trace this to something I overlooked at first.
Take a look at the outputs of these two commands:
root#test:~# printf '%X' 10 | xxd -r -p | xxd -p
root#test:~# printf '%X' 43 | xxd -r -p | xxd -p
2b
root#test:~#
The first xxd command converts hex to ASCII. The second converts ASCII back to hex. (43 decimal = 2b hex).
Unfortunately, it seems that converting hex to ASCII does not preserve non-printable characters. For example, the raw hex "A" (10 decimal = A hex), somehow gets eaten up by xxd -r -p. Thus, when I perform the inverse operation, I get an empty result.
What I am trying to do is feed some data into minimodem. I need to generate Call Waiting Caller ID (FSK), effectively via bit banging. My bash script has the right bits, but if I do a hexdump, the non-printable characters are missing. Unfortunately, it seem that minimodem only accepts ASCII characters, and I need to feed it raw hex, but it seems that gets eaten up in the conversion. Is it possible to preserve these characters somehow? I don't see it as any option, so wondering if there's a better way.
xxd expects two characters per byte. One A is invalid. Do:
printf '%02X' 10 | xxd -r -p | xxd -p
How to convert hex to ASCII while preserving non-printable characters
Use xxd. If your input has one character, pad it with an initial 0.
ASCII does not preserve non-printable characters
It does preserve any bytes, xxd is the common tool to work with any binary data in shell.
Is it possible to preserve these characters somehow?
Yes - input sequence of two characters per byte to xxd.

Modify a byte in a binary file using standard Linux command-line tools

I need to modify a byte in a binary file at a certain offset.
Example:
Input file: A.bin
Output file: B.bin
I need to read a byte at the offset 0x40c from A.bin, clear to 0 least significant 2 bits of this byte, and then write file B.bin equal to A.bin, but with the calculated byte at offset 0x40c.
I can use Bash and standard commands like printf and dd.
I can easily write a byte into a binary file, but I don't know how to read it.
Modify a byte in a binary file using standard Linux command line tools.
# Read one byte at offset 40C
b_hex=$(xxd -seek $((16#40C)) -l 1 -ps A.bin -)
# Delete the three least significant bits
b_dec=$(($((16#$b_hex)) & $((2#11111000))))
cp A.bin B.bin
# Write one byte back at offset 40C
printf "00040c: %02x" $b_dec | xxd -r - B.bin
It was tested in Bash and Z shell (zsh) on OS X and Linux.
The last line explained:
00040c: is the offset xxd should write to
%02x converts $b from decimal to hexadecimal
xxd -r - B.bin: reverse hexadecimal dump (xxd -r) — take the byte number and the hexadecimal value from standard input (-) and write to B.bin

File size in UTF-8 encoding?

I have created a file with UTF-8 encoding, but I don't understand the rules for the size it takes up on disk. Here is my complete research:
First I created the file with a single Hindi letter 'क' and the file size on Windows 7 was
8 bytes.
Now with two letter 'कक' and the file size was 11 bytes.
Now with three letter 'ककक'and the file size was 14 bytes.
Can someone please explain me why it is showing such sizes?
The first three bytes are used for the BOM (Byte Order Mark) EF BB BF.
Then, the bytes E0 A4 95 encode the letter क.
Then the bytes 0D 0A encode a carriage return.
Total: 8 bytes. For each letter क you add, you need three more bytes.
On linux based systems, you can use hexdump to get the hexadecimal dump(used by Tim in his answer) and understand how many bytes a character is allocating.
echo -n a | hexdump -C
echo -n क | hexdump -C
Here's the output of the above two command.

Large number base conversion

Is there any shell command to convert large numbers from one base to another
for ex:
Converting 1024-bit binary number into hexadecimal number
You could have a look at bc, here. E.g. convert 12 to binary (output base = 2)
echo 'obase=2;12' | bc
1100
And also dc, here.
And also printf. E.g.
printf "%x" 32
20
Or you can Perl with bigint or bignum. See here. E.g.
perl -e '$line="1101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111110111101010110110111110111011111101111010101101101111101110111111011110101011011011111011101111";$hex=unpack("H*",pack("B*",$line));print $hex'
Output:
deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef
Also, you can use xxd to convert hex to binary:
echo -n $'\x02\x02' | xxd -b
0000000: 00000010 00000010

Resources