encode / decode binary data in a qr-code using qrencode and zbarimg in bash - bash

I have some binary data that I want to encode in a qr-code and then be able to decode, all of that in bash. After a search, it looks like I should use qrencode for encoding, and zbarimg for decoding. After a bit of troubleshooting, I still do not manage to decode what I had encoded
Any idea why? Currently the closest I am to a solution is:
$ dd if=/dev/urandom bs=10 count=1 status=none > data.bin
$ xxd data.bin
00000000: b255 f625 1cf7 a051 3d07 .U.%...Q=.
$ cat data.bin | qrencode -l H -8 -o data.png
$ zbarimg --raw --quiet data.png | xxd
00000000: c2b2 55c3 b625 1cc3 b7c2 a051 3d07 0a ..U..%.....Q=..
It looks like I am not very far, but something is still off.
Edit 1: a possible fix is to use base64 wrapping, as explained in the answer by #leagris .
Edit 2: using base64 encoding doubles the size of the message. The reason why I use binary in the first place is to be size-efficient so I would like to avoid that. De-accepting the answer by #leagris as I would like to have it 'full binary', sorry.
Edit 3: as of 2020-03-03 it looks like this is a well-known issue of zbarimg and that a pull request to fix this is on its way:
https://github.com/mchehab/zbar/pull/64
Edit 4: if you know of another command-line tool on linux that is able to decrypt qr-codes with binary content, please feel free to let me know.

My pull request has been applied. ZBar version 0.23.1 and newer will be able to decode binary QR codes:
zbarimg --raw --oneshot -Sbinary qr.png
zbarcam --raw --oneshot -Sbinary
QR codes have several encoding modes. The simplest, most commonly used and widely supported is the alphanumeric encoding which is suitable for simple text. The byte encoding allows storing arbitrary 8 bit data in the QR code. The ECI mode is like 8 bit mode but with additional metadata that tells the decoder which character set to use in order to decode the binary data back to text. Here's a list of known ECI values and the character encodings they represent. For example, when a decoder encounters an ECI 26 mode QR code it knows to decode the binary data as UTF-8.
The qrencode tool is doing its job correctly: it is creating a byte mode QR code with the data you gave it as its contents. The problem is most decoders were explicitly designed to handle textual data first and foremost. The retrieval of the raw binary data is a detail at best.
Current versions of the zbar library will treat byte mode QR codes as if they were unknown ECI mode QR codes. If a character set isn't specified, it will attempt to guess the encoding and convert the data to it. This will most likely mangle the binary data. As you noted, I brought this up in issue #55 and after some time managed to submit a pull request to improve this. Should it be merged, the library will have binary decoder option that will instruct decoders to return the raw binary data without converting it. Another source of data mangling is the tendency of the command line tools to append line feeds to the output. I submitted a pull request to allow users to prevent this and it has already been merged.
The zxing-cpp library will also try to guess the encoding of binary data in QR codes. The comments suggest that the QR code specification requires that decoders pick an encoding without specifying a default or allowing them to return the raw binary data. In order to make that possible, the binary data is copied to a byte array which can be accessed through the DecoderResult. When I have some free time, I intend to write zximg and zxcam tools with binary decoding support for this library.
It's always possible to encode binary data as base 64 and encode the result as an alphanumeric QR code. However, base 64 encoding will increase the size of the data and the alphanumeric mode doesn't allow use of the QR code's maximum capacity. In a comment, you mentioned what you intend to use binary QR codes for:
I want to have a package to effectively dump some gpg stuff in a format that makes recovery easy.
That is the exact use case I'm attempting to enable with my pull request: an easier-to-restore paperkey. 4096 bit RSA secret keys can be directly QR encoded in 8 bit mode but not in alphanumeric mode as base 64-encoded data.

See also: Storing binary data in QR codes
Look like zbarimg is only supporting printable characters and adding a newline
printf '%s' 'Hello World!' >data.bin
xxd data.bin
qrencode -l H -8 -o data.png -r data.bin
zbarimg --raw --quiet data.png | xxd
I think a better more portable option would be to base64 encode your binary data before qr encoding.
Like this:
dd if=/dev/urandom bs=10 count=1 status=none > data.bin
xxd data.bin
base64 <data.bin | qrencode -l H -8 -o data.png
zbarimg --raw --quiet data.png | base64 -d | xxd

Related

Image Conversion - RAW to png/raw for game (Pac The Man X)

So I have raw image and I am just curious If I can edit such image to save as RGB-32 Packed transparent interlaced raw and what program I could use, there is specification:
Format of RAW image
I have tried using photoshop but then game crashes. Is it even possible? I should get file without thumbnail. I also tried using gimp, free converters and Raw viewer but no luck. Any suggestions?
Edit:
Used photoshop (interleaved with transparency format), game starts but images are just bunch of pixels.
file that i try to prepare (221bits)
We are still not getting a handle on what output format you are really trying to achieve. Let's try generating a file from scratch, to see if we can get there.
So, let's just use simple commands that are available on a Mac and generate some test images from first principles. Start with exactly the same ghost.raw image you shared in your question. We will take the first 12 bytes as the header, and then generate a file full of red pixels and see if that works:
# Grab first 12 bytes from "ghost.raw" and start a new file "red.raw"
head -c 12 ghost.raw > red.raw
# Now generate 512x108 pixels, where red=ff, green=00, blue=01, alpha=fe and append to "red.raw"
perl -E 'say "ff0001fe" x (512*108)' | xxd -r -p >> red.raw
So you can try using red.raw in place of ghost.raw and tell me what happens.
Now try generating a blue file just the same:
# Grab first 12 bytes from "ghost.raw" and start a new file "blue.raw"
head -c 12 ghost.raw > blue.raw
# Now generate 512x108 pixels, where red=00, green=01, blue=ff, alpha=fe and append to "blue.raw"
perl -E 'say "0001fffe" x (512*108)' | xxd -r -p >> blue.raw
And then try blue.raw.
Original Answer
AFAIK, your image is actually 512 pixels wide by 108 pixels tall in RGBA8888 format with a 12-byte header at the start - making 12 + 4*(512 * 108) bytes.
You can convert it to PNG or JPEG with ImageMagick like this:
magick -size 512x108+12 -depth 8 RGBA:ghost.raw result.png
I still don't understand from your question or comments what format you actually want - so if you clarify that, I am hopeful we can get you answered.
Try using online converters. They help most of the time.\
A Website like these can possibly help:
https://www.freeconvert.com/raw-to-png
https://cloudconvert.com/raw-to-png
https://www.zamzar.com/convert/raw-to-png/
Some are specific websites which ask you for detail and some are straight forward conversions.

How to zgrep the last line of a gz file without tail

Here is my problem, I have a set of big gz log files, the very first info in the line is a datetime text, e.g.: 2014-03-20 05:32:00.
I need to check what set of log files holds a specific data.
For the init I simply do a:
'-query-data-'
zgrep -m 1 '^20140320-04' 20140320-0{3,4}*gz
BUT HOW to do the same with the last line without process the whole file as would be done with zcat (too heavy):
zcat foo.gz | tail -1
Additional info, those logs are created with the data time of it's initial record, so if I want to query logs at 14:00:00 I have to search, also, in files created BEFORE 14:00:00, as a file would be created at 13:50:00 and closed at 14:10:00.
The easiest solution would be to alter your log rotation to create smaller files.
The second easiest solution would be to use a compression tool that supports random access.
Projects like dictzip, BGZF, and csio each add sync flush points at various intervals within gzip-compressed data that allow you to seek to in a program aware of that extra information. While it exists in the standard, the vanilla gzip does not add such markers either by default or by option.
Files compressed by these random-access-friendly utilities are slightly larger (by perhaps 2-20%) due to the markers themselves, but fully support decompression with gzip or another utility that is unaware of these markers.
You can learn more at this question about random access in various compression formats.
There's also a "Blasted Bioinformatics" blog by Peter Cock with several posts on this topic, including:
BGZF - Blocked, Bigger & Better GZIP! – gzip with random access (like dictzip)
Random access to BZIP2? – An investigation (result: can't be done, though I do it below)
Random access to blocked XZ format (BXZF) – xz with improved random access support
Experiments with xz
xz (an LZMA compression format) actually has random access support on a per-block level, but you will only get a single block with the defaults.
File creation
xz can concatenate multiple archives together, in which case each archive would have its own block. The GNU split can do this easily:
split -b 50M --filter 'xz -c' big.log > big.log.sp.xz
This tells split to break big.log into 50MB chunks (before compression) and run each one through xz -c, which outputs the compressed chunk to standard output. We then collect that standard output into a single file named big.log.sp.xz.
To do this without GNU, you'd need a loop:
split -b 50M big.log big.log-part
for p in big.log-part*; do xz -c $p; done > big.log.sp.xz
rm big.log-part*
Parsing
You can get the list of block offsets with xz --verbose --list FILE.xz. If you want the last block, you need its compressed size (column 5) plus 36 bytes for overhead (found by comparing the size to hd big.log.sp0.xz |grep 7zXZ). Fetch that block using tail -c and pipe that through xz. Since the above question wants the last line of the file, I then pipe that through tail -n1:
SIZE=$(xz --verbose --list big.log.sp.xz |awk 'END { print $5 + 36 }')
tail -c $SIZE big.log.sp.xz |unxz -c |tail -n1
Side note
Version 5.1.1 introduced support for the --block-size flag:
xz --block-size=50M big.log
However, I have not been able to extract a specific block since it doesn't include full headers between blocks. I suspect this is nontrivial to do from the command line.
Experiments with gzip
gzip also supports concatenation. I (briefly) tried mimicking this process for gzip without any luck. gzip --verbose --list doesn't give enough information and it appears the headers are too variable to find.
This would require adding sync flush points, and since their size varies on the size of the last buffer in the previous compression, that's too hard to do on the command line (use dictzip or another of the previously discussed tools).
I did apt-get install dictzip and played with dictzip, but just a little. It doesn't work without arguments, creating a (massive!) .dz archive that neither dictunzip nor gunzip could understand.
Experiments with bzip2
bzip2 has headers we can find. This is still a bit messy, but it works.
Creation
This is just like the xz procedure above:
split -b 50M --filter 'bzip2 -c' big.log > big.log.sp.bz2
I should note that this is considerably slower than xz (48 min for bzip2 vs 17 min for xz vs 1 min for xz -0) as well as considerably larger (97M for bzip2 vs 25M for xz -0 vs 15M for xz), at least for my test log file.
Parsing
This is a little harder because we don't have the nice index. We have to guess at where to go, and we have to err on the side of scanning too much, but with a massive file, we'd still save I/O.
My guess for this test was 50000000 (out of the original 52428800, a pessimistic guess that isn't pessimistic enough for e.g. an H.264 movie.)
GUESS=50000000
LAST=$(tail -c$GUESS big.log.sp.bz2 \
|grep -abo 'BZh91AY&SY' |awk -F: 'END { print '$GUESS'-$1 }')
tail -c $LAST big.log.sp.bz2 |bunzip2 -c |tail -n1
This takes just the last 50 million bytes, finds the binary offset of the last BZIP2 header, subtracts that from the guess size, and pulls that many bytes off of the end of the file. Just that part is decompressed and thrown into tail.
Because this has to query the compressed file twice and has an extra scan (the grep call seeking the header, which examines the whole guessed space), this is a suboptimal solution. See also the below section on how slow bzip2 really is.
Perspective
Given how fast xz is, it's easily the best bet; using its fastest option (xz -0) is quite fast to compress or decompress and creates a smaller file than gzip or bzip2 on the log file I was testing with. Other tests (as well as various sources online) suggest that xz -0 is preferable to bzip2 in all scenarios.
————— No Random Access —————— ——————— Random Access ———————
FORMAT SIZE RATIO WRITE READ SIZE RATIO WRITE SEEK
————————— ————————————————————————————— —————————————————————————————
(original) 7211M 1.0000 - 0:06 7211M 1.0000 - 0:00
bzip2 96M 0.0133 48:31 3:15 97M 0.0134 47:39 0:00
gzip 79M 0.0109 0:59 0:22
dictzip 605M 0.0839 1:36 (fail)
xz -0 25M 0.0034 1:14 0:12 25M 0.0035 1:08 0:00
xz 14M 0.0019 16:32 0:11 14M 0.0020 16:44 0:00
Timing tests were not comprehensive, I did not average anything and disk caching was in use. Still, they look correct; there is a very small amount of overhead from split plus launching 145 compression instances rather than just one (this may even be a net gain if it allows an otherwise non-multithreaded utility to consume multiple threads).
Well, you can access randomly a gzipped file if you previously create an index for each file ...
I've developed a command line tool which creates indexes for gzip files which allow for very quick random access inside them:
https://github.com/circulosmeos/gztool
The tool has two options that may be of interest for you:
-S option supervise a still-growing file and creates an index for it as it is growing - this can be useful for gzipped rsyslog files as reduces to zero in the practice the time of index creation.
-t tails a gzip file: this way you can do: $ gztool -t foo.gz | tail -1
Please, note that if the index doesn't exists, this will consume the same time as a complete decompression: but as the index is reusable, next searches will be greatly reduced in time!
This tool is based on zran.c demonstration code from original zlib, so there's no out-of-the-rules magic!

Hex Dump Specific Parts of File - Bash

I'm trying to write a bash script to audit hard drives that have been wiped to ensure the wiping system is working properly. I would like to find a way to hex dump specific parts of a drive without having to hex dump the entire drive and extract the parts I'd like (as this seems to run for too long to make the script worth writing). Ideally, I'd be able to grab parts from the beginning, middle, and end of the drive.
I would like to take the output of the hex dump and check it for the existence of only one character (indicating the drive has been successfully wiped). This part, I can handle, but I thought it may affect any advice I may get.
I've used head piped into xxd to get the beginning of the file which has worked, but I'm still stuck on the other parts. I've tried using tail to just get the end of the drive, but that doesn't seem to work quickly either. Is it possible to do this efficiently? Possibly using dd or something else and pipe it into a hex editor? I've looked through options for xxd as well as hexdump to no avail. If someone could point me in the right direction, it would be greatly appreciated!
xxd has options to skip a ways into the file (-s) and dump a limited length (-l). If you use its plain hex (-p) option, you may be able to use grep to find any anomalies:
$ xxd -s 8192 -l 256 -p /dev/disk3s2 | grep [^0]
000000010000000000000000000000000000000000000000000000000000
000000000000000000000000300000000000000800000000000000000000
dbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdb
dbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdb
dbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdb
od has similar skip (-j) and limit length (-N). Similarly, dd has skip= and count= (although these are counted in blocks, not bytes; you can change the block size with bs=).
EDIT: Since xxd -p is giving weird results (not stopping at what should be the end of the device), I'd recommend running some tests to figure out what's going on. First, back up anything important on the computer, because if something is weird at the device access level, it's possible that some of these tests might overwrite something unexpected, possibly even on another disk.
Next, try dumping to the end of the device with different tools, and see if they all behave the same way:
xxd -s 65451982336 /dev/sdb | more # This *should* dump 512 bytes (32 lines) then stop, but apparently keeps going
od -xv -j 65451982336 /dev/sdb | more # This also *should* dump 512 bytes then stop
dd if=/dev/sdb skip=127835903 | xxd | more # This again should do the same thing (note that the skip value is in 512-byte blocks)
Do the other tools read past what fdisk reports as the end of the disk? If all three read more data, I'm going with the "fdisk is wrong/misleading" answer. You can test further by writing some nonzero data past the "end" and seeing what the results are:
dd if=/dev/random of=/dev/sdb seek=127835903 count=2
...then repeat the various dump commands. If they show two blocks (=64 lines) of random data followed by zeroes, I'm pretty sure the device is bigger than you think it is.
I am not near my shell, but something along these lines should get you started:
dd if=/dev/hda1 | hexdump -C | grep [^00]
will print all non-zero bytes.
dd if=/dev/hda1 | od -x -j100
will give you a hexadecimal dump with offsets, starting 100 bytes in.

How to get file size in bytes from shell script?

I am trying to create a script to write an XML file for Apple's ITMSP Transporter files for uploading metadata to the App Store. Requirements for screenshots are filename, MD5 checksum and filesize in bytes.
MD5 checksum is easy and be can be retrieved with md5 -q image.png
I am however having a hard time trying to get the byte size of the image file. If I use du -k image.png command, it returns the size rounded up in kilo bytes. So for example if the actual size is 5722 bytes, du will return 8 (as in 8K or 8192 bytes) which is not correct. And the default for du is in 512 byte chunks but still rounds the value up (so it will return 16 instead of 8).
I am running Lion OSX 10.7.4.
One easy approach is:
stat -f%z image.png
stat normally spits out a bunch of data, but the %z format just selects the size in bytes.
On OSX do stat -f "%z bytes".

AES/CBC/PKCS5Padding implementation in Ruby (for rails)

I need to decrypt text encrypted using AES/CBC/PKCS5Padding scheme. The encrypted text I got was generated using some Java software.
All values below are changed by me to something fictional.
What I get is a Key aHjgYFutF672eGIUGGVlgSETyM9VJj0K (256-bit = 32-chars * 8-bit)
and IV: rxYoks3c8hRRsL2P (16-bit)
and (I supposed) Base64 encoded encrypted result ETlAHS5ZcshKxQUaHVB8==
What I need is to decrypt in Ruby this ETlAHS5ZcshKxQUaHVB8== to get in the and a simple string, like 'blablablabla'
I tried to decrypt what I got using both Ruby and just common linux console openssl command.
NOTE: Key and IV below are not the ones used in real code:
# require 'openssl'
# require 'base64'
# decryption
aes = OpenSSL::Cipher::AES256.new(:CBC)
aes.decrypt
aes.padding = 1 # actually it's on by default
aes.key = "aHjgYFutF672eGIUGGVlgSETyM9VJj0K"
aes.iv="rxYoks3c8hRRsL2P"
aes.update(Base64::decode64("ETlAHS5ZcshKxQUaHVB8=="))+aes.final
=> OpenSSL::Cipher::CipherError: bad decrypt
Same as above but in console, key and iv converted to hex with:
$ echo -n $key256 | hexdump -e '16/1 "%02x"'
$ echo -n $iv | hexdump -e '16/1 "%02x"'
$ echo "ETlAHS5ZcshKxQUaHVB8==" | openssl enc -d -aes-256-cbc -a -K 61486a675946757446363732654749554747566c67534554794d39564a6a304b -iv 7278596f6b73336338685252734c3250
bad decrypt
140378046432928:error:06065064:digital envelope routines:EVP_DecryptFinal_ex:bad decrypt:evp_enc.c:539:
BTW. to get back original key and iv in the console you an use:
$ echo 61486a6... | xxd -r -p
#or , but then need to add \x before every character pair
$ eval `printf "\x61\x48......"
Please give me some clues as I hoped in the beginning that I will be able to use https://github.com/chicks/aes gem. The gem seems fine, it's just a nice wrapper for OpenSSL::Cipher::Cipher.
Is it possible that ruby/openssl use different PKCS, let's say PKCS#7, Java uses PKCS#5 and I need to preprocess my data ? Or there is a vesion mismatch between ruby/openssl and that Java's PKCS #7 and #5? #5 is meant for 8byte data blocks and #7 is for 16byte? Just a wild guess ...
The Ruby code in my first post is correct, the problem was this AES/CBC/PKCS5Padding used by Java part.
Java program should not use this scheme for AES-CBC-256. PKCS5 pads to a 64 bit (8 byte) block size, but AES-256-CBC uses 16 byte blocks. Therefore, PKCS7 must be used.

Resources