SoftQuad DESC or font file binary - image

I read this question but it doesn't helped me. I am solving a challenge where I have two files, first one was .png which gave me upper half part of an image, second file is SoftQuad DESC or font file binary I am sure that this file should somehow convert into .png file to complete the image. I googled and got hint about magic bytes but I am unable to match the bytes.
These are the first two rows of output of xxd command
00000000: aaaa a6bb 67bb bf18 dd94 15e6 252c 0a2f ....g.......%,./
00000010: fe14 d943 e8b5 6ad5 2264 1632 646e debc ...C..j."d.2dn..
These are the last two rows of output of xxd command
00001c10: 7a05 7f4c 3600 0000 0049 454e 44ae 4260 z..L6....IEND.B`
00001c20: 82
.

Related

Disassamble ELF file - debugging area where specific string of binary is loaded

I would like to disassamble / debug an elf file. Is it somehow possible to track the function where a specific string in the elf file is called?
So I mean, I have a string where I know it is used to search for that string in a file. Is it somehow possible with e.g. gdb to debug exactly that position in the executable?
Or is the position of the string in the elf file, somehow visible in the objdump -d output?
In order to do that you need a disassembler - objdump just dumps the info - it might not give you enough information as some analysis is needed before you can tell where it is being used. What you need is to get the XREFs for the string you have in mind.
If you open your binary in the disassembler it will probably have the ability to show you strings that are present in the binary with the ability to jump to the place where the string is being used (it might be multiple places).
I'll showcase this using radare2.
Open the binary (I'll use ls here)
r2 -A /bin/ls
and then
iz
to display all the strings. There's a lot of them so here's an extract
000 0x00004af1 0x100004af1 7 8 (4.__TEXT.__cstring) ascii COLUMNS
001 0x00004af9 0x100004af9 39 40 (4.__TEXT.__cstring) ascii 1#ABCFGHLOPRSTUWabcdefghiklmnopqrstuvwx
002 0x00004b21 0x100004b21 6 7 (4.__TEXT.__cstring) ascii bin/ls
003 0x00004b28 0x100004b28 8 9 (4.__TEXT.__cstring) ascii Unix2003
004 0x00004b31 0x100004b31 8 9 (4.__TEXT.__cstring) ascii CLICOLOR
005 0x00004b3a 0x100004b3a 14 15 (4.__TEXT.__cstring) ascii CLICOLOR_FORCE
006 0x00004b49 0x100004b49 4 5 (4.__TEXT.__cstring) ascii TERM
007 0x00004b60 0x100004b60 8 9 (4.__TEXT.__cstring) ascii LSCOLORS
008 0x00004b69 0x100004b69 8 9 (4.__TEXT.__cstring) ascii fts_open
009 0x00004b72 0x100004b72 28 29 (4.__TEXT.__cstring) ascii %s: directory causes a cycle
let's see where this last one is being used. If we move to the location where it's defined 0x100004b72. We can see this:
;-- str.s:_directory_causes_a_cycle:
; DATA XREF from 0x100001cbe (sub.fts_open_INODE64_b44 + 378)
And here we see where it's being referenced -> DATA XREF. We can move there (s 0x100001cbe) and there we see how it's being used.
⁝ 0x100001cbe 488d3dad2e00. lea rdi, str.s:_directory_causes_a_cycle ; 0x100004b72 ; "%s: directory causes a cycle"
⁝ 0x100001cc5 4c89ee mov rsi, r13
⁝ 0x100001cc8 e817290000 call sym.imp.warnx ;[1]
Having the location you can put a breakpoint there (r2 is also a debugger) or use it in gdb.

How to extract contents from 'Payload' file in a apple macOS update package?

I am extracting macOS sierra update package - macOSUpd10.12.1.pkg using following command to /tmp/macosupd directory.
pkgutil --expand /Volumes/macOS\ Sierra\ Update/macOSUpd10.12.1.pkg /tmp/macosupd
I can see following packages are bundled inside the metapackage.
-rwxr-xr-x 1 Distribution
drwxr-xr-x 6 FirmwareUpdate.pkg
drwxr-xr-x 5 FullBundleUpdate.pkg
drwxr-xr-x 36 Resources
drwxr-xr-x 6 macOSUpd10.12.1.pkg
I am not able to extract contents of Payload file. For example when i try to extract the Payload of macOSUpd10.12.1.pkg, i get following error message:
with tar:
tar -xvf macOSUpd10.12.1.pkg/Payload
tar: Unrecognized archive format
tar: Error exit delayed from previous errors.
with xar:
xar -xvf macOSUpd10.12.1.pkg/Payload
Error opening xar archive: macOSUpd10.12.1.pkg/Payload
When I run file command on the payload file, i get a message as:
file macOSUpd10.12.1.pkg/Payload
macOSUpd10.12.1.pkg/Payload: data
By seeing initial bits of Payload file using xxd command , i can see file type looks like pbzx
xxd macOSUpd10.12.1.pkg/Payload
00000000: 7062 7a78 0000 0000 0100 0000 0000 0000 pbzx............
00000010: 0100 0000 0000 0000 0031 0330 fd37 7a58 .........1.0.7zX
00000020: 5a00 0000 ff12 d941 04c0 f385 c401 8080 Z......A........
00000030: 8008 2101 1600 0000 506a 84e2 e3b8 13ef ..!.....Pj......
00000040: fe5d 0018 0ddf 07a4 347c 7c50 9853 8031 .]......4||P.S.1
00000050: 2d14 f703 6903 cf69 f214 76b0 93c0 a4c9 -...i..i..v.....
00000060: 774d 6fb0 8b3a 2257 4a55 04ad 289b cc4d wMo..:"WJU..(..M
00000070: b835 5db7 7e72 f7a8 dc15 7a9c 7755 800c .5].~r....z.wU..
00000080: 6060 d45f e078 f84f e537 4319 2d89 f72e ``._.x.O.7C.-...
00000090: 60c4 cdb0 6b54 9326 9321 3339 4a4f 1e75 `...kT.&.!39JO.u
000000a0: 8eb7 991d 8968 5e6f 45d4 24c9 e364 712b .....h^oE.$..dq+
000000b0: ef4e 9abc af70 e97a e5a3 1810 7f05 54df .N...p.z......T.
000000c0: 08d2 3060 9f8a a1e0 edb8 2b10 df23 789d ..0`......+..#x.
000000d0: 3e52 ee3e d6f0 468a bfee 3366 d39e 28db >R.>..F...3f..(.
Please suggest a way on how to extract the contents of Payload file. I have tried all he methods suggested in link1 and link2, including the method suggested in this external-link still no success. Seems like apple updates prior to Mavericks were pure xar and tar packages. With Mavericks and above apple changed the file format from a tar to pbzx(lzma).
please suggest an easy command line way to extract contents of the Payload file as i have to make use in my automation setup.
There is an undocumented option to pkgutil, --expand-full, which uses the same syntax as pkgutil --expand.
pkgutil --expand-full [pkg] [dir]
This will completely "unarchive" a pkg or mpkg file.
Since OSX 10.10, the Payload in the .pkg files is encoded as pbzx (which is in turn lzma compressed). It can no longer be extracted using gunzip. Check out the pbzx tool (a fork of the original which would not allow you to extract the Payload directly but only by passing the .pkg file directly).
Instead of
tar -xvf macOSUpd10.12.1.pkg/Payload
you can write
pbzx -n macOSUpd10.12.1.pkg/Payload | cpio -i

Hex Dump Specific Parts of File - Bash

I'm trying to write a bash script to audit hard drives that have been wiped to ensure the wiping system is working properly. I would like to find a way to hex dump specific parts of a drive without having to hex dump the entire drive and extract the parts I'd like (as this seems to run for too long to make the script worth writing). Ideally, I'd be able to grab parts from the beginning, middle, and end of the drive.
I would like to take the output of the hex dump and check it for the existence of only one character (indicating the drive has been successfully wiped). This part, I can handle, but I thought it may affect any advice I may get.
I've used head piped into xxd to get the beginning of the file which has worked, but I'm still stuck on the other parts. I've tried using tail to just get the end of the drive, but that doesn't seem to work quickly either. Is it possible to do this efficiently? Possibly using dd or something else and pipe it into a hex editor? I've looked through options for xxd as well as hexdump to no avail. If someone could point me in the right direction, it would be greatly appreciated!
xxd has options to skip a ways into the file (-s) and dump a limited length (-l). If you use its plain hex (-p) option, you may be able to use grep to find any anomalies:
$ xxd -s 8192 -l 256 -p /dev/disk3s2 | grep [^0]
000000010000000000000000000000000000000000000000000000000000
000000000000000000000000300000000000000800000000000000000000
dbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdb
dbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdb
dbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdb
od has similar skip (-j) and limit length (-N). Similarly, dd has skip= and count= (although these are counted in blocks, not bytes; you can change the block size with bs=).
EDIT: Since xxd -p is giving weird results (not stopping at what should be the end of the device), I'd recommend running some tests to figure out what's going on. First, back up anything important on the computer, because if something is weird at the device access level, it's possible that some of these tests might overwrite something unexpected, possibly even on another disk.
Next, try dumping to the end of the device with different tools, and see if they all behave the same way:
xxd -s 65451982336 /dev/sdb | more # This *should* dump 512 bytes (32 lines) then stop, but apparently keeps going
od -xv -j 65451982336 /dev/sdb | more # This also *should* dump 512 bytes then stop
dd if=/dev/sdb skip=127835903 | xxd | more # This again should do the same thing (note that the skip value is in 512-byte blocks)
Do the other tools read past what fdisk reports as the end of the disk? If all three read more data, I'm going with the "fdisk is wrong/misleading" answer. You can test further by writing some nonzero data past the "end" and seeing what the results are:
dd if=/dev/random of=/dev/sdb seek=127835903 count=2
...then repeat the various dump commands. If they show two blocks (=64 lines) of random data followed by zeroes, I'm pretty sure the device is bigger than you think it is.
I am not near my shell, but something along these lines should get you started:
dd if=/dev/hda1 | hexdump -C | grep [^00]
will print all non-zero bytes.
dd if=/dev/hda1 | od -x -j100
will give you a hexadecimal dump with offsets, starting 100 bytes in.

Images corrupt after git push

I added some .png images to my git repo and then pushed them to my remote repository. I noticed that the images are in the remote repository, but they are broken.
In order to verify this, I compared the binary data for the images on my local and my remote repos. I also ran the file command on images in both repos.
Image on local repository
First 10 Lines of Binary data:
8950 4e47 0d0a 1a0a 0000 000d 4948 4452
0000 0019 0000 0014 0806 0000 0078 7796
bd00 0000 1974 4558 7453 6f66 7477 6172
6500 4164 6f62 6520 496d 6167 6552 6561
6479 71c9 653c 0000 0373 6954 5874 584d
4c3a 636f 6d2e 6164 6f62 652e 786d 7000
0000 0000 3c3f 7870 6163 6b65 7420 6265
6769 6e3d 22ef bbbf 2220 6964 3d22 5735
4d30 4d70 4365 6869 487a 7265 537a 4e54
637a 6b63 3964 223f 3e20 3c78 3a78 6d70
Result of file command:
bundles/admin/public/images/messages-icon.png: PNG image data, 24 x 16, 8-bit/color RGBA, non-interlaced
Image pushed to remote repository
First 10 Lines of Binary data:
8950 4e47 0a1a 0a00 0000 0d49 4844 5200
0000 1900 0000 1408 0600 0000 7877 96bd
0000 0019 7445 5874 536f 6674 7761 7265
0041 646f 6265 2049 6d61 6765 5265 6164
7971 c965 3c00 0003 7369 5458 7458 4d4c
3a63 6f6d 2e61 646f 6265 2e78 6d70 0000
0000 003c 3f78 7061 636b 6574 2062 6567
696e 3d22 efbb bf22 2069 643d 2257 354d
304d 7043 6568 6948 7a72 6553 7a4e 5463
7a6b 6339 6422 3f3e 203c 783a 786d 706d
Result of file command:
bundles/admin/public/images/notifications-icon.png: data
Why would this be happening? I searched all over google and this site for answers but have found none.
I had this issue too - here's a generic .gitattributes that looks generally safe for php web repos that worked for me: https://github.com/Danimoth/gitattributes/blob/master/Web.gitattributes
running git check-attr --all -- path/to/png before and after adding the new gitattributes showed that the image file was originally treated as a text file, and now it's binary (note I had to commit/push the .gitattributes file for it to take effect on remote)
In case the link goes dead:
# These settings are for any web project
# Handle line endings automatically for files detected as text
# and leave all files detected as binary untouched.
# * text=auto
# NOTE - originally I had the above line un-commented. it caused me a lot of grief related to line endings because I was dealing with WordPress plugins and the website changing line endings out if a user modified a plugin through the web interface. commenting this line out seems to have alleviated the git chaos where simply switching to a branch caused it to believe 500 files were modified.
#
# The above will handle all files NOT found below
#
#
## These files are text and should be normalized (Convert crlf => lf)
#
# source code
*.php text
*.css text
*.sass text
*.scss text
*.less text
*.styl text
*.js text
*.coffee text
*.json text
*.htm text
*.html text
*.xml text
*.svg text
*.txt text
*.ini text
*.inc text
*.pl text
*.rb text
*.py text
*.scm text
*.sql text
*.sh text
*.bat text
# templates
*.ejs text
*.hbt text
*.jade text
*.haml text
*.hbs text
*.dot text
*.tmpl text
*.phtml text
# server config
.htaccess text
# git config
.gitattributes text
.gitignore text
.gitconfig text
# code analysis config
.jshintrc text
.jscsrc text
.jshintignore text
.csslintrc text
# misc config
*.yaml text
*.yml text
.editorconfig text
# build config
*.npmignore text
*.bowerrc text
# Heroku
Procfile text
.slugignore text
# Documentation
*.md text
LICENSE text
AUTHORS text
#
## These files are binary and should be left untouched
#
# (binary is a macro for -text -diff)
*.png binary
*.jpg binary
*.jpeg binary
*.gif binary
*.ico binary
*.mov binary
*.mp4 binary
*.mp3 binary
*.flv binary
*.fla binary
*.swf binary
*.gz binary
*.zip binary
*.7z binary
*.ttf binary
*.eot binary
*.woff binary
*.pyc binary
*.pdf binary
*.png binary
in .gitattribute worked after images were reuploaded.
You can add all images by executing git add *.png --force

Change Data Capture in delimited files

There are two tab delimited files (file1, file2) with same number and structure of records but with different values for columns.
Daily we get another file (newfile) with same number and structure of records but with some changes in column values.
Compare this file (newfile) with two files (file1, file2) and update the records in them with changed records, keeping unchanged records intact.
Before applying changes:
file1
11 aaaa
22 bbbb
33 cccc
file2
11 bbbb
22 aaaa
33 cccc
newfile
11 aaaa
22 eeee
33 ffff
After applying changes:
file1
11 aaaa
22 eeee
33 ffff
file2
11 aaaa
22 eeee
33 ffff
What could be the easy and most efficient solution? Unix shell scripting? The files are huge containing millions of records, can a shell script be efficient solution in this case?
Daily we get another file (newfile) with same number and structure of records but with
some changes in column values.
This sounds to me like a perfect case for git. With git you can commit the current file as it is.
Then as you get new "versions" of the file, you can simply replace the old version with the new one, and commit again. The best part is each time you make a commit git will record the changes from file to file, giving you access to the entire history of the file.

Resources