reverse engineer binary encoded data - binaryfiles

Is there a general approach? It doesn't appear to be encrypted, and I know the file must contain numeric tabular data of some kind:
$ od -tc filename.hobo | head
0000000 H O B O 210 \r 004 \0 \0 001 d 210 035 004 \0 \0
0000020 001 c 210 " 035 001 \0 \0 001 035 q - \0 $ 070
0000040 8 E 001 d 377 377 235 220 \0 \0 001 \0 \0 \v \f
0000060 030 002 210 5 032 001 003 003 001 \0 \a \0 \0 \0 \0 \a
0000100 \0 \0 \0 \0 \0 \0 \0 005 004 \0 \0 \0 \0 \0 \0 210
0000120 c 001 \0 210 033 002 \a ҈ ** 034 002 \0 001 210 001 002
0000140 017 033 210 002 002 001 035 210 003 002 001 \n 210 004 032 O
0000160 n s e t C o m p u t e r C o
0000200 r p o r a t i o n 210 005 024 H O B O
0000220 U 2 3 - 0 0 1 T e m p / R H

Related

JQ explode function is returning incorrect chars

I am trying to decode base64 encoded binary content in JQ using explode function.
When I run explode and then through implode, I am expecting it to return the same string. But it is not. Try it here: https://jqplay.org/s/Rt8H1qv8VRP
Base64 encoded string: "AQEAAAABAQAyGWRkZBXNWwcAAAAAAQIDBAUGBwgJClIGnj9SBp4/"
JQ: '#base64d | explode | implode | #base64'
Output: "AQEAAAABAQAyGWRkZBXvv71bBwAAAAABAgMEBQYHCAkKUgbvv70/Ugbvv70/"
Debugging further,
#base64d | explode | .[14]
returns
65533
Running the following on Ubuntu, you can see the [14] char is 315 (octal) == 215(decimal)
$ echo "AQEAAAABAQAyGWRkZBXNWwcAAAAAAQIDBAUGBwgJClIGnj9SBp4/" | base64 -d | od -bc
0000000 001 001 000 000 000 001 001 000 062 031 144 144 144 025 315 133
001 001 \0 \0 \0 001 001 \0 2 031 d d d 025 315 [
0000020 007 000 000 000 000 001 002 003 004 005 006 007 010 011 012 122
\a \0 \0 \0 \0 001 002 003 004 005 006 \a \b \t \n R
0000040 006 236 077 122 006 236 077
006 236 ? R 006 236 ?
0000047
Why is JQ returning this weird 65533 (0xFFFD) character? What am I missing?
First of all, the issue has nothing to do with explode or implode. Using just #base64d | #base64 produces the same result.
jq expects the string encoded with base64 to be text encoded with UTF-8.
If the decoded string is not UTF-8, the results are undefined.
Your input is not UTF-8.
U+FFFD REPLACEMENT CHARACTER is a character used to mark input errors.

zsh: no such file or directory error but file exist

I'm trying to run a compiler but I'm getting an error saying it can not be found, but it looks to exist and the path is good. I even tried a different shell incase zsh was mis-configured, but got the same error. Lost at what to do, any suggestions?
6909077c228a% ls -l toolchain/bin/armv7l-timesys-linux-gnueabi-gcc
-rwxr-xr-x 2 root root 2287465 Sep 11 13:19 toolchain/bin/armv7l-timesys-linux-gnueabi-gcc
6909077c228a% ./toolchain/bin/armv7l-timesys-linux-gnueabi-gcc
zsh: no such file or directory: ./toolchain/bin/armv7l-timesys-linux-gnueabi-gcc
#switch to bash
6909077c228a:~$ ./toolchain/bin/armv7l-timesys-linux-gnueabi-gcc
bash: ./toolchain/bin/armv7l-timesys-linux-gnueabi-gcc: No such file or directory
Edit:
Update showing suggestion, don't see any odd character inserted.
6909077c228a% ls -l toolchain/bin/armv7l-timesys-linux-gnueabi-gcc | od -xcb
0000000 722d 7877 2d72 7278 782d 3220 7220 6f6f
- r w x r - x r - x 2 r o o
055 162 167 170 162 055 170 162 055 170 040 062 040 162 157 157
0000020 2074 6f72 746f 3220 3832 3437 3536 5320
t r o o t 2 2 8 7 4 6 5 S
164 040 162 157 157 164 040 062 062 070 067 064 066 065 040 123
0000040 7065 3120 2031 3331 313a 2039 6f74 6c6f
e p 1 1 1 3 : 1 9 t o o l
145 160 040 061 061 040 061 063 072 061 071 040 164 157 157 154
0000060 6863 6961 2f6e 6962 2f6e 7261 766d 6c37
c h a i n / b i n / a r m v 7 l
143 150 141 151 156 057 142 151 156 057 141 162 155 166 067 154
0000100 742d 6d69 7365 7379 6c2d 6e69 7875 672d
- t i m e s y s - l i n u x - g
055 164 151 155 145 163 171 163 055 154 151 156 165 170 055 147
0000120 756e 6165 6962 672d 6363 000a
n u e a b i - g c c \n
156 165 145 141 142 151 055 147 143 143 012
Depending on how you typed in your initial ls -l line, there may be funny characters in the file name. If you use auto completion, it may have put those funny characters in for you so, if you subsequently attempt to type in the file name without auto completion, that could result in a file not found situation.
The first thing you should do is to check the filename completely, with something like:
ls -l toolchain/bin/armv7l-timesys-linux-gnueabi-gcc | od -xcb
and check the output to ensure there's no funny characters in the name.
If the file does exist in that for (no funny characters), one other possibility is that you're trying to run a 32-bit ELF program on a system that's not correctly set up to run them (i.e., a 64-bit system without the libraries and support infrastructure for 32-bit).
That results in an unhelpful error message since it really should be complaining about not being able to find the loader for your 32-bit executable, rather than the executable itself.
If this is the case, you will need to identify those missing items and install them.

unexpected result: grep from a changing line

I wrote a bash command to test grep from a changing line:
for i in $(seq 0 9); do echo -e -n "\r"$i; sleep 0.1; done | grep 5
The result shows:
9
Update
The real problem is as follows:
mplayer shows and refreshes a single-line playing progress when playing a media file. A sample result is:
A: 17.2 (17.2) of 213.0 (03:33.0) 0.5%
And I'm trying to grep this playing progress and ingore other lines. I used this command:
mplayer xxx.mp3 | grep ^A:
The result does not contain the line expected.
Update 2
mplayer xxx.mp3 | od -xda
shows:
0002140 4a5b 410d 203a 2020 2e31 2033 3028 2e31
[ J \r A : 1 . 3 ( 0 1 .
133 112 015 101 072 040 040 040 061 056 063 040 050 060 061 056
0002160 2932 6f20 2066 3132 2e33 2030 3028 3a33
2 ) o f 2 1 3 . 0 ( 0 3 :
062 051 040 157 146 040 062 061 063 056 060 040 050 060 063 072
0002200 3333 302e 2029 3020 342e 2025 5b1b 0d4a
3 3 . 0 ) 0 . 4 % 033 [ J \r
063 063 056 060 051 040 040 060 056 064 045 040 033 133 112 015
0002220 3a41 2020 3120 352e 2820 3130 342e 2029
A : 1 . 5 ( 0 1 . 4 )
101 072 040 040 040 061 056 065 040 050 060 061 056 064 051 040
0002240 666f 3220 3331 302e 2820 3330 333a 2e33
o f 2 1 3 . 0 ( 0 3 : 3 3 .
157 146 040 062 061 063 056 060 040 050 060 063 072 063 063 056
And
mplayer xxx.mp3 | tr '\r' '\n'
shows
A: 0.2 (00.1) of 213.0 (03:33.0) 0.3%
A: 0.3 (00.3) of 213.0 (03:33.0) 0.3%
A: 0.5 (00.5) of 213.0 (03:33.0) 0.4%
A: 0.6 (00.6) of 213.0 (03:33.0) 0.4%
A: 0.8 (00.8) of 213.0 (03:33.0) 0.4%
A: 1.0 (01.0) of 213.0 (03:33.0) 0.4%
While,
mplayer xxx.mp3 | tr '\r' '\n' | grep ^A
shows empty result.
Any tip will be appreciated.
It's your definition of "line" that's causing the problem here. The -n means that all the numbers are output on a single line, according the the definition used by grep (a series of characters, terminated by the \n character):
\r1\r2\r3\r4\r5\r6\r7\r8\r9
If you pipe the output through something like a hex dump, you can see what's happening:
$ for i in $(seq 0 9); do echo -e -n "\r"$i; sleep 0.1; done | grep 5 | od -xcb
0000000 300d 310d 320d 330d 340d 350d 360d 370d
\r 0 \r 1 \r 2 \r 3 \r 4 \r 5 \r 6 \r 7
015 060 015 061 015 062 015 063 015 064 015 065 015 066 015 067
0000020 380d 390d 000a
\r 8 \r 9 \n
015 070 015 071 012
0000025
That single line containing all the carriage returns (and not newlines) will, when output, appear to be a single line with just the 9 on it. Removing the -n will result instead in:
$ for i in $(seq 0 9); do echo -e "\r"$i; sleep 0.1; done | grep 5 | od -xcb
0000000 350d 000a
\r 5 \n
015 065 012
0000003
which would look like just the 5 was being output.
If you have a process that outputs "lines" separated by carriage returns rather than newlines, there's nothing to stop you changing them on the fly so as to be able to handle them as real lines:
$ echo -e "junk\rA: good 1\rjunk\rA: good 2\rjunk" | tr '\r' '\n' | grep '^A'
A: good 1
A: good 2
Applying that back to your original question, it would be (with the sleep removed since it's irrelevant):
$ for i in $(seq 0 9); do echo -e -n "\r"$i; done | tr '\r' '\n' | grep 5
5
$ for i in $(seq 0 9); do echo -e -n "\r"$i; done | tr '\r' '\n' | grep 5 | od -xcb
0000000 0a35
5 \n
065 012
0000002

Bash: ls says file not found

I tried the following command
for i in `ls`; do ls $i; done
and got the following output:
ls: a.out: No such file or directory
ls: c: No such file or directory
ls: contest: No such file or directory
ls: cpp: No such file or directory
ls: java: No such file or directory
ls: : No such file or directory
It is confusing since the list of files was also obtained using ls. When I tried to do an od on echo, i see the following:
0000000 033 133 060 155 033 133 060 061 073 063 062 155 141 056 157 165
033 [ 0 m 033 [ 0 1 ; 3 2 m a . o u
0000020 164 033 133 060 155 012
t 033 [ 0 m \n
0000026
0000000 033 133 060 061 073 063 064 155 143 033 133 060 155 012
033 [ 0 1 ; 3 4 m c 033 [ 0 m \n
0000016
0000000 033 133 060 061 073 063 064 155 143 157 156 164 145 163 164 033
033 [ 0 1 ; 3 4 m c o n t e s t 033
0000020 133 060 155 012
[ 0 m \n
0000024
0000000 033 133 060 061 073 063 064 155 143 160 160 033 133 060 155 012
033 [ 0 1 ; 3 4 m c p p 033 [ 0 m \n
0000020
0000000 033 133 060 155 146 151 154 145 056 164 170 164 033 133 060 155
033 [ 0 m f i l e . t x t 033 [ 0 m
0000020 012
\n
0000021
0000000 033 133 060 061 073 063 064 155 152 141 166 141 033 133 060 155
033 [ 0 1 ; 3 4 m j a v a 033 [ 0 m
0000020 012
\n
0000021
0000000 033 133 155 012
033 [ m \n
0000004
What does these "033 [ 0 m" characters stand for? How do I avoid them? Are they the cause of this problem?
Please help.
Thanks,
Karthick S.
You don't need `ls` or $(ls). You can use * instead. This way you avoid fancy colored outputs while leaving your code both portable, readable and compact.
This is #1 in Bash Pitfalls
NEVER use ls as input for another command...
The "033 [ 0 m" characters are escape codes for colouring terminal output. Try using this instead:
for file in $(unset LS_COLORS \ls);
do
ls "$file";
done
Or
for file in $(ls -1 -Q --quoting-style=shell --color=never);
do
ls "$file";
done

How do you delete all lines that contain double quotes in sh?

I tried sed -ne '/\"/!p' theinput > theproductbut that got me nowhere. It didn't do anything. What can I try?
You don't need to escape quote. Write:
sed '/"/d' theinput > theproduct
or
sed -i '/"/d' theinput
to alter the file directly.
In case you have other quotes as #Jonathan Leffler suggests, you have to find out which ones. Then, using \x you can achieve what you want. \x is used to specify hexadecimal values.
sed -i '/\x22/d' theinput
The line above would delete all rows in theinput containing the ordinary (ASCII 34) quote. You'll have to try the code points Jonathan suggested.
try this:
grep -v '"' theinput > theproduct
The command you showed us should have worked.
$ cat theinput
foo"bar
foo.bar
$ sed -ne '/\"/!p' theinput > theproduct
$ cat theproduct
foo.bar
$
unless you're using csh or tcsh as your interactive shell. In that case, you'd need to escape the ! character, even within quotation marks:
% cat theinput
foo"bar
foo.bar
% sed -ne '/\"/!p' theinput > theproduct
sed -ne '/"/pwd' theinput > theproduct
sed: -e expression #1, char 5: extra characters after command
% rm theproduct
% sed -ne '/\"/\!p' theinput > theproduct
% cat theproduct
foo.bar
%
But that's inconsistent with your statement that "It didn't do anything", so it's not clear what's really going on (and the question is tagged bourne-shell anyway).
But there are much simpler ways to accomplish the same task, particularly the grep command suggested by #Mike Sokolov.
Are you sure you have 'ASCII' input? Could you have Unicode (UTF-8) with characters that are not not ASCII 34, or Unicode U+0022, but something else?
Alternative Unicode 'double quotes' could be:
U+2033 DOUBLE PRIME; U+201C LEFT DOUBLE QUOTATION MARK;
U+201D RIGHT DOUBLE QUOTATION MARK;
U+201F DOUBLE HIGH-REVERSED-9 QUOTATION MARK;
U+02DD DOUBLE ACUTE ACCENT;
(and there could easily be others I've left out).
You can look to debug this with the od command:
$ cat theinput
No double quote here
Double quote " here
Unicode pseudo-double-quotes include “”‟″˝.
$ od -c theinput
0000000 N o d o u b l e q u o t e
0000020 h e r e \n D o u b l e q u o t
0000040 e " h e r e \n U n i c o d e
0000060 p s e u d o - d o u b l e - q
0000100 u o t e s i n c l u d e “ **
0000120 ** ” ** ** ‟ ** ** ″ ** ** ˝ ** . \n
0000136
$ od -x theinput
0000000 6f4e 6420 756f 6c62 2065 7571 746f 2065
0000020 6568 6572 440a 756f 6c62 2065 7571 746f
0000040 2065 2022 6568 6572 550a 696e 6f63 6564
0000060 7020 6573 6475 2d6f 6f64 6275 656c 712d
0000100 6f75 6574 2073 6e69 6c63 6475 2065 80e2
0000120 e29c 9d80 80e2 e29f b380 9dcb 0a2e
0000136
$ odx theinput
0x0000: 4E 6F 20 64 6F 75 62 6C 65 20 71 75 6F 74 65 20 No double quote
0x0010: 68 65 72 65 0A 44 6F 75 62 6C 65 20 71 75 6F 74 here.Double quot
0x0020: 65 20 22 20 68 65 72 65 0A 55 6E 69 63 6F 64 65 e " here.Unicode
0x0030: 20 70 73 65 75 64 6F 2D 64 6F 75 62 6C 65 2D 71 pseudo-double-q
0x0040: 75 6F 74 65 73 20 69 6E 63 6C 75 64 65 20 E2 80 uotes include ..
0x0050: 9C E2 80 9D E2 80 9F E2 80 B3 CB 9D 2E 0A ..............
0x005E:
$ sed '/"/d' theinput > theproduct
$ cat theproduct
No double quote here
Unicode pseudo-double-quotes include “”‟″˝.
$
(odx is my own command for dumping data in hex.)

Resources