Little endian packet treated as big endian by dpkt - endianness

I am using dpkt to parse some ieee80211 packets.
I see that the ieee80211 object created has wrong values.
Digging deeper I found that the ieee80211 treats the data as big endian while in practice the packets I am providing it are little endian.
Is there a way to detect the endianness of the packet in runtime so I could maybe change it to big endian before providing it to dpkt.ieee80211?

There shouldn't be anything to detect or guess. IEEE 802.11 is a standard protocol, and its specification states the correct endianess for each and every part of a frame. It the endianess is reversed, then the frame is malformed. You can grab the latest copy of the standard here.
Looking over the 3500+ page pdf (thank god for ctrl+f), it seems that most values are big-endian, just like in TCP/IP. But apparently, little-endian is used here and there. For instance, in some TKIP fields. Frankly, that's a bit surprising.
You haven't mentioned the frame/field you're trying to create/decode, so it's hard to say anything more specific than to look it up.

The only way you're going to be able to detect endianness when you don't know one way or the other would be to inject a payload and have that parsed the same way.
You can then check for endianness by checking the identity of the payload you injected.

It turns out that for IEEE80211 under CAPWAP the frame control bytes are simply swapped.
It is probably an an-initial-mistake-gone-de-facto-standard case.
See answer in Wireshark Q&A

Related

Is Gzip compressed binary data or uncompressed text safe to transmit over https, or should it be base 64 encoded as the final step before sending it?

My question is in the title, this provides context to help you understand my confusion. Everything is sent over https.
My understanding of base 64 encoding is that it is a way of representing binary data as text, such that the text is safe to transmit across networks or the internet because it avoids anything that might be interpreted as a control code by the various possible protocols that might be involved at some point.
Given this understanding, I am confused why everything sent to over the internet is not base 64 encoded. When is it safe not to base 64 encode something before sending it? I understand that not everything understands or expects to receive things in base 64, but my question is why doesn't everything expect and work with this if it is the only way to send data without the possibility it could be interpreted as control codes?
I am designing an Android app and server API such that the app can use the API to send data to the server. There are some potentially large SQLite database files the client will be sending to the server (I know this sounds strange, yes it needs to send the entire database files). They are being gzipped prior to uploading. I know there is also a header that can be used to indicate this: Content-Encoding: gzip. Would it be safe to compress the data and send it with this header without base 64 encoding it? If not, why does such a header exist if it is not safe to use? I mean, if you base 64 encode it first and then compress it, you undo the point of base 64 encoding and it is not at that point base 64 encoded. If you compress it first and then base 64 encode it, that header would no longer be valid as it is not in the compressed format at that point. We actually don't want to use the header because we want to save the files in a compressed state, and using the header will cause the server to decompress it prior to our API code running. I'm only asking this to further clarify why I am confused about whether it is safe to send gzip compressed data without base 64 encoding it.
My best guess is that it depends on if what you are sending is binary data or not. If you are sending binary data, it should be base 64 encoded as the final step before uploading it. But if you are sending text data, you may not need to do this. However it still seems to my logic, this might still depends on the character encoding used. Perhaps some character encodings can result in sending data that could be interpreted as a control code? If this is true, which character encodings are safe to send without base 64 encoding them as the final step prior to sending it? If I am correct about this, it implies you should only use the that gzip header if you are sending compressed text that has not been base 64 encoded. Does compressing it create the possibility of something that could be interpreted as a control code?
I realize this was rather long, so I will repeat my primary questions (the title) here: Is either Gzip compressed binary data or uncompressed text safe to transmit, or should it be base 64 encoded as the final step before sending it? Okay I lied there is one more question involved in this. Would sending gzip compressed text always be safe to send without base 64 encoding it at the end, no matter which character encoding it had prior to compression?
My understanding of base 64 encoding is that it is a way of representing binary data as text,
Specifically, as text consisting of characters drawn from a 64-character set, plus a couple of additional characters serving special purposes.
such that the text is safe to transmit across networks or the internet because it avoids anything that might be interpreted as a control code by the various possible protocols that might be involved at some point.
That's a bit of an overstatement. For two endpoints to communicate with each other, they need to agree on one protocol. If another protocol becomes involved along the way, then it is the responsibility of the endpoints for that transmission to handle any needed encoding considerations for it.
What bytes and byte combinations can successfully be conveyed is a matter of the protocol in use, and there are plenty that handle binary data just fine.
At one time there was also an issue that some networks were not 8-bit clean, so that bytes with numeric values greater than 127 could not be conveyed across those networks, but that is not a practical concern today.
Given this understanding, I am confused why everything sent to over the internet is not base 64 encoded.
Given that the understanding you expressed is seriously flawed, it is not surprising that you are confused.
When is it safe not to base 64 encode something before sending it?
It is not only safe but essential to avoid base 64 encoding when the recipient of the transmission expects something different. The two or more parties to a given transmission must agree about the protocol to be used. That establishes the acceptable parameters of the communication. Although Base 64 is an available option for part or all of a message, it is by no means the only one, nor is it necessarily the best one for binary data, much less for data that are textual to begin with.
I understand that not everything understands or expects to receive things in base 64, but my question is why doesn't everything expect and work with this if it is the only way to send data without the possibility it could be interpreted as control codes?
Because it is not by any means the only way to avoid data being misinterpreted.
They are being gzipped prior to uploading. I know there is also a header that can be used to indicate this: Content-Encoding: gzip. Would it be safe to compress the data and send it with this header without base 64 encoding it?
It would be expected to transfer such data without base-64 encoding it. HTTP(S) handles binary data just fine. The Content-Encoding header tells the recipient how to interpret the message body, and if it specifies a binary content type (such as gzip) then binary data conforming to that content type are what the recipient will expect.
My best guess is that it depends on if what you are sending is binary data or not.
No. These days, for all practical intents and purposes, it depends only on what application-layer protocol you are using for the transmission. If it specifies that some or all of the message is to be base-64 encoded (according to a particular base-64 scheme, as there are more than one) then that's what the sender must do and how the receiver will interpret the message. If the protocol does not specify that, then the sender must not perform base-64 encoding. Some protocols afford the sender the option to make this choice, but those also provide a way for the sender to indicate inside the transmission what choice has been made.
Is either Gzip compressed binary data or uncompressed text safe to transmit, or should it be base 64 encoded as the final step before sending it?
Neither is inherently unsafe to transmit on today's networks. Whether data are base-64 encoded for transmission is a question of agreement between sender and receiver.
Okay I lied there is one more question involved in this. Would sending gzip compressed text always be safe to send without base 64 encoding it at the end, no matter which character encoding it had prior to compression?
The character encoding of the uncompressed text is not a factor in whether a gzipped version can be safely and successfully conveyed. But it probably matters for the receiver or anyone to whom they forward that data to understand the uncompressed text correctly. If you intend to accommodate multiple character encodings then you will want to provide a way to indicate which applies to each text.

Design of the Protobuf binary format: performance and varint

I need to design a binary format to save data from a scientific application. This data has to be encoded in a binary format that can't be easily read by any other application (it is a requirement by some of our clients). As a consequence, we decided to build our own binary format, its encoder and its decoder.
We got some inspiration from many binary format, including protobuf. One thing that puzzles me is the way protobuf encodes the length of embedded messages. According to https://developers.google.com/protocol-buffers/docs/encoding, the size of an embedded message is encoded at its very beginning as a varint.
But before we encode an embedded message, we don't know yet its size (think for instance of an embedded message that contains many integers encoded as varint). As a consequence, we need to encode the message entirely, before we write it to the disk so we know its size.
Imagine that this message is huge. As a consequence, it is very difficult to encode it in an efficient way. We could encode this size as a full int and seek back to this part of the file once the embedded message is written, but we loose the nice property of varints: you don't need to specify if you have a 32-bit or a 64-bit integer. So going back to Google's implementation using a varint:
Is there an implementation trick I am missing, or is this scheme likely to be inefficient for large messages?
Yes, the correct way to do this is to write the message first, at the back of the buffer, and then prepend the size. With proper buffer management, you can write the message in reverse.
That said, why write your own message format when you can just use protobuf? It would be better to just use Protobuf directly and encrypt the file format. That would be easy for you to use, and still be hard for other applications to read.

Decoding a file compressed with an obsolete language

I'm trying to decompress a data file that was originally compressed with an extension for AMOS Pro, the old Amiga BASIC language, that shipped with the AMOS Pro compiler. I've still got the programming language and have access to the compressor and decompressor, but I'm trying to decompress the files using C. I ultimately want to be able to view these files on modern hardware without having to resort to using an Amiga emulator first.
However, there's no documentation as to how the compressor worked, so I'm trying to reverse-engineer it solely from watching its behaviour. Here's what I've got so far.
This is a raw file (ASCII):
AABCDEFGHIJKLMNOPQRSTUVWXYZAABCDEFGHIJKLMNOPQRSTUVWXYZAABCDEFGHIJKLMNOPQRSTUVWXYZ
Here's the compressed version (hex):
D802C6B5
05048584
4544C5C4
2524A5A4
6564E5E4
15149594
5554D5D4
3534B591
00000007
AD763363
00000051
Testing with various files has given me to a few insights:
The last 4 bytes are the size of the original file.
The file seems to function as a bit stream, so byte boundaries aren't important (I say this because I've seen ASCII codes appear in a few files and they aren't aligned to byte boundaries).
All of the bits in the file are stored in reverse.
The first 4 byte seems to represent a sequence length. In the above example, the value 0xD8 is 11011000 in binary; mirror it (bits are in reverse) and you'll get 00011011, which is 0x1B in hex or 27 in decimal. That matches the sequence length.
However, I'm not making any more progress. Does this look like a standard compression algorithm? What do I try next?
As you've posted here, the compression function is called "squash", a function part of AMOS Pro.
As such, my advice would be to try one of the following lines of attack:
Reverse engineer the algorithm by analyzing its output: This is definitely not a viable option. You will only waste time.
Read, annotate, understand the source code of the unsquash function in AMOS Pro
Contact the author of AMOS Pro
Read the source code
The source code for AMOS Pro is apparently in the public domain now and can be found here:
http://www.pianetaamiga.it/downloads/AMOSPro_Sources.zip
It consists of 68000 assembly code and quite a few compiled object files.
The unsquash function can be found in the file +header.s on line 1061 and onwards. It is not documented, except for its entry register values, which is good at least. It doesn't appear to be a very large function so this might be worth a shot.
You will need to have, or obtain/learn, rudimentary 68000 machine code. It does not appear to call out to system libraries or anything and only seem to operate directly on memory, which would suggest this is actually doable (ie. understanding the code). Still, I've never written or read 68000 code in my life so what do I know.
Contact the author of AMOS Pro
The author of AMOS Pro is François Lionet, as is evident by the User Guide, he founded Clickteam in the mid-90s to make game- and multimedia-making software. He still seems to be situated in that company and according to forum posts from others looking into AMOS Pro he seems to be willing to answer email. Sadly I don't know his email but the Clickteam website above should give you a starting point.

Decoding a picture from a gps tracker

I'am developing a server for a GPS Tracker that can send pictures taken by a camera connected to it, inside a vehicle.
The problem is that I follow every step in the manual and I can't still decode the bytes sent by the tracker into a picture:
I receive the picture in packages separated by the headers and "tails", each one. When I receive the bytes I convert them into hexadecimals as the manual expecifies, then I have to remove the headers and "tails" and apparently after joinned the remain data and saved as a .jpeg, the image should appear, but it doesn't.
the company name is "Toplovo" from China. Have anyone else solve something similar?
Are the line feeds part of your actual data? Because if so I doubt that's supposed to happen.
Otherwise, make sure you're writing the file in binary mode. In some languages this matters. You didn't really specify, but make sure you're not in text mode. Also make sure you're not using any datatypes unsuited for hexidecimal values (again, we don't even know what language you're using, so, it's kind of hard to give specific suggestions.)

How do I send Hex data?

I am trying to communicate with a modbus slave via either modbusTCP or modbus serial. The manuf. (partlow) has an ASCII communications manual (http://www.partlow.com/uploadedFiles/Downloads/1160%20ASCII%20Comms%20Manual.pdf) which, looks like it differs from the standard communication methods (http://en.wikipedia.org/wiki/Modbus). A lot of existing code out there is setup to work with normal modbus addressing of coils and such, where it seems (at least to me) to be different with these guys.
So, via ruby or perl, how can I send hex data? I may be doing everything fine, but, if I write "0DFA" to a serial port... is that ok? or do I need to convert it into a lower layer first, or denote it somehow?
Been working on this a lot and may have myself mixed up (making things out to be more complicated than they are) but, i am trying to establish comm with this meter, and I can see the TX activity light blink but no RX, which means my data format is wrong...
Been working off this mostly (and a few perl snippets here and there, trying to find something that works):
http://www.messen-und-deuten.de/modbus.html
I am communicating through a terminal server, which accepts modbusTCP (which this script uses) but i'm having trouble applying whats in the comm manual to the code above, to get the packet formatted correctly.
Are you talking about raw data? There are several ways, including
print HANDLE "\x{OD}\x{FA}";
printf HANDLE "%c%c", 0x0D, 0xFA;
print HANDLE "\015\372"; # octal notation
print HANDLE pack("C*", 0x0D, 0xFA);
syswrite HANDLE, "\x{OD}\x{FA}", 2;
I would recommend you look at the RModBus library to help handle some of the intricacies of packet formation over TCP/IP from inside the Ruby language.
It is always possible that the device you are communicating with requires, or conversely avoids the modicon notation. That was a bit of a hiccup when I first tried reading registers from a PLC. The other "gotcha" that I've found with Modbus is that some of the addressing systems are offset by one due to quirkiness in their implementation.

Resources