Can AES algorithm work the same over plain text and over bytes sequences?

It's clear how the algorithm manages plain text as the characters byte values to the state matrix.
But what about AES encryption of binary files?
How does the algorithm manages larger than 16 bytes files, as long as the state is standarized to be 4x4 bytes?

The AES primitive is the basis of constructions that allow encryption/decryption of arbitrary binary streams.
AES-128 takes a 128-bit key and a 128-bit data block and "encrypts" or "decrypts" this block. 128 bit is 16 bytes. Those 16 bytes can be text (e.g. ASCII, one character per byte) or binary data.
A naive implementation would just break a file with longer than 16 bytes into groups of 16 bytes and encrypt each of these with the same key. You might also need to "pad" the file to make it a multiple of 16 bytes. The problem with that is that it exposes information about the file because every time you encrypt the same block with the same key you'll get the same ciphertext.
There are different ways to build on the AES function to encrypt/decrypt more than 16 bytes securely. For example you can use CBC or use counter mode.
Counter mode is a little easier to explain so let's look at that. If we have AES_e(k, b) encrypt block b with key k we do not want to re-use the same key to encrypt the same block more than once. So the construction we'll use is something like this:
Calculate AES_e(k, 0), AES_e(k, 1), AES_e(k, n)
Now we can take arbitrary input, break it into 16 bytes blocks, and XOR with this sequence. Since the attacker does not know they key they can not regenerate this sequence and decode our (longer) message. The XOR is applied bit by bit between the blocks generated above and the cleartext. The receiving side can now generate the same sequence, XOR it with the ciphertext and retrieve the cleartext.
In application you also want to combine this with some sort of authentication mechanism so you something like AES-GCM or AES-CCM.

Imagine you have a 17 byte plain text.
state matrix will be filled with the first 16 bytes and one block will be encrypt.
Next block will be 1 byte that left and state matrix will be padded with data in order to fill those 16 bytes AES needs.
It works well with bytes/binary files because AES always consider bytes unities.Does not matter if that is a ascii chunk or any other think. Just remember that everything in a computer is binary/bytes/bits. Once data be a stream data (chunks of information in bytes) it'll work fine.


What's the reason behind ZigZag encoding in Protocol Buffers and Avro?

ZigZag requires a lot of overhead to write/read numbers. Actually I was stunned to see that it doesn't just write int/long values as they are, but does a lot of additional scrambling. There's even a loop involved:
I don't seem to be able to find in Protocol Buffers docs or in Avro docs, or reason myself, what's the advantage of scrambling numbers like that? Why is it better to have positive and negative numbers alternated after encoding?
Why they're not just written in little-endian, big-endian, network order which would only require reading them into memory and possibly reverse bit endianness? What do we buy paying with performance?
It is a variable length 7-bit encoding. The first byte of the encoded value has it high bit set to 0, subsequent bytes have it at 1. Which is the way the decoder can tell how many bytes were used to encode the value. Byte order is always little-endian, regardless of the machine architecture.
It is an encoding trick that permits writing as few bytes as needed to encode the value. So an 8 byte long with a value between -64 and 63 takes only one byte. Which is common, the range provided by long is very rarely used in practice.
Packing the data tightly without the overhead of a gzip-style compression method was the design goal. Also used in the .NET Framework. The processor overhead needed to en/decode the value is inconsequential. Already much lower than a compression scheme, it is a very small fraction of the I/O cost.

Cache calculating block offset and index

I've read several topics about this theme but I could not get the answer. So my question is:
1) How is the block offset calculated?
I want to know not the formula but the concept of it. As I know it is quantity of cases which a block can store the address. For example If there is a block with 8 byte storage and has to store 2 byte addresses. Does its block offset is 2 bit?(So there is 4 cases to store the address (the diagram below might make easier to see what I am saying).
The block offset is simply calculated as log2 cache_line_size.
The reason is that all system that I know of are byte addressable. So you need enough bits to index any byte in the block. Although most systems have a word size that is larger than a single byte, they still support offsets of a single byte gradulatrity, even if that is not the common case.
So for the example you mentioned of an 8-byte block size with 2-byte word, you would still need 3 bits in order to allow accessing any byte. If you had a system that was not byte addressable then you could use just 2 bits for the block offset. But in practice all systems that I know of are byte addressable.

Counter Size in AES Driver in CTR mode in linux kernel

I have seen multiple open source drivers for AES(CTR) mode for different Crypto Hardware Engines, I was not really sure on counter size,nonce etc.
Please can any one provide some info on the following
How does AES driver identifies the counter size during the CTR mode of operation?
looks like AES in CTR mode supports "countersize" of multiple lengths as below:
1: First is a counter which is made up of a nonce and counter. The nonce is random, and the remaining bytes are counter bytes (which are incremented).
For example, a 16 byte block cipher might use the high 8 bytes as a nonce, and the low 8 bytes as a counter.
2: Second is a counter block, where all bytes are counter bytes and can be incremented as carries are generated.
For example, in a 16 byte block cipher, all 16 bytes are counter bytes
Does Linux Kernel Crypto subsystem increments the counter value for every block of input or is it needs tp be taken care by Kernel Driver for the respective Crypto H/W ?
counters and nonces are something which will be extracted from the IV i.e., IV = nonce + counter .Note if "l" is length of IV then first "l/2" is length of nonce and next "l/2" is length of counter.Please let me know if my understanding regarding IV,counter and nonce is correct or not?
Any information regarding the above is really appreciable.
How does AES driver identifies the counter size during the CTR mode of operation?
It most likely doesn't. As long as it sees the IV as one big 128 bit counter then there isn't a problem. If the counter would be 64 bit and initialized on all zeros then you would only have a problem after 2^64 = 18,446,744,073,709,551,616 (16 byte) blocks of data; that's not likely to happen.
Does Linux Kernel Crypto subsystem increments the counter value for every block of input or is it needs tp be taken care by Kernel Driver for the respective Crypto H/W ?
It needs to be taken care by the kernel driver. I only see an IV as input in the API. This is commonly the case for crypto API's. You cannot get any performance if you have to update the counter for each 16 bytes you want to encrypt.
counters and nonces are something which will be extracted from the IV i.e., IV = nonce + counter .Note if "l" is length of IV then first "l/2" is length of nonce and next "l/2" is length of counter.Please let me know if my understanding regarding IV,counter and nonce is correct or not?
Yes, you understand correctly. You would only have a problem if the protocol uses a separate nonce and counter and both are generated randomly. In that case you may have a problem with the carry from the counter to the nonce field.
Note that it may be a good idea to limit the data size to, say ~68 GB and use the top 12 bytes as a random nonce to avoid being bitten by the birthday problem.

Improve this compression algorithm?

I am compressing 8 bit bytes and the algorithm works only if the number of unique single bytes found on the data is 128 or less.
I take all the unique bytes. At the start I store a table containing once each unique byte. If they are 120 I store 120 bytes.
Then, instead of storing each item in space of 8 bits, I store each item in 7 bits, one after another. Those 7 bits contain the item's position on the table.
Question: how can I avoid storing those 120 bytes at the start, by storing the possible tables in my code?
What you are trying do is special case of huffman coding where you are only considering unique byte not their frequency hence giving each byte fixed length code but you can do better use their frequency to give them variable length codes using huffman coding and get more compression.
But if you intend to use the same algorithm then consider this way :-
Dont store 120 bytes store 256 bits (32 bytes) where 1 indicate if value is present
because it will give you all info. You use bit to get the values which
are found in the file and construct the mapping tables again
I don't know the exact algorithm, but probably the idea of the compression algorithm is that you cannot. It has to store those values, so it can write a shortcut for all other bytes in the data.
There is one way in which you could avoid writing those 120 bytes: when you know the contents of those bytes beforehand. For example, when you know that whatever you are going to send, will only contain those bytes. Then you can simply make the table known on both sides, and simply store everything but those 120 bytes.

PyCrypto compatibility with CommonCrypto in CFB mode?

I'm trying to get somepython code to decrypt data that was encrypted using the OS X CommonCrypto APIs. There is little to no documentation on the exact options that CommonCrypto uses, so I'm needing some help figuring out what options to set in PyCrypto.
Specifically, my CommonCrypto decryption setup call is:
CCCryptorCreateWithMode(kCCDecrypt, kCCModeCFB, kCCAlgorithmAES128, ccDefaultPadding, NULL, key, keyLength, NULL, 0, 0, 0, &mAESKey);
My primary questions are:
Since there is both a kCCModeCFB and kCCModeCFB8, what is CommonCrypto's definition of CFB mode - what segment size, etc?
What block size is the CommonCrypto AES128 using? 16 or 128?
What is the default padding, and does it even matter in CFB mode?
Currently, the first 4 bytes of data is decrypting successfully with PyCrypto *as long as I set the segment_size to 16*.
Without knowing CommonCrypto or PyCrypto, some partial answers:
AES (in all three variants) has a block size of 128 bits, which are 16 bytes.
CFB (cipher feedback mode) would actually also work without padding (i.e. with a partial last block), since for each
block the ciphertext is created as the XOR of plaintext with some keystream block, which only depends on previous blocks.
(You still can use any padding you want.)
If you can experiment with some known data, first have a look at the ciphertext size. If it is not a multiple of a
full block (and the same as the plaintext + IV), then it is quite likely no padding.
Otherwise, decrypt it with noPadding mode, have a look at the result, and compare with the different known padding modes.
From a glance at the source code, it might be PKCS#5-padding.
CFB8 is a variant of CFB which uses only the top 8 bits (= one byte) of each block cipher call output (which takes the
previous 128 bits (= 16 bytes) of ciphertext (or IV) as input). This needs 16 times as many block cipher calls, but
allows partial sending of a stream without having to worry about block boundaries.
There is another definition of CFB which includes a segment size - here the segment size is the number of
bits (or bytes) to be used from each cipher output. In this definition, the "plain" CFB would have a segment size of 128 bits (= 16 bytes), CFB8 would have a segment size of 8 bits (one byte).
