How many numbers can we store with 1 bit? - byte

I want to know how many characters or numbers can I store in 1 bit only. It will be more helpful if you tell it in octal, hexadecimal.

I want to know how many characters or numbers can I store in 1 bit only.
It is not practical to use a single bit to store numbers or characters. However, you could say:
One integer provided that the integer is in the range 0 to 1.
One ASCII character provided that the character is either NUL (0x00) or SOH (0x01).
The bottom line is that a single bit has two states: 0 and 1. Any value domain with more that two values in the domain cannot be represented using a single bit.
It will be more helpful if you tell it in octal, hexadecimal.
That is not relevant to the problem. Octal and hexadecimal are different textual representations for numeric data. They make no difference to the meaning of the numbers, or (in most cases1) the way that you represent the numbers in a computer.
1 - The exception is when you are representing numbers as text; e.g. when you represent the number 42 in a text document as the character '4' followed by the character '2'.

A bit is a "binary digit", or a value from a set of size two. If you have one or more bits, you raise 2 to the power of the number of bits. So, 2ยน gives 2. The field in Mathematics is called combinatorics.

Related

Bitmasking--when to use hex vs binary

I'm working on a problem out of Cracking The Coding Interview which requires that I swap odd and even bits in an integer with as few instructions as possible (e.g bit 0 and 1 are swapped, bits 2 and 3 are swapped, etc.)
The author's solution revolves around using a mask to grab, in one number, the odd bits, and in another num the even bits, and then shifting them off by 1.
I get her solution, but I don't understand how she grabbed the even/odd bits. She creates two bit masks --both in hex -- for a 32 bit integer. The two are: 0xaaaaaaaa and 0x55555555. I understand she's essentially creating the equivalent of 1010101010... for a 32 bit integer in hexadecimal and then ANDing it with the original num to grab the even/odd bits respectively.
What I don't understand is why she used hex? Why not just code in 10101010101010101010101010101010? Did she use hex to reduce verbosity? And when should you use one over the other?
It's to reduce verbosity. Binary 10101010101010101010101010101010, hexadecimal 0xaaaaaaaa, and decimal 2863311530 all represent exactly the same value; they just use different bases to do so. The only reason to use one or another is for perceived readability.
Most people would clearly not want to use decimal here; it looks like an arbitrary value.
The binary is clear: alternating 1s and 0s, but with so many, it's not obvious that this is a 32-bit value, or that there isn't an adjacent pair of 1s or 0s hiding in the middle somewhere.
The hexadecimal version takes advantage of chunking. Assuming you recognize that 0x0a == 0b1010, you can mentally picture the 8 groups of 1010 in the assumed value.
Another possibility would be octal 25252525252, since... well, maybe not. You can see that something is alternating, but unless you use octal a lot, it's not clear what that alternating pattern in binary is.

Encode string to an specified length using any algo

Is there a way to compress/encode string to specified length(8/10 character).
I have a combination of secret key and a numeric value of 16 digit, and I want to create a unique id with combination of these both. which length should be between 8-12, and it should not change if combination is same.
Please suggest a way.
If it's 16 decimal digits and your string can contain any characters, then sure. If you want ten characters out, then you'd need 40 different characters. 4010 > 1016. Or for nine characters out, you need 60 different characters. 609 > 1016. E.g. some subset of the upper case letters, lower case letters, and digits (62 to choose 40 or 60 from). Then it is simply a matter of base conversion either way. Convert from base 10 to base 40 or 60, and then back.
Many languages already have Base-64 coding routines, which will get you to nine characters.
Eight is a problem, since you would need 100 characters (1008 == 1016), and there are only 95 printable ASCII characters.
You could use a secure hash function, like sha512, and truncate the resulting hex string to the desired length.
If you want slightly more entropy, you can base64 encode it before truncating.

How to encode a number as a string such that the lexicographic order of the generated string is in the same order as the numeric order

For eg. if we have two strings 2 and 10, 10 will come first if we order lexicographically.
The very trivial sol will be to repeat a character n number of time.
eg. 2 can be encoded as aa
10 as aaaaaaaaaa
This way the lex order is same as the numeric one.
But, is there a more elegant way to do this?
When converting the numbers to strings make sure that all the strings have the same length, by appending 0s in the front if necessary. So 2 and 10 would be encoded as "02" and "10".
While kjampani's solution is probably the best and easiest in normal applications, another way which is more space-efficient is to prepend every string with its own length. Of course, you need to encode the length in a way which is also consistently sorted.
If you know all the strings are fairly short, you can just encode their length as a fixed-length base-X sequence, where X is the number of character codes you're willing to use (popular values are 64, 96, 255 and 256.) Note that you have to use the character codes in lexicographical order, so normal base64 won't work.
One variable-length order-preserving encoding is the one used by UTF-8. (Not UTF-8 directly, which has a couple of corner cases which will get in the way, but the same encoding technique. The order-preserving property of UTF-8 is occasionally really useful.) The full range of such compressed codes can encode values up to 42 bits long, with an average of five payload bits per byte. That's sufficient for pretty long strings; four terabyte long strings are pretty rare in the wild; but if you need longer, it's possible, too, by extending the size prefix over more than one byte.
Break the string into successive sub strings of letters and numbers and then sort by comparing each substring as an integer if it's an numeric string
"aaa2" ---> aaa + 2
"aaa1000" ---> aaa + 1000
aaa == aaa
Since they're equal, we continue:
1000 > 2
Hence, aaa1000 > aaa2.

How to get ASCII code of an character in assembly language?

I need to enter a string and to show that string like array of ASCII codes.
How can i implement it in assembly language.
In assembly language, characters are already encoded in ASCII (or unicode or whatever). You work with characters as numbers.
What you need to be able to is to format numbers in their denary representation, for output. This is not specific to character codes.
There will almost certainly be library routines to do this, but it's not hard to do yourself. Basically, you write a loop which repeatedly extracts the lowest digit from the number (by taking the residue of the number modulo 10 - look for a MOD instruction), converts that into the character code for a digit (by adding 48) and adds it to a buffer, then divides the number by 10 to move on to the next digit. You repeat that until the number is zero.

how to represent a n-byte array in less than 2*n characters

given that a n-byte array can be represented as a 2*n character string using hex, is there a way to represent the n-byte array in less than 2*n characters?
for example, typically, an integer(int32) can be considered as a 4-byte array of data
The advantage of hex is that splitting an 8-bit byte into two equal halves is about the simplest thing you can do to map a byte to printable ASCII characters. More efficient methods consider multiple bytes as a block:
Base-64 uses 64 ASCII characters to represent 6 bits at a time. Every 3 bytes (i.e. 24 bits) are split into 4 6-bit base-64 digits, where the "digits" are:
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
(and if the input is not a multiple of 3 bytes long, a 65th character, "=", is used for padding at the end). Note that there are some variant forms of base-64 use different characters for the last two "digits".
Ascii85 is another representation, which is somewhat less well-known, but commonly used: it's often the way that binary data is encoded within PostScript and PDF files. This considers every 4 bytes (big-endian) as an unsigned integer, which is represented as a 5-digit number in base 85, with each base-85 digit encoded as ASCII code 33+n (i.e. "!" for 0, up to "u" for 84) - plus a special case where the single character "z" may be used (instead of "!!!!!") to represent 4 zero bytes.
(Why 85? Because 845 < 232 < 855.)
yes, using binary (in which case it takes n bytes, not surprisingly), or using any base higher than 16, a common one is base 64.
It might depend on the exact numbers you want to represent. For instance, the number 9223372036854775808, which requres 8 bytes to represent in binary, takes only 4 bytes in ascii, if you use the product of primes representation (which is "2^63").
How about base-64?
It all depends on what characters you're willing to use in your encoding (i.e. representation).
Base64 fits 6 bits in each character, which means that 3 bytes will fit in 4 characters.
Using 65536 of about 90000 defined Unicode characters you may represent binary string in N/2 characters.
Yes. Use more characters than just 0-9 and a-f. A single character (assuming 8-bit) can have 256 values, so you can represent an n-byte number in n characters.
If it needs to be printable, you can just choose some set of characters to represent various values. A good option is base-64 in that case.

Resources