Using logical bitwise operators to convert ASCII to binary

Using logical bitwise operators to convert ASCII to binary - ascii

How could AND, OR, NOT or XOR be used to convert 7-bit ASCII code for a numeric character into a 7-bit pure binary representation of the number?

code AND 11112
You have to take the four lowest significant bits of your ASCII code.
Please note that a single ASCII character can represent only digits from 0 to 9 so 4 output bits are enough to represent the resulting number.

A very late answer.
I would perform a XOR operation on the ASCII code with the bitmask 011 0000 (48). The idea is that you minus 48 from the ASCII code to get the actual value of the digit, and XOR mimics subtraction.

Related

What is the output of '%b' verb when it is floating number

According to the go doc, %b used with floating number means:
decimalless scientific notation with exponent a power of two,
in the manner of strconv.FormatFloat with the 'b' format,
e.g. -123456p-78
As the code shows below, the program output is
8444249301319680p-51
I'm a little confused about %b in floating number, can anybody tell me how this result is calculated? Also what does p- mean?
f := 3.75
fmt.Printf("%b\n", f)
fmt.Println(strconv.FormatFloat(f, 'b', -1, 64))

The decimalless scientific notation with exponent a power of two that means follows:
8444249301319680*(2^-51) = 3.75 or 8444249301319680/(2^51) = 3.75
p-51 means 2^-51 which can also be calculated as 1/(2^51)
Nice article on Floating-Point Arithmetic.
https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

The five rules of scientific notation are given below:
The base is always 10
The exponent must be a non-zero integer, which means it can be either positive or negative
The absolute value of the coefficient is greater than or equal to 1 but it should be less than 10
The coefficient carries the sign (+) or (-)
he mantissa carries the rest of the significant digits
p
%b scientific notation with exponent a power of two (its p)
%e scientific notation

It is worth pointing out that the %b output is particularly easy for the runtime system to generate as well, due to the internal storage format for floating point numbers.
If we ignore "denormalized" floating point numbers (we can add them back later), a floating point number is stored, internally, as 1.bbbbbb...bbb x 2exp for some set of bits ("b" here), e.g., the value four is stored as 1.000...000 <exp> 2. The value six is stored as 1.100...000 <exp> 2, the value seven is stored as 1.110...000 <exp> 2, and eight is stored as 1.000...000 <exp> 3. The value seven-and-a-half is 1.111 <exp> 2, seven and three quarters is 1.1111 <exp> 2, and so on. Each bit here, in the 1.bbbb, represents the next power of two lower than the exponent.
To print out 1.111 <exp> 2 with the %b format, we simply note that we need four 1 bits in a row, i.e., the value 15 decimal or 0xf or 1111 binary, which causes the exponent to need to be decreased by 3, so that instead of multiplying by 22 or 4, we want to multiply by 2-1 or ½. So we can take the actual exponent (2), subtract 3 (because we moved the "point" three times to print 1111 binary or 15), and hence print out the string 15p-1.
That's not what Go's %b prints though: it prints 8444249301319680p-50. This is the same value (so either one would be correct output)—but why?
Well, 8444249301319680 is, in hexadecimal, 1E000000000000. Expanded into full binary, this is 1 1110 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000. That's 53 binary digits. Why 53 binary digits, when four would suffice?
The answer to that is found in the link in Nick's answer: IEEE 754 floating point format uses a 53-digit "mantissa" or "significand" (the latter is the better term and the one I usually try to use, but you'll see the former pop up very often). That is, the 1.bbb...bbb has 52 bs, plus that forced-in leading 1. So there are always exactly 53 binary digits (for IEEE "double precision").
If we just treat this 53-binary-digit number as a decimal number, we can always print it out without a decimal point. That means we just adjust the power-of-two exponent.
In IEEE754 format, the exponent itself is already stored in "excess form", with 1023 added (for double precision again). That means that 1.111000...000 <exp> 2 is actually stored with an exponent value of 2+1023 = 1025. What this means is that to get the actual power of two, the machine code formatting the number is already going to have to subtract 1023. We can just have it subtract 52 more at the same time.
Last, because the implied 1 is always there, the internal IEEE754 number doesn't actually store the 1 bit. So to read out the value and convert it, the code internally does:
decimalPart := machineDependentReinterpretation1(&doubleprec_value)
expPart := machineDependentReinterpretation2(&doubleprec_value)
where the machine-dependent-reinterpretation simply extracts the correct bits, puts in the implied 1 bit as needed in the decimal part, subtracts the offset (1023+52) for the exponent part, and then does:
fmt.Sprint("%dp%d", decimalPart, expPart)
When printing a floating-point number in decimal, the base conversion (from base 2 to base 10) is problematic, requiring a lot of code to get the rounding right. Printing it in binary like this is much easier.
Exercises for the reader, to help with understanding this:
Compute 1.102 x 22. Note: 1.12 is 1½ decimal.
Compute 11.02 x 21. (11.02 is 3.)
Based on the above, what happens as you "slide the binary point" left and right?
(more difficult) Why can we assume a leading 1? If necessary, read on.
Why we can assume a leading 1?
Let's first note that when we use scientific notation in decimal, we can't assume a leading 1. A number might be 1.7 x 103, or 5.1 x 105, or whatever. But when we use scientific notation "correctly", the first digit is never zero. That is, we do not write 0.3 x 100 but rather 3.0 x 10-1. In this kind of notation, the number of digits tells us about the precision, and the first digit never has to be zero and generally isn't supposed to be zero. If the first digit were zero, we just move the decimal point and adjust the exponent (see exercises 1 and 2 above).
The same rules apply with floating-point numbers. Instead of storing 0.01, for instance, we just slide the binary point two over two positions and get 1.00, and decrease the exponent by 2. If we might want to have stored 11.1, we slide the binary point one position the other way and increase the exponent. Whenever we do this, the first digit always winds up being a one.
There is one big exception here, which is: when we do this, we can't store zero! So we don't do this for the number 0.0. In IEEE754, we store 0.0 as all-zero-bits (except for the sign, which we can set to store -0.0). This has an all-zero exponent, which the computer hardware handles as a special case.
Denormalized numbers: when we can't assume a leading 1
This system has one notable flaw (which isn't entirely fixed by denorms, but nonetheless, IEEE has denorms). That is: the smallest number we can store "abruptly underflows" to zero. Kahan has a 15 page "brief tutorial" on gradual underflow, which I am not going to attempt to summarize, but when we hit the minimum allowed exponent (2-1023) and want to "get smaller", IEEE lets us stop using these "normalized" numbers with the leading 1 bit.
This doesn't affect the way that Go itself formats floating point numbers, because Go just takes the entire significand "as is". All we have to do is stop inserting the 253 "implied 1" when the input value is a denormalized number, and everything else Just Works. We can hide this magic inside the machine-dependent float64 reinterpretation code, or do it explicitly in Go, whichever is more convenient.

Using integers to encode short strings

Suppose I were limited to using only 32-bit unsigned integers to express strings. Obviously, I could use individual u8 numbers and allocate enough separate values to describe a short string, but say compute and time aren’t important, this being for my curiosity, not necessarily for a real world use.
I observe that a 32-bit number is the same size as 4 strict u8 chars. In decimal, there’s space to encode 4 of any character-encoding that could be indexed by a 2-digit decimal as their decimal equivalent, while 5 ECMA-1 characters could fit in the same bitsize.
Suppose I want the range of printable characters, using a mapped ASCII table, where I subtract 32 to get the printable characters into 2 decimal digits (32 to 126 become 0 to 94). Suppose a mapping function similar to |c,i|c-31*(10^((i+1)*2)), where c is the ASCII value and i is the position: 45769502. In ASCII values as a u8 array [66, 97, 116, 33], or the string “Bat!”
Clearly this is not computationally efficient. I’m not necessarily shooting for that? Just pure curiosity here.
Supposing compute is arbitrary, so even being totally absurd, how might I encode a longer string in a 32-bit unsigned integer?

First you need to decide on which characters you want to encode. Suppose you have chosen k characters which you have mapped to the numbers 0 to k-1. Then every integer n is mapped to a unique non-empty string by expressing n in base k and mapping each k-ary digit to the corresponding character. You could reserve the maximum integer for the empty string.
So you just need a mapping table for the k characters and a function to convert an integer from one base to another, that's simple and efficient, and the encoding is also optimally dense (since every integer maps to a unique string).

How can one byte be more significant than another?

The difference between little endian and big endian was explained to me like this: "In little endian the least significant byte goes into the low address, and in big endian the most significant byte goes into the low address". What exactly makes one byte more significant than the other?

In the number 222, you could regard the first 2 as most significant because it represents the value 200; the second 2 is less significant because it represents the value 20; and the third 2 is the least significant digit because it represents the value 2.
So, although the digits are the same, the magnitude of the number they represent is used to determine the significance of a digit.
It is the same as when a value is rounded to a number of significant figures ("S.F." or "SF"): 123.321 to 3SF is 123, to 2SF it is 120, to 4SF it is 123.3. That terminology has been used since before electronics were invented.

In any positional numeric system, each digit has a different weight in creating the overall value of the number.
Consider the number 51354 (in base 10): the first 5 is more significant than the second 5, as it stands for 5 multiplied by 10000, while the second 5 is just 5 multiplied by 10.
In computers number are generally fixed-width: for example, a 16 bit unsigned integer can be thought as a sequence of exactly 16 binary digits, with the leftmost one being unambiguously the most significant (it is worth exactly 32768, more than any other bit in the number), and the rightmost the least significant (it is worth just one).
As long as integers are in the CPU registers we don't really need to care about their representation - the CPU will happily perform operations on them as required. But when they are saved to memory (which generally is some random-access bytes store), they need to be represented as bytes in some way.
If we consider "normal" computers, representing a number (bigger than one byte) as a sequence of bytes means essentially representing it in base 256, each byte being a digit in base 256, and each base-256 digit being more or less significant.
Let's see an example: take the value 54321 as a 16 bit integer. If you write it in base 256, it'll be two base-256 digits: the digit 0xD41 (which is worth 0xD4 multiplied by 256) and the digit 0x31 (which is worth 0x31 multiplied by 1). It's clear that the first one is more significant than the second one, as indeed the leftmost "digit" is worth 256 times more than the one at its right.
Now, little endian machines will write in memory the least significant digit first, big endian ones will do the opposite.
Incidentally, there's a nice relationship between binary, hexadecimal and base-256: 4 bits are mapped straight to a hexadecimal digit, and 2 hexadecimal digits are mapped straight to a byte. For this reason you can also see that 54321 is in binary
1101010000110001 = 0xD431
can be split straight into two groups of 8 bits
11010100 00110001
which are the 0xD4 and 0x31 above. So you can see as well that the most significant byte is the one that contains the most significant bits.
Here I'm using the corresponding hexadecimal values to represent each base-256 digit, as there's no good way to represent them symbolically. I could use their ASCII character value, but I 0xD4 is outside ASCII, and 0x31 is 1, which would only add confusion.

A computer represents information in groups of 64 bits. How many different integers can be represented in BCD code?

This from my Interview-MCQ module:
A computer represents information in groups of 64 bits. How many
different integers can be represented in BCD code?
The given answer is 1016, however no explanation is provided, I was just wondering if somebody could help me understand the answer.

BCD is binary coded decimal. In BCD, every 4 bits is used to represent a single digit from 0 to 9. So if you have 64 bits, that gives you 64/4 = 16 decimal digits, which means you can have 10^16 different integers.

What is the 2's complement of -17?

What will be the binary value of -17 and how to find the 2's complement of -17?

Assuming an 8-bit word, start with the binary form of 17. = 00010001
Then invert the bits: = 11101110
Then just add 1: = 11101111.
If you've got a 16-, 32- or 64-bit word then you'll have a load more leading 1s.

Even if you do not assume anything, you have to just keep the leftmost bit significant.
Start with the word itself, 10001.
Then invert gives the one's, 01110
Now add 1 to this number. 01111.
But to keep the left most bit significant, append a one there eg,101111
in terms of the minimum number of bits required (6 here).

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio