Bit overflow when subtracting and adding. My video is confusing - overflow

I have two slides in a video that I'm watching that I'm confused about:
Why can no overflow happen when adding a positive and a negative? Give me an example?
and
Why is the first addition -8? Isn't it positive 8?

The result of adding a positive number (say, P) and a negative number (call it N) together cannot be more positive than P or more negative than N. Since N & P can be represented, their sum must be able to as well. (And you can't prove something can't happen with an example.)
In the second slide, they are demonstrating that when overflow occurs, the value being represented does not correspond to what the value of the addition should be (because that value cannot be represented).

Related

finding the count of cells in a given 2d array satisfying the given constraints

Given a 2-D array starting at (0,0) and proceeding to infinity in positive x and y axes. Given a number k>0 , find the number of cells reachable from (0,0) such that at every moment -> sum of digits of x+ sum of digits of y <=k . Moves can be up, down ,left or right. given x,y>=0 . Dfs gives answers but not sufficient for large values of k. anyone can help me with a better algorithm for this?
I think they asked you to calculate the number of cells (x,y) reachable with k>=x+y. If x=1 for example, then y can take any number between 0 and k-1 and the sum would be <=k. The total number of possibilities can be calculated by
sum(sum(1,y=0..k-x),x=0..k) = 1/2*k²+3/2*k+1
That should be able to do the trick for large k.
I am somewhat confused by the "digits" in your question. The digits make up the index like 3 times 9 makes 999. The sum of digits for the cell (999,888) would be 51. If you would allow the sum of digits to be 10^9 then you could potentially have 10^8 digits for an index, resulting something around 10^(10^8) entries, well beyond normal sizes for a table. I am therefore assuming my first interpretation. If that's not correct, then could you explain it a bit more?
EDIT:
okay, so my answer is not going to solve it. I'm afraid I don't see a nice formula or answer. I would approach it as a coloring/marking problem and mark all valid cells, then use some other technique to make sure all the parts are connected/to count them.
I have tried to come up with something but it's too messy. Basically I would try and mark large parts at once based on the index and k. If k=20, you can mark the cell range (0,0..299) at once (as any lower index will have a lower index sum) and continue to check the rest of the range. I start with 299 by fixing the 2 last digits to their maximum value and look for the max value for the first digit. Then continue that process for the remaining hundreds (300-999) and only fix the last digit to end up with 300..389 and 390..398. However, you can already see that it's a mess... (nevertheless i wanted to give it to you, you might get some better idea)
Another thing you can see immediately is that you problem is symmetric in index so any valid cell (x,y) tells you there's another valid cell (y,x). In a marking scheme / dfs/ bfs this can be exploited.

Algorithm to find an arbitrarily large number

Here's something I've been thinking about: suppose you have a number, x, that can be infinitely large, and you have to find out what it is. All you know is if another number, y, is larger or smaller than x. What would be the fastest/best way to find x?
An evil adversary chooses a really large number somehow ... say:
int x = 9^9^9^9^9^9^9^9^9^9^9^9^9^9^9
and provides isX, isBiggerThanX, and isSmallerThanx functions. Example code might look something like this:
int c = 2
int y = 2
while(true)
if isX(y) return true
if(isBiggerThanX(y)) fn()
else y = y^c
where fn() is a function that, once a number y has been found (that's bigger than x) does something to determine x (like divide the number in half and compare that, then repeat). The thing is, since x is arbitrarily large, it seems like a bad idea to me to use a constant to increase y.
This is just something that I've been wondering about for a while now, I'd like to hear what other people think
Use a binary search as in the usual "try to guess my number" game. But since there is no finite upper end point, we do a first phase to find a suitable one:
Initially set the upper end point arbitrarily (e.g. 1000000, though 1 or 1^100 would also work -- given the infinite space to work in, all finite values are equally disproportionate).
Compare the mystery number X with the upper end point.
If it's not big enough, double it, and try again.
Once the upper end point is bigger than the mystery number, proceed with a normal binary search.
The first phase is itself similar to a binary search. The difference is that instead of halving the search space with each step, it's doubling it! The cost for each phase is O(log X). A small improvement would be to set the lower end point at each doubling step: we know X is at least as high as the previous upper end point, so we can reuse it as the lower end point. The size of the search space still doubles at each step, but in the end it will be half as large as would have been. The cost of the binary search will be reduced by only 1 step, so its overall complexity remains the same.
Some notes
A couple of notes in response to other comments:
It's an interesting question, and computer science is not just about what can be done on physical machines. As long as the question can be defined properly, it's worth asking and thinking about.
The range of numbers is infinite, but any possible mystery number is finite. So the above method will eventually find it. Eventually is defined such as that, for any possible finite input, the algorithm will terminate within a finite number of steps. However since the input is unbounded, the number of steps is also unbounded (it's just that, in every particular case, it will "eventually" terminate.)
If I understand your question correctly (advise if I do not), you're asking about how to solve "pick a number from 1 to 10", except that instead of 10, the upper bound is infinity.
If your number space is truly infinite, the following are true:
The value will never be held in an int (or any other data type) on any physical hardware
You will NEVER find your number
If the space is immensely large but bound, I think the best you can do is a binary search. Start at the middle of the number range. If the desired number turns out to be higher or lower, divide that half of the number space, and repeat until the desired number is found.
In your suggested implementation you raise y ^ c. However, no matter how large c is chosen to be, it will not even move the needle in infinite space.
Infinity isn't a number. Thus you can't find it, even with a computer.
That's funny. I've wondered the same thing for years, though I've never heard anyone else ask the question.
As simple as your scenario is, it still seems to provide insufficient information to allow the choice of an optimal strategy. All one can choose is a suitable heuristic. My heuristic had been to double y, but I think that I like yours better. Yours doubles log(y).
The beauty of your heuristic is that, so long as the integer fits in the computer's memory, it finds a suitable y in logarithmic time.
Counter-question. Once you find y, how do you proceed?
I agree with using binary search, though I believe that a ONE-SIDED binary search would be more suitable, since here the complexity would NOT be O( log n ) [ Where n is the range of allowable numbers ], but O( log k ) - where k is the number selected by your adversary.
This would work as follows : ( Pseudocode )
k = 1;
while( isSmallerThanX( k ) )
{
k = k*2;
}
// At this point, once the loop is exited, k is bigger than x
// Now do normal binary search for the range [ k/2, k ] to find your number :)
So even if the allowable range is infinity, as long as your number is finite, you should be able to find it :)
Your method of tetration is guaranteed to take longer than the age of the universe to find an answer, if the opponent merely uses a paradigm which is better (for example, pentation). This is how you should do it:
You can only do this with symbolic representations of numbers, because it is trivial to name a number your computer cannot store in floating-point representation, even if it used arbitrary-precision arithmetic and all its memory.
Required reading: http://www.scottaaronson.com/writings/bignumbers.html - that pretty much sums it up
How do you represent a number then? You represent it by a program which will, if run to completion, print out that number. Even then, your computer is incapable of computing BusyBeaver(10^100) (if you dictated a program 1 terabyte in size, this well over the maximum number of finite clock cycles it could run without looping forever). You can see that we could easily have the computer print out 1 0 0... each clock cycle, making the maximum number it could say (if we waited nearly an eternity) would be 10^BusyBeaver(10^100). If you allowed it to say more complicated expressions like eval(someprogram), power-towers, Ackermann's function, whatever-- then I believe that would be no better than increasing the original 10^100 by some constant proportional to the complexity of what you described (plus some logarithmic interpreter factor, see Kolmogorov complexity).
So let's frame this another way:
Your opponent picks a finite computable number, and gives you a function tells you if the number is smaller/larger/equal by computing it. He also gives you a representation for the output (in a sane world this would be "you can only print numbers like 99999", but he can make it more complicated; it actually doesn't matter). Proceed to measure the size of this function in bits.
Now, answer with your own function, which is twice the size of his function (in bits), and prints out the largest number it can while keeping the code to less than 2N bits in length. (You use the same representation he chose: In a world where you can only print out numbers like "99999", that's what you do. If you can define functions, it gets slightly more complicated.)
I do not understand the purpose here, but I this is what I thought of:
Reading your comments, I suppose you aren't looking for infinitely large number, but a "super large number" instead. And whatever be the number, it will have a large no. of digits. How you got them, isn't the concern. Keeping this in mind:
No complex computation is required. Just type random keys on your numeric keyboard to have a super large number, and then have a program randomly add/remove/modify digits of that number. You get a list of very large numbers - select any one out of them.
e.g: 3672036025039629036790672927305060260103610831569252706723680972067397267209
and keep modifying/adding digits to get more numbers
PS: If you state the purpose in your question clearly, we might be able to give better answers.

How to compute palindrome from a stream of characters in sub-linear space/time?

I don't even know if a solution exists or not. Here is the problem in detail. You are a program that is accepting an infinitely long stream of characters (for simplicity you can assume characters are either 1 or 0). At any point, I can stop the stream (let's say after N characters were passed through) and ask you if the string received so far is a palindrome or not. How can you do this using less sub-linear space and/or time.
Yes. The answer is about two-thirds of the way down http://rjlipton.wordpress.com/2011/01/12/stringology-the-real-string-theory/
EDIT: Some people have asked me to summarize the result, in case the link dies. The link gives some details about a proof of the following theorem: There is a multi-tape Turing machine that can recognize initial non-trivial palindromes in real-time. (A summary, also provided by the article linked: Suppose the machine has read x1, x2, ..., xk of the input. Then it has only constant time to decide if x1, x2, ..., xk is a palindrome.)
A multitape Turing machine is just one with several side-by-side tapes that it can read and write to; in a very specific sense it is exactly equivalent to a standard Turing machine.
A real-time computation is one in which a Turing machine must read a character from input at least once every M steps (for some bounded constant M). It is readily seen that any real-time algorithm should be linear-time, then.
There is a paper on the proof which is around 10 pages which is available behind an institutional paywall here which I will not repost elsewhere. You can contact the author for a more detailed explanation if you'd like; I just had read this recently and realized it was more or less what you were looking for.
You could use a rolling hash, or more rolling hashes for accuracy. Incrementally compute the hash of the characters read so far, in the order they were read, and in reverse order of reading.
If your hash function is x*3^(k-1)+x*3^(k-2)+...+x*3^0 for example, where x is a character you read, this is how you'd do it:
hLeftRight = 0
hRightLeft = 0
k = 0
repeat until there are numbers in the stream
x = stream.Get()
hLeftRight = 3*hLeftRight + x.Value
hRightLeft = hRightLeft + 3^k*x.Value
if (x.QueryPalindrome = true)
yield hLeftRight == hRightLeft
k = k + 1
Obviously you'd have to calculate the hashes modulo something, probably a prime or a power of two. And of course, this could lead to false positives.
Round 2
As I see it, with each new character, there are three cases:
Character breaks potential symmetry, for example, aab -> aabc
Character extends the middle, for example aab -> aabb
Character continues symmetry, for example aab->aaba
Assume you have a pointer that tracks down the string and points to the last character that continued a potential palindrome.
(I am going to use parenthesis to indicate a pointed at character)
Lets say you are starting with aa(b) and get an:
'a' (case 3), you move the pointer to
the left and check if it's an 'a' (it
is). You now have a(a)b.
'c' (case 1), you are not expecting a 'c', in this case you start back at the beginning and you now have aab(c).
The really tricky case is 2, because somehow you have to know that the character you just got isn't affecting symmetry, it is just extending the middle. For this, you have to hold an additional pointer that tracks where the plateau's (middle's) edge lies. For example, you have (b)baabb and you just got another 'b', in this case you have to know to reset the pointer to the base of the middle plateau here: bbaa(b)bb. Since we are going for constant time, you have to hold a pointer here to begin with (you can't afford the time to search for the plateau's edge). Now if you get another 'b', you know that you are still on the edge of that plateau and you keep the pointer where it is, so bbaa(b)bb -> bbaa(b)bbb. Now, if you get an 'a', you know that the 'b's are not part of the extended middle and you reset both pointers (The tracking pointer and the edge pointer) so you now have bbaabbbb((a)).
With these three cases, I think all bases are covered. If you ever want to check if the current string is a palindrome, check if the first pointer (not the plateau's edge pointer) is at index 0.
This might help you:
http://arxiv.org/pdf/1308.3466v1.pdf
If you store the last $k$ many input symbols you can easily find palindromes up to a length of $k$.
If you use the algorithms of the paper you can find the midpoints of palindromes and an length estimate of its length.

Is there "good" PRNG generating values without hidden state?

I need some good pseudo random number generator that can be computed like a pure function from its previous output without any state hiding. Under "good" I mean:
I must be able to parametrize generator in such way that running it for 2^n iterations with any parameters (or with some large subset of them) should cover all or almost all values between 0 and 2^n - 1, where n is the number of bits in output value.
Combined generator output of n + p bits must cover all or almost all values between 0 and 2^(n + p) - 1 if I run it for 2^n iterations for every possible combination of its parameters, where p is the number of bits in parameters.
For example, LCG can be computed like a pure function and it can meet first condition, but it can not meet second one. Say, we have 32-bit LCG, m = 2^32 and it is constant, our p = 64 (two 32-bit parameters a and c), n + p = 96, so we must peek data by three ints from output to meet second condition. Unfortunately, condition can not be meet because of strictly alternating sequence of odd and even ints in output. To overcome this, hidden state must be introduced, but that makes function not pure and breaks first condition (long hidden period).
EDIT: Strictly speaking, I want family of functions parametrized by p bits and with full state of n bits, each generating all possible binary strings of p + n bits in unique "randomish" way, not just continuously incrementing (p + n)-bit int. Parametrization required to select that unique way.
Am I wanting too much?
You can use any block cipher, with a fixed key. To generate the next number, decrypt the current one, increment it, and re-encrypt it. Because block ciphers are 1:1, they'll necessarily iterate through every number in the output domain before repeating.
Try LFSR
All you need is list of primitive polynomials.
Period of generating finite field this way, generates field of size 2^n-1. But you can generalise this procedure to generate anything whit period of k^n-1.
I have not seen this implemented, but all you have to implement is shifting numbers by small number s>n where gcd(s,2^n-1) == 1. gcd stands for greatest common divisor

Anticipate factorial overflow

I'm wondering how could I anticipate whether the next iteration will generate an integer overflow while calculating the factorial F or not?
Let's say that at each iteration I have an int I and the maximum value is MAX_INT.
It sounds like a homework, I know. It's not. It's just me asking myself "stupid" questions.
Addendum
I though about, given a number of BITS (the width an integer can take, in bits), I could round up the number I to the next power of two, and detect if a shift to left would exceed BITS. But how would that look like, algorithmically?
Alternative hint:
a * b ≤ MAX_INT
is equivalent to
a ≤ MAX_INT / b
if b > 0.
Factorials are a series of multiplications, and the number of bits needed to hold the result of a multiplication is the sum of the bits of the two multiplicands. So, keep a running total of how many bits are used in your result, and the current number of bits needed to hold the value you are multiplying in. When that's greater than the number of bits left, you're about to overflow.
If you've so far got m = (n-1)! and you're about to multiply by n, you can guard against overflow by checking that
m <= MAX_INT / n
You can probably use Stirling's Approximation formula which says that
ln (n!) = n*ln(n) - n + ln(2*pi*n)/2 + O(1/n)
and will be quite accurate.
You don't actually need to go about trying to multiply etc. Of course, this does not directly answer what you asked, but given that you are just curious, hope this helps.

Resources