Meaning of floor in equations - algorithm

By reading the title it may sound like a silly question, but I have a data structures exam tomorrow and some formulas I need to know for algorithm analysis are read as (n – floor(log (n + 1)). What is the meaning of floor?
Thanks

floor(x) is the largest integer not greater than x. You can easily find this information on the web, here for example.
e.g.
floor(1.12) = 1
floor(0.53) = 0
floor(-3.4) = -4
One thing that can confuse people is the floor of a negative value. Some might initially think that floor(-3.4) is -3 when in reality it is -4 by the definition of floor(x).
As a note, floor(x) is often written as .

To round down to the nearest integer value.

For positive numbers: Remove the decimal portion. eg. floor(3.4): 3
For negative numbers: Remove the decimal portion and subtract one. eg. floor(-3.4): -3 - 1 = -4
Hope this helps.!

Related

four-square sum representation for integers upto N

Lagrange's four-square theorem proves that any natural number can be written as the sum of four square numbers. What I need is to find any one way to write a natural number x as sum of four square numbers for all 0 <= x <= N for any given upper limit N.
What I have done so far is find two-square sum representation for all the numbers <= N for which it is possible to find one, and saved them in an array called two_square_div. Then I used a greedy approach like following:
last_two_square_sum = 0
for num in 0..N
if num can be written as sum of two square
last_two_square_sum = num
other_last_two_square_sum = num - last_two_square_sum
four_square_div[num] = (two_square_div[last_two_square_sum], two_square_div[other_last_two_square_sum]
But this approach does not work for numbers like 23, for which last_two_square_sum = 20 other_last_two_square_sum = 3. But 3 can not be written as sum of two squares so this method fails.
So could anybody provide a correct O(N) solution or any helpful hint? Thank you.
Your algorithm should make more than one attempt (if it already does, then the exit condition must be improved).
23 can be written as 3 + 20, yes; but 3 is not a decomposable of order two and can't lead to a solution.
So you go on: next you try 4 + 19, and this time it's 19 that is rejected. Next you try 5, so 23-5 is 18, and 5 is 12 + 22 while 18 is 32 + 32.
(Of course this is not O(N) at all).
It is not clear to me how you arrive at 20 and not accept previous solutions; try posting the whole of the code.
Also, try asking on Math StackExchange.

Intuition behind the Z algorithm

The Z algorithm is a string matching algorithm with O(n) complexity.
One use case is finding the longest occurence of string A from string B. For example, the longest occurence of "overdose" from "stackoverflow" would be "over". You could discover this by calling the Z algorithm with a combined string "overdose#stackoverflow" (where # is some character not present in either string). The Z algorithm would then try to match the combined string with itself - and create an array z[] where z[i] gives you the length of longest match starting from index i. In our example:
index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
string o v e r d o s e # s t a c k o v e r f l o w
z (21) 0 0 0 0 1 0 0 0 0 0 0 0 0 4 0 0 0 0 0 1 0
There are plenty of code implementations and mathematically oriented explanations of the algorithm, here are some good ones:
http://www.geeksforgeeks.org/z-algorithm-linear-time-pattern-searching-algorithm/
http://codeforces.com/blog/entry/3107
I can see how it works, but I don't understand why. It seems almost like black magic. I have a very strong intuition that this task is supposed to take O(n^2), yet here is an algorithm that does it in O(n)
I don't find it completely intuitive either, so I think that I qualify for answering. Otherwise I'd just say that you don't understand because you're an idiot, and surely that's not the answer your hoping for :-)
Case in point (citation from an explanation):
Correctness is inherent in the algorithm and is pretty intuitively clear.
So, let's try to be even more intuitive...
First, I'd guess that the common intuition for O(n^2) is this: for a string of length N, if you're dropped at a random place i in the string with no other information, you have to match x (< N) characters to compute Z[i]. If you're dropped N times, you have to do up to N(N-1) tests, so that's O(n^2).
The Z algorithm, however, makes good use of the informations you've gained from the past computations.
Let's see.
First, as long as you don't have a match (Z[i]=0), you progress along the string with one comparison per character, so that's O(N).
Second, when you find a range where there's a match (at index i), the trick is to use clever deductions using the previous Z[0...i-1] to compute all the Z values in that range in constant time, without other comparisons inside that range. The next matches will only be done on the right of the range.
That's how I understand it anyway, hope this helps.
I was looking for a deeper understanding for this algorithm hence I found this question.
I didn't understand the codeforces post initially, but later I found it is good enough for understanding, and I noticed that the post was not entirely accurate, and it omitted some steps in the thinking process, making it a bit confusing.
Let me try to correct the inaccuracy in that post, and clarify some of the steps I think may help people connect the dots to a line. In this process, I hope we can learn some intuition from the original author. In the explanation, I'll mix some quoted blocks from codeforces and my own notes so we can keep the original post close to our discussion.
The Z algorithm starts as:
As we iterate over the letters in the string (index i from 1 to n - 1), we maintain an interval [L, R] which is the interval with maximum R such that 1 ≤ L ≤ i ≤ R and S[L...R] is a prefix-substring (if no such interval exists, just let L = R =  - 1). For i = 1, we can simply compute L and R by comparing S[0...] to S[1...]. Moreover, we also get Z1 during this.
This is simple and straightforward.
Now suppose we have the correct interval [L, R] for i - 1 and all of the Z values up to i - 1. We will compute Z[i] and the new [L, R] by the following steps:
If i > R, then there does not exist a prefix-substring of S that starts before i and ends at or after i. If such a substring existed, [L, R] would have been the interval for that substring rather than its current value. Thus we "reset" and compute a new [L, R] by comparing S[0...] to S[i...] and get Z[i] at the same time (Z[i] = R - L + 1).
The bold part in the bullet point might be confusing, but if you read it twice, it's really just repeating the definition of R.
Otherwise, i ≤ R, so the current [L, R] extends at least to i. Let k = i - L. We know that Z[i] ≥ min(Z[k], R - i + 1) because S[i...] matches S[k...] for at least R - i + 1 characters (they are in the [L, R] interval which we know to be a prefix-substring). Now we have a few more cases to consider.
The bold part is not completely accurate, because R - i + 1 can be greater than Z[k], in which case Z[i] would be Z[k].
Let's focus on the key now: Z[i] ≥ min(Z[k], R - i + 1). Why is this true? Because of the following:
Based on the definition of interval [L, R] and i ≤ R, we already confirmed that S[0...R - L] == S[L...R], hence S[0...k] == S[L...i], and S[k...R - L] == S[i...R];
Say Z[k] = x, based on the definition of Z, we know S[0...x] == S[k...k + x];
Combined above equations, we know S[0...x] == S[L...L + x] == S[k...k + x] == S[i...i + x], when x < R - i + 1. The point is, S[k...k + x] == S[i...i + x], so Z[i] = Z[k] when Z[k] < R - i + 1.
These are the missing dots I mentioned in the beginning, and they explain both the second and the third bullet points, and partially the last bullet point. This wasn't straightforward when I read the codeforces post. To me this is the most important part of this algorithm.
For the last bullet point, if Z[k] ≥ R - i + 1, we would refresh [L, R], using i as the new L, and extending R to a bigger R'.
In the whole process, Z algorithm only uses each character once for comparison, so the time complexity is O(n).
As Ilya answered, the intuition in this algorithm is to carefully reuse every piece of information we gathered so far. I just explained it in another way. Hope it helps.

Algorithm for 0 and any other x

I need to write an algorithm that takes a positive integer x. If integer x is 0, the algorithm returns 0. If it's any other number, the algorithm returns 1.
Here's the catch. I need to condense the algorithm into one equation. i.e. no conditionals. Basically, I need a single equation that equates to 0 if x is zero and 1 if x > 0.
EDIT: As per my comment below. I realize that I wasn't clear enough. I am entering the formula into a system that I don't have control over, hence they strange restrictions.
However, I learned a couple tricks that could be useful in the future!
In C and C++, you can use this trick:
!!x
In those languages, !x evaluates to 1 if x is zero and 0 otherwise. Therefore, !!x evaluates to 1 if x is nonzero and 0 otherwise.
Hope this helps!
Try return (int)(x > 0)
In every programming language I know, (int)(TRUE) == 1 and (int)(FALSE) == 0
Assuming 32-bit integers:
int negX = -x;
return negX >> 31;
Negating x puts a 1 in the highest bit. Shifting right by 31 places moves that 1 to the lowest bit, and fills with 0s. This does nothing to a 0, but converts all positive integers to 1.
This is basically the sign function, but since you specified a positive integer input, you can drop the part that converts negative numbers to -1.
Since virtually every system I know of uses IEEE-754 representation for floating-point numbers, you could just rely on its behavior (namely, that 0.0 / 0.0 is NaN, and NaN != NaN). Pseudo-C (-Java, ...) follows:
float oneOrNAN = (float)(x) / (float)(x);
return oneOrNAN == oneOrNAN;
Like I said, I wasn't clear enough in my problem description. When I said equation, I meant a purely algebraic equation.
I did find an acceptable solution: Y = X/(X - .001)
If it's zero you get 0/ -.001 which is just 0. Any other number, you get 5/4.999 which is close enough to 1 for my particular situation.
However, this is interesting:
!!x
Thanks for the tip!

Number of ways of counting nothing is one?

This is something that I routinely err in while solving problems. How do we decide what is the value of a recursive function when the argument is at the lowest extreme. An example will help:
Given n, find the number of ways to tile a 3xN grid using 2x1 blocks only. Rotation of blocks is allowed.
The DP solution is easily found as
f(n): the number of ways of tiling a 3xN grid
g(n): the number of ways of tiling a 3xN grid with a 1x1 block cut off at the rightmost column
f(n) = f(n-2) + 2*g(n-1)
g(n) = f(n-1) + g(n-2)
I initially thought that the base cases would be f(0)=0, g(0)=0, f(1)=0, g(1)=1. However, this yields a wrong answer. I then read somewhere that f(0)=1 and reasoned it out as
The number of ways of tiling a 3x0 grid is one because there is only one way we cannot use any tiles(2x1 blocks).
My question is, by that logic, shouldn't g(0) be also one. But, in the correct solution, g(0)=0. In general, when can we say that the number of ways of using nothing is one?
About your specific question of tiling, think this way:
How many ways are there to "tile a 3*0 grid"?
I would say: Just one way, don't do anything! and you can't "do nothing" any other way. (f(0) = 1)
How many ways are there to "tile a 3*0 grid, cutting that specific block off"?
I would say: Hey! That's impossible! You can't cut the specific block off since there is nothing. So, there's no way one can solve the task anyhow. (g(0) = 0)
Now, let's get to the general case:
There's no "general" rule about zero cases.
Depending on your problem, you may be able to somehow "interpret" the situation, and find the reasonable value. Most of the times (depending on your definition of "ways") number of ways of doing "nothing" is 1, and number of ways of doing something impossible is 0!
Warning! Being able to somehow "interpret" the zero case is not enough for the relation to be correct! You should recheck your recursive relation (i.e. the way you get the n-th value from the previous ones) to be applicable for the zero-to-one case as well, since most of the time this would be a "tricky" case.
You may find it easier to base your recursive relation on some non-zero case, if you find the zero-case being tricky, or confusing.
The way I see it, g(0) is invalid, since there is no way to cut a 1x1 block out of a 3x0 grid.
Invalid values are typically represented as 0, -∞ or ∞, but this largely depends on the problem. The number of ways to place something would make sense to be 0 to represent invalid values (otherwise your counts will be off). When working with min, you'd generally use ∞. When working with max, you'd generally use -∞ (or possibly 0).
Generally, the number of ways to place 0 objects or objects in a 0-sized space makes sense to be 1 (i.e. placing no objects) (so f(0) = 1). In a lot of other cases valid values would be 0.
But these are far from rules (and avoid blindly following rules, because you'll get hurt with exceptions); the best advice I can give - when in doubt, throw a few values in and see what happens.
In this case you can easily determine what the actual values for g(1), f(1), g(2) and f(2) should be, and use these to calculate g(0) and f(0):
g(1) = 1
f(1) = 0
g(2): (all invalid, since ? is not populated)
|X |X ?X
|? || --
-- ?| --
g(2) = 0, thus g(0) = 0 - f(1) = 0 - 0 = 0
f(2):
|| -- --
|| -- ||
-- -- ||
f(2) = 3, thus f(0) = 3 - 2*g(1) = 3 - 2 = 1

Number of ways to add up to a sum S with N numbers

Say S = 5 and N = 3 the solutions would look like - <0,0,5> <0,1,4> <0,2,3> <0,3,2> <5,0,0> <2,3,0> <3,2,0> <1,2,2> etc etc.
In the general case, N nested loops can be used to solve the problem. Run N nested loop, inside them check if the loop variables add upto S.
If we do not know N ahead of time, we can use a recursive solution. In each level, run a loop starting from 0 to N, and then call the function itself again. When we reach a depth of N, see if the numbers obtained add up to S.
Any other dynamic programming solution?
Try this recursive function:
f(s, n) = 1 if s = 0
= 0 if s != 0 and n = 0
= sum f(s - i, n - 1) over i in [0, s] otherwise
To use dynamic programming you can cache the value of f after evaluating it, and check if the value already exists in the cache before evaluating it.
There is a closed form formula : binomial(s + n - 1, s) or binomial(s+n-1,n-1)
Those numbers are the simplex numbers.
If you want to compute them, use the log gamma function or arbitrary precision arithmetic.
See https://math.stackexchange.com/questions/2455/geometric-proof-of-the-formula-for-simplex-numbers
I have my own formula for this. We, together with my friend Gio made an investigative report concerning this. The formula that we got is [2 raised to (n-1) - 1], where n is the number we are looking for how many addends it has.
Let's try.
If n is 1: its addends are o. There's no two or more numbers that we can add to get a sum of 1 (excluding 0). Let's try a higher number.
Let's try 4. 4 has addends: 1+1+1+1, 1+2+1, 1+1+2, 2+1+1, 1+3, 2+2, 3+1. Its total is 7.
Let's check with the formula. 2 raised to (4-1) - 1 = 2 raised to (3) - 1 = 8-1 =7.
Let's try 15. 2 raised to (15-1) - 1 = 2 raised to (14) - 1 = 16384 - 1 = 16383. Therefore, there are 16383 ways to add numbers that will equal to 15.
(Note: Addends are positive numbers only.)
(You can try other numbers, to check whether our formula is correct or not.)
This can be calculated in O(s+n) (or O(1) if you don't mind an approximation) in the following way:
Imagine we have a string with n-1 X's in it and s o's. So for your example of s=5, n=3, one example string would be
oXooXoo
Notice that the X's divide the o's into three distinct groupings: one of length 1, length 2, and length 2. This corresponds to your solution of <1,2,2>. Every possible string gives us a different solution, by counting the number of o's in a row (a 0 is possible: for example, XoooooX would correspond to <0,5,0>). So by counting the number of possible strings of this form, we get the answer to your question.
There are s+(n-1) positions to choose for s o's, so the answer is Choose(s+n-1, s).
There is a fixed formula to find the answer. If you want to find the number of ways to get N as the sum of R elements. The answer is always:
(N+R-1)!/((R-1)!*(N)!)
or in other words:
(N+R-1) C (R-1)
This actually looks a lot like a Towers of Hanoi problem, without the constraint of stacking disks only on larger disks. You have S disks that can be in any combination on N towers. So that's what got me thinking about it.
What I suspect is that there is a formula we can deduce that doesn't require the recursive programming. I'll need a bit more time though.

Resources