What is meant by 'Polynomial in number of bits in the input'?

What is meant by 'Polynomial in number of bits in the input'? - algorithm

So I have a piece of coursework which is asking a questions about complexity theory where I have some problem, PRIMES, which is in the NPTime complexity class.
All good up to that point.
The thing that is throwing me is a question to come up with a polynomial time algorithms for computing (a^p)mod(b). It has to be polynomial in the size of the input (no of bits).
It's the latter sentence that confuses me.
This is where it loses me! Surely, assuming a brute force attempt (all values between 2 and sqrt(n)), would give 2^NoBits which is exponential?!
Now I don't want the answer! This my coursework so I can't ask for that. I just want clarity on what is meant by 'polynomial in the number of input bits'. Explain it like you would to a child ;)

You have 3 numbers, a, p, and b. Each one can be of any size. 1. 10. 123_456_789. No limits.
When we write them in base 2, each is some number of bits, followed by that string of bits. So 1, 110, 111010110111100110100010101. Each one is some number of bits. 1, 2, 27. The sum of the number of bits is the total size of your input. 30.
Your algorithm should be efficient enough that there is some polynomial p(x) where x is the sum of the number of bits in the inputs, which is an upper bound on how long your computation will take.
In particular note that you do not want to multiply a by itself p times!

Related

Dynamic algorithm to multiply elements in a sequence two at a time and find the total

I am trying to find a dynamic approach to multiply each element in a linear sequence to the following element, and do the same with the pair of elements, etc. and find the sum of all of the products. Note that any two elements cannot be multiplied. It must be the first with the second, the third with the fourth, and so on. All I know about the linear sequence is that there are an even amount of elements.
I assume I have to store the numbers being multiplied, and their product each time, then check some other "multipliable" pair of elements to see if the product has already been calculated (perhaps they possess opposite signs compared to the current pair).
However, by my understanding of a linear sequence, the values must be increasing or decreasing by the same amount each time. But since there are an even amount of numbers, I don't believe it is possible to have two "multipliable" pairs be the same (with potentially opposite signs), due to the issue shown in the following example:
Sequence: { -2, -1, 0, 1, 2, 3 }
Pairs: -2*-1, 0*1, 2*3
Clearly, since there are an even amount of pairs, the only case in which the same multiplication may occur more than once is if the elements are increasing/decreasing by 0 each time.
I fail to see how this is a dynamic programming question, and if anyone could clarify, it would be greatly appreciated!

A quick google for define linear sequence gave
A number pattern which increases (or decreases) by the same amount each time is called a linear sequence. The amount it increases or decreases by is known as the common difference.
In your case the common difference is 1. And you are not considering any other case.
The same multiplication may occur in the following sequence
Sequence = {-3, -1, 1, 3}
Pairs = -3 * -1 , 1 * 3
with a common difference of 2.
However this is not necessarily to be solved by dynamic programming. You can just iterate over the numbers and store the multiplication of two numbers in a set(as a set contains unique numbers) and then find the sum.

Probably not what you are looking for, but I've found a closed solution for the problem.
Suppose we observe the first two numbers. Note the first number by a, the difference between the numbers d. We then count for a total of 2n numbers in the whole sequence. Then the sum you defined is:
sum = na^2 + n(2n-1)ad + (4n^2 - 3n - 1)nd^2/3
That aside, I also failed to see how this is a dynamic problem, or at least this seems to be a problem where dynamic programming approach really doesn't do much. It is not likely that the sequence will go from negative to positive at all, and even then the chance that you will see repeated entries decreases the bigger your difference between two numbers is. Furthermore, multiplication is so fast the overhead from fetching them from a data structure might be more expensive. (mul instruction is probably faster than lw).

constructing binary sequences with unique n-bit

A question that was asked during a job interview (which I pretty much failed) and
sadly, something I still cannot figure out.
Let's assume that you're given some positive integer, n.
Assume that you construct a sequence consisting of only 1 and 0, and
you want to construct a sequence of length 2^n + n-1 such that
every sequence of length n consisting of adjacent numbers is unique.
for instance
00110 (00, 01, 11, 10) for n=2
How would one construct such a sequence?
I think one should start with 0000..0 (n zeroes) and
do something about it.
If there is a constructive way of doing it, maybe
I could extend that method to constructing
a sequence consisting of only 0, 1, ..., k-1, and having
length k^n + n-1 such that
every sequence of length n consisting of adjacent numbers is unique
(or maybe not..)
(sorry, my sequence for n=3 is wrong, so I deleted it.
also, i've never heard of De Bruijin's sequence. I know it now!
thanks for all the answers and comments).

This strikes me as a very ambitious interview question; if you don't know the answer, you're unlikely to get it in a few minutes.
As mentioned in comments, this is really just the derivation of a de Bruijn sequence, only unwrapped. You can read the Wikipedia article linked above for more information, but the algorithms it proposes, while efficient, are not exactly easy to derive. There is a much simpler (but rather more storage-intensive) algorithm which I think is folkloric; at least, I don't know of a name attached to it. It's at least simple to describe:
Start with n 0s
As long as possible:
If you can add a 1 without repeating a previously-seen n-sequence, do so.
If not but you can add a 0 without repeating a previously-seen n-sequence, do so.
Otherwise, done.
This requires you to either search the entire string on each iteration, requiring exponential time, or maintain a boolean array of all seen sequences (coded as binary numbers, presumably), requiring exponential space. The "concatenate all Lyndon words in lexicographical order" solution is much more efficient, but leaves open the question of generating all Lyndon words in lexicographical order.

Distinct digit count

Is is possible to count the distinct digits in a number in constant time O(1)?
Suppose n=1519 output should be 3 as there are 3 distinct digits(1,5,9).
I have done it in O(N) time but anyone knows how to find it in O(1) time?

I assume N is the number of digits of n. If the size of n is unlimited, it can't be done in general in O(1) time.
Consider the number n=11111...111, with 2 trillion digits. If I switch one of the digits from a 1 to a 2, there is no way to discover this without in some way looking at every single digit. Thus processing a number with 2 trillion digits must take (of the order of) 2 trillion operations at least, and in general, a number with N digits must take (of the order of) N operations at least.
However, for almost all numbers, the simple O(N) algorithm finishes very quickly because you can just stop as soon as you get to 10 distinct digits. Almost all numbers of sufficient length will have all 10 digits: e.g. the probability of not terminating with the answer '10' after looking at the first 100 digits is about 0.00027, and after the first 1000 digits it's about 1.7e-45. But unfortunately, there are some oddities which make the worst case O(N).

After seeing that someone really posted a serious answer to this question, I'd rather repeat my own cheat here, which is a special case of the answer described by #SimonNickerson:
O(1) is not possible, unless you are on radix 2, because that way, every number other than 0 has both 1 and 0, and thus my "solution" works not only for integers...
EDIT
How about 2^k - 1? Isn't that all 1s?
Drat! True... I should have known that when something seems so easy, it is flawed somehow... If I got the all 0 case covered, I should have covered the all 1 case too.
Luckily this case can be tested quite quickly (if addition and bitwise AND are considered an O(1) operation): if x is the number to be tested, compute y this way: y=(x+1) AND x. If y=0, then x=2^k - 1. because this is the only case when all the bits needed to be flipped by the addition. Of course, this is quite a bit flawed, as with bit lengths exceeding the bus width, the bitwise operators are not O(1) anymore, but rather O(N).
At the same time, I think it can be brought down to O(logN), by breaking the number into bus width size chunks, and AND-ing together the neighboring ones, repeating until only one is left: if there were no 0s in the number tested, the last one will be full 1s too...
EDIT2: I was wrong... This is still O(N).

Understanding assumptions about machine word size in analyzing computer algorithms

I am reading the book Introduction to Algorithms by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein.. In the second chapter under "Analyzing Algorithms" it is mentioned that :
We also assume a limit on the size of each word of data. For example , when working with inputs of size n , we typically assume that integers are represented by c lg n bits for some constant c>=1 . We require c>=1 so that each word can hold the value of n , enabling us to index the individual input elements , and we restrict c to be a constant so that the word size doesn't grow arbitrarily .( If the word size could grow arbitrarily , we could store huge amounts of data in one word and operate on it all in constant time - clearly an unrealistic scenario.)
My questions are why this assumption that each integer should be represented by c lg n bits and also how c>=1 being the case allows us to index the individual input elements ?

first, by lg they apparently mean log base 2, so lg n is the number of bits in n.
then what they are saying is that if they have an algorithm that takes a list of numbers (i am being more specific in my example to help make it easier to understand) like 1,2,3,...n then they assume that:
a "word" in memory is big enough to hold any of those numbers.
a "word" in memory is not big enough to hold all the numbers (in one single word, packed in somehow).
when calculating the number of "steps" in an algorithm, an operation on one "word" takes one step.
the reason they are doing this is to keep the analysis realistic (you can only store numbers up to some size in "native" types; after that you need to switch to arbitrary precision libraries) without choosing a particular example (like 32 bit integers) that might be inappropriate in some cases, or become outdated.

You need at least lg n bits to represent integers of size n, so that's a lower bound on the number of bits needed to store inputs of size n. Setting the constant c >= 1 makes it a lower bound. If the constant multiplier were less than 1, you wouldn't have enough bits to store n.
This is a simplifying step in the RAM model. It allows you to treat each individual input value as though it were accessible in a single slot (or "word") of memory, instead of worrying about complications that might arise otherwise. (Loading, storing, and copying values of different word sizes would take differing amounts of time if we used a model that allowed varying word lengths.) This is what's meant by "enabling us to index the individual input elements." Each input element of the problem is assumed to be accessible at a single address, or index (meaning it fits in one word of memory), simplifying the model.

This question was asked very long ago and the explanations really helped me, but I feel like there could still be a little more clarification about how the lg n came about. For me talking through things really helps:
Lets choose a random number in base 10, like 27, we need 5 bits to store this. Why? Well because 27 is 11011 in binary. Notice 11011 has 5 digits each 'digit' is what we call a bit hence 5 bits.
Think of each bit as being a slot. For binary, each of those slots can hold a 0 or 1. What's the largest number I can store with 5 bits? Well, the largest number would fill each slot: 11111
11111 = 31 = 2^5 so to store 31 we need 5 bits and 31 is 2^5
Generally (and I will use very explicit names for clarity):
numToStore = 2 ^ numBitsNeeded
Since log is the mathematical inverse of exponent we get:
log(numToStore) = numBitsNeeded
Since this is likely to not result in an integer, we use ceil to round our answer up. So applying our example to find how many bits are needed to store the number 31:
log(31) = 4.954196310386876 = 5 bits

in a series of n elements of arithmetic progression, [n/2] elements are changed. Find the difference in the initial arithmetic progression

I have a list of size n which contains n consecutive members of an arithmetic progression which are not in order. I changed less than half of the elements in this list with some random integer. From this new list, how can I find the difference of the initial arithmetic progression?
I thought a lot about it but except brute force, I was not able to come up with any other thing :(
Thanks for thinking on this one :)

It's not possible to solve this in general and be 100% sure that your answer is correct. Let's say that the initial list is the following arithmetic progression (not in order):
1 3 2 4
Change less than half the elements at random... let's say for example that we changed 2 to 5:
1 3 5 4
If we can first find out which numbers we need to change to obtain a valid shuffled arithmetic sequence then we can easily solve the problem stated in the question. However we can see that there are multiple possible answers depending in which we number we choose to change:
6, 3, 5, 4 (difference is 1)
1, 3, 2, 4 (difference is 1)
1, 3, 5, 7 (difference is 2)
There is no way to know which of these possible sequence is the original sequence, so you cannot be sure what the original difference was.

Since there is no deterministic solution for the problem (as stated by #Mark Byers), you can try a probabilistic approach.
It's difficult to obtain the original progression, but its rate can be obtained easily by comparing the differences between elements. The difference of original ones will be multiples of rate.
Consider you take 2 elements from the list (probability that both of them belongs to the original sequence is 1/4), and compute the difference. This difference, with probability of 1/4, will be a multiple of the rate. Decompose it to prime factors and count them (for example, 12 = 2^^2 * 3 will add 2 to 2's counter and will increment 3's counter).
After many such iterations (it looks like a good problem for probabilistic methods, like Monte Carlo), you could analize the counters.
If a prime factor belongs to the rate, its counter will be at least num_iteartions/4 ( or num_iterations/2 if it appears twice).
The main problem is that small factors will have large probability on random input (for example, the difference between two random numbers will have 50% probability to be divisible by 2). So you'll have to compensate it: since 3/4 of your differences were random, you'll have to consider that (3/8)*num_iterations of 2's counter must be ignored. Since this also applies to all powers of two, the simpliest way is to pregenerate "white noise mask" by taking the differences only between random numbers.
EDIT: let's take this approach further. Consider that you create this "white noise mask" (let's call it spectrum) for random numbers, and consider that it's base-1 spectrum, since their smallest "largest common factor" is 1. By computing it for a differences of the arithmetic sequence, you'll obtain a base-R spectrum, where R is the rate, and it will equivalent to a shifted version of base-1 spectrum. So you have to find the value of R such that
your_spectrum ~= spectrum(1)*3/4 + spectrum(R)*1/4
You could also check for largest number R such that at least half of the elements will be equal modulo R.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio