Unlimited integer increment running time - performance

I was asked in a job interview to build a 'simple' unlimited integer.
I was given endless space and time, so for 'simplicity' I just used a linked list of digits (0-9).
An 'increment' operation increments the first digit and cascades an increment to the next one if it's 0 (was 9 initially).
So far so good.
Then I was asked what is the run time for such an increment operation, given a random initial state.
I got kinda stuck on it, started talking about sums of logarithms on bases 10^n and such, when the interviewer got frustrated and told me it's constant time.
I can't figure it out by myself, but it seems to me that it can't be constant.
Please, can anyone explain this to me?

Related

How to identify and count the loop in R

Hi, please see the picture attached. I am trying to test out a trading strategy.
The data frame you see here is the result of it. The multiplier is calculated manually and took me 2 days since there were 60,000 rows.
So if you can, please teach me the code to calculated automatically. Thanks in advance.
The rule is as follow:
The true profit is equip to the profit multiply by the multiplier
Every time when I lose for more than 1 point, the loop starts and I will double the entry for the next trade until the last trade that win back at least 90% of what I have lost in this loop. Then the multiplier return to 1. (please note, in the picture, there are 2 entries with a multiplier of 4 occurring twice. That's because the first one did not win back at least 90% of my losses in the loop)
The second thing I needed help with is to count how many loops ended with each number.
How many ended with a multiplier of only 1, and how many ended with each different numbers.
Thanks again
I haven't tried anything since I have no idea what to do.

Increase Speed of Wordle Bot (General For Any Programming Language)

I am working on a Wordle bot to calculate the best first (and subsequent) guess(es). I am assuming all 13,000 possible guesses are equally likely (for now), and I am running into a huge speed issue.
I can not think of a valid way to avoid a triple for loop, each with 13,000 elements. Obviously, this is ridiculously slow and would take about 20 hours on my laptop to compute the best first guess (I assume it would be faster for subsequent guesses due to fewer valid words).
The way I see it, I definitely need the first loop to test each word as a guess.
I need the second loop to find the colors/results given each possible answer with that guess.
Then, I need the third loop to see which guesses are valid given the guess and the colors.
Then, I find the average words remaining for each guess, and choose the one with the lowest average using a priority queue.
I think the first two loops are unavoidable, but maybe there's a way to get rid of the third?
Does anyone have any thoughts or suggestions?
I did a similar thing, for a list with 11,000 or so entries. It took 27 minutes.
I did it using a pregenerated (in one pass) list of the letters in a word,
as a bitfield (i.e. one 32 bit integer) and then did crude testing using
the AND instruction. if a word failed that, it exited the rest of the loop.

Optimal sequence to brute force solve a keypad code lock [duplicate]

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
Need help in building efficient exhaustive search algorithm
Imagine that you must open a locked door by inputting the correct 4-digit code on a keypad. After every keypress the lock evaluates the sequence of the last 4 digits inputted, i.e. by entering 123456 you have evaluated 3 codes: 1234, 2345 and 3456.
What is the shortest sequence of keypresses to evaluate all 10^4 different combinations?
Is there a method for traversing the entire space easy enough for a human to follow?
I have pondered this from time to time since a friend of mine had to brute force such a lock, to not having to spend the night outdoors in wintertime.
My feeble attempts at wrapping my head around it
With a code of length L=4 digits and an "alphabet" of digits of size D=10 the length of the optimal sequence cannot be shorter than D^L + L - 1. In simulations of smaller size than [L,D] = [4,10] I have obtained optimal results by semi-randomly searching the space. However I do not know if a solution exists for an arbitrary [L,D] pair and would not be able to remember the solution if I ever had to use it.
Lessons learned so far
When planning to spend the night at a friends house in another town, be sure to not arrive at 1 am if that person is going out to party and won't hear her cell phone.
I think you want a http://en.wikipedia.org/wiki/De_Bruijn_sequence - "a cyclic sequence of a given alphabet A with size k for which every possible subsequence of length n in A appears as a sequence of consecutive characters exactly once."
The link Evgeny provided should answer both of your quests. This answer is a bit offtopic, but you ask for a solution for humans.
In the real world you should probably rely more on Social engineering or heuristics, and after that on mathematics. I give a case on real life:
I went to visit an apartment and I found out that my cellphone was dead. Now way of contacting the person doing the visit. I was about to go back when I saw that the door used a keypad 0 - 9 and A B. I made several assumptions:
The code is 5 digits long. The length is pretty standard depending on the region you are in. I based this assumption on buildings I had access before (legally :D).
The code starts with numbers, then either A or B (based on my own building).
The keypad was not brand new. Conclusion, the numbers used in the code were a bit damaged. I knew with certainty which numbers were not in the code, and three of the four number in the code (given my previous assumptions)
By the amount of keys damaged I assumed the code didn't contain repeated keys (7 were damaged, it was clear A was used, B not used )
At the end I had 3 numbers which were in the code for sure, 2 candidates for the last number and I was sure A was at the end. On key was just slightly damaged compared to the others.
I just had to enumerate permutations starting with the candidate which seemed the more damaged, which give me 4! + 4! = 48 tries. Believe me, at the 5th try the door was opened. If I can give my 2 cents, the old put a key and open the door is still the most reliable method to restrict access to a building.

About random number sequence generation

I am new to randomized algorithms, and learning it myself by reading books. I am reading a book Data structures and Algorithm Analysis by Mark Allen Wessis
.
Suppose we only need to flip a coin; thus, we must generate a 0 or 1
randomly. One way to do this is to examine the system clock. The clock
might record time as an integer that counts the number of seconds
since January 1, 1970 (atleast on Unix System). We could then use the
lowest bit. The problem is that this does not work well if a sequence
of random numbers is needed. One second is a long time, and the clock
might not change at all while the program is running. Even if the time
were recorded in units of microseconds, if the program were running by
itself the sequence of numbers that would be generated would be far
from random, since the time between calls to the generator would be
essentially identical on every program invocation. We see, then, that
what is really needed is a sequence of random numbers. These numbers
should appear independent. If a coin is flipped and heads appears,
the next coin flip should still be equally likely to come up heads or
tails.
Following are question on above text snippet.
In above text snippet " for count number of seconds we could use lowest bit", author is mentioning that this does not work as one second is a long time,
and clock might not change at all", my question is that why one second is long time and clock will change every second, and in what context author is mentioning
that clock does not change? Request to help to understand with simple example.
How author is mentioning that even for microseconds we don't get sequence of random numbers?
Thanks!
Programs using random (or in this case pseudo-random) numbers usually need plenty of them in a short time. That's one reason why simply using the clock doesn't really work, because The system clock doesn't update as fast as your code is requesting new numbers, therefore qui're quite likely to get the same results over and over again until the clock changes. It's probably more noticeable on Unix systems where the usual method of getting the time only gives you second accuracy. And not even microseconds really help as computers are way faster than that by now.
The second problem you want to avoid is linear dependency of pseudo-random values. Imagine you want to place a number of dots in a square, randomly. You'll pick an x and a y coordinate. If your pseudo-random values are a simple linear sequence (like what you'd obtain naïvely from a clock) you'd get a diagonal line with many points clumped together in the same place. That doesn't really work.
One of the simplest types of pseudo-random number generators, the Linear Congruental Generator has a similar problem, even though it's not so readily apparent at first sight. Due to the very simple formula
you'll still get quite predictable results, albeit only if you pick points in 3D space, as all numbers lies on a number of distinct planes (a problem all pseudo-random generators exhibit at a certain dimension):
Computers are fast. I'm over simplifying, but if your clock speed is measured in GHz, it can do billions of operations in 1 second. Relatively speaking, 1 second is an eternity, so it is possible it does not change.
If your program is doing regular operation, it is not guaranteed to sample the clock at a random time. Therefore, you don't get a random number.
Don't forget that for a computer, a single second can be 'an eternity'. Programs / algorithms are often executed in a matter of milliseconds. (1000ths of a second. )
The following pseudocode:
for(int i = 0; i < 1000; i++)
n = rand(0, 1000)
fills n a thousand times with a random number between 0 and 1000. On a typical machine, this script executes almost immediatly.
While you typically only initialize the seed at the beginning:
The following pseudocode:
srand(time());
for(int i = 0; i < 1000; i++)
n = rand(0, 1000)
initializes the seed once and then executes the code, generating a seemingly random set of numbers. The problem arises then, when you execute the code multiple times. Lets say the code executes in 3 milliseconds. Then the code executes again in 3 millisecnds, but both in the same second. The result is then a same set of numbers.
For the second point: The author probabaly assumes a FAST computer. THe above problem still holds...
He means by that is you are not able to control how fast your computer or any other computer runs your code. So if you suggest 1 second for execution thats far from anything. If you try to run code by yourself you will see that this is executed in milliseconds so even that is not enough to ensure you got random numbers !

How do I find the running time given algorithm speed and computer speed?

I'm currently working through an assignment that deals with Big-O and running times. I have this one question presented to me that seems to be very easy, but I'm not sure if I'm doing it correctly. The rest of the problems have been quite difficult, and I feel like I'm overlooking something here.
First, you have these things:
Algorithm A, which has a running time of 50n^3.
Computer A, which has a speed of 1 millisecond per operation.
Computer B, which has a speed of 2 milliseconds per operation.
An instance of size 300.
I want to find how long it takes algorithm A to solve this instance on computer A, and how long it takes it on computer B.
What I want to do is sub 300 in for n, so you have 50*(300^2) = 4500000.
Then, multiply that by 1 for the first computer, and by 2 for the second computer.
This feels odd to me, though, because it says the "running time" is 50n^3, not, "the number of operations is 50n^3", so I get the feeling that I'm multiplying time by time, and would end up with units of milliseconds squared, which doesn't seem right at all.
I would like to know if I'm right, and if not, what the question actually means.
It wouldn't make sense if you had O(n^3) but you are not using big O notation in your example. I.e. if you had O(n^3) you would know the complexity of your algorithm but you would not know the exact number of operations as you said.
Instead it looks as though you are given the exact number of operations that are taken. (Even know it is not explicitly stated). So substituting for n would make sense.
Big O notation describes how the size of the input would effect your running time or memory usage. But with Big O you could not deduce an exact running time even given the speed of each operation.
Putting an explanation of why your answer looks so simple (as I described above) would also be a safe way. But I'm sure even without it you'll get the marks for the question.
Well, aside from the pointlessness of figuring out how long something will take this way on most modern computers, though it might make have some meaning in an embedded system, it does look right to me the way you did it.
If the algorithm needs 50n^3 operations to complete something, where n is the number of elements to process, then substituting 300 for n will give you the number of operations to perform, not a time-unit.
So multiply with time per operation and you would get the time needed.
Looks right to me.
Your 50*n^3 data is called "running time", but that's because the model used for speed evaluations assumes a machine with several basic operations, where each of these takes 1 time unit.
In you case, running the algorithm takes 50*500^3 time units. On computer A each time unit is 1ms, and on computer B 2ms.
Hope this puts some sense into the units,
Asaf.

Resources