How many times does a zero occur on an odometer - ruby

I am solving how many times a zero occus on an odometer. I count +1 everytime I see a zero.
10 -> +1
100-> +2 because in 100 I see 2 zero's
10004 -> +3 because I see 3 zero's
So I get,
1 - 100 -> +11
1 - 500 -> +91
1 - 501 -> +92
0 - 4294967295-> +3825876150
I used rubydoctest for it. I am not doing anything with begin_number yet. Can anyone explain how to calculate it without a brute force method?
I did many attempts. They go well for numbers like 10, 1000, 10.000, 100.000.000, but not for numbers like 522, 2280. If I run the rubydoctest, it will fail on # >> algorithm_count_zero(1, 500)
# doctest: algorithm_count_zero(begin_number, end_number)
# >> algorithm_count_zero(1, 10)
# => 1
# >> algorithm_count_zero(1, 1000)
# => 192
# >> algorithm_count_zero(1, 10000000)
# => 5888896
# >> algorithm_count_zero(1, 500)
# => 91
# >> algorithm_count_zero(0, 4294967295)
# => 3825876150
def algorithm_count_zero(begin_number, end_number)
power = Math::log10(end_number) - 1
if end_number < 100
return end_number/10
else
end_number > 100
count = (9*(power)-1)*10**power+1
end
answer = ((((count / 9)+power)).floor) + 1
end
end_number = 20000
begin_number = 10000
puts "Algorithm #{algorithm_count_zero(begin_number, end_number)}"

As noticed in a comment, this is a duplicate to another question, where the solution gives you correct guidelines.
However, if you want to test your own solution for correctness, i'll put in here a one-liner in the parallel array processing language Dyalog APL (which i btw think everyone modelling mathemathics and numbers should use).
Using tryapl.org you'll be able to get a correct answer for any integer value as argument. Tryapl is a web page with a backend that executes simple APL code statements ("one-liners", which are very typical to the APL language and it's extremely compact code).
The APL one-liner is here:
{+/(c×1+d|⍵)+d×(-c←0=⌊(a|⍵)÷d←a×+0.1)+⌊⍵÷a←10*⌽⍳⌈10⍟⍵} 142857
Copy that and paste it into the edit row at tryapl.org, and press enter - you will quickly see an integer, which is the answer to your problem. In the code row above, you can see the argument rightmost; it is 142857 this time but you can change it to any integer.
As you have pasted the one-liner once, and executed it with Enter once, the easiest way to get it back for editing is to press [Up arrow]. This returns the most recently entered statement; then you can edit the number sitting rightmost (after the curly brace) and press Enter again to get the answer for a different argument.
Pasting teh code row above will return 66765 - that many zeroes exist for 142857.
If you paste this 2 characters shorter row below, you will see the individual components of the result - the sum of these components make up the final result. You will be able to see a pattern, which possibly makes it easier to understand what happens.
Try for example
{(c×1+d|⍵)+d×(-c←0=⌊(a|⍵)÷d←a×+0.1)+⌊⍵÷a←10*⌽⍳⌈10⍟⍵} 1428579376
0 100000000 140000000 142000000 142800000 142850000 142857000 142857900 142857930 142857937
... and see how the intermediate results contain segments of the argument 1428579376, starting from left! There are as many intermediate results as there are numbers in the argument (10 this time).
The result for 1428579376 will be 1239080767, ie. the sum of the 10 numbers above. This many zeroes appear in all numbers between 1 and 1428579376 :-).

Consider each odometer position separately. The position x places from the far right changes once every 10^x times. By looking at the numbers to its right, you know how long it will be until it next changes. It will then hold each value for 10^x times before changing, until it reaches the end of the range you are considering, when it will hold its value at that time for some number of times that you can work out given the value at the very end of the range.
Now you have a sequence of the form x...0123456789012...y where you know the length and you know the values of x and y. One way to count the number of 0s (or any other digit) within this sequence is to clip off the prefix from x.. to just before the first 0, and clip off the suffix from just after the last 9 to y. Look for 0s n in this suffix, and measure the length of the long sequence from prefix to suffix. This will be of a length divisible by 10, and will contain each digit the same number of times.
Based on this you should be able to work out, for each position, how often within the range it will assume each of its 10 possible values. By summing up the values for 0 from each of the odometer positions you get the answer you want.

Related

Quick way to compute n-th sequence of bits of size b with k bits set?

I want to develop a way to be able to represent all combinations of b bits with k bits set (equal to 1). It needs to be a way that given an index, can get quickly the binary sequence related, and the other way around too. For instance, the tradicional approach which I thought would be to generate the numbers in order, like:
For b=4 and k=2:
0- 0011
1- 0101
2- 0110
3- 1001
4-1010
5-1100
If I am given the sequence '1010', I want to be able to quickly generate the number 4 as a response, and if I give the number 4, I want to be able to quickly generate the sequence '1010'. However I can't figure out a way to do these things without having to generate all the sequences that come before (or after).
It is not necessary to generate the sequences in that order, you could do 0-1001, 1-0110, 2-0011 and so on, but there has to be no repetition between 0 and the (combination of b choose k) - 1 and all sequences have to be represented.
How would you approach this? Is there a better algorithm than the one I'm using?
pkpnd's suggestion is on the right track, essentially process one digit at a time and if it's a 1, count the number of options that exist below it via standard combinatorics.
nCr() can be replaced by a table precomputation requiring O(n^2) storage/time. There may be another property you can exploit to reduce the number of nCr's you need to store by leveraging the absorption property along with the standard recursive formula.
Even with 1000's of bits, that table shouldn't be intractably large. Storing the answer also shouldn't be too bad, as 2^1000 is ~300 digits. If you meant hundreds of thousands, then that would be a different question. :)
import math
def nCr(n,r):
return math.factorial(n) // math.factorial(r) // math.factorial(n-r)
def get_index(value):
b = len(value)
k = sum(c == '1' for c in value)
count = 0
for digit in value:
b -= 1
if digit == '1':
if b >= k:
count += nCr(b, k)
k -= 1
return count
print(get_index('0011')) # 0
print(get_index('0101')) # 1
print(get_index('0110')) # 2
print(get_index('1001')) # 3
print(get_index('1010')) # 4
print(get_index('1100')) # 5
Nice question, btw.

Complex Numbers Seemingly Arising from Non-Complex Logarithms

I have a simple program written in TI-BASIC that converts from base 10 to base 2
0->B
1->E
Input "DEC:",D
Repeat D=0
int(round(log(D)/log(2),1))->E
round(E)->E
B+10^E->B
D-2^E->D
End
Disp B
This will sometimes return an the error 'ERR: DATA TYPE'. I checked, and this is because the variable D, will sometimes become a complex number. I am not sure how this happens.
This happens with seemingly random numbers, like 5891570. It happens with this number, but not something close to it like 5891590 Which is strange. It also happens with 1e30, But not 1e25. Another example is 1111111111111111, and not 1111111111111120.
I haven't tested this thoroughly, and don't see any pattern in these numbers. Any help would be appreciated.
The error happens because you round the logarithm to one decimal place before taking the integer part; therefore, if log(D)/log(2) is something like 8.99, you will round E up rather than down, and 2^9 will be subtracted from D instead of 2^8, causing, in the next iteration, D to become negative and its logarithm to be complex. Let's walk through your code when D is 511, which has base-2 logarithm 8.9971:
Repeat D=0 ;Executes first iteration without checking whether D=0
log(D)/log(2 ;8.9971
round(Ans,1 ;9.0
int(Ans ;9.0
round(Ans)->E ;E = 9.0
B+10^E->B ;B = 1 000 000 000
D-2^E->D ;D = 511-512 = -1
End ;loops again, since D≠0
---next iteration:----
log(D ;log(-1) = 1.364i; throws ERR:NONREAL ANS in Real mode
Rounding the logarithm any more severely than nine decimal places (nine digits is the default for round( without a "digits" argument) is completely unnecessary, as on my TI-84+ rounding errors do not accumulate: round(int(log(2^X-1)/log(2)) returns X-1 and round(int(log(2^X)/log(2)) returns X for all integer X≤28, which is high enough that precision would be lost anyway in other parts of the calculation.
To fix your code, simply round only once, and only to nine places. I've also removed the unnecessary double-initialization of E, removed your close-parens (it's still legal code!), and changed the Repeat (which always executes one loop before checking the condition D=0) to a While loop to prevent ERR:DOMAIN when the input is 0.
0->B
Input "DEC:",D
While D
int(round(log(D)/log(2->E
B+10^E->B
D-2^E->D
End
B ;on the last line, so it prints implicitly
Don't expect either your code or my fix to work correctly for D > 213 or so, because your calculator can only store 14 digits in its internal representation of any number. You'll lose the digits while you store the result into B!
Now for a trickier, optimized way of computing the binary representation (still only works for D < 213:
Input D
int(2fPart(D/2^cumSum(binomcdf(13,0
.1sum(Ans10^(cumSum(1 or Ans

Scope of variables and the digits function

My question is twofold:
1) As far as I understand, constructs like for loops introduce scope blocks, however I'm having some trouble with a variable that is define outside of said construct. The following code depicts an attempt to extract digits from a number and place them in an array.
n = 654068
l = length(n)
a = Int64[]
for i in 1:(l-1)
temp = n/10^(l-i)
if temp < 1 # ith digit is 0
a = push!(a,0)
else # ith digit is != 0
push!(a,floor(temp))
# update n
n = n - a[i]*10^(l-i)
end
end
# last digit
push!(a,n)
The code executes fine, but when I look at the a array I get this result
julia> a
0-element Array{Int64,1}
I thought that anything that goes on inside the for loop is invisible to the outside, unless I'm operating on variables defined outside the for loop. Moreover, I thought that by using the ! syntax I would operate directly on a, this does not seem to be the case. Would be grateful if anyone can explain to me how this works :)
2) Second question is about syntex used when explaining functions. There is apparently a function called digits that extracts digits from a number and puts them in an array, using the help function I get
julia> help(digits)
Base.digits(n[, base][, pad])
Returns an array of the digits of "n" in the given base,
optionally padded with zeros to a specified size. More significant
digits are at higher indexes, such that "n ==
sum([digits[k]*base^(k-1) for k=1:length(digits)])".
Can anyone explain to me how to interpret the information given about functions in Julia. How am I to interpret digits(n[, base][, pad])? How does one correctly call the digits function? I can't be like this: digits(40125[, 10])?
I'm unable to reproduce you result, running your code gives me
julia> a
1-element Array{Int64,1}:
654068
There's a few mistakes and inefficiencies in the code:
length(n) doesn't give the number of digits in n, but always returns 1 (currently, numbers are iterable, and return a sequence that only contain one number; itself). So the for loop is never run.
/ between integers does floating point division. For extracting digits, you´re better off with div(x,y), which does integer division.
There's no reason to write a = push!(a,x), since push! modifies a in place. So it will be equivalent to writing push!(a,x); a = a.
There's no reason to digits that are zero specially, they are handled just fine by the general case.
Your description of scoping in Julia seems to be correct, I think that it is the above which is giving you trouble.
You could use something like
n = 654068
a = Int64[]
while n != 0
push!(a, n % 10)
n = div(n, 10)
end
reverse!(a)
This loop extracts the digits in opposite order to avoid having to figure out the number of digits in advance, and uses the modulus operator % to extract the least significant digit. It then uses reverse! to get them in the order you wanted, which should be pretty efficient.
About the documentation for digits, [, base] just means that base is an optional parameter. The description should probably be digits(n[, base[, pad]]), since it's not possible to specify pad unless you specify base. Also note that digits will return the least significant digit first, what we get if we remove the reverse! from the code above.
Is this cheating?:
n = 654068
nstr = string(n)
a = map((x) -> x |> string |> int , collect(nstr))
outputs:
6-element Array{Int64,1}:
6
5
4
0
6
8

Generate infinite stream of unique numbers between 0 and 1

Came across this question previously on an interview. The requirements are to write a function that
Generates a number between 0..1
Never returns the same number
Can scale (called every few milliseconds and continuously for years)
Can use only 1mb of heap memory
Does not need to return as a decimal, can render directly to stdout
My idea was hacky at best which involved manipulating a string of the "0.1" then "0.11" then "0.12" etc. Since the requirements did not mention it had to be uniformly distributed, it does not need to be random. Another idea is generate a timestamp of the form yyyyMMddhhmmssSSS (where SSS is msec) then convert that to a string and prefix it with "0." . This way the values will always be unique.
It's a pretty open ended question and I'm curious how other people would tackle it.
Pseudo code that can do what you except guarantee no repeats.
Take your 1 MB allocation.
Randomly set every byte.
Echo to stdout as "0.<bytes as integer string>" (will be very long)
Go to #2
Your "Never returns the same number" is not guaranteed but it is extremely unlikely (1 in 2^8192) assuming a good implementation of Random.
Allocate about a million characters and set them initially to all 0.
Then each call to the function simply increments the number and returns it, something like:
# Gives you your 1MB heap space.
num = new digit/byte/char/whatever[about a million]
# Initialise all digits to zero (1-based arrays).
def init():
for posn ranges from 1 to size(num):
set num[posn] to 0
# Print next value.
def printNext():
# Carry-based add-1-to-number.
# Last non-zero digit stored for truncated output.
set carry to 1
set posn to size(num)
set lastposn to posn
# Keep going until no more carry or out of digits.
while posn is greater than 0 and carry is 1:
# Detect carry and continue, or increment and stop.
if num[posn] is '9':
set num[posn] to '0'
set lastposn to posn minus 1
else:
set num[posn] to num[posn] + 1
set carry to 0
set posn to posn minus one
# Carry set after all digits means you've exhausted all numbers.
if carry is 1:
exit badly
# Output the number.
output "0."
for posn ranges from 1 to lastposn
output num[posn]
The use of lastposn prevents the output of trailing zeros. If you don't care about that, you can remove every line with lastposn in it and run the output loop from 1 to size(num) instead.
Calling this every millisecond will give you about well over 10some--big-number-resulting-in-a-runtime-older-than-the-age-of-the-universe years of run time.
I wouldn't go with your time-based solution because the time may change - think daylight savings or summer time and people adjusting clocks due to drift.
Here's some actual Python code which demonstrates it:
import sys
num = "00000"
def printNext():
global num
carry = 1
posn = len(num) - 1
lastposn = posn
while posn >= 0 and carry == 1:
if num[posn:posn+1] == '9':
num = num[:posn] + '0' + num[posn+1:]
lastposn = posn - 1
else:
num = num[:posn] + chr(ord(num[posn:posn+1]) + 1) + num[posn+1:]
carry = 0
posn = posn - 1
if carry == 1:
print "URK!"
sys.exit(0)
s = "0."
for posn in range (0,lastposn+1):
s = s + num[posn:posn+1];
print s
for i in range (0,15):
printNext()
And the output:
0.00001
0.00002
0.00003
0.00004
0.00005
0.00006
0.00007
0.00008
0.00009
0.0001
0.00011
0.00012
0.00013
0.00014
0.00015
Your method would eventually use more than 1mb of heap memory. Every way you represent numbers, if you are constrained by 1mb of heap then there is only a finite number of values. I would take the maximum ammount of memory possible, and increment the least significant bit by one on each call. That would ensure running as longer as possible before returning a repeted number.
Yes, because there is no random requirement, you have a lot of flexibility.
The idea here I think is very close to that of enumerating all strings over the regular expression [0-9]* with a couple modifications:
the real string starts with the sequence 0.
you cannot end with a 0
So how would you enumerate? One idea is
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.11 0.12 0.13 0.14 0.15 ... 0.19 0.21 0.22 ... 0.29 0.31 ... 0.99 0.101 0.102 ...
The only state you need here is an integer I think. Just be clever in skipping those zeros at the end (not difficult really). 1 MB of memory should be fine. It stores a massive massive integer, so I think you would be good here.
(It is different from yours because I generate all one character strings, then all two character strings, then all three character strings, ... so I believe there is no need for state other than the last number generated.)
Then again I may be wrong; I haven't tried this.
ADDENDUM
Okay I will try it. Here is the generator in Ruby
i = 0
while true
puts "0.#{i}" if i % 10 != 0
i += 1
end
Looks okay to me....
If you are programming in C, the nextafter() family of functions are Posix-compatible functions useful for producing the next double after or before any given value. This will give you about 2^64 different values to output, if you output both positive and negative values.
If you are required to print out the values, use the %a or %A format for exact representation. From the printf(3) man page: "For 'a' conversion, the double argument is converted to hexadecimal notation (using the letters abcdef) in the style [-]0xh.hhhhp±d..." "The default precision suffices for an exact representation of the value if an exact representation in base 2 exists..."
If you want to generate random numbers rather than sequentially ascending ones, perhaps do a google search for 64-bit KISS RNG. Implementations in Java, C, Ada, Fortran, et al are available on the web. The period of 64-bit KISS RNG itself is ~ 2^250, but there are not that many 64-bit double-precision numbers, so some numbers will re-appear within 2^64 outputs, but with different neighbor values. On some systems, long doubles have 128-bit values; on others, only 80 or 96. Using long doubles, you could accordingly increase the number of different values output by combining two randoms into each output.
It may be that the point of this question in an interview is to figure out if you can recognize a silly spec when you see it.

String to Number and back algorithm

This is a hard one (for me) I hope people can help me. I have some text and I need to transfer it to a number, but it has to be unique just as the text is unique.
For example:
The word 'kitty' could produce 12432, but only the word kitty produces that number. The text could be anything and a proper number should be given.
One problem the result integer must me a 32-bit unsigned integer, that means the largest possible number is 2147483647. I don't mind if there is a text length restriction, but I hope it can be as large as possible.
My attempts. You have the letters A-Z and 0-9 so one character can have a number between 1-36. But if A = 1 and B = 2 and the text is A(1)B(2) and you add it you will get the result of 3, the problem is the text BA produces the same result, so this algoritm won't work.
Any ideas to point me in the right direction or is it impossible to do?
Your idea is generally sane, only needs to be developed a little.
Let f(c) be a function converting character c to a unique number in range [0..M-1]. Then you can calculate result number for the whole string like this.
f(s[0]) + f(s[1])*M + f(s[2])*M^2 + ... + f(s[n])*M^n
You can easily prove that number will be unique for particular string (and you can get string back from the number).
Obviously, you can't use very long strings here (up to 6 characters for your case), as 36^n grows fast.
Imagine you were trying to store Strings from the character set "0-9" only in a number (the equivalent of obtaining a number of a string of digits). What would you do?
Char 9 8 7 6 5 4 3 2 1 0
Str 0 5 2 1 2 5 4 1 2 6
Num = 6 * 10^0 + 2 * 10^1 + 1 * 10^2...
Apply the same thing to your characters.
Char 5 4 3 2 1 0
Str A B C D E F
L = 36
C(I): transforms character to number: C(0)=0, C(A)=10, C(B)=11, ...
Num = C(F) * L ^ 0 + C(E) * L ^ 1 + ...
Build a dictionary out of words mapped to unique numbers and use that, that's the best you can do.
I doubt there are more than 2^32 number of words in use, but this is not the problem you're facing, the problem is that you need to map numbers back to words.
If you were only mapping words over to numbers, some hash algorithm might work, although you'd have to work a bit to guarantee that you have one that won't produce collisions.
However, for numbers back to words, that's quite a different problem, and the easiest solution to this is to just build a dictionary and map both ways.
In other words:
AARDUANI = 0
AARDVARK = 1
...
If you want to map numbers to base 26 characters, you can only store 6 characters (or 5 or 7 if I miscalculated), but not 12 and certainly not 20.
Unless you only count actual words, and they don't follow any good countable rules. The only way to do that is to just put all the words in a long list, and start assigning numbers from the start.
If it's correctly spelled text in some language, you can have a number for each word. However you'd need to consider all possible plurals, place and people names etc. which is generally impossible. What sort of text are we talking about? There's usually going to be some existing words that can't be coded in 32 bits in any way without prior knowledge of them.
Can you build a list of words as you go along? Just give the first word you see the number 1, second number 2 and check if a word has a number already or it needs a new one. Then save your newly created dictionary somewhere. This would likely be the only workable solution if you require 100% reliable, reversible mapping from the numbers back to original words given new unknown text that doesn't follow any known pattern.
With 64 bits and a sufficiently good hash like MD5 it's extremely unlikely to have collisions, but for 32 bits it doesn't seem likely that a safe hash would exist.
Just treat each character as a digit in base 36, and calculate the decimal equivalent?
So:
'A' = 0
'B' = 1
[...]
'Z' = 25
'0' = 26
[...]
'9' = 35
'AA' = 36
'AB' = 37
[...]
'CAB' = 46657

Resources