LMC program to find the difference between double the median and the smallest of 3 inputs? - algorithm

I want to write an LMC program to find the difference between twice the median and the smallest of 3 distinct inputs efficiently. I would like some help in figuring out an algorithm for this.
Here is what I have so far:
INPUT 901 - Input first
STO 399 - Store in 99 (a)
INPUT 901 - Input second
STO 398 - Store in 98 (b)
INPUT 901 - Input third
STO 397 - Store in 97 (c)
LOAD 597 - Load 97 (a)
SUB 298 - Subtract 97 - 98 (a - b)
BRP 8xx - If value positive go to xx (if value is positive a > b else b > a)
LOAD 598 - Load 98 (b)
SUB 299 - Subtract 98 - 99 (b - c)
BRP 8xx - If value positive go to xx (if value is positive b > c else c > b)
LOAD 598 - Load 98 (b) which is the median
ADD 198 - Double to get "twice the median"
I realized at the end of the snippet I didn't know which input was the smallest and was assuming the inputs were already sorted (which they aren't).
I think I will need to somehow sort the inputs from smallest to largest to do this efficiently and determine the smallest input and the median within the same branch.

I don't know little-man-computer language, but it doesn't matter, it's an algorithm question.
First of all, you made a little confusion naming the three parameters (first you said that 99 was a, then you said 97 was a).
You must load the three parameters in 99, 98, 97 (say a, b, c).
Then, you load 99 (a) and subtract 98 (b) from 99 (a).
If the result is positive (99 is greater than 98), you have to swap 98 and 99, so the smallest between the two is in location 99.
Now load 98 (c) and subtract 97 from it. If the result is positive, swap 97 and 98, so the smallest between the two is in location 98.
Finally, you have the two smallest numbers in 98 and 99 locations, that is the smallest and the median.
Load 99 and subtract 98 from it. If the result is positive, 99 contains the median and 98 the smallest, otherwise the contrary.
Now you can double the median one, and calculate the difference between this number and the smallest.

Related

The 1000th element which is product of 2, 3, 5

There is a sequence S.
All the elements in S is product of 2, 3, 5.
S = {2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24 ...}
How to get the 1000th element in this sequence efficiently?
I check each number from 1, but this method is too slow.
A geometric approach:
Let s = 2^i . 3^j . 5^k, where the triple (i, j, k) belongs to the first octant of a 3D state space.
Taking the logarithm,
ln(s) = i.ln(2) + j.ln(3) + k.ln(5)
so that in the state space the iso-s surfaces are planes, which intersect the first octant along a triangle. On the other hand, the feasible solutions are the nodes of a square grid.
If one wants to produce the s-values in increasing order, one can keep a list of the grid nodes closest to the current s-plane*, on its "greater than" side.
If I am right, to move from one s-value to the next, it suffices to discard the current (i, j, k) and replace it by the three triples (i+1, j, k), (i, j+1, k) and (i, j, k+1), unless they are already there, and pick the next smallest s.
An efficient implementation will be by storing the list as a binary tree with the log(s)-value as the key.
If you are asking for the first N values, you will explore a pyramidal volume of state-space of height O(³√N), and base area O(³√N²), which is the number of tree nodes, hence the spatial complexity. Every query in the tree will take O(log(N)) comparisons (and O(1) operations to fetch the minimum), for a total of O(N.log(N)).
*More precisely, the list will contain all triples on the "greater than" side and such that no index can be decreased without getting on the other side of the plane.
Here is Python code that implements these ideas.
You will notice that the logarithms are converted to fixed point (7 decimals) to avoid floating-point inaccuracies that could result in the log(s)-values not being found equal. This causes the s values being inexact in the last digits, but this does not matter as long as the ordering of the values is preserved. Recomputing the s-values from the indexes yields exact values.
import math
import bintrees
# Constants
ln2= round(10000000 * math.log(2))
ln3= round(10000000 * math.log(3))
ln5= round(10000000 * math.log(5))
# Initial list
t= bintrees.FastAVLTree()
t.insert(0, (0, 0, 0))
# Find the N first products
N= 100
for i in range(N):
# Current s
s= t.pop_min()
print math.pow(2, s[1][0]) * math.pow(3, s[1][1]) * math.pow(5, s[1][2])
# Update the list
if not s[0] + ln2 in t:
t.insert(s[0] + ln2, (s[1][0]+1, s[1][1], s[1][2]))
if not s[0] + ln3 in t:
t.insert(s[0] + ln3, (s[1][0], s[1][1]+1, s[1][2]))
if not s[0] + ln5 in t:
t.insert(s[0] + ln5, (s[1][0], s[1][1], s[1][2]+1))
The 100 first values are
1 2 3 4 5 6 8 9 10 12
15 16 18 20 24 25 27 30 32 36
40 45 48 50 54 60 64 72 75 80
81 90 96 100 108 120 125 128 135 144
150 160 162 180 192 200 216 225 240 243
250 256 270 288 300 320 324 360 375 384
400 405 432 450 480 486 500 512 540 576
600 625 640 648 675 720 729 750 768 800
810 864 900 960 972 1000 1024 1080 1125 1152
1200 1215 1250 1280 1296 1350 1440 1458 1500 1536
The plot of the number of tree nodes confirms the O(³√N²) spatial behavior.
Update:
When there is no risk of overflow, a much simpler version (not using logarithms) is possible:
import math
import bintrees
# Initial list
t= bintrees.FastAVLTree()
t[1]= None
# Find the N first products
N= 100
for i in range(N):
# Current s
(s, r)= t.pop_min()
print s
# Update the list
t[2 * s]= None
t[3 * s]= None
t[5 * s]= None
Simply put, you just have to generate each ith number consecutively. Let's call the set {2, 3, 5} to be Z. At ith iteration, assume you have all (i-1) of the values generated in the previous iteration. While generating the next one, what you basically have to do is trying all the elements in Z and for each of them generating **the least element they can form that is larger than the element generated at (i-1)th iteration. Then, you simply consider the smallest one among them as the ith value. A simple and not so efficient implementation is given below.
def generate_simple(N, Z):
generated = [1]
for i in range(1, N+1):
minFound = -1
minElem = -1
for j in range(0, len(Z)):
for k in range(0, len(generated)):
candidateVal = Z[j] * generated[k]
if candidateVal > generated[-1]:
if minFound == -1 or minFound > candidateVal:
minFound = candidateVal
minElem = j
break
generated.append(minFound)
return generated[-1]
As you may observe, this approach has a time complexity of O(N2 * |Z|). An improvement in terms of efficiency would be to store where we left off scanning in the array of generated values for each element in a second array, indicesToStart. Then, for each element we would only scan all N values of the array generated for once(i.e. all through the algorithm), which means the time complexity after such an improvement would be O(N * |Z|).
A simple implementation of the improvement based on the simple version provided above, is given below.
def generate_improved(N, Z):
generated = [1]
indicesToStart = [0] * len(Z)
for i in range(1, N+1):
minFound = -1
minElem = -1
for j in range(0, len(Z)):
for k in range(indicesToStart[j], len(generated)):
candidateVal = Z[j] * generated[k]
if candidateVal > generated[-1]:
if minFound == -1 or minFound > candidateVal:
minFound = candidateVal
minElem = j
break
indicesToStart[j] += 1
generated.append(minFound)
indicesToStart[minElem] += 1
return generated[-1]
If you have a hard time understanding how complexity decreases with this algorithm, try looking into the difference in time complexity of any graph traversal algorithm when an adjacency list is used, and when an adjacency matrix is used. The improvement adjacency lists help achieve is almost exactly the same kind of improvement we get here. In a nutshell, you have an index for each element and instead of starting to scan from the beginning you continue from wherever you left the last time you scanned the generated array for that element. Consequently, even though there are N iterations in the algorithm(i.e. the outermost loop) the overall number of operations you make is O(N * |Z|).
Important Note: All the code above is a simple implementation for demonstration purposes, and you should consider it just as a pseudocode you can test. While implementing this in real life, based on the programming language you choose to use, you will have to consider issues like integer overflow when computing candidateVal.

How to calculate Sub matrix of a matrix

I was giving a test for a company called Code Nation and came across this question which asked me to calculate how many times a number k appears in the submatrix M[n][n]. Now there was a example which said Input like this.
5
1 2 3 2 5
36
M[i][j] is to calculated by a[i]*a[j]
which on calculation turn I could calculate.
1,2,3,2,5
2,4,6,4,10
3,6,9,6,15
2,4,6,4,10
5,10,15,10,25
Now I had to calculate how many times 36 appears in sub matrix of M.
The answer was 5.
I am unable to comprehend how to calculate this submatrix. How to represent it?
I had a naïve approach which resulted in many matrices of which I think none are correct.
One of them is Submatrix[i][j]
1 2 3 2 5
3 9 18 24 39
6 18 36 60 99
15 33 69 129 228
33 66 129 258 486
This was formed by adding all the numbers before it 0,0 to i,j
In this 36 did not appear 5 times so i know this is incorrect. If you can back it up with some pseudo code it will be icing on the cake.
Appreciate the help
[Edit] : Referred Following link 1 link 2
My guess is that you have to compute how many submatrices of M have sum equal to 36.
Here is Matlab code:
a=[1,2,3,2,5];
n=length(a);
M=a'*a;
count = 0;
for a0 = 1:n
for b0 = 1:n
for a1 = a0:n
for b1 = b0:n
A = M(a0:a1,b0:b1);
if (sum(A(:))==36)
count = count + 1;
end
end
end
end
end
count
This prints out 5.
So you are correctly computing M, but then you have to consider every submatrix of M, for example, M is
1,2,3,2,5
2,4,6,4,10
3,6,9,6,15
2,4,6,4,10
5,10,15,10,25
so one possible submatrix is
1,2,3
2,4,6
3,6,9
and if you add up all of these, then the sum is equal to 36.
There is an answer on cstheory which gives an O(n^3) algorithm for this.

Need to find lowest differences between first line of an array and the rest ones

Well, I've been given a number of pairs of elements (s,h), where s sends an h element on the s-th row of a 2d array.It is not necessary that each line has the same amount of elements, only known that there cannot be more than N elements on a line.
What I want to do is to find the lowest biggest difference(!) between a certain element of the first line and the rest ones.
Thus, if I have 3 lines with (101,92) (100,25,95,52,101) (93,108,0,65,200) what I want to find is 3, because I have to choose 92 and I have 95-92=3 from first to second and 93-92=1 form first to third.
I have reached a point where it is certain that if I have s lines with n(i) elements each and i=0..s, then n0<=n1<=...<=ns so as to have a good average performance scenario when picking the best-fit from 1st line towards the others.
However, I cannot think of a way lower than O(n2) or even maybe O(n3) in some cases. Does anyone have a suggestion about a fairly improved way to do this?
Combine all lines into a single list, also keeping track of which element comes from where.
Sort this list.
Have a last-value variable for each line.
For each item in the sorted list, update the last-value variable of the applicable list. If not all lines have a last-value set yet, do nothing. If it's an element from the first list:
Recalculate the biggest difference for all of the last-value variables. Store this difference.
If it's an element from any other list:
If all values have previous not been set, calculate the biggest difference. Otherwise, if the difference between the first list's last-value and this element is bigger than the biggest difference, update the biggest difference with this difference. Store this difference.
The smallest difference is the desired value.
Example:
Lists: (101,92) (100,25,95,52,101) (93,108,0,65,200)
Sorted 0 25 52 65 92 93 95 100 101 101 108 200
Source 2 1 1 2 0 2 1 1 0 1 2 2
Last[0] - - - - 92 92 92 92 101 101 101 101
Last[1] - 25 52 52 52 52 95 100 100 101 101 101
Last[2] 0 0 0 65 65 93 93 93 93 93 108 200
Diff - - - - 40 41 3 8 8 8 7 9
Best - - - - 40 40 3 3 3 3 3 3
Best = 3 as required. Storing the actual items or finding them afterwards should be easy enough.
Complexity:
Let n be the total number of items and k be the number of lists.
O(n log n) for the combine + sort.
O(nk) (worst case) for the scan through, since we're checking n items and, at each item, we do maximum O(k) work.
So O(n log n + nk).

Print Maximum List

We are given a set F={a1,a2,a3,…,aN} of N Fruits. Each Fruits has price Pi and vitamin content Vi.Now we have to arrange these fruits in such a way that the list contains prices in ascending order and the list contains vitamins in descending order.
For example::
N=4
Pi: 2 5 7 10
Vi: 8 11 9 2
This is the exact question https://cs.stackexchange.com/questions/1287/find-subsequence-of-maximal-length-simultaneously-satisfying-two-ordering-constr/1289#1289
I'd try to reduce the problem to longest increasing subsequent problem.
Sort the list according to first criteria [vitamins]
Then, find the longest increasing subsequent in the modified list,
according to the second criteria [price]
This solution is O(nlogn), since both step (1) and (2) can be done in O(nlogn) each.
Have a look on the wikipedia article, under Efficient Algorithms - how you can implement longest increasing subsequent
EDIT:
If your list allows duplicates, your sort [step (1)] will have to sort by the second parameter as secondary criteria, in case of equality of the primary criteria.
Example [your example 2]:
Pi::99 12 34 10 87 19 90 43 13 78
Vi::10 23 4 5 11 10 18 90 100 65
After step 1 you get [sorting when Vi is primary criteria, descending]:
Pi:: 013 43 78 12 90 87 87 99 10 34
Vi:: 100 90 65 23 18 11 10 10 05 04
Step two finds for longest increasing subsequence in Pi, and you get:
(13,100), (43,90), (78,65), (87,11), (99,10)
as a feasible solution, since it is an increasing subsequence [according to Pi] in the sorted list.
P.S. In here I am assuming the increasing subsequence you want is strictly increasing, otherwise the result is (13,100),(43,90),(78,65),(87,11),(87,10),(99,10) - which is longer subsequence, but it is not strictly increasing/decreasing according to Pi and Vi

Fastest gap sequence for shell sort?

According to Marcin Ciura's Optimal (best known) sequence of increments for shell sort algorithm,
the best sequence for shellsort is 1, 4, 10, 23, 57, 132, 301, 701...,
but how can I generate such a sequence?
In Marcin Ciura's paper, he said:
Both Knuth’s and Hibbard’s sequences
are relatively bad, because they are
defined by simple linear recurrences.
but most algorithm books I found tend to use Knuth’s sequence: k = 3k + 1, because it's easy to generate. What's your way of generating a shellsort sequence?
Ciura's paper generates the sequence empirically -- that is, he tried a bunch of combinations and this was the one that worked the best. Generating an optimal shellsort sequence has proven to be tricky, and the problem has so far been resistant to analysis.
The best known increment is Sedgewick's, which you can read about here (see p. 7).
If your data set has a definite upper bound in size, then you can hardcode the step sequence. You should probably only worry about generality if your data set is likely to grow without an upper bound.
The sequence shown seems to grow roughly as an exponential series, albeit with quirks. There seems to be a majority of prime numbers, but with non-primes in the mix as well. I don't see an obvious generation formula.
A valid question, assuming you must deal with arbitrarily large sets, is whether you need to emphasise worst-case performance, average-case performance, or almost-sorted performance. If the latter, you may find that a plain insertion sort using a binary search for the insertion step might be better than a shellsort. If you need good worst-case performance, then Sedgewick's sequence appears to be favoured. The sequence you mention is optimised for average-case performance, where the number of comparisons outweighs the number of moves.
I would not be ashamed to take the advice given in Wikipedia's Shellsort article,
With respect to the average number of comparisons, the best known gap
sequences are 1, 4, 10, 23, 57, 132, 301, 701 and similar, with gaps
found experimentally. Optimal gaps beyond 701 remain unknown, but good
results can be obtained by extending the above sequence according to
the recursive formula h_k = \lfloor 2.25 h_{k-1} \rfloor.
Tokuda's sequence [1, 4, 9, 20, 46, 103, ...], defined by the simple formula h_k = \lceil h'_k
\rceil, where h'k = 2.25h'k − 1 + 1, h'1 = 1, can be recommended for
practical applications.
guessing from the pseudonym, it seems Marcin Ciura edited the WP article himself.
The sequence is 1, 4, 10, 23, 57, 132, 301, 701, 1750. For every next number after 1750 multiply previous number by 2.25 and round down.
Sedgewick observes that coprimality is good. This rings true: if there are separate ‘streams’ not much cross-compared until the gap is small, and one stream contains mostly smalls and one mostly larges, then the small gap might need to move elements far. Coprimality maximises cross-stream comparison.
Gonnet and Baeza-Yates advise growth by a factor of about 2.2; Tokuda by 2.25. It is well known that if there is a mathematical constant between 2⅕ and 2¼ then it must† be precisely √5 ≈ 2.236.
So start {1, 3}, and then each subsequent is the integer closest to previous·√5 that is coprime to all previous except 1. This sequence can be pre-calculated and embedded in code. There follow the values up to 2⁶⁴ ≈ eighteen quintillion.
{1, 3, 7, 16, 37, 83, 187, 419, 937, 2099, 4693, 10499, 23479, 52501, 117391, 262495, 586961, 1312481, 2934793, 6562397, 14673961, 32811973, 73369801, 164059859, 366848983, 820299269, 1834244921, 4101496331, 9171224603, 20507481647, 45856123009, 102537408229, 229280615033, 512687041133, 1146403075157, 2563435205663, 5732015375783, 12817176028331, 28660076878933, 64085880141667, 143300384394667, 320429400708323, 716501921973329, 1602147003541613, 3582509609866643, 8010735017708063, 17912548049333207, 40053675088540303, 89562740246666023, 200268375442701509, 447813701233330109, 1001341877213507537, 2239068506166650537, 5006709386067537661, 11195342530833252689}
(Obviously, omit those that would overflow the relevant array index type. So if that is a signed long long, omit the last.)
On average these have ≈1.96 distinct prime factors and ≈2.07 non-distinct prime factors; 19/55 ≈ 35% are prime; and all but three are square-free (2⁴, 13·19² = 4693, 3291992692409·23³ ≈ 4.0·10¹⁶).
I would welcome formal reasoning about this sequence.
† There’s a little mischief in this “well known … must”. Choosing ∉ℚ guarantees that the closest number that is coprime cannot be a tie, but rational with odd denominator would achieve same. And I like the simplicity of √5, though other possibilities include e^⅘, 11^⅓, π/√2, and √π divided by the Chow-Robbins constant. Simplicity favours √5.
I've found this sequence similar to Marcin Ciura's sequence:
1, 4, 9, 23, 57, 138, 326, 749, 1695, 3785, 8359, 18298, 39744, etc.
For example, Ciura's sequence is:
1, 4, 10, 23, 57, 132, 301, 701, 1750
This is a mean of prime numbers. Python code to find mean of prime numbers is here:
import numpy as np
def isprime(n):
''' Check if integer n is a prime '''
n = abs(int(n)) # n is a positive integer
if n < 2: # 0 and 1 are not primes
return False
if n == 2: # 2 is the only even prime number
return True
if not n & 1: # all other even numbers are not primes
return False
# Range starts with 3 and only needs to go up the square root
# of n for all odd numbers
for x in range(3, int(n**0.5)+1, 2):
if n % x == 0:
return False
return True
# To apply a function to a numpy array, one have to vectorize the function
vectorized_isprime = np.vectorize(isprime)
a = np.arange(10000000)
primes = a[vectorized_isprime(a)]
#print(primes)
for i in range(2,20):
print(primes[0:2**i].mean())
The output is:
4.25
9.625
23.8125
57.84375
138.953125
326.1015625
749.04296875
1695.60742188
3785.09082031
8359.52587891
18298.4733887
39744.887085
85764.6216431
184011.130096
392925.738174
835387.635033
1769455.40302
3735498.24225
The gap in the sequence is slowly decreasing from 2.5 to 2.
Maybe this association could improve the Shellsort in the future.
I discussed this question here yesterday including the gap sequences I have found work best given a specific (low) n.
In the middle I write
A nasty side-effect of shellsort is that when using a set of random
combinations of n entries (to save processing/evaluation time) to test
gaps you may end up with either the best gaps for n entries or the
best gaps for your set of combinations - most likely the latter.
The problem lies in testing the proposed gaps such that valid conclusions can be drawn. Obviously, testing the gaps against all n! orderings that a set of n unique values can be expressed as is unfeasible. Testing in this manner for n=16, for example, means that 20,922,789,888,000 different combinations of n values must be sorted to determine the exact average, worst and reverse-sorted cases - just to test one set of gaps and that set might not be the best. 2^(16-2) sets of gaps are possible for n=16, the first being {1} and the last {15,14,13,12,11,10,9,8,7,6,5,4,3,2,1}.
To illustrate how using random combinations might give incorrect results assume n=3 that can assume six different orderings 012, 021, 102, 120, 201 and 210. You produce a set of two random sequences to test the two possible gap sets, {1} and {2,1}. Assume that these sequences turn out to be 021 and 201. for {1} 021 can be sorted with three comparisons (02, 21 and 01) and 201 with (20, 21, 01) giving a total of six comparisons, divide by two and voilà, an average of 3 and a worst case of 3. Using {2,1} gives (01, 02, 21 and 01) for 021 and (21, 10 and 12) for 201. Seven comparisons with a worst case of 4 and an average of 3.5. The actual average and worst case for {1] is 8/3 and 3, respectively. For {2,1} the values are 10/3 and 4. The averages were too high in both cases and the worst cases were correct. Had 012 been one of the cases {1} would have given a 2.5 average - too low.
Now extend this to finding a set of random sequences for n=16 such that no set of gaps tested will be favored in comparison with the others and the result close (or equal) to the true values, all the while keeping processing to a minimum. Can it be done? Possibly. After all, everything is possible - but is it probable? I think that for this problem random is the wrong approach. Selecting the sequences according to some system may be less bad and might even be good.
More information regarding jdaw1's post:
Gonnet and Baeza-Yates advise growth by a factor of about 2.2; Tokuda by 2.25. It is well known that if there is a mathematical constant between 2⅕ and 2¼ then it must† be precisely √5 ≈ 2.236.
It is known that √5 * √5 is 5 so I think every other index should increase by a factor of five. So first index being 1 insertion sort, second being 3 then each other subsequent is of the factor 5. There follow the values up to 2⁶⁴ ≈ eighteen quintillion.
{1, 3,, 15,, 75,, 375,, 1 875,, 9 375,, 46 875,, 234 375,, 1 171 875,, 5 859 375,, 29 296 875,, 146 484 375,, 732 421 875,, 3 662 109 375,, 18 310 546 875,, 91 552 734 375,, 457 763 671 875,, 2 288 818 359 375,, 11 444 091 796 875,, 57 220 458 984 375,, 286 102 294 921 875,, 1 430 511 474 609 375,, 7 152 557 373 046 875,, 35 762 786 865 234 375,, 178 813 934 326 171 875,, 894 069 671 630 859 375,, 4 470 348 358 154 296 875,}
The values in the gaps can simply be calculated by taking the value before and multiply by √5 rounding to whole numbers giving the resulting array (using 2.2360679775 * 5 ^ n * 3):
{1, 3, 7, 15, 34, 75, 168, 375, 839, 1 875, 4 193, 9 375, 20 963, 46 875, 104 816, 234 375, 524 078, 1 171 875, 2 620 392, 5 859 375, 13 101 961, 29 296 875, 65 509 804, 146 484 375, 327 549 020, 732 421 875, 1 637 745 101, 3 662 109 375, 8 188 725 504, 18 310 546 875, 40 943 627 518, 91 552 734 375, 204 718 137 589, 457 763 671 875, 1 023 590 687 943, 2 288 818 359 375, 5 117 953 439 713, 11 444 091 796 875, 25 589 767 198 563, 57 220 458 984 375, 127 948 835 992 813, 286 102 294 921 875, 639 744 179 964 066, 1 430 511 474 609 375, 3 198 720 899 820 328, 7 152 557 373 046 875, 15 993 604 499 101 639, 35 762 786 865 234 375, 79 968 022 495 508 194, 178 813 934 326 171 875, 399 840 112 477 540 970, 894 069 671 630 859 375, 1 999 200 562 387 704 849, 4 470 348 358 154 296 875, 9 996 002 811 938 524 246}
(Obviously, omit those that would overflow the relevant array index type. So if that is a signed long long, omit the last.)

Resources