Compare 43 variables in all possible ways - performance

I am trying to figure out which method is the best way to cross compare 43 variables (data sets, data)
I need to compare variable 1 with variable 2,3,4,5,6,7....43 and then compare variable 2 with variable 1,3,4,5,6,7....43 and so on, to variable no. 43.
I think i should use some kind of a loop, but i am clueless how to perform this operation efficient.
I think I just need some kind of pseudo code. Either way I want to do this in a do-file in Stata.

Assuming e.g. variables var1-var43 and that the "comparison" between the first and the second differs from that between the second and the first, which is what your question implies, then
forval i = 1/43 {
forval j = 1/43 {
if `i' != `j' {
<code for comparison between var`i' and var`j'>
}
}
}
With other variable names, foreach might be better.

As #NickCox suggested, you could use a O(NxN) nested loop. If that takes too long, which it could if your "43" is actually 1000, then there's a better way. Sort each list (indirectly), which is O(N logN), and run a merge-order loop, which is O(N), so altogether it is O(N logN).

Related

Concise (one line?) binary search in Raku

Many common operations aren't built in to Raku because they can be concisely expressed with a combination of (meta) operators and/or functions. It feels like binary search of a sorted array ought to be expressable in that way (maybe with .rotor? or …?) but I haven't found a particularly good way to do so.
For example, the best I've come up with for searching a sorted array of Pairs is:
sub binary-search(#a, $target) {
when +#a ≤ 1 { #a[0].key == $target ?? #a[0] !! Empty }
&?BLOCK(#a[0..^*/2, */2..*][#a[*/2].key ≤ $target], $target)
}
That's not awful, but I can't shake the feeling that it could be an awfully lot better (both in terms of concision and readability). Can anyone see what elegant combo of operations I might be missing?
Here's one approach that technically meets my requirements (in that the function body it fits on a single normal-length line). [But see the edit below for an improved version.]
sub binary-search(#a, \i is copy = my $=0, :target($t)) {
for +#a/2, */2 … *≤1 {#a[i] cmp $t ?? |() !! return #a[i] with i -= $_ × (#a[i] cmp $t)}
}
# example usage (now slightly different, because it returns the index)
my #a = ((^20 .pick(*)) Z=> 'a'..*).sort;
say #a[binary-search(#a».key, :target(17))];
say #a[binary-search(#a».key, :target(1))];
I'm still not super happy with this code, because it loses a bit of readability – I still feel like there could/should be a concise way to do a binary sort that also clearly expresses the underlying logic. Using a 3-way comparison feels like it's on that track, but still isn't quite there.
[edit: After a bit more thought, I came up with an more readable version of the above using reduce.
sub binary-search(#a, :target(:$t)) {
(#a/2, */2 … *≤.5).reduce({ $^i - $^pt×(#a[$^i] cmp $t || return #a[$^i]) }) && Nil
}
In English, that reads as: for a sequence starting at the midpoint of the array and dropping by 1/2, move your index $^i by the value of the next item in the sequence – with the direction of the move determined by whether the item at that index is greater or lesser than the target. Continue until you find the target (in which case, return it) or you finish the sequence (which means the target wasn't present; return Nil)]

complexity of for Loop having same variable name

I was trying this below code to find the time complexity and somewhere Lecturer said that here you have same variable name so time complexity would be O(n^3) , I think it should be O(n^6) can someone help me Iam confused
int i ;
for(i=1;i<=n;++i)
{
for(i=1;i<=n^2;++i)
{
for(i=1;i<=n^3;++i)
{
x=y+z;
}
}
}
The first for loop starts, then the second and the third.
The third for loop runs until i>n³.
Once this happens, the second for loop will finish one iteration, it then tries to check if i<=n², but since i > n³, it ends the loop.
This happens for the first loop as well.
This happens mainly because the i variable is block scoped and since all three loops are nested, they all share the same variable.

Can someone explain me this code intuitively?

I understand recursion and what the advantages it brings to writing code efficiently. While I can code recursive functions, I cannot seem to wrap my head around how they work. I would like someone to explain me recursion instinctively.
For example, this code:
int fact(int n)
{ if n<0:
return -1
elif n==0:
return 1
else
return n*fact(n-1)
}
These are some of my questions:
Let's say n=5. On entering the function,the control goes to the last return statement since none of the previous conditions are satisfied.
Now, roughly, the computer 'writes' something like this: 5*(fact(4))
Again, the fact() function is called and the same process gets repeated except now we have n=4.
So, how exactly does the compiler multiply 5*4 and so on until 2 since its not exactly 5*4 but 5*fact(4). How does it 'remember' that it has to multiply two integers and where does it store the temporary value since we haven't provided any explicit data structure?
Again let's say n=5. The same process goes on and eventually n gets decremented to 0. My question is why/how doesn't the function simply return 1 as stated in the return statement. Similar to my previous question, how does the compiler 'remember' that it also has 180 stored for displaying?
I'd be really thankful if someone explains this to me completely so that can understand recursion better and intuitively.
Yeah, for beginners recursion can be quite confusing. But, you are already on the right track with your explanation under "1.".
The function will be called recursively until a break condition is satisfied. In this case, the break condition is satisfied when n equals 0. At this point, no recursive calls will be made anymore. The result of each recursive call is returned to the caller. The callers always "wait" until they get a result. That's how the algorithm "knows" the receiver of the results. The flow of this procedure is handled by the so called stack.
Hence, in your informal notation (in this example n equals 3):
3*(fact(2)) = 3*(2*fact(1)) = 3*(2*(1*fact(0))).
Now, n equals 0. The inner fact(0) therefore returns 1:
3*(2*(1*(1)))) = 3*(2*(1)) = 3*(2) = 6
You can see a bit like this
The function fact(int n) is like a class and every time you call fact(int n) you create an instance of that class. By creating them (calling them) from the same function, you are creating a chain of instances. Once you reach break condition, those functions start returning one by one and the value they returned to calculate a new value in the return statement return n*fact(n-1) e.g. return 3*fact(2);

Finding the minimum of mapped data

Given an array of complex objects, an algorithm for mapping each to Comparable values, and the desire to find the minimum such value, is there a built-in library method that will do this in a single pass?
Effective but not perfectly efficient solutions:
# Iterates through the array twice
min = objects.map{ |o| make_number o }.min
# Calls make_number one time more than is necessary
min = make_number( objects.min_by{ |o| make_number o } )
Efficient, but verbose solution:
min = nil
objects.each{ |o| n=make_number(o); min=n if !min || n<min }
No, no such library method already exists.
I don't really see an issue with either of your two original solutions. The enumerator code is written in C and is generally very fast. You can always just benchmark it and see what is fastest for your specific dataset and code (try https://github.com/acangiano/ruby-benchmark-suite)
However, if you really do want one pass, you can simplify your #each version by using #reduce:
min = objects.reduce(Float::INFINITY){ |min, o|
n = make_number(o)
min > n ? n : min
}
If your objects are already numbers of some form, you can omit the Float::INFINITY. Otherwise, in order to make sure we are only comparing number values, you will need to add it.

Loop with conditions in python

Consider the following code in C:
for(int i=0; i<10 && some_condition; ++i){
do_something();
}
I would like to write something similar in Python. The best version I can think of is:
i = 0
while some_condition and i<10:
do_something()
i+=1
Frankly, I don't like while loops that imitate for loops. This is due to the risk of forgetting to increment the counter variable. Another option, that addressess this risk is:
for i in range(10):
if not some_condition: break
do_something()
Important clarifications
some_condition is not meant to be calculated during the loop, but rather to specify whether to start the loop in the first place
I'm referring to Python2.6
Which style is preferred? Is there a better idiom to do this?
This might not be related, but there's what I'm used to do... If some_condition is simple enough, put it in a function and filter items you iterate over:
def some_condition(element):
return True#False
for i in filter(some_condition, xrange(10)):
pass
You can use this approach also when you iterate over some list of elements.
selected = filter(some_condition, to_process)
for i, item in enumerate(selected):
pass
Again, this might not be your case, you should choose method of filtering items depending on your problem.
In general, the "range + break" style is preferred - but in Python 2.x, use xrange instead of range for iteration (this creates the values on-demand instead of actually making a list of numbers).
But it always depends. What's special about the number 10 in this context? What exactly is some_condition? Etc.
Response to update:
It sounds as though some_condition is a "loop invariant", i.e. will not change during the loop. In that case, we should just test it first:
if some_condition:
for i in xrange(10):
do_something()
for loops with a constant upper bound are a bit rare in Python. If you are iterating over somearray, you might do:
for i in xrange(len(somearray)):
if not some_condition:
break
do_sth_with(i, somearray[i])
or, better:
for i, item in enumerate(somearray):
if not some_condition:
break
do_sth_with(i, item)

Resources