Array confusion - Quiz - sorting

I am a bit confused by one of my quiz answers and I was hoping somebody could provide me with an explanation. The question is:
You have two arrays named b and c. You call a static method that swaps the value in component k of one array with the value in component k of the other array. This swap method does not mention any variable declared outside the method except its parameters. Which of the following method calls could possibly accomplish this?
Answer: swap (c, b, k)
I am very confused why this is the answer. Any help would be greatly appreciated!

Most likely swap's declaration has c and b being passed by reference. That means all the function has to do is check both to make sure there's enough elements (sizeof(c) >= k && sizeof(b) >= k or something similar) before doing the actual swap.
Also: b,c, and k are defined in the question itself.

it swaps the element at index k for both b and c. k-1 swaps the element which is previous to k and 0 swaps the 1st element in array

Since swap() can't refer to anything outside itself, all the necessary references and information must be passed in as parameters.
The answer of swap (c, b, k) is correct since it gives a reference to both arrays and what index to swap. You just can't accomplish the task passing in fewer parameters.
If you only had swap (), swap (k), or swap (c, b), then the method would need information external to itself to complete the task.

Related

Best way to initialize a list of lists

Is there any difference in initializing a list of lists by
[[0 for i in xrange(n)] for j in xrange(m)]
or
[[0]*n for j in xrange(m)]
From the point of view of time performance the first way is 4 times faster than the second way, and I am wondering whether the first way has any computational/memory use or allocation drawback.
List comprehensions provide a concise way to create lists. Common
applications are to make new lists where each element is the result of
some operations applied to each member of another sequence or
iterable, or to create a subsequence of those elements that satisfy a
certain condition.
First method is slower than the second method because a no-op is performed in each loop iteration.
P.S. If you want to initialise a list of list with constant value (k) then the ideal and the fastest way would be to used numpy np.ones((m, n))*k.

Parse multiple array similarity query

I am working on an algorithm that will compare 2 objects, object 1 and object 2. Each object has attributes which are 5 different arrays, array A, B, C, D, and E.
In order for the two objects to be a match, at least one item from Object 1 A must be in Object 2 A AND Object 1 B must be in Object 2 B, etc through object E must be similar. With a higher number of matches in each array A-E, the higher of a score The match will produce.
Am I going to have to pull Object 1 and object 2 then do an n^2 complexity search on each array to determine which ones exist in both arrays? Then I would go about serving a score by how many matches there were in each array, then add them up and the total would give me the score.
I feel like there has to be a better option for this, especially for Parse.com
Maybe I am going about this problem all wrong, can someone PLEASE help me with this problem. I would provide some code for this one, but I have not started the code yet because I cannot wrap my head around the best way to design it. The two object database are in place already though.
Thanks!
As I said, I may be thinking of this problem in the wrong way. If I am unclear about anything that I am trying to do, let me know and I will update accordingly.
Simplest solution:
Copy all elements from some array object1 to hash table (unordered map), and thereafter iterate array in the 2nd object, and lookup presence in the map. Thus, time complexity is O(N).
Smart solution:
Keep elements in all objects not in the "naive arrays", but in the arrays, structured as hash tables with double hashing algorithm. If so, all arrays in an objects1, 2, already will be pre-indexed, and what is you needed - iterate array, contains less number of elements, and match elements vs longest pre-indexed array.

Given an object A and a list of objects L, how to find which objects on L are clones of A without testing all cases?

Using JavaScript notation:
A = {color:'red',size:8,type:'circle'};
L = [{color:'gray',size:15,type:'square'},
{color:'pink',size:4,type:'triangle'},
{color:'red',size:8,type:'circle'},
{color:'red',size:12,type:'circle'},
{color:'blue',size:10,type:'rectangle'}];
The answer for this case would be 2, because L[2] is identic to A. You could find the answer in O(n) by testing each possibility. What is a representation/algorithm that allows finding that answer faster?
I would just create a HashMap and put all objects into the HashMap. Also we would need to define a hash function which is function of data in object (something similar to overriding Object.hashcode() in java)
Suppose given array L is [B, C, D] where B, C and D are objects. Then HashMap would be {B=>1, C=>2, D=>3}. Now suppose D is copy of A. So we would just lookup A in this map and get the answer. Also as suggested by Eric P in comment, we would need to keep the hashmap updated with respect to any change in array L. This also can be done in O(1) for every operation in array L.
Cost of Looking up an object in the HashMap is O(1). So we can achieve O(1) complexity.
I think it's not possible to do it faster than O(n) with your preconditions.
It's possible to find element in O(logn) using binary search, but:
A) you need elements with one variable to compare
B) sorted list by that variable
Maybe with some technics (ordering, skip lists, etc.) you can find answer faster than N iterations, but the worst case is O(n)
Since the goal is to find all objects which are clones of A, you must test every object at least once to determine whether it is a clone of A, so the minimum number of tests is N. Passing through the list once and testing each object performs N tests, so since this method is the minimum number of tests, it is an optimal method.
first, I assume, that you are talking about array, not list. the word 'list' is reserved for specific type of data structures, that has O(n) indexing comlexity, so meantime for any search in it is at least linear.
for unsorted array, the only algorithm is full scan with linear time. However, if array is sorted, you can use binary or interpolating search to get better time.
The problem with sorted arrays is that they have linear insert time. No good. So if you wish to update your set much and both update and search times are important, you should search for optimized container, that in c++ and haskell is called Set (set template in set header and Data.Set module in containers package respectively). I dunno if there is any in JS.

quicksort quickie: the flow of control in quicksort

In what seems to me a common implementation of quicksort, the program is composed of a partitioning subroutine and two recursive calls to quicksort those (two) partitions.
So the flow of control, in the quickest and pseudo-est of pseudocode, goes something like this:
quicksort[list, some parameters]
.
.
.
q=partition[some other parameters]
quicksort[1,q]
quicksort[q+1,length[list]]
.
.
.
End
The q is the "pivot" after a partitioning. That second quicksort call--the one that'll quicksort the second part of the list, also uses q. This is what I don't understand. If the "flow of control" is going through the first quicksort first, q is going to be updated. How is the same q going to work in the second quicksort, when it comes time to do the second parts of all those partitions?
I think my misunderstanding comes from the limitations of pseudocode. There are details that have been likely left out by expressing this implementation of the quicksort algorithm in pseudocode.
Edit 1 This seems related to my problem:
For[i = 1, i < 5, i = i + 1, Print[i]]
The first time through, we would get i=1, true, i=2, 1. Even though i was updated to 2, i is still 1 in body (i.e., Print[i]=1). This "flow of control" is what I don't understand. Where is the i=1 being stored when it increments to 2 and before it gets to body?
Edit 2
As an example of what I'm trying to get at, I'm pasting this here. It's from here.
Partition(A,p,r)
x=A[r]
i=p+1
j=r+1
while TRUE
repeat j=j-1
until A[j]<=x
repeat i=i+1
until A[i]>=x
if i<j
then exchange A[i] with A[j]
else return j
Quicksort(A,1,length[A])
Quicksort(A,p,r)
if p<r
then q=Partition(A,p,r)
Quicksort(A,p,q)
Quicksort(A,q+1,r)
Another example can be found here.
Where or when in these algorithms is q being put onto a stack?
q is not updated. The pivot remains in his place. In each iteration of quicksort, the only element who is guaranteed to be in its correct place, is the pivot.
Also, note that the q which is "changed" during the recursive call is NOT actually changed, since it is a different variable, stored in a different area, this is true because q is a local variable of the function, and is generated for each call.
EDIT: [response to the question edit]
In quicksort, the algorithm actually generate number of qs, which are stored on the stack. Every variable is 'alive' only on its own function, and is accessible [in this example] only from it. When the function ends, the local variable is being released automatically, so actually you don't have only one pivot, you actually have number of pivots, one for each recursive step.
Turns out Quicksort demands extra memory to function precisely in order to do the bookeeping you mentioned. Perhaps the following (pseudocode) iterative version of the algorithm might clear things up:
quicksort(array, begin, end) =
intervals_to_sort = {(begin, end)}; //a set
while there are intervals to sort:
(begin, end) = remove an interval from intervals_to_sort
if length of (begin, end) >= 2:
q = partition(array, begin, end)
add (begin, q) to intervals_to_sort
add (q+1, end) to intervals_to_sort
You may notice that now the intervals to sort are being explicitly kept in a data structure (usually just an array, inserting and removing at the end, in a stack-like fashion) so there is no risk of "forgetting" about old intervals.
What might confuse you is that the most common description of Quicksort is recursive so the q variable appears multiple times. The answer to this is that every time a function is called it creates a new batch of local variables so it doesn't touch the old ones. In the end, the explicit stack from that previous imperative example ends up being implemented as an implicit stack with function variables.
(An interesting side note: some early programming languages didn't implement neat local variables like that and Quicksort was actually first described using the iterative version with the explicit stack. It was only latter that it was seen how Quicksort could be elegantly described as a recursive algorithm in Algol.)
As for the part after your edit, the i=1 is forgotten since assignment will destructively update the variable.
The partition code picks some value from the array (such as the value at the midpoint of the array ... your example code picks the last element) -- this is the pivot. It then puts all the values <= pivot on the left and all values >= pivot on the right, and then stores the pivot in the one remaining slot between them. At that point, the pivot is necessarily in the correct slot, q. Then the algorithm sorts the partition [p, q) and the partition [q+1, r), which are disjoint but cover all of A except q, resulting in the entire array being sorted.

Set transformation

I'm looking for a way to transform a set, and having trouble. This is because the requirements are rather rigorous.
Set A contains a bunch of integers, the details are really irrelevant.
Set B contains a bunch of integers, such that:
Each value in A directly maps to one and only one value in B.
Each bit is true in one, and only one, value in B.
The sum of any N values in B has a strict relation to (the sum of) it's original values in A. This relation may not depend on knowing the actual N values in question, although other things like knowing the number of values summed is fine.
It's mainly a thought exercise rather than an actual implementation, so detailing the realities of, for example, the memory constraints which would grow hugely with the size of A.
For example, you could satisfy the first two requirements by simply saying that B[i] = 2^A[i]. But that's not useful, because if you did 2^x = 2^A[i] + 2^A[j], you can't infer that the sum of A[i] and A[j] is x or some other expression which does not involve A[i] or A[j].
I'm tending towards such a transformation being impossible, but thought I'd throw it out there just in case.
Edit: I've been unclear. Sorry. This idea exists mainly in my head.
I already know the sum of the B values. The problem is that I start with the sum of the B values and find the values in B which sum to it, which is trivial due to the unique-bits restriction. The trouble is that the sum is initially expressed in A values, so I have to be able to transform the sum from a sum of A values to a sum of B values. This is useless to me if I have to transform it separately for every possible sum because the transformation depends on the values I'm summing.
More edit: Also, my reverse mechanism from B[i] to A[i] is a lookup table. Don't need an actual existent mathematical function. Any A[i] is unique from any other A[j].
I think your third constraint poses a problem. When you say A is one-to-one onto B, that means there exists an invertible mapping F: A->B and its inverse F': B->A such that F'(F(x))=x. Now, "the sum of any N values in B has a strict relation to the sum of the original values in A". This means that there exists some G such that G(A_1+A_2+...+A_n)=B_1+B_2+...+B_n; the sum of the B-values are related to the A-values. But, because of the first clause, we've established that A_1=F'(B_1), so "knowing the actual N values in question" (A_1 through A_n, although your original question leaves it ambiguous to which values you refer) is the same as "knowing" the B-values, due to the one-to-one correspondence. Thus it is not possible to satisfy constraints one and three simultaneously for a finite set of integers; if you are instructed to "sum these n B-values", you must already know the A-values - just apply the inverse transform.
Given (A_1 + ... + A_n), can we assume each A_i is unique? If not, the problem is impossible: adding A_1 to itself A_2 times gives the same result as adding A_2 to itself A_1 times. If each A_i IS unique, then what are we allowed to assume about the bijection between A and B? For example, if N is known, then A[i] = B[i] + d is trivially reversible for all d. Even if we can assume each A_i in the sum is unique, the problem of recovering the B_i is possible if (and only if) no two subsets of A sum to the same value. How easily the B_i can be recovered depends on the nature of the bijection.
The trouble is that the sum is
initially expressed in A values, so I
have to be able to transform the sum
from a sum of A values to a sum of B
values.
That translates to finding the A values from a sum of A values, which I don't think is possible if the A values are arbitrary.
EDIT: The way you described it, this problem appears to be the subset sum problem, which is NP-complete. You can use dynamic programming to improve its performance, but I don't know if you can go much beyond that.

Resources