find endpoints for range given a value within the range - algorithm

I am trying to solve a simple problem, but at the moment I cannot think of a better solution. I am testing an API that is not documented.
There is an ID used to fetch objects and it has a min and max value with random values missing in-between. I'm trying to test the responses I receive for random objects, but to find objects, I need to have valid IDs.
It would be very inefficient to test random numbers and hope that I get an object back. The best I can do is find a range, get a random number between that range and check if it exists before conducting tests.
A sample list of all of the IDs in the database might look like this:
[1005, 25984, 25986, 29587, 30000, ...]
Assuming the deviation from one value to another will never exceed C, e.g. from the first value to the next value, the difference will never be greater than a pre-defined constant, how would you calculate the min/max of the range given only one value in the range?
Starting from a given value and looping until the last value is found is horrible but that is how it was implemented by previous devs. Below is pseudocode that more or less covers what they do.
// this can be any valid object ID from the database
// assuming the ID's in the database are [1005, 25984, 25986, 29587, 30000]
// "i" could be any one of these values
var i = givenPredefinedObjectId;
var deviation = 100;
// objectWithIdExists() is going to lookup an object with the ID "i" in the database
// if there is no object with the ID "i" , it will return false
// otherwise the object will get tested and return true
while(objectWithIdExists(i)){
i++;
}
for(i; i < i+deviation; i++){
if(objectWithIdExists(i)){
goto while loop;
}
}
endPoint = i - deviation;
Assuming there is no knowledge about the possible values except you can check if they exist and you are given one valid value (there is no array with all possible IDs, that was just an example), how would you find the min/max values?

Unbounded binary search is feasible, with a factor of C slowdown. Given an algorithm for unbounded binary search that, given access to the oracle less_equal(n) for some natural number n, returns n in time O(log n), implement the oracle on input k by querying all of the IDs C*k, C*k+1, ..., C*k+C-1 and reporting that k is less than or equal to n if and only if one ID is found. The running time is O(C*log((max-min)/C)).

Related

Hash Tables and Separate Chaining: How do you know which value to return from the bucket's list?

We're learning about hash tables in my data structures and algorithms class, and I'm having trouble understanding separate chaining.
I know the basic premise: each bucket has a pointer to a Node that contains a key-value pair, and each Node contains a pointer to the next (potential) Node in the current bucket's mini linked list. This is mainly used to handle collisions.
Now, suppose for simplicity that the hash table has 5 buckets. Suppose I wrote the following lines of code in my main after creating an appropriate hash table instance.
myHashTable["rick"] = "Rick Sanchez";
myHashTable["morty"] = "Morty Smith";
Let's imagine whatever hashing function we're using just so happens to produce the same bucket index for both string keys rick and morty. Let's say that bucket index is index 0, for simplicity.
So at index 0 in our hash table, we have two nodes with values of Rick Sanchez and Morty Smith, in whatever order we decide to put them in (the first pointing to the second).
When I want to display the corresponding value for rick, which is Rick Sanchez per our code here, the hashing function will produce the bucket index of 0.
How do I decide which node needs to be returned? Do I loop through the nodes until I find the one whose key matches rick?
To resolve Hash Tables conflicts, that's it, to put or get an item into the Hash Table whose hash value collides with another one, you will end up reducing a map to the data structure that is backing the hash table implementation; this is generally a linked list. In the case of a collision this is the worst case for the Hash Table structure and you will end up with an O(n) operation to get to the correct item in the linked list. That's it, a loop as you said, that will search the item with the matching key. But, in the cases that you have a data structure like a balanced tree to search, it can be O(logN) time, as the Java8 implementation.
As JEP 180: Handle Frequent HashMap Collisions with Balanced Trees says:
The principal idea is that once the number of items in a hash bucket
grows beyond a certain threshold, that bucket will switch from using a
linked list of entries to a balanced tree. In the case of high hash
collisions, this will improve worst-case performance from O(n) to
O(log n).
This technique has already been implemented in the latest version of
the java.util.concurrent.ConcurrentHashMap class, which is also slated
for inclusion in JDK 8 as part of JEP 155. Portions of that code will
be re-used to implement the same idea in the HashMap and LinkedHashMap
classes.
I strongly suggest to always look at some existing implementation. To say about one, you could look at the Java 7 implementation. That will increase your code reading skills, that is almost more important or you do more often than writing code. I know that it is more effort but it will pay off.
For example, take a look at the HashTable.get method from Java 7:
public synchronized V get(Object key) {
Entry<?,?> tab[] = table;
int hash = key.hashCode();
int index = (hash & 0x7FFFFFFF) % tab.length;
for (Entry<?,?> e = tab[index] ; e != null ; e = e.next) {
if ((e.hash == hash) && e.key.equals(key)) {
return (V)e.value;
}
}
return null;
}
Here we see that if ((e.hash == hash) && e.key.equals(key)) is trying to find the correct item with the matching key.
And here is the full source code: HashTable.java

how can I get the location for the maximum value in fortran?

I have a 250*2001 matrix. I want to find the location for the maximum value for a(:,i) where i takes 5 different values: i = i + 256
a(:,256)
a(:,512)
a(:,768)
a(:,1024)
a(:,1280)
I tried using MAXLOC, but since I'm new to fortran, I couldn't get it right.
Try this
maxloc(a(:,256:1280:256))
but be warned, this call will return a value in the range 1..5 for the second dimension. The call will return the index of the maxloc in the 2001*5 array section that you pass to it. So to get the column index of the location in the original array you'll have to do some multiplication. And note that since the argument in the call to maxloc is a rank-2 array section the call will return a 2-element vector.
Your question is a little unclear: it could be either of two things you want.
One value for the maximum over the entire 250-by-5 subarray;
One value for the maximum in each of the 5 250-by-1 subarrays.
Your comments suggest you want the latter, and there is already an answer for the former.
So, in case it is the latter:
b(1:5) = MAXLOC(a(:,256:1280:256), DIM=1)

Need some explanation about getting max in XPath

I'm kinda new to XPath and I've found that to get the max attribute number I can use the next statement: //Book[not(#id > //Book/#id) and it works quite well.
I just can't understand why does it return max id instead of min id, because it looks like I'm checking whether id of a node greater than any other nodes ids and then return a Book where it's not.
I'm probably stupid, but, please, someone, explain :)
You're not querying for maximum values, but for minimum values. Your query
//Book[not(#id > //Book/#id)
could be translated to natural language as "Find all books, which do not have an #id that is larger than any other book's #id". You probably want to use
//Book[not(#id < //Book/#id)
For arbitrary input you might have wanted to use <= instead, so it only returns a single maximum value (or none if it is shared). As #ids must be unique, this does not matter here.
Be aware that //Book[#id > //Book/#id] is not equal to the query above, although math would suggest so. XPath's comparison operators adhere to a kind of set-semantics: if any value on the left side is larger than any value on the right side, the predicate would be true; thus it would include all books but the one with minimum #id value.
Besides XPath 1.0 your function is correct, in XPath 2.0:
/Books/Book[id = max(../Book/id)]
The math:max function returns the maximum value of the nodes passed as the argument. The maximum value is defined as follows. The node set passed as an argument is sorted in descending order as it would be by xsl:sort with a data type of number. The maximum is the result of converting the string value of the first node in this sorted list to a number using the number function.
If the node set is empty, or if the result of converting the string values of any of the nodes to a number is NaN, then NaN is returned.
The math:max template returns a result tree fragment whose string value is the result of turning the number returned by the function into a string.

Count frequency of items in array - without two for loops

Need to know is there a way to count the frequency of items in a array without using two loops. This is without knowing the size of the array. If I know the size of the array I can use switch without looping. But I need more versatile than that. I think modifying the quicksort may give better results.
Array[n];
TwoDArray[n][2];
First loop will go on Array[], while second loop is to find the element and increase it count in two-d array.
max = 0;
for(int i=0;i<Array.length;i++){
found= false;
for(int j=0;j<TwoDArray[max].length;j++){
if(TwoDArray[j][0]==Array[i]){
TwoDArray[j][1]+=;
found = true;
break;
}
}
if(found==false){
TwoDArray[max+1][0]=Array[i];
TwoDArray[max+1][1]=1;
max+=;
}
If you can comment or provide better solution would be very helpful.
Use map or hash table to implement this. Insert key as the array item and value as the frequency.
Alternatively you can use array too if the range of array elements are not too large. Increase the count of value at indexes corresponding to the array element.
I would build a map keyed by the item in the array and with a value that is the count of that item. One pass over the array to build the map that contains the counts. For each item, look it's count up in the map, increment the count, and put the new count back into the map.
The map put and get operations can be constant time (e.g., if you use a hash map implementation with a good hash function and properly sized backing store). This means you can compute the frequencies in time proportional to the number of elements in your array.
I'm not saying this is better than using a map or hash table (especially not when there are lots of duplicates, though in that case you can get close to O(n) sorting with certain techniques, so this is not too bad either), it's just an alternative.
Sort the array
Use a (single) for-loop to iterate through the sorted array
If you find the same element as the previous one, increment the current count
If you find a different element, store the previous element and its count and set the count to 1
At the end of the loop, store the previous element and its count

Is there an easy way to have a "mode" function on an array of singles in vb6?

I need to run "mode" (which value occurs most frequently) on an array of singles in vb6. Is there a quick way do do this on large arrays?
Have a look online for a decent implementation of a sort algorithm for VB6 (I can't believe it doesn't have one built in!), sort the array, and then go through it counting the occurrences (which will be straightforward as you've all the same items together in the array) - keep a track of the most frequently occurring item on your way through and you're done. This should be O(n ln(n)) - that is, fast enough - if you've used a decent sort algorithm (quicksort or similar).
You could use a hash table. Hash all of the elements of your array (which is O(n)). You'll need a back-end data structure to hold the unique values that each hash bin contains and the number of occurances (some sort of associative memory similar to the C++ std::map). As long as you can guarantee that there will be no more than a constant, m, number of collisions (for dissimilar hash input values) in any given bin, this is O(m log m), but since m is constant, this is really O(1). This assumption may not be reasonable, but the key is to get good enough spread for your input values.
To pull out the mode, examine all of the elements in the hash table, which will be values that occur in your original input array and the number of times they occur. Find the value with the largest number of occurances (again O(n)). Total complexity is O(n) if you can find a suitable hash function. Worst case performance will be O(n log n) if the hash function doesn't provide you with good collision performance.
On another note, .Net provides a large runtime library that might make this easier. If it's feasible, you might want to consider using a new version of VB.
Included a reference to Microsoft Scripting Runtime, and used a Dictionary object to keep tally of frequency, then looked for index highest frequency and the corresponding key is the mode. Not the quickest/most elegant solution, but I just needed something up fast that worked.
Function fnModeSingle(ByRef pValues() As Single) As Single
Dim dict As Dictionary
Set dict = New Dictionary
dict.CompareMode = BinaryCompare
Dim i As Long
Dim pCurVal As Single
For i = 0 To uBound(pValues)
'limit the values that have to be analyzed to desired precision'
pCurVal = Round(pValues(i), 2)
If (pCurVal > 0) Then
'this will create a dictionary entry if it doesn't exist
dict.Item(pCurVal) = dict.Item(pCurVal) + 1
End If
Next
'find index of first largest frequency'
Dim KeyArray, itemArray
KeyArray = dict.Keys
itemArray = dict.Items
pCount = 0
Dim pModeIdx As Integer
'find index of mode'
For i = 0 To UBound(itemArray)
If (itemArray(i) > pCount) Then
pCount = itemArray(i)
pModeIdx = i
End If
Next
'get value corresponding to selected mode index'
fnModeSingle = KeyArray(pModeIdx)
Set dict = Nothing
End Function

Resources