CouchDB null value when sort descending - view

I have CouchDB view that gives me a correct value in natural order and a null when sorted descending, here are the Futon screenshots:
Natural order
Descending order
Here is the view code:
"informe_precios": {
"map": "function(doc){if(doc.doc_type=='precio'){emit([doc.comprador,doc.fecha.substr(0,4),doc.fecha.substr(5,2)],{precio:doc.precio,litros:doc.litros});}}",
"reduce": "function(keys, values, rereduce){var importe= 0; var totallitros = 0;for(var i = 0; i < values.length; i++) {importe += values[i].precio*values[i].litros;totallitros += values[i].litros;}return importe/totallitros;}"
}
I need it descending because I want to get 12 last values.
TIA
Diego

You're always assuming that your reduce function is called with the output of your map function, ie. you're not handling the rereduce situation.
In the rereduce your values will be the importe/totallitros values from previous reduce calls.
Your reduce function is getting a "price per liter" average for each month, so because it's an average there's no way for your rereduce function to actually handle that data because for the multiple values coming in there's no way to know their weight in the average.
So, you'll need to change your function to return the count so that you can use that to weight the average in the rereduce function (we're also using the inbuilt sum function to make things simpler):
function(keys, values, rereduce) {
if (rereduce) {
var length = sum(values.map(function(v){return v[1]}));
var avg = sum(values.map(function(v){
return v[0] * (v[1] / length)
}));
return [avg, length];
}
else {
var importe= 0;
var totallitros = 0;
for( var i = 0; i < values.length; i++) {
importe += values[i].precio * values[i].litros;
totallitros += values[i].litros;
}
return [ importe/totallitros, values.length ];
}
}
The final result you'll see in your view here will be an array, so you'll always need to pick out the first element of that in your client code.

Related

Nested For Loops Explanation Needed

Basically, in this program, I was instructed to create an array of random numbers and then sort them smallest to largest by bubble sorting with for loops. With a bunch of trial and error, my buddy and I were able to figure it out but I just took a look back at my code and honestly, it's very hard to comprehend.. I'm not too familiar with nested loops so if someone could explain how this method is working, that would be awesome. More specifically, what does the value j and i stand for.
public void sort() {
int val = 0;
for(int i = 0; i < myArray.length; i++) {
for(int j = 1; j < (myArray.length - i); j++) {
if(myArray[j-1] > myArray[j]) {
val = myArray[j-1];
myArray[j-1] = myArray[j];
myArray[j] = val;
}
}
}
}
Any answers are greatly appreciated, thanks guys/gals!
i and j are short with no inherent meaning other than to represent the index you are at in the array. The first for loop is so that the second loop and the sorting method are repeated for as many items are in the array. The second loop does the sorting.
if(myArray[j-1] > myArray[j]) { // Checks if the index `j` in the array is less than the one before it.
val = myArray[j-1]; // Temporarily stores the greater value.
myArray[j-1] = myArray[j]; // Swap the numbers.
myArray[j] = val; // Swap the numbers.
}

Algorithm for finding the closest set of measurment to certain measurment

I have a collection of measurments, example:
measurment #1: { 200, 350, 712, 1023, 1430, 1555, 1800, 2036, 2569 }
measurment #2: { 165, 400, 974, 1124, 1600, 1893, 1919, 2032, 2654, 2932 }
...
measurment #N: { 234, 454, 879, 1432, 1877, 2000, 2543, 2876 }
The order of the elements in each measurment is important.
Each element will have higher value than the previous.
The number of elements in each measurment may vary,
but they should not vary to much.
Now i am getting as an input a new measurment
(lets say: { 212, 354, 978, 1222, 1454, 1922, 2013, 2432, 2987})
and should find the closest measurment from the collection of measurment i already possess.
My question is what algorithm should i use for this task ?
More:
1. It is also possible to extend the task in such meatter that instead input of one measurment i will be given a small collection of measurments.
2. Each element in a measurment represent time passed in second from the begining.
The measuring is stoped when reached 3600 seconds (1 hour), therfore the minimal posible value will be 0 and the maximal will be 3599.
The events creating each element in the measurment to be created is affected by a human behaviour.
Thanks for your help :)
Assuming that your data is "fuzzy", one class of algorithms you may want to look into is dynamic programming. By fuzzy I mean that two sets are almost align, but one set may have extra elements inserted, removed compared to the other and the matching elements "almost" matches.
In these types of algorithms you typically define a distance score by defining a penalty for inserting/removing an element in the alignment and a penalty score for two elements not quite matching.
In your case you may define an insert / delete penalty of "100" seconds for inserting an extra timing event, and a two-element distance score as the absolute distance in seconds.
Given that definition you can easily find and modify a needleman-wunsch algorithm implementation or something similar. This will give you the distance between two small sets of measurements in an acceptable amount of time.
However, if your number of elements in your measurements is huge or your number of sets is huge, and you need the answer in say milliseconds, then it is a rather difficult problem unless you can find a lot of good constraints for your problem.
The above is just an example, it all boils down to the context. Is your data noisy? How "noisy", with extra elements in the middle, start or end or just slightly off in position? plus a ton of other questions.
Choosing and implementing fuzzy algorithms can range between pretty easy to near impossible all depending on the context and what you are going to use the result for. Does it need to be exact or "just good enough". Does it need to be fast, etc.
Find the squared sum of errors of your new measure with each measurement in your collection. Then return the one from your collection with the smallest error.
var measures = [
[1, 2, 3, 4],
[10, 20, 30, 40],
[66, 77, 88, 99],
[101, 202, 303, 404]
];
// ignores measurements that aren't the same length as the data
// uses the squared sum of differences (errors)
function findClosest(data) {
var minError = 0x7FFFFFFF; // max 32bit signed int
var result = null;
for(var i=0; i < measures.length; i++) {
if(data.length !== measures[i].length) { continue; }
var error = 0;
for(var j=0; j < data.length; j++) {
error += Math.pow(measures[i][j] - data[j], 2);
}
if(error < minError) {
minError = error;
result = measures[i];
}
}
return result;
}
// allows data that is different length than measurements by trying to best fit each element of data to an element of the tested measurement
// uses the squared sum of differences (error)
function findClosestV2(data) {
var minError = 0x7FFFFFFF; // max 32bit signed int
var result = null;
for(var i=0; i < measures.length; i++) {
var measure = measures[i];
var error = 0;
var minLocalError = 0x7FFFFFFF;
for(var j=0; j < data.length; j++) {
for(var k=0; k < measure.length; k++) {
var localError = Math.pow(measure[k] - data[j], 2);
if(localError < minLocalError) {
minLocalError = localError;
}
}
error += minLocalError;
}
if(error < minError) {
minError = error;
result = measures[i];
}
}
return result;
}
// allows data that is different length than measurements by trying to best fit each element of data to an element of the tested measurement
// uses the average of the absolute error % using the previous measurement as the ideal value
function findClosestV3(data) {
var minError = 0x7FFFFFFF; // max 32bit signed int
var result = null;
for(var i=0; i < measures.length; i++) {
var measure = measures[i];
var error = 0;
var minLocalError = 0x7FFFFFFF;
for(var j=0; j < data.length; j++) {
for(var k=0; k < measure.length; k++) {
var localError = Math.abs( (measure[k] - data[j]) / measure[k] );
if(localError < minLocalError) {
minLocalError = localError;
}
}
error += minLocalError;
}
// average of sum of error percentages
error /= data.length;
if(error < minError) {
minError = error;
result = measures[i];
}
}
return result;
}
console.log(findClosest([2,3,4,5])); // [1,2,3,4]
console.log(findClosest([70,80,90,100])); // [66,77,88,99]
console.log(findClosest([9,19,304,405])); // [101,202,303,404]
console.log(findClosestV2([404])); // [101,202,303,404]
console.log(findClosestV2([66,67,68,69])); // [66,77,88,99]
console.log(findClosestV2([9,19,304,405])); // [10,20,30,40]
console.log(findClosestV3([404])); // [101,202,303,404]
console.log(findClosestV3([66,67,68,69])); // [66,77,88,99]
console.log(findClosestV3([9,19,304,405])); // [10,20,30,40]

Slickgrid Filtering without Dataview

Is it possible to filter a Slickgrid without using the DataView?
In case it isn't possible, how should the data array be structured in order to display correctly?
I don't have a working example atm. Thanks
Later edit:
After doing some more homework, a filterable datagrid is all about getting matching indexes in a nested array... to get a live sorted result-set that gets updated with grid.setData(filterData);grid render; one should do the following
function intersect(a, b) // find an intersection of 2 arrays (google result on SO
{
var ai=0, bi=0;
var a = a.sort();
var b = b.sort();
var result = new Array();
while( ai < a.length && bi < b.length )
{
if (a[ai] < b[bi] ){ ai++; }
else if (a[ai] > b[bi] ){ bi++; }
else /* they're equal */
{
result.push(a[ai]);
ai++;
bi++;
}
}
return result;
}
// given results sets are arrays of indexes matching search criteria
a = [1,2,3,4];
b = [2,3,4,5];
c = [3,4,5,6];
d = [4,5,6,7];
// should reunite in a nested array
array = [a,b,c,d];
// check intersections for each array[k] and array[k+1]
k = array[0];
for (var i = 0; i<array.length-1; i++){
k = intersect(k,array[i+1]);
}
console.log(k) // returns 4
// k array is the index array that
// is used to build filterData[i] = data[j]
// depends if id is stored in data or in case
// of a database, it is stored in data
// tested in firebug
// thanks
Filter the underlying data array and call grid.setData(filteredData).

Algorithm to find first index where strings are different?

I've got a collection of strings, and I need to know the first index where they all differ. I can think of two ways to do this: (the following pseudo code is just off the top of my head and may be heavily bug-laden)
First Way:
var minLength = [go through all strings finding min length];
var set = new set()
for(i=0;i<minlength;i++)
{
for(str in strings)
{
var substring = str.substring(0,i);
if(set.contains(substring))
break; // not all different yet, increment i
set.add(substring)
}
set.clear(); // prepare for next length of substring
}
This strikes me as gross because of the use of a set data structure where it seems like one should not be needed.
Second Way:
var minLength = [go through all strings finding min length];
strings.sort();
for(i=0;i<minlength;i++)
{
boolean done = true;
char last = null;
for(str in strings)
{
char c = str[i];
if(c == last)
{
// not all different yet, increment i
done = false;
break;
}
last = c;
}
if(done)
return i;
}
But it annoys me that I have to run the sort first, because the sorting algorithm, by its very nature, has access to the information that I'm looking for.
Surely there must be a more efficient way than what I have listed above. Eventually I'd like to abstract it out to any type of array, but that will be trivial and it's simpler to think of it as a string problem.
Any help?
**UPDATE: I apparently didn't explain myself very well. If my strings are ["apple", "banana", "cucumber", "banking"], I want the function to return 3, because there were two strings ("banana" and "banking") that matched through index 0, 1, and 2, so 3 is the first index where they are all unique.
As Daniel mentioned below, a better way to state my needs is that: "I want to find index i where calling substring(0,i) on all my strings will result in all unique values."**
This is untested, but here's my attempt. (I may be making it more complicated than I have to, but I think it's a different way to look at it.)
The basic idea is to compile groups of items that match at the first element, then find the max unique index for each group, checking elements at each successive index.
int FirstUniqueIndex<T>(IEnumerable<IEnumerable<T>> myArrayCollection)
{
//just an overload so you don't have to specify index 0 all the time
return FirstUniqueIndex(myArrayCollection, 0);
}
int FirstUniqueIndex<T>(IEnumerable<IEnumerable<T>> myArrayCollection, int StartIndex)
{
/* Group the current collection by the element at StartIndex, and
* return a collection of these groups. Additionally, we're only interested
* in the groups with more than one element, so only get those.*/
var groupsWithMatches = from var item in myArrayCollection //for each item in the collection (called "item")
where item.Length > StartIndex //that are long enough
group by item[StartIndex] into g //group them by the element at StartIndex, and call the group "g"
where g.Skip(1).Any() //only want groups with more than one element
select g; //add the group to the collection
/* Now "groupsWithMatches" is an enumeration of groups of inner matches of
* your original arrays. Let's process them... */
if(groupsWithMatches.Any())
//some matches were found - check the next index for each group
//(get the maximum unique index of all the matched groups)
return groupsWithMatches.Max(group => FirstUniqueIndex(group, StartIndex + 1));
else
//no matches found, all unique at this index
return StartIndex;
}
And for the non-LINQ version of the above (I'll change it to use a List collection, but any collection will do). I'll even remove the lambda. Again untested, so try not to aim sharp implements in my direction.
int FirstUniqueIndex<T>(List<List<T>> myArrayCollection, int StartIndex)
{
/* Group the current collection by the element at StartIndex, and
* return a collection of these groups. Additionally, we're only interested
* in the groups with more than one element, so only get those.*/
Dictionary<T, List<List<T>>> groupsWithMatches = new Dictionary<T, List<List<T>>>();
//group all the items by the element at StartIndex
foreach(var item in myArrayCollection)
{
if(item.Count > StartIndex)
{
List<List<T>> group;
if(!groups.TryGetValue(item[StartIndex], out group))
{
//new group, so make it first
group = new List<List<T>>();
groups.Add(item[StartIndex], group);
}
group.Add(Item);
}
}
/* Now "groups" is an enumeration of groups of inner matches of
* your original arrays. Let's get the groups with more than one item. */
List<List<List<T>>> groupsWithMatches = new List<List<List<T>>>(groups.Count);
foreach(List<List<T> group in groupsWithMatches)
{
if(group.Count > 1)
groupsWithMatches.Add(group);
}
if(groupsWithMatches.Count > 0)
{
//some matches were found - check the next index for each group
//(get the maximum unique index of all the matched groups)
int max = -1;
foreach(List<List<T>> group in groupsWithMatches)
{
int index = FirstUniqueIndex(group, StartIndex + 1);
max = index > max ? index : max;
}
return max;
}
else
{
//no matches found, all unique at this index
return StartIndex;
}
}
have you looked at a Patricia trie? (Java implementation available on google code)
Build the trie, then traverse the data structure to find the maximum string position of all the internal nodes (black dots in the function above).
This seems like it should be an O(n) operation. I'm not sure whether your set implementation is O(n) or not -- it "smells" like O(n2) but I'm not sure.
Use the set as you proposed, that's exactly the right thing to do.
You should be able to do this without sorting, and with only looking at each character in each string once in the worst case.
here is a ruby script that puts the index to the console:
mystrings = ["apple", "banana", "cucumber", "banking"]
minlength = getMinLengthString(mystrings) #not defined here
char_set = {}
(0..minlength).each do |char_index|
char_set[mystrings[0][char_index].chr] = 1
(1..mystrings.length).each do |string_index|
comparing_char = mystrings[string_index][char_index].chr
break if char_set[comparing_char]
if string_index == (mystrings.length - 1) then
puts string_index
exit
else
char_set[comparing_char] = 1
end
end
char_set.clear
end
puts minlength
the result is 3.
Here's the same general snippet in C#, if it is more legible for you:
string[] mystrings = { "apple", "banana", "cucumber", "banking" };
//defined elsewhere...
int minlength = GetMinStringLengthFromStringArray(mystrings);
Dictionary<char, int> charSet = new Dictionary<char, int>();
for (int char_index = 0; char_index < minlength; char_index++)
{
charSet.Add(mystrings[0][char_index], 1);
for (int string_index = 1; string_index < mystrings.Length; string_index++)
{
char comparing_char = mystrings[string_index][char_index];
if (charSet.ContainsKey(comparing_char))
{
break;
}
else
{
if (string_index == mystrings.Length - 1)
{
Console.Out.WriteLine("Index is: " + string_index.ToString());
return;
}
else
{
charSet.Add(comparing_char, 1);
}
}
}
charSet.Clear();
}
Console.Out.WriteLine("Index is: " + minlength.ToString());
int i = 0;
while(true)
{
Set set = new Set();
for(int j = 0; j < strings.length; j++)
{
if(i >= strings[j].length) return i;
String chr = strings[j].charAt(i);
if(set.hasElement(chr))
break;
else
set.addElement(chr);
}
if(set.size() == strings.length)
return i;
i++;
}
Gotta check preconditions first.
EDIT: Using a set now. Changed langauge.
Here's my solution in Python:
words = ["apple", "banana", "cucumber", "banking"]
for i in range(len(min(words))):
d = defaultdict(int)
for word in words:
d[word[i]] += 1
if max(d.values()) == 1:
return i
I didn't write in anything to handle the case where no minimum index is found by the time you reach the end of the shortest word, but I'm sure you get the idea.

Most efficient way to sort parallel arrays in a restricted-feature language

The environment: I am working in a proprietary scripting language where there is no such thing as a user-defined function. I have various loops and local variables of primitive types that I can create and use.
I have two related arrays, "times" and "values". They both contain floating point values. I want to numerically sort the "times" array but have to be sure that the same operations are applied on the "values" array. What's the most efficient way I can do this without the benefit of things like recursion?
You could maintain an index table and sort the index table instead.
This way you will not have to worry about times and values being consistent.
And whenever you need a sorted value, you can lookup on the sorted index.
And if in the future you decided there was going to be a third value, the sorting code will not need any changes.
Here's a sample in C#, but it shouldn't be hard to adapt to your scripting language:
static void Main() {
var r = new Random();
// initialize random data
var index = new int[10]; // the index table
var times = new double[10]; // times
var values = new double[10]; // values
for (int i = 0; i < 10; i++) {
index[i] = i;
times[i] = r.NextDouble();
values[i] = r.NextDouble();
}
// a naive bubble sort
for (int i = 0; i < 10; i++)
for (int j = 0; j < 10; j++)
// compare time value at current index
if (times[index[i]] < times[index[j]]) {
// swap index value (times and values remain unchanged)
var temp = index[i];
index[i] = index[j];
index[j] = temp;
}
// check if the result is correct
for (int i = 0; i < 10; i++)
Console.WriteLine(times[index[i]]);
Console.ReadKey();
}
Note: I used a naive bubble sort there, watchout. In your case, an insertion sort is probably a good candidate. Since you don't want complex recursions.
Just take your favourite sorting algorithm (e.g. Quicksort or Mergesort) and use it to sort the "values" array. Whenever two values are swapped in "values", also swap the values with the same indices in the "times" array.
So basically you can take any fast sorting algorithm and modify the swap() operation so that elements in both arrays are swapped.
Take a look at the Bottom-Up mergesort at Algorithmist. It's a non-recursive way of performing a mergesort. The version presented there uses function calls, but that can be inlined easily enough.
Like martinus said, every time you change a value in one array, do the exact same thing in the parallel array.
Here's a C-like version of a stable-non-recursive mergesort that makes no function calls, and uses no recursion.
const int arrayLength = 40;
float times_array[arrayLength];
float values_array[arrayLength];
// Fill the two arrays....
// Allocate two buffers
float times_buffer[arrayLength];
float values_buffer[arrayLength];
int blockSize = 1;
while (blockSize <= arrayLength)
{
int i = 0;
while (i < arrayLength-blockSize)
{
int begin1 = i;
int end1 = begin1 + blockSize;
int begin2 = end1;
int end2 = begin2 + blockSize;
int bufferIndex = begin1;
while (begin1 < end1 && begin2 < end2)
{
if ( values_array[begin1] > times_array[begin2] )
{
times_buffer[bufferIndex] = times_array[begin2];
values_buffer[bufferIndex++] = values_array[begin2++];
}
else
{
times_buffer[bufferIndex] = times_array[begin1];
values_buffer[bufferIndex++] = values_array[begin1++];
}
}
while ( begin1 < end1 )
{
times_buffer[bufferIndex] = times_array[begin1];
values_buffer[bufferIndex++] = values_array[begin1++];
}
while ( begin2 < end2 )
{
times_buffer[bufferIndex] = times_array[begin2];
values_buffer[bufferIndex++] = values_array[begin2++];
}
for (int k = i; k < i + 2 * blockSize; ++k)
{
times_array[k] = times_buffer[k];
values_array[k] = values_buffer[k];
}
i += 2 * blockSize;
}
blockSize *= 2;
}
I wouldn't suggest writing your own sorting routine, as the sorting routines provided as part of the Java language are well optimized.
The way I'd solve this is to copy the code in the java.util.Arrays class into your own class i.e. org.mydomain.util.Arrays. And add some comments telling yourself not to use the class except when you must have the additional functionality that you're going to add. The Arrays class is quite stable so this is less, less ideal than it would seem, but it's still less than ideal. However, the methods you need to change are private, so you've no real choice.
You then want to create an interface along the lines of:
public static interface SwapHook {
void swap(int a, int b);
}
You then need to add this to the sort method you're going to use, and to every subordinate method called in the sorting procedure, which swaps elements in your primary array. You arrange for the hook to get called by your modified sorting routine, and you can then implement the SortHook interface to achieve the behaviour you want in any secondary (e.g. parallel) arrays.
HTH.

Resources