How can this array be be sorted? - sorting

I have an array with values as follows:
0-10
10-50
50-100
100-150
150-200
200+
This is actually an array of WordPress taxonomies, which WP sorts alphabetically (or ascending order of the first digit), giving me the following:
//notice how 50-100 gets pushed to the bottom, due to ascending order applied
0-10
10-50
100-150
150-200
200+
50-100
I just want to keep the order as the original array, as these are ranges for a particular situation and 50-100 at the last disturbs the UI!
Does anyone know a way to sort this array?

For PHP use natsort()
$nums = array('10-50', '100-150', '0-10', '150-200', '50-100', '200+');
natsort($nums);
var_dump($nums);

Parse the strings, convert them to numbers, and sort on the first number. The actual mechanics of doing that depends on your programming language. In C# you can use Linq:
var strings = new[] { "0-10", "10-50", "100-150", "50-100", "200+", "150-200" };
var sorted = strings.OrderBy(s => int.Parse(s.Split('+', '-')[0])).ToArray();

Related

Ruby storing data for queries

I have a string
"4813243948,1234433948,1.3,Type2
1234433948,4813243948,1.3,Type1
1234433948,6345635414,1.3,Type1
4813243948,2435677524,1.3,Type2
4813243948,5245654367,1.3,Type2
2345243524,6754846756,1.3,Type1
1234512345,2345124354,1.3,Type1
1342534332,4565346546,1.3,Type1"
This is telephone outbound call data where each new line represents a new phone call.
(Call From, Call To, Duration, Line Type)
I want to save this data in a way that allows me to query a specific number and get a string output of the number, its type, its total minutes used, and all the calls that it made (outbound calls). I just want to do this in a single ruby file.
Thus typing in this
4813243948
Returns
4813243948, Type 2, 3.9 Minutes total
1234433948, 1.3
2435677524, 1.3
5245654367, 1.3
I am wondering if I should try to store values in arrays, or create a custom class and make each number an object of a class then append the calls to each number.. not sure how to do the class method. Having a different array for each number seems like it would get cluttered as there are thousands of numbers and millions of calls. Of course, the provided input string is a very small portion of the real source.
I have a string
"4813243948,1234433948,1.3,Type2
1234433948,4813243948,1.3,Type1
This looks like a CSV. If you slap some headers on top, you can parse it into an array of hashes.
str = "4813243948,1234433948,1.3,Type2
1234433948,4813243948,1.3,Type1"
require 'csv'
calls = CSV.parse(str, headers: %w[from to length type], header_converters: :symbol).map(&:to_h)
# => [{:from=>"4813243948", :to=>"1234433948", :length=>"1.3", :type=>"Type2"},
# {:from=>"1234433948", :to=>"4813243948", :length=>"1.3", :type=>"Type1"}]
This is essentially the same as your original string, only it trades some memory for ease of access. You can now "query" this dataset like this:
calls.select{ |c| c[:from] == '4813243948' }
And then aggregate for presentation however you wish.
Naturally, searching through this array takes linear time, so if you have millions of calls you might want to organize them in a more efficient search structure (like a B-Tree) or move the whole dataset to a real database.
If you only want to make queries for the number the call originated from, you could store the data in a hash where the keys are the "call from" numbers and the value is an array, or another hash, containing the rest of the data. For example:
{ '4813243948': { call_to: 1234433948, duration: 1.3, line_type: 'Type2' }, ... }
If the dataset is very large, or you need more complex queries, it might be better to store it in a database and just query it directly.

find endpoints for range given a value within the range

I am trying to solve a simple problem, but at the moment I cannot think of a better solution. I am testing an API that is not documented.
There is an ID used to fetch objects and it has a min and max value with random values missing in-between. I'm trying to test the responses I receive for random objects, but to find objects, I need to have valid IDs.
It would be very inefficient to test random numbers and hope that I get an object back. The best I can do is find a range, get a random number between that range and check if it exists before conducting tests.
A sample list of all of the IDs in the database might look like this:
[1005, 25984, 25986, 29587, 30000, ...]
Assuming the deviation from one value to another will never exceed C, e.g. from the first value to the next value, the difference will never be greater than a pre-defined constant, how would you calculate the min/max of the range given only one value in the range?
Starting from a given value and looping until the last value is found is horrible but that is how it was implemented by previous devs. Below is pseudocode that more or less covers what they do.
// this can be any valid object ID from the database
// assuming the ID's in the database are [1005, 25984, 25986, 29587, 30000]
// "i" could be any one of these values
var i = givenPredefinedObjectId;
var deviation = 100;
// objectWithIdExists() is going to lookup an object with the ID "i" in the database
// if there is no object with the ID "i" , it will return false
// otherwise the object will get tested and return true
while(objectWithIdExists(i)){
i++;
}
for(i; i < i+deviation; i++){
if(objectWithIdExists(i)){
goto while loop;
}
}
endPoint = i - deviation;
Assuming there is no knowledge about the possible values except you can check if they exist and you are given one valid value (there is no array with all possible IDs, that was just an example), how would you find the min/max values?
Unbounded binary search is feasible, with a factor of C slowdown. Given an algorithm for unbounded binary search that, given access to the oracle less_equal(n) for some natural number n, returns n in time O(log n), implement the oracle on input k by querying all of the IDs C*k, C*k+1, ..., C*k+C-1 and reporting that k is less than or equal to n if and only if one ID is found. The running time is O(C*log((max-min)/C)).

Ruby - Optimize code finding the optimal choice from an array

I asked a question that was basically a knapsack problem - I needed to find the combination of several different array of objects that gave the optimal output. So for example, the highest sum "value" from the objects with respect to a limit on the "cost" of each object. The answer I received here was the following-
a.product(b,c)
.select{ |arr| arr.reduce(0) { |sum,h| sum + h[:cost] } < 30 }
.max_by{ |arr| arr.reduce(0) { |sum,h| sum + h[:value] } }
Which works great, but as I get into 6 arrays with ~40 choices each, the possible combinations get upwards of 4 million and take too long to process. I made some changes to the code that made processing faster -
#creating the array doesn't take too long
combinations = a.product(b,c,d,e)
possibles = []
combinations.each do |array_of_objects|
#max_cost is a numeric parameter, and I can't have the same exact object used twice
if !(array_of_objects.sum(&:salary) > max_cost) or !(array_of_objects.uniq.count < array_of_objects.count)
possibles << array_of_objects
end
end
possibles.max_by{ |ar| ar.sum(&:std_proj) }
Breaking it into two separate arrays helped the performance a lot as I only had to check the max_by for many less possible combinations that fit the criteria.
Does anyone see a way to optimize this code? Since I'm typically dealing with tens of thousands or millions of combinations, any little bit could greatly help. Thanks.
If we are talking about millions of rows, and the operations are like unique and max.
I suggest you to solve it by using DISINCT and MAX() in your query and You can even use WHERE filtering by cost.
Looping over the objects in Ruby, is clearly more expensive.

Count frequency of items in array - without two for loops

Need to know is there a way to count the frequency of items in a array without using two loops. This is without knowing the size of the array. If I know the size of the array I can use switch without looping. But I need more versatile than that. I think modifying the quicksort may give better results.
Array[n];
TwoDArray[n][2];
First loop will go on Array[], while second loop is to find the element and increase it count in two-d array.
max = 0;
for(int i=0;i<Array.length;i++){
found= false;
for(int j=0;j<TwoDArray[max].length;j++){
if(TwoDArray[j][0]==Array[i]){
TwoDArray[j][1]+=;
found = true;
break;
}
}
if(found==false){
TwoDArray[max+1][0]=Array[i];
TwoDArray[max+1][1]=1;
max+=;
}
If you can comment or provide better solution would be very helpful.
Use map or hash table to implement this. Insert key as the array item and value as the frequency.
Alternatively you can use array too if the range of array elements are not too large. Increase the count of value at indexes corresponding to the array element.
I would build a map keyed by the item in the array and with a value that is the count of that item. One pass over the array to build the map that contains the counts. For each item, look it's count up in the map, increment the count, and put the new count back into the map.
The map put and get operations can be constant time (e.g., if you use a hash map implementation with a good hash function and properly sized backing store). This means you can compute the frequencies in time proportional to the number of elements in your array.
I'm not saying this is better than using a map or hash table (especially not when there are lots of duplicates, though in that case you can get close to O(n) sorting with certain techniques, so this is not too bad either), it's just an alternative.
Sort the array
Use a (single) for-loop to iterate through the sorted array
If you find the same element as the previous one, increment the current count
If you find a different element, store the previous element and its count and set the count to 1
At the end of the loop, store the previous element and its count

indexing and comparing string index or hash

I want to clean up my music-library by giving attention to songs that have the most doubles on my system. I could just list them all, sort the and do it manually but that would take too long. I want the list to sort on the most possible duplicates. So if a song would have 10 duplicates it would mean there are 10 songnames that resemble each other and thus i would focus my attention to that song first to just keep the best version.
I could compare two songnames using the using the levenshtein string-comparison technique and gem
require 'levenshtein'
Levenshtein.distance("string1", "string2") => 1
But let's say i have x number of songs, i would have to compare each song x times because i can't rely on normal filesorting, i would miss some duplicates then. eg
The Beatles - Hey Jude
Beatles, The - hey jude
Beatles_-_Hey_Judy_(remastered)
should give beatles - hey judy (x3)
Is there a way to produce an index based on the filename that then can be sorted and would give all the duplicates in descending order ? A kind of hash that can be compared ?
I know of other music comparing methods but they have their flaws, and this would be usable to compare other type of files also.
Try to use this code
files is an array of filenames, max_distance is a maximum distance to consider the names similar.
hash = {}
files.each do |file|
similar = hash.keys.select { |f| Levenshtein.distance(f, file) < max_distance }
if similar.any?
hash[similar.first] += 1
else
hash.merge!({file => 0})
end
end
After that you will get hash, which have filenames as keys and "duplicates" count as values, and you can sort it as you want.

Resources