Linear Matrix Sorting

Linear Matrix Sorting - sorting

I have a linear vector, within which contains the unit coordinates of any given block. The coordinate system is shown below. What would be the best method for sorting the list if I wanted the final list starting from the bottom left and ending at the top right? Changing the data structure is not an option.
Any means of explanation will do; pseudo, code, rough idea, etc.

Your options are as follows.
Column major
Order by y descending, then by x increasing, using a dictionary ordering.
Row major
Order by x increasing, then by y descending, using a dictionary ordering.
To illustrate, here is a C-style example comparison operation for option 1.
int compare(point a, point b)
{
if (a.y > b.y)
return -1;
else if (a.y < b.y)
return 1;
else if (a.x < b.x)
return -1;
else if (a.x > b.x)
return 1;
else
return 0;
}
It appears based on the "#n" labels on the blocks that you want option 1, column major.

Related

Building adjacency graph out of chessboard (for dijkstra)

I ran into a problem where I wanted to add a little feature to my homework and it turned out to be overwhelming for me (read bold sentences for question without context).
My program has a list of about 35 items in it, containing information about the map I'm supposed to work with. It can have the following elements:
"Wall" with its coordinates (X, Y), in the dijkstra it is supposed to have weight 100.
"Tree" with cords (X, Y), weight 3
I have a 10x10 map laid out like a chessboard, which means 100 tiles, and 35 items. "Nothing" in the list means weight 1 in dijkstra (it means a normal route)
To make the dijkstra work and be able to find a shortest path between two tiles, I have to build an adjacency graph. My problem here is, how to define tiles "around" the current tile, if all I have is that list?
Only adjacent tiles in the shape of "+" have edges in the graph between them, but I have to run through the list every time to check if there is something on it?
Any lead on the problem would be greatly appreciated, also if you can point me to a source with a code sample, that could also do. I only see really messy code with a lot of "if-elseif-elseif..." to solve that.
Thank you for your time!
Edit: I ended up using #kraskevich 's suggested way, and it works excellent, but all the answers and suggestions were really useful, thank you very much everyone!

You don't really need to build a graph. Just create a 10x10 table and put the weight of the corresponding items into it:
board = 10x10 array filled with 1
for item in list:
if item is a tree:
board[item.row][item.column] = 3
else if item is a wall:
board[item.row][item.column] = 100
After that, you can treat pairs (row, col) of coordinates as vertices and update the distance for 4 adjacent cells when you process a tile. That's it. Sure, you can also create a graph with 100 vertices and add edges explicitly from a tile to all 4 adjacent cells (the weight is the weight of the end of the edge tile) and use a standard implementation.
The most convenient way to iterate over adjacent cells is as follows:
delta_rows = [-1, 1, 0, 0]
delta_cols = [0, 0, -1, 1]
...
for direction = 0 .. 3
new_row = row + delta_rows[direction]
new_col = col + delta_cols[direction]
if is_valid(new_row, new_col)
// do something

It should be straightforward to implement Dijkstra based on this generic graph interface:
interface Graph<T> {
Iterable<T> adjacentNodes(T node);
double getDistance(T node, T adjacent);
}
So now all we need to do is to fill it for your case:
class Field {
final int x;
final int y;
int value = 1;
Field (int x, int y) {
this.x = x;
this.y = y;
}
}
class ChessboardGraph implements Graph<Field> {
Field[][] board = new Filed[10][10];
ChessboardGraph(List<Entry> list) {
for (int x = 0; x < 10; x++) {
for (int y = 0; y < 10; y++) {
board[x][y] = new Field(x, y);
}
}
for (Entry e: list) {
board[e.x][e.y].value = e.value == TREE ? 3 : 100;
}
}
Iterable<Field> adjacentNodes(Field node) {
ArrayList result = new ArrayList<>();
int x = node.x;
int y = node.y;
if (x > 0) {
if (y > 0) {
result.add(board[x - 1][y - 1]);
}
if (y < 9) {
result.add(board[x - 1][y + 1]);
}
}
if (x < 9) {
if (y > 0) {
result.add(board[x + 1][y - 1]);
}
if (y < 9) {
result.add(board[x + 1][y + 1]);
}
}
}
double getDistance(Field node, Field adjacent) {
assert Math.abs(node.x - adjacent.x) + Math.abs(node.y - adjacent.y) == 1;
return board[adjacent.x][adjacent.y];
}
}

Efficient random sampling of constrained n-dimensional space

I'm about to optimize a problem that is defined by n (n>=1, typically n=4) non-negative variables. This is not a n-dimensional problem since the sum of all the variables needs to be 1.
The most straightforward approach would be for each x_i to scan the entire range 0<=x_i<1, and then normalizing all the values to the sum of all the x's. However, this approach introduces redundancy, which is a problem for many optimization algorithms that rely on stochastic sampling of the solution space (genetic algorithm, taboo search and others). Is there any alternative algorithm that can perform this task?
What do I mean by redundancy?
Take two dimensional case as an example. Without the constrains, this would be a two-dimensional problem which would require optimizing two variables. However, due to the requirement that X1 + X2 == 0, one only needs to optimize one variable, since X2 is determined by X1 and vice versa. Had one decided to scan X1 and X2 independently and normalizing them to the sum of 1, then many solution candidates would have been identical vis-a-vis the problem. For example (X1==0.1, X2==0.1) is identical to (X1==0.5, X2==0.5).

If you are dealing with real valued variables then arriving with 2 samples that become identical is quite unlikely. However you do have the problem that your samples would not be uniform. You are much more likely to choose (0.5, 0.5) than (1.0, 0). Oneway of fixing this is subsampling. Basically what you do is that when you are shrinking space along a certain point, you shrink the probability of choosing it.
So basically what you are doing is mapping all the points that are inside the unit cube that satisfy that are in the same direction, map to a single points. These points in the same direction form a line. The longer the line, the larger the probability that you will choose the projected point. Hence you want to bias the probability of choosing a point by the inverse of the length of that line.
Here is the code that can do it(Assuming you are looking for x_is to sum up to 1):
while(true) {
maximum = 0;
norm = 0;
sum = 0;
for (i = 0; i < N; i++) {
x[i] = random(0,1);
maximum = max(x[i], max);
sum += x[i];
norm += x[i] * x[i];
}
norm = sqrt(norm);
length_of_line = norm/maximum;
sample_probability = 1/length_of_line;
if (sum == 0 || random(0,1) > sample_probability) {
continue;
} else {
for (i = 0; i < N; i++) {
x[i] = x[i] /sum;
}
return x;
}

Here is the same function provided earlier by Amit Prakash, translated to python
import numpy as np
def f(N):
while(True):
count += 1
x = np.random.rand(N)
mxm = np.max(x)
theSum = np.sum(x)
nrm = np.sqrt(np.sum(x * x))
length_of_line = nrm / mxm
sample_probability = 1 / length_of_line
if theSum == 0 or rand() > sample_probability:
continue
else:
x = x / theSum
return x

Fastest way to modify one digit of an integer

Suppose I have an int x = 54897, old digit index (0 based), and the new value for that digit. What's the fastest way to get the new value?
Example
x = 54897
index = 3
value = 2
y = f(x, index, value) // => 54827
Edit: by fastest, I definitely mean faster performance. No string processing.

In simplest case (considering the digits are numbered from LSB to MSB, the first one being 0) AND knowing the old digit, we could do as simple as that:
num += (new_digit - old_digit) * 10**pos;
For the real problem we would need:
1) the MSB-first version of the pos, that could cost you a log() or at most log10(MAX_INT) divisions by ten (could be improved using binary search).
2) the digit from that pos that would need at most 2 divisions (or zero, using results from step 1).
You could also use the special fpu instruction from x86 that is able to save a float in BCD (I have no idea how slow it is).
UPDATE: the first step could be done even faster, without any divisions, with a binary search like this:
int my_log10(unsigned short n){
// short: 0.. 64k -> 1.. 5 digits
if (n < 1000){ // 1..3
if (n < 10) return 1;
if (n < 100) return 2;
return 3;
} else { // 4..5
if (n < 10000) return 4;
return 5;
}
}

If your index started at the least significant digit, you could do something like
p = pow(10,index);
x = (x / (p*10) * (p*10) + value * p + x % p).
But since your index is backwards, a string is probably the way to go. It would also be more readable and maintainable.

Calculate the "mask" M: 10 raised to the power of index, where index is a zero-based index from the right. If you need to index from the left, recalculate index accordingly.
Calculate the "prefix" PRE = x / (M * 10) * (M * 10)
Calculate the "suffix" SUF = x % M
Calculate the new "middle part" MID = value * M
Generate the new number new_x = PRE + MID + POST.
P.S. ruslik's answer does it more elegantly :)

You need to start by figuring out how many digits are in your input. I can think of two ways of doing that, one with a loop and one with logarithms. Here's the loop version. This will fail for negative and zero inputs and when the index is out of bounds, probably other conditions too, but it's a starting point.
def f(x, index, value):
place = 1
residual = x
while residual > 0:
if index < 0:
place *= 10
index -= 1
residual /= 10
digit = (x / place) % 10
return x - (place * digit) + (place * value)
P.S. This is working Python code. The principle of something simple like this is easy to work out, but the details are so tricky that you really need to iterate it a bit. In this case I started with the principle that I wanted to subtract out the old digit and add the new one; from there it was a matter of getting the correct multiplier.

You gotta get specific with your compute platform if you're talking about performance.
I would approach this by converting the number into pairs of decimal digits, 4 bit each.
Then I would find and process the pair that needs modification as a byte.
Then I would put the number back together.
There are assemblers that do this very well.

Minimum Window for the given numbers in an array

Saw this question recently:
Given 2 arrays, the 2nd array containing some of the elements of the 1st array, return the minimum window in the 1st array which contains all the elements of the 2nd array.
Eg :
Given A={1,3,5,2,3,1} and B={1,3,2}
Output : 3 , 5 (where 3 and 5 are indices in the array A)
Even though the range 1 to 4 also contains the elements of A, the range 3 to 5 is returned Since it contains since its length is lesser than the previous range ( ( 5 - 3 ) < ( 4 - 1 ) )
I had devised a solution but I am not sure if it works correctly and also not efficient.
Give an Efficient Solution for the problem. Thanks in Advance

A simple solution of iterating through the list.
Have a left and right pointer, initially both at zero
Move the right pointer forwards until [L..R] contains all the elements (or quit if right reaches the end).
Move the left pointer forwards until [L..R] doesn't contain all the elements. See if [L-1..R] is shorter than the current best.
This is obviously linear time. You'll simply need to keep track of how many of each element of B is in the subarray for checking whether the subarray is a potential solution.
Pseudocode of this algorithm.
size = bestL = A.length;
needed = B.length-1;
found = 0; left=0; right=0;
counts = {}; //counts is a map of (number, count)
for(i in B) counts.put(i, 0);
//Increase right bound
while(right < size) {
if(!counts.contains(right)) continue;
amt = count.get(right);
count.set(right, amt+1);
if(amt == 0) found++;
if(found == needed) {
while(found == needed) {
//Increase left bound
if(counts.contains(left)) {
amt = count.get(left);
count.set(left, amt-1);
if(amt == 1) found--;
}
left++;
}
if(right - left + 2 >= bestL) continue;
bestL = right - left + 2;
bestRange = [left-1, right] //inclusive
}
}

What string similarity algorithms are there?

I need to compare 2 strings and calculate their similarity, to filter down a list of the most similar strings.
e.g. searching for "dog" would return
dog
doggone
bog
fog
foggy
e.g. searching for "crack" would return
crack
wisecrack
rack
jack
quack
I have come across:
QuickSilver
LiquidMetal
What other string similarity algorithms are there?

The Levenshtein distance is the algorithm I would recommend. It calculates the minimum number of operations you must do to change 1 string into another. The fewer changes means the strings are more similar...

It seems you are needing some kind of fuzzy matching. Here is java implementation of some set of similarity metrics http://www.dcs.shef.ac.uk/~sam/stringmetrics.html. Here is more detailed explanation of string metrics http://www.cs.cmu.edu/~wcohen/postscript/ijcai-ws-2003.pdf it depends on how fuzzy and how fast your implementation must be.

If the focus is on performance, I would implement an algorithm based on a trie structure
(works well to find words in a text, or to help correct a word, but in your case you can find quickly all words containing a given word or all but one letter, for instance).
Please follow first the wikipedia link above.Tries is the fastest words sorting method (n words, search s, O(n) to create the trie, O(1) to search s (or if you prefer, if a is the average length, O(an) for the trie and O(s) for the search)).
A fast and easy implementation (to be optimized) of your problem (similar words) consists of
Make the trie with the list of words, having all letters indexed front and back (see example below)
To search s, iterate from s[0] to find the word in the trie, then s[1] etc...
In the trie, if the number of letters found is len(s)-k the word is displayed, where k is the tolerance (1 letter missing, 2...).
The algorithm may be extended to the words in the list (see below)
Example, with the words car, vars.
Building the trie (big letter means a word end here, while another may continue). The > is post-index (go forward) and < is pre-index (go backward). In another example we may have to indicate also the starting letter, it is not presented here for clarity.
The < and > in C++ for instance would be Mystruct *previous,*next, meaning from a > c < r, you can go directly from a to c, and reversely, also from a to R.
1. c < a < R
2. a > c < R
3. > v < r < S
4. R > a > c
5. > v < S
6. v < a < r < S
7. S > r > a > v
Looking strictly for car the trie gives you access from 1., and you find car (you would have found also everything starting with car, but also anything with car inside - it is not in the example - but vicar for instance would have been found from c > i > v < a < R).
To search while allowing 1-letter wrong/missing tolerance, you iterate from each letter of s, and, count the number of consecutive - or by skipping 1 letter - letters you get from s in the trie.
looking for car,
c: searching the trie for c < a and c < r (missing letter in s). To accept a wrong letter in a word w, try to jump at each iteration the wrong letter to see if ar is behind, this is O(w). With two letters, O(w²) etc... but another level of index could be added to the trie to take into account the jump over letters - making the trie complex, and greedy regarding memory.
a, then r: same as above, but searching backwards as well
This is just to provide an idea about the principle - the example above may have some glitches (I'll check again tomorrow).

You could do this:
Foreach string in haystack Do
offset := -1;
matchedCharacters := 0;
Foreach char in needle Do
offset := PositionInString(string, char, offset+1);
If offset = -1 Then
Break;
End;
matchedCharacters := matchedCharacters + 1;
End;
If matchedCharacters > 0 Then
// (partial) match found
End;
End;
With matchedCharacters you can determine the “degree” of the match. If it is equal to the length of needle, all characters in needle are also in string. If you also store the offset of the first matched character, you can also sort the result by the “density” of the matched characters by subtracting the offset of the first matched character from the offset of the last matched character offset; the lower the difference, the more dense the match.

class Program {
static int ComputeLevenshteinDistance(string source, string target) {
if ((source == null) || (target == null)) return 0;
if ((source.Length == 0) || (target.Length == 0)) return 0;
if (source == target) return source.Length;
int sourceWordCount = source.Length;
int targetWordCount = target.Length;
int[,] distance = new int[sourceWordCount + 1, targetWordCount + 1];
// Step 2
for (int i = 0; i <= sourceWordCount; distance[i, 0] = i++);
for (int j = 0; j <= targetWordCount; distance[0, j] = j++);
for (int i = 1; i <= sourceWordCount; i++) {
for (int j = 1; j <= targetWordCount; j++) {
// Step 3
int cost = (target[j - 1] == source[i - 1]) ? 0 : 1;
// Step 4
distance[i, j] = Math.Min(Math.Min(distance[i - 1, j] + 1, distance[i, j - 1] + 1), distance[i - 1, j - 1] + cost);
}
}
return distance[sourceWordCount, targetWordCount];
}
static void Main(string[] args){
Console.WriteLine(ComputeLevenshteinDistance ("Stackoverflow","StuckOverflow"));
Console.ReadKey();
}
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Linear Matrix Sorting - sorting

Related

Building adjacency graph out of chessboard (for dijkstra)

Efficient random sampling of constrained n-dimensional space

Fastest way to modify one digit of an integer

Minimum Window for the given numbers in an array

What string similarity algorithms are there?

Categories

Resources