about Multi-Dimensional array - data-structures

what is multidimensional array?
Multidimensional arrays can be described as "arrays of arrays".
For example, a twoDMatrix array can be imagined as a twoDMatrix table made of elements, all of them of a same uniform data type.
int twoDMatrix [2 ][3 ] = { {5,3,2},{8,4,1} };
I want to know this answer is correct or not .

YES dude, your representation is correct.
In the case of array access,
int twoDMatrix [2 ][3 ] = { {5,3,2},{8,4,1} };
twoDMatrix[0][0] is 5 and twoDMatrix [1][1] is 4.
believe you got it.

Yes what you mean is true. Multidimensional arrays are in fact arrays of arrays which can be visualized in memory as below,
| 5 | 3 | 2 | 8 | 4 | 1 |
Which is same as,
int twoDMatrix [2 ][3] = {{5,3,2}, {8,4,1}};
equivalent to,
int twoDMatrix [6] = { 5,3,2,8,4,1 };

When using a 2d array its best to think of it as a grid where the first number represents column and the second represents row much like the data you would see in an excel table. At least that's what I do. Anything larger than a 3d array (think of a cube) becomes very confusing and possibly could be substituted with a different structure.

Related

Does a data structure like this exist?

I'm searching for a data structure that can be sorted as fast as a plain list and which should allow to remove elements in the following way. Let's say we have a list like this:
[{2,[1]},
{6,[2,1]},
{-4,[3,2,1]},
{-2,[4,3,2,1]},
{-4,[5,4,3,2,1]},
{4,[2]},
{-6,[3,2]},
{-4,[4,3,2]},
{-6,[5,4,3,2]},
{-10,[3]},
{18,[4,3]},
{-10,[5,4,3]},
{2,[4]},
{0,[5,4]},
{-2,[5]}]
i.e. a list containing tuples (this is Erlang syntax). Each tuple contains a number, and a list which includes the members of a list used to compute previous number. What I want to do with the list is the following. First, sort it, then take the head of the list, and finally clean the list. With clean I mean to remove all the elements from the tail that contain elements that are in the head, or, in other words, all the elements from the tail which intersection with head is not empty. For example, after sorting the head is {18,[4,3]}. Next step is removing all the elements of the list that contain 4 or 3, i.e. the resulting list should be this one:
[{6,[2,1]},
{4,[2]},
{2,[1]},
{-2,[5]}]
The process follows by taking the new head and cleaning again till the whole list is consumed. Note that if the the clean process preserves the order, there is no need to resorting the list each iteration.
The bottleneck here is the clean process. I would need some structure which allows me to do the cleaning in a faster way than now.
Does anyone know some structure that allows to do this in an efficient way without losing the order or at least allowing fast sorting?
Yes, you can get faster than this. Your problem is that you are representing the second tuple members as lists. Searching them is cumbersome and quite unnecessary. They are all contiguous substrings of 5..1. You could simply represent them as a tuple of indices!
And in fact you don't even need a list with these index tuples. Put them in a two-dimensional array right at the position given by the respective tuple, and you'll get a triangular array:
h\l| 1 2 3 4 5
---+----------------------
1 | 2
2 | 6 2
3 | -4 -6 -10
4 | -2 -4 18 2
5 | -4 -10 -10 0 -2
Instead of storing the data in a two-dimensional array, you might want to store them in a simple array with some index magic to account for the triangular shape (if your programming language only allows for rectangular two-dimensional arrays), but that doesn't affect complexity.
This is all the structure you need to quickly filter the "list" by simply looking the things up.
Instead of sorting first and getting the head, we simply iterate once through the whole structure to find the maximum value and its indices:
max_val = 18
max = (4, 3) // the two indices
The filter is quite simple. If we don't use lists (not (any (substring `contains`) selection)) or sets (isEmpty (intersect substring selection)) but tuples then it's just sel.high < substring.low || sel.low > substring.high. And we don't even need to iterate the whole triangular array, we can simple iterate the higer and the lower triangles:
result = []
for (i from 1 until max[1])
for (j from i until max[1])
result.push({array[j][i], (j,i)})
for (i from max[0] until 5)
for (j from i until 5)
result.push({array[j+1][i+1], (j+1,i+1)})
And you've got the elements you need:
[{ 2, (1,1)},
{ 6, (2,1)},
{ 4, (2,2)},
{-2, (5,5)}]
Now you only need to sort that and you've got your result.
Actually the overall complexity doesn't get better with the triangular array. You still got O(n) from building the list and finding the maximum. Whether you filter in O(n) by testing against every substring index tuple, or filter in O(|result|) by smart selection doesn't matter any more, but you were specifically asking about a fast cleaning step. This still might be beneficial in reality if the data is large, or when you need to do multiple cleanings.
The only thing affecting overall complexity is to sort only the result, not the whole input.
I wonder if your original data structure can be seen as an adjacency list for a directed graph? E.g;
{2,[1]},
{6,[2,1]}
means you have these nodes and edges;
node 2 => node 1
node 6 => node 2
node 6 => node 1
So your question can be rewritten as;
If I find a node that links to nodes 4 and 3, what happens to the graph if I delete nodes 4 and 3?
One approach would be to build an adjacency matrix; an NxN bit matrix where every edge is the 1-bit. Your problem now becomes;
set every bit in the 4-row, and every bit in the 4-column, to zero.
That is, nothing links in or out of this deleted node.
As an optimisation, keep a bit array of length N. The bit is set if the node hasn't been deleted. So if nodes 1, 2, 4, and 5 are 'live' and 3 and 6 are 'deleted', the array looks like
[1,1,0,1,1,0]
Now to delete '4', you just clear the bit;
[1,1,0,0,1,0]
When you're done deleting, go through the adjacency matrix, but ignore any edge that's encoded in a row or column with 0 set.
Full example. Lets say you have
[ {2, [1,3]},
{3, [1]},
{4, [2,3]} ]
That's the adjacency matrix
1 2 3 4
1 0 0 0 0 # no entry for 1
2 1 0 1 0 # 2, [1,3]
3 1 0 0 0 # 3, [1]
4 0 1 1 0 # 4, [2,3]
and the mask
[1 1 1 1]
To delete node 2, you just alter the mask;
[1 0 1 1]
Now, to figure out the structure, pseudocode like:
rows = []
for r in 1..4:
if mask[r] == false:
# this row was deleted
continue;
targets = []
for c in 1..4:
if mask[c] == true && matrix[r,c]:
# this node wasn't deleted and was there before
targets.add(c)
if (!targets.empty):
rows.add({ r, targets})
Adjacency matrices can get large - it's NxN bits, after all - so this will only better on small, dense matrices, not large, sparse ones.
If this isn't great, you might find that it's easier to google for graph algorithms than invent them yourself :)

Find max value 2d array N*N with fewer comparisons

I want to find the maximum value in two dimensional array N*N in C with fewer comparisons. I can do it simply with an O(N^2) algorithm, but I think it is too slow.
So, I thought about another way. I simply loop once and search by row and column at the same time, and try to reduce the complexity. (I guess O(2(n-1))) You can see in this picture what I'm trying to do.
I use the same loop to check the content of the columns and the rows.
What I want to know is there anything faster? Like Sort the 2D array with O(N log N) complexity? Assume the values are unsorted.
If the 2d array of M x M elements is not sorted in any way, then you're not going to do better than O(M^2).
Keep in mind that the matrix has M^2 elements, so sorting them will have complexity of O(M^2 log M^2), since most decent sorts are O(N log N) and here N = M^2.
Divide it up into [no, of cores] chunks. Get max. of each chunk in parallel. Pick the bones out of the results.
You could probably just cast the array to a 1D array and iterate over the flattened pointer...
I'll explain:
As you probably know, a 2D Array in the memory is stored in a flat state. The Array char c[4][2] looks like this:
| c[0][0] | | c[0][1] | | c[1][0] | | c[1][1] | | c[2][0] | ...
| Byte 1 | | Byte 2 | | Byte 3 | | Byte 4 | | Byte 5 | ...
In this example, c[1][1] == ((char*)c)[3].
For this reason, when all members are of the same type, it's possible to safely cast a 2D array to a 1D array, i.e.
int my_array[20][20];
for (int i = 0; i < 400 ; i++) {
((int *)(my_array))[i] = i;
}
// my_array[19][0] == 180;
As dbush points out (up vote his answer), If your matrix is M x M elements, then M^2 is the best you're going to get and flattening the array this way simply saves you from copying the memory over before any operations.
EDIT
Someone asked why casting the array to a 1D array might be better.
The idea is to avoid a nested inner loop, making the optimizer's work easier. It is more likely that the compiler will unroll the loop if it's only a single dimension loop and the array's size is fixed.
dbush certainly has the right answer in terms of complexity.
It should also be noted that if you want "faster" in terms of actual run time (not just complexity), you need to consider caching. Going down the rows and columns in parallel is very bad for data locality, and you will incur a cache miss when you iterate down a column if your data has relatively large rows. You have to touch every element at least once in order to find the max, and it would be fastest to touch them in a "row major" ordering.

Is row-major and column-major order really a property of a programming language

I think I have discovered a widespread misunderstanding (professors do it wrong!). People say that C and C++ represents matrices in row-major order and Fortran column-major order. But I doubt that C and C++ have build in major-order because there is no true matrix type? If I enter
int A[2][3] = { {1, 2, 3}
, {4, 5, 6} };
The order is row-major just because my editor is row-oriented rather than column-oriented. This has nothing to do with the language itself, or has it? If the editor were column-oriented:
i {
n { {
t 1 4
, ,
A 2 5
[ , ,
2 3 6
] } }
[ ;
3
]
=
Now the matrix A has two columns and three rows.
To illustrate further, consider a matrix print loop
for(int k=0; k<M; ++k)
{
for(int l=0; l<N; ++l)
{printf("%.7g\t",A[k][l]);}
putchar('\n');
}
Why does it print by row? Because '\n' moves to the next row rather than the next column. If '\n' were interpreted as "go to the next column and first row" and '\t' go to the next row, then A is printed column-wise. But I know that my terminal is row-oriented, so if I want to print column-wise, the only way is to swap these loops.
If A[k] logically represents a row or column depends on the functions that operates on A and then there is a trade-off what order to choose. For example gauss elimination walks rows{column,rows{column}}. The advantage of placing row-index first is that it makes it easer to swap rows when pivoting. However, to perform the pivoting one has to loop through all rows in the same column, which should be faster by choosing the opposite. The innermost elimination loop has access two rows at the time and neither is really good.
A better terminology probably is first-index indexing and last-index indexing. This is a pure language feature: First-index indexing refers to the situation when the first given index is supposed to increment slowest, while last-index indexing is the opposite. "Rows" and "columns" is an interpretation issue much like byte order and character encodings: The compiler will never know what a row or column is but it may have a language defined input order (Most languages happens to accept numeric constants in big endian order but my computer wants little endian). These terms come from conventions in the environment and library routines.
This has nothing to do with how your text editor works, and everything to do with how the elements of the 2D array are laid out in memory. That in turn determines whether it is more efficient for nested loops (looping over all the elements of the matrix) are more efficient with the row loop as the inner loop or with the column loop as the inner loop.
As one commenter suggested, it's really just the order of indices in the array access syntax that makes C row-major.
Here's a better example program that initialises a 2D array using a flat list of values.
#include <stdio.h>
#include <string.h>
int main() {
int data[9] = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
int arr[3][3];
memcpy(arr, data, sizeof(int)*9);
printf("arr[0][1] = %d\n", arr[0][1]);
}
So now we can avoid any confusion added by the 2D array declaration syntax, or how that syntax is laid out in the text editor. We are just concerned with how C interprets the linear list of values that we have shoved into memory.
And if we run the program we will see:
$ ./a.out
arr[0][1] = 2
This is what makes C row-major. The fact that the array syntax is interpreted as [row][column] when accessing data in memory.

Traverse an n-dimensional array when dimensions are variable

I need to traverse an n-dimensional array. The array is build and passed from another function and number of dimensions is not known in advance. This needs to be done in a primitive language similar to VBA. So, no python sort of goodness is present.
Does anyone know how this can be accomplished?
A sample array could be like a 5 x 6 x 1 x 8 array. So, it is a 4 dimensional array with dimension1=5, dimention2=6, dimension3=1 and dimension4=8.
I need to traverse each of the 5*6*1*8= 240 elements and record my results somehow so that I can relate my results with elements.
EDIT: TO make it more clear, at the end of the traversal, I want to be able to say that element at position (2,3,1,5) is x. So, I need to record the position of the element within the array and the element itself.
THe array in question is more like this
`Global multiArray as Variant
'\Now, lots of other functions, when find eligible candidates, add arrays to this array '\like below.
Redim multiarray(len(multiArray)+1)
multiArray(len(multiarray))= newElementArray()
`
So, I end up with something like below. Only that the dimensions will change at runtime, So, I need a generic logic to traverse it.
Let a coordinate represent a location of an element in an n-dimensional array. For example (2,1,3,4) corresponds to an element in position: array[2][1][3][4].
var array = // n-dimensional
function traverse(array, coordinate, dimension);
for(var i = 0 ; i < array.length ; i++){
// assuming coordinate is immutable. Append the current iteration's index.
currentCoordinate = coordinate.add(i);
if(dimension == 1){
doSomething(currentCoordinate, array[i]);
}else{
traverse(array[i], currentCoordinate, dimension(array[i]));
}
}
}
coordinate = []; // at first, the top level coordinate is empty.
traverse(array, coordinate, 4); // 4-dimensional
The implementation will depend on whether it is a multi-dimensional array or a jagged array (array of arrays) as ggreiner shows.
If you only need to traverse the values of the array it can be as simple as:
(C#)
int[, ,] arr = new int[1, 3, 2] { { { 1, 2 }, { 3, 4 }, { 5, 6 } } };
foreach(int i in arr)
Console.WriteLine(i);

Efficient Indexing method for a 2 by 2 matrix

If I fill numbers from 1 to 4 in a 2 by 2 matrix, there are 16 possible combinations. What I want to do is store values in an array of size 24 corresponding to each matrix. So given a
2 by 2 matrix, I want a efficient indexing method to index into the array directly.( I dont want comparing all 4 elements for each of 16 positions). Something similar to bit vector ? but not able to figure out how?
I want it for a 4 by 4 matrix also filling from 1 to 9
to clarify: you're looking for an efficient hash function for 2x2 matrices. you want to use the results of the hash function to compare matrices to see if they're the same.
first, lets assume you actually want the numbers 0 to 3 instead of 1 to 4 - this makes it easier, and is more computer-sciency. Next, 16 is not right. there are 24 possible permutations of the numbers 0-3. There are 4^4 = 256 possible strings of length 4 that use a four-letter alphabet (you can repeat already-used numbers).
either one is trivial to encode into a single byte. Let the first 2 bits represent the (0,0) position, the next 2 bits represent (0,1), and so forth. Than, to hash your 2x2 matrix, simply do:
hash = m[0][0] | (m[0][1] << 2) | (m[1][0] << 4) | (m[1][1] << 6
random example: the number 54 in binary is 00110110 which represents a matrix like:
2 1
3 0
When you need efficiency, sometimes code clarity goes out the window :)
First you need to be sure you want efficiency - you have profiling info to be sure that the simple comparison code is too inefficient for you?
You can simply treat it as an array of bytes of the same size. memcmp does comparisons of arbitary memory:
A data structure such as:
int matrix[2][2];
is stored the same as:
int matrix[2*2];
which could be dynamically allocated as:
typedef int matrix[2*2];
matrix* m = (matrix*)malloc(sizeof(matrix));
I'm not suggesting you dynamically allocate them, I'm illustrating how the bytes in your original type is actually layed out in memory.
Therefore, the following is valid:
matrix lookup[16];
int matrix_cmp(const void* a,const void* b) {
return memcmp(a,b,sizeof(matrix));
}
void init_matrix_lookup() {
int i;
for(i=0; i<16; i++) {
...
}
qsort(lookup,16,sizeof(matrix),matrix_cmp));
}
int matrix_to_lookup(matrix* m) {
// in this example I'm sorting them so we can bsearch;
// but with only 16 elements, its probably not worth the effort,
// and you could safely just loop over them...
return bsearch(m,lookup,16,sizeof(matrix),matrix_cmp);
}

Resources