Array Math on n dimensional array [closed] - ruby

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I've lost all my hair on this. I've got a 3 dimensional array.
Initialized as Array.new(rows) { Array.new(columns) { Array.new(CHANNELS, 0) } }
Everything seems to work, but when i try to add columns of padding, i can't figure out how the 2nd dimension gets whacked.
i've done this about 5 different ways and keep coming up with the wrong size for the second dimension. The first part works okay, i initialize an array stack_edge to be an array of 1xn pixels and push/unsift it to the beginning and end of image_data. which then becomes and array 0...pads...original height...original_height_2*pads) rows.
But then i try and push & unshift pixexls onto the columns of each row and get an array that thinks it's wider than it is. It reports a width of 110 pixels wider than the original. I can't figure out where the other 100 pixels come from. They're not there, never notice before since i calculate the with instead of interrogating for it. (old_width+2*pad_s) worked and all the data appears to be in place, but width= #image_data[row].size, whacks out with the 110 pixel size. I'm guessing it's because the pixel i'm pushing on is a 10x1 array, and i put 5 in the front and 5 in the back, so 110 by some strange math. Can you tell me what i'm doing wrong?
(0...pad_s).each {
#image_data.unshift(stack_edge)
#image_data.push(stack_edge)
}
self.rows=#image_data.size
edge=Array.new(image_data[0][0].size)
a='whats up'
(0...#image_data.size).each { |i|
(0...pad_s).each{
#image_data[i].unshift(edge)
#image_data[i].push([edge)
}
}

You are pushing/unshifting the same stack_edge array 5 times on the front and 5 times on the back of #image_data. So when you run over #image_data and push/unshift 10 "edges" onto each array in #image_data, those 10 edges are added to stack_edge 10 times. Because the same stack_edge appears in 10 different positions in #image_data. Get it?
What you need is:
#image_data.unshift(stack_edge.dup)
#image_data.push(stack_edge.dup)
This is what is called an "aliasing bug". They tend to be a problem in OO languages.

Related

How To Empty a Dynamic Array [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I need to re-use a dynamic arrays many times as I consider it a better performance.
Hence, I don't need to create a new dynamic array every time I need it.
I want to ask if it can lead to bugs and inefficiency if I use the same array for several instructions then clear it and reuse it? And how can I correct my procedure, so, it might approach my need.
My code :
procedure Empty(local_array : array of Integer);
var
i : Integer;
begin
for i:= 0 to high(local_array) do
local_array[i]:= nil;
Setlength(local_array, 0);
end;
If you want to reuse your array don't mes with its size. Changing the size of an array or more specifically increasing it is what could lead to the need for data reallocation.
What is array data reallocation?
In Delphi all arrays need to be stored in continuous memory block. This means that if you are trying to increase the size of your array and there already some data after memory block that is currently assigned to your array the whole array needs to be moved to another memory location where there is enough space to store the new array size in one continuous memory block.
So instead of resizing your array leave its size alone and just set value of array items to some default value. Yes this means that such array will still occupy its allocated memory. But that is goal of reusing such array as you avoid overhead for allocating/deallocating memory to your array.
If you go this way don't forget to store your own count of used items in your array since its length may be larger than the number of item actually used.

From Log value to Exponential value, huge Distortion for prediction of machine learning algorithm [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I build a machine learning algorithms to predict Y' value. For this, I used Log value of Y for data scaling.
As I got the predicted Y' and actual Y value, I have to convert Log value of Y&Y' to Exponential value.
BUT, there was so huge distortion from the values over exp7 (=ln1098)... It makes a lot of MSE(error).
How can I avoid this huge distortion?? (Generally, I need to get values over 1000)
Thanks!!
For this, I used Log value of Y for data scaling.
Not for scaling, but to make target variable distribution normal.
If your MSE arises when real target value arises too - it means that the model simply can't fit enough on big values. Usually it can be solved by cleaning data (removing outliers). Or take another ML-model.
UPDATE
You can run KFold and for each fold calculate MSE/MAE between predicted and real values. Then take big errors and take a look which parameters/features this cases have.
You can eliminate cases with big errors, but it's usually dangerous.
In general bad fit on big values mean that you did not remove outliers from your original dataset. Plot histograms and scatter plots and make sure that you don't have them.
Check categorical variables: maybe you have small values (<=5%). If so, group them.
Or you need to create 2 models: one for small values, one for big ones.

How to store set of numbers [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I have a set of 1,000,000 unique numbers. The numbers are in the interval between 0 an 50,000,000. Consider, the numbers are random. I need a data structure which would hold them all. The data-structure should require as little memory as possible. It should be possible to find quickly whether the number is in the set with no errors.
I found a solution with bloom filter. Yes, bloom filter has a probability of false positives, but since there are "just" 50,000,000 possible numbers, I can find all the errors and keep them in the std::set. By this method, I'm able to store all the numbers in 2.3MB of memory.
Can you find a better method?
Rather than a range of 0 to 50,000,000, how about 1,024 separate ranges of 65,536? That'd give you a 64 MB range. I suppose you can make it 763 rather than 1,024, which will give you 50,003,968.
Something like ushort[763][];
Now you're storing 1,000,000 16-bit values rather than 32-bit values.
The values in the rows are sorted. So to determine if a number is in the set, you divide the number by 763 to figure out which array to look in, and then do a binary search on number % 65536.
Storage for the numbers themselves is 2,000,000 bytes. Plus a small amount of overhead for the arrays.
This will be faster in execution, smaller than your Bloom filter approach, no false positives, and a whole lot easier to implement.
The minimum space to store such a vector in general is 884,002 bytes. That stores an integer index (a very large integer) into the list of all possible choices of 1,000,000 out of 50,000,000.
You can get close to that with a simple, fast byte encoding. Given the sorted list of numbers, replace each number with the difference from the last number. (Assume that -1 precedes the first number.) The differences are all one or more, so subtract one. If the result is 254 or less, code it as a byte. Otherwise, write 255, and follow with two bytes with a larger difference minus 255. If it doesn't fit in that, then write three 255's, and follow with three bytes with the difference. This will almost always code the vector in less than 1,012,000 bytes.

how to write algorithm to insert multiple elements in queue [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
How can we write an algorithm to add multiple elements say 5 elements {1,2,3,4,5} in an queue
I searched a lot but found algorithm to insert only one item but I don't know how to run a loop to insert multiple elements.
the algorithm to insert one item which I found is
Start
Check if the Queue is full or not if(rear=N-1) THEN print “Queue is Full” and exit else goto step 3
Increment the rear
++rear;
Add the item at the ‘rear’ position Q[rear]= item;
Exit
0 Start
1 Initialize index variable to 0
2 Check if the number of the elements inserted (iterations) is equal M (where M is the number of the elements to insert). If it is, go to the step 7.
3 Check if the Queue is full or not if(rear=N-1) THEN print “Queue is Full” and exit
4 Increment the rear ++rear;
5 Add the item at the ‘rear’ position Q[rear]= items[i];
6 Increment index variable and go to the step 2
7 Exit
Alternatively, you could check if the queue has space to put M elements before the loop. Steps from 1 to 6 can be implemented using for loop (of course, any other loop should do the trick).

Separate objects from binary volume [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I'm using MATLAB.
I have a three dimensional array filled with logicals. This array represents data of a cylinder with N uniformly shaped, but arbitrary orientated staples in it. The volume is discretized in voxels (3 dimensional pixels) and a logical '1' means 'at this point in the cylinder IS a part of a staple', while a '0' means 'at this point in the cylinder is air'.
The following picture contains ONE two dimensional slice of the full volume. Imagine the complete volume composed of such slices. White means '1' and black means '0'.
To my problem now: I have to separate each staple as good as possible.
The output products should be N three dimensional arrays with only the voxels belonging to a certain staple being '1', everything else '0'. So that I have arrays that only contain the data of one staple.
The biggest problem is, that '1's of different staples can lie next to each other (touching each other and being entangled), making it difficult to decide to which staple they belong to.
Simplifying is the fact, that boundary voxels of a staple may be cut away, I can work with any output array which preserves the approximate shape of the original staple.
Maybe somebody of you can provide an idea how such a problem could be solved, or even name me algorithms which I can take a look at.
Thanks in advance.
Since the staples are many pixel objects, you can reduce noise using 3d median filtering or bwareaopen to start with. Then bwlabeln can be used to label connected components in the binary array. Then you can use
REGIONPROPS to further analyze each connected object, and see if this is a standalone staple or more. This can be done using features such as 'Perimeter' to identify different cases, but you'll have to investigate yourself these and other regionprops features .

Resources