Size limitation when drawing to CanvasRenderingContext2D - html5-canvas

I HAVE HEAVILY EDITED THIS QUESTION TO PROVIDE MORE DETAILS
I came across a limitation when drawing to CanvasRenderingContext2D via EaselJS framework. I have objects like this:
but when the position of those objects surpasses couple million pixels the drawings start to crumble apart. This is the same object with x position 58524928. (The parent container is moved to -58524928 so that we can see the object on stage.) The more I offset the object the more it will crumble. Also when I try to move the object - drag it with mouse - it will "jump" like it was subjected to a large grid.
This is EaseJS framework and the shapes are ultimately drawn to the CanvasRenderingContext2D via the drawImage() method. Here is snippet from the code:
ctx.drawImage(cacheCanvas, this._cacheOffsetX+this._filterOffsetX, this._cacheOffsetY+this._filterOffsetY, cacheCanvas.width/scale, cacheCanvas.height/scale);
I suppose it has something to do with the limited number of real numbers in JavaScript:
Note that there are infinitely many real numbers, but only a finite
number of them (18437736874454810627, to be exact) can be represented
exactly by the JavaScript floating-point format. This means that when
you're working with real numbers in JavaScript, the representation of
the number will often be an approximation of the actual number.
Source: JavaScript: The Definitive Guide
Can someone confirm/reject my assumption? 58 million (58524928) does not seems so much to me, is it some inefficiency of EaselJS or it is a limit of the Canvas?
PS:
Scaling has no effect. I have drawn everything 1000 times smaller and 1000 times closer with no effect. Equally, if you scale the object up 1000 times while still x:58 million it will not look crumbled. But move it to 50 billion and you are where you started. Basically offset divided by size is constant limit for details.
EDIT
Here is example jsfiddle.net/wzbsbtgc/2. Basically there are two separate problems
If I use huge numbers as parameters for the drawing itself (red curve) it will be distorted. This can be avoided by using smaller numbers and moving the DisplayObject instead (blue curve).
In both cases it is not possible to move the DisplayObject by 1px. I think this is explained in GameAlchemist's post.
Any advice/workaround for the second problem is welcome.

It appears that Context2D uses lower precision numbers for transforms. I haven't confirmed the precision yet, but my guess is that it is using floats instead of doubles.
As such, with higher values, the transform method (and other similar C2D methods) that EaselJS relies on loses precision, similar to what GameAlchemist describes. You can see this issue reproduced using pure C2D calls here:
http://jsfiddle.net/9fLff2we/
The best workaround that I can think of, would be to precalculate the "final" values external to the transform methods. Normal JS numbers are higher precision than what C2D is using, so this should solve the issue. Really rough example to illustrate:
http://jsfiddle.net/wzbsbtgc/3/

The behavior that you see is related to the way figures are represented in the IEEE 754 standard.
While Javascript uses 64bits floats, WebGL uses only 32bits float, and since most (?all?) canvases are webGL accelerated, all your numbers will be (down)converted before the draw.
The IEEE 754 32 bits standard uses 32 bits to represent a number : 1 bit for sign, 8 exponent bits, and then only 23 bits for the mantissa.
Let's call IEEE_754_32max :
IEEE_754_32max = ( 1 << 23 ) -1 = 8.388.6071 (8+ millions)
We can have full precision for integers only in the [-IEEE_754_32max, IEEE_754_32max] range.
Beyond that point, the exponent will be used, and we'll loose the weak bits of the mantissa.
For instance ( 10 millions + 1 ) = 10.000.001 is too big, it can't fit into 23 bits,so it will be stored as
10.000.001 = 10.000.00• * 10 = 1e7 = 10.000.000
- We lost the final '1' -
The grid effect that you see is linked to the exponent being used /precision being lost : with figures such as 58524928, we need 26 bits to represent the figure. So 3 bits are lost, and we have, for instance :
58524928 + 7 == 58524928
So when using a figure that is near from 58524928, it will either be rounded to 58524928, OR 'jump' to the nearest possible figure : you have your grid effect.
Solution ?
-->> Change the units you are using for your applications, to have much smaller figures. Maybe you're using mm --> use meters or kilometers.
Mind that the precision you are using is an illusion : display resolution is the first limit, and the mouse is 1 pixel precise at most, so even with a 4K display, there's no way 32 bit floats can be a limit.
Choose the right measure unit to fit your all your coordinates in a smaller range and you'll solve your issue.
More clearly : you must change the units you are using for the display. Which does not mean you have to trade accuracy : you just have to do the translation + scaling by yourself before drawing : that way you still use the Javascript IEEE 64 bits accuracy and you've got no more those 32 bits rounding issue.
(you might override the x, y properties with getters/setters
Object.defineProperty(targetObject, 'x', { get : function() { return view.pixelWidth*(this.rx-view.left)/view.width ; } }
)

You can use any sized drawing coordinates that you desire.
Canvas will clip your drawing to the display area of the canvas element.
For example, here's a demo that starts drawing a line from x = -50000000 and finishes on the canvas. Only the visible portion of the line is rendered. All non-visible (off-canvas) points are clipped.
var canvas=document.getElementById("canvas");
var ctx=canvas.getContext("2d");
var cw=canvas.width;
var ch=canvas.height;
ctx.beginPath();
ctx.moveTo(-50000000,100);
ctx.lineTo(150,100);
ctx.stroke();
body{ background-color: ivory; padding:10px; }
#canvas{border:1px solid red;}
<h4>This line starts at x = negative 50 million!</h4>
<canvas id="canvas" width=300 height=300></canvas>

Remember that the target audience for W3C standard is mainly browser vendors. The unsigned long value (or 232) addresses more the underlying system for creating a bitmap by the browser. The standard says values in this range are valid, but there is no guarantee the underlying system will be able to provide a bitmap that large (most browsers today limits the bitmap to much smaller sizes than this). You stated that you don't mean the canvas element itself, but the link reference is the interface definition of the element so I just wanted to point that out in regards to the number range.
From the JavaScript side of things, where we developers usually are, and with the exception of typed arrays, there is no such thing as ulong etc. Only Number (aka unrestricted double) which is signed and stores numbers in 64-bit, formatted as IEEE-754.
The valid range for Number is:
Number.MIN_VALUE = 5e-324
Number.MAX_VALUE = 1.7976931348623157e+308
You can use any values in this range with canvas for your vector paths. Canvas will clip them to the bitmap based on the current transformation matrix when the paths are rasterized.
If you by drawing mean another bitmap (ie. Image, Canvas, Video) then they will be subject to the same system and browser capabilities/restrictions as the target canvas itself. Positioning (direct or via transformation) is limited (in sum) by the range of a Number.

Related

Compressed representation of polygons

I have a lot of (millions) of polygons from openstreetmap-data with mostly (more than 99%) exactly four coordinates representing houses.
Example
I currently save the four coordinates for each house explicitly as Tuple of floats (Latitude and Longitude), hence taking 32 bytes of memory.
Is there a way to store this information in a compressed way (fewer than 32 byte) since the four coordinates only differ very few in the last decimals?
If your map patch is not too large, you can store relative coordinates against some base point (for example, bottom left corner). Get these differences, norm them by map size like this:
uint16_diff = (uint16) 65535 * (lat - latbottom) / (lattop - latbottom)
This approach allows to store 16-bit integer values.
For rectangles (you can store them in separate list) there is a way to store 5 16-bit values instead of 8 values - coordinates of left top corner, width, height, and angle of rotation (there might be another sets of data, for example, including the second corner)
Combining both these methods, one might get data size loss upto 3.2 times
As #MBo said, you can store one corner of each house and compress the other three corners as relative to the first corner.
Also, if buildings are so similar you can set a "dictionary" of buildings. For each building you store its index in the dictionary and some feature, like its first corner coordinates and rotation.
You are giving no information on the resolution you want to keep.
Assuming 1 m accuracy is enough, 24 bits can cover up to 16000 km. Then 8 bits should also be enough to represent the size information (up to 256 m).
This would make 8 bytes per house.
More aggressive compression for instance with Huffman coding will probably not work on the locations (relatively uniform distribution); a little better on the sizes, but the benefit is marginal.

Summed area table in GLSL and GPU fragment shader execution

I am trying to compute the integral image (aka summed area table) of a texture I have in the GPU memory (a camera capture), the goal being to compute the adaptive threshold of said image. I'm using OpenGL ES 2.0, and still learning :).
I did a test with a simple gaussian blur shader (vertical/horizontal pass), which is working fine, but I need a way bigger variable average area for it to give satisfactory results.
I did implement a version of that algorithm on CPU before, but I'm a bit confused on how to implement that on a GPU.
I tried to do a (completely incorrect) test with just something like this for every fragment :
#version 100
#extension GL_OES_EGL_image_external : require
precision highp float;
uniform sampler2D u_Texture; // The input texture.
varying lowp vec2 v_TexCoordinate; // Interpolated texture coordinate per fragment.
uniform vec2 u_PixelDelta; // Pixel delta
void main()
{
// get neighboring pixels values
float center = texture2D(u_Texture, v_TexCoordinate).r;
float a = texture2D(u_Texture, v_TexCoordinate + vec2(u_PixelDelta.x * -1.0, 0.0)).r;
float b = texture2D(u_Texture, v_TexCoordinate + vec2(0.0, u_PixelDelta.y * 1.0)).r;
float c = texture2D(u_Texture, v_TexCoordinate + vec2(u_PixelDelta.x * -1.0, u_PixelDelta.y * 1.0)).r;
// compute value
float pixValue = center + a + b - c;
// Result stores value (R) and original gray value (G)
gl_FragColor = vec4(pixValue, center, center, 1.0);
}
And then another shader to get the area that I want and then get the average. This is obviously wrong as there's multiple execution units operating at the same time.
I know that the common way of computing a prefix sum on a GPU is to do it in two pass (vertical/horizontal, as discussed here on this thread or or here), but isn't there a problem here as there is a data dependency on each cell from the previous (top or left) one ?
I can't seem to understand the order in which the multiple execution units on a GPU will process the different fragments, and how a two-pass filter can solve that issue. As an example, if I have some values like this :
2 1 5
0 3 2
4 4 7
The two pass should give (first columns then rows):
2 1 5 2 3 8
2 4 7 -> 2 6 13
6 8 14 6 14 28
How can I be sure that, as an example, the value [0;2] will be computed as 6 (2 + 4) and not 4 (0 + 4, if the 0 hasn't been computed yet) ?
Also, as I understand that fragments are not pixels (If I'm not mistaken), would the values I store back in one of my texture in the first pass be the same in another pass if I use the exact same coordinates passed from the vertex shader, or will they be interpolated in some way ?
Tommy and Bartvbl address your questions about a summed-area table, but your core problem of an adaptive threshold may not need that.
As part of my open source GPUImage framework, I've done some experimentation with optimizing blurs over large radii using OpenGL ES. Generally, increasing blur radii leads to a significant increase in texture sampling and calculations per pixel, with an accompanying slowdown.
However, I found that for most blur operations you can apply a surprisingly effective optimization to cap the number of blur samples. If you downsample the image before blurring, blur at a smaller pixel radius (radius / downsampling factor), and then linearly upsample, you can arrive at a blurred image that is the equivalent of one blurred at a much larger pixel radius. In my tests, these downsampled, blurred, and then upsampled images look almost identical to the ones blurred based on the original image resolution. In fact, precision limits can lead to larger-radii blurs done at a native resolution breaking down in image quality past a certain size, where the downsampled ones maintain the proper image quality.
By adjusting the downsampling factor to keep the downsampled blur radius constant, you can achieve near constant-time blurring speeds in the face of increasing blur radii. For a adaptive threshold, the image quality should be good enough to use for your comparisons.
I use this approach in the Gaussian and box blurs within the latest version of the above-linked framework, so if you're running on Mac, iOS, or Linux, you can evaluate the results by trying out one of the sample applications. I have an adaptive threshold operation based on a box blur that uses this optimization, so you can see if the results there are what you want.
AS per the above, it's not going to be fantastic on a GPU. But assuming the cost of shunting data between the GPU and CPU is more troubling it may still be worth persevering.
The most obvious prima facie solution is to split horizontal/vertical as discussed. Use an additive blending mode, create a quad that draws the whole source image then e.g. for the horizontal step on a bitmap of width n issue a call that requests the quad be drawn n times, the 0th time at x = 0, the mth time at x = m. Then ping pong via an FBO, switching the target of buffer of the horizontal draw into the source texture for the vertical.
Memory accesses are probably O(n^2) (i.e. you'll probably cache quite well, but that's hardly a complete relief) so it's a fairly poor solution. You could improve it by divide and conquer by doing the same thing in bands — e.g. for the vertical step, independently sum individual rows of 8, after which the error in every row below the final is the failure to include whatever the sums are on that row. So perform a second pass to propagate those.
However an issue with accumulating in the frame buffer is clamping to avoid overflow — if you're expecting a value greater than 255 anywhere in the integral image then you're out of luck because the additive blending will clamp and GL_RG32I et al don't reach ES prior to 3.0.
The best solution I can think of to that, without using any vendor-specific extensions, is to split up the bits of your source image and combine channels after the fact. Supposing your source image were 4 bit and your image less than 256 pixels in both directions, you'd put one bit each in the R, G, B and A channels, perform the normal additive step, then run a quick recombine shader as value = A + (B*2) + (G*4) + (R*8). If your texture is larger or smaller in size or bit depth then scale up or down accordingly.
(platform specific observation: if you're on iOS then you've hopefully already got a CVOpenGLESTextureCache in the loop, which means you have CPU and GPU access to the same texture store, so you might well prefer to kick this step off to GCD. iOS is amongst the platforms supporting EXT_shader_framebuffer_fetch; if you have access to that then you can write any old blend function you like and at least ditch the combination step. Also you're guaranteed that preceding geometry has completed before you draw so if each strip writes its totals where it should and also to the line below then you can perform the ideal two-pixel-strips solution with no intermediate buffers or state changes)
What you attempt to do cannot be done in a fragment shader. GPU's are by nature very different to CPU's by executing their instructions in parallel, in massive numbers at the same time. Because of this, OpenGL does not make any guarantees about execution order, because the hardware physically doesn't allow it to.
So there is not really any defined order other than "whatever the GPU thread block scheduler decides".
Fragments are pixels, sorta-kinda. They are pixels that potentially end up on screen. If another triangle ends up in front of another, the previous calculated colour value is discarded. This happens regardless of whatever colour was stored at that pixel in the colour buffer previously.
As for creating the summed area table on the GPU, I think you may first want to look at GLSL "Compute Shaders", which are specifically made for this sort of thing.
I think you may be able to get this to work by creating a single thread for each row of pixels in the table, then have every thread "lag behind" by 1 pixel compared to the previous row.
In pseudocode:
int row_id = thread_id()
for column_index in (image.cols + image.rows):
int my_current_column_id = column_index - row_id
if my_current_column_id >= 0 and my_current_column_id < image.width:
// calculate sums
The catch of this method is that all threads should be guaranteed to execute their instructions simultaneously without getting ahead of one another. This is guaranteed in CUDA, but I'm not sure whether it is in OpenGL compute shaders. It may be a starting point for you, though.
It may look surprising for the beginner but the prefix sum or SAT calculation is suitable for parallelization. As the Hensley algorithm is the most intuitive to understand (also implemented in OpenGL), more work-efficient parallel methods are available, see CUDA scan. The paper from Sengupta discuss parallel method which seems state-of-the-art efficient method with reduce and down swap phases. These are valuable materials but they do not enter OpenGL shader implementations in detail. The closest document is the presentation you have found (it refers to Hensley publication), since it has some shader snippets. This is the job which is doable entirely in fragment shader with FBO Ping-Pong. Note that the FBO and its texture need to have internal format set to high precision - GL_RGB32F would be best but I am not sure if it is supported in OpenGL ES 2.0.

Detect uniform images that (most probably) are not photographs

Take a look at these two example images:
I would like to be able to identify these types of images inside large set of photographs and similar images. By photograph I mean a photograph of people, a landscape, an animal etc.
I don't mind if some photographs are falsely identified as these uniform images but I wouldn't really want to "miss" some of these by identifying them as photographs.
The simplest thing that came to my mind was to analyze the images pixel by pixel to find highest and lowest R,G,B values (each channel separately). If the difference between lowest and highest value is large, then there are large color changes and such image is probably a photograph.
Other idea was to analyze the Hue value of each pixel in similar fashion. The problem is that in HSL model orangish-red and pinkish-red have roughly 350 degree difference when looking clockwise and 10 degree difference when looking counterclockwise. So I cant just compare each pixel's Hue component because I'll get some weird results.
Also, there is a problem of noise - one white or black pixel will ruin tests like that. So I would need to somehow exclude extreme values if there are only few pixels with such extremes. But at this point it gets more and more complicated and I'm feeling it's not the best approach.
I was also thinking about bumping contrast to the max and then running test like the RGB one I described above. It would probably make things easier but still one or two abnormal pixels would ruin the test anyway. How to deal with such cases?
I don't mind running few different algorithms that would cover different image types. But please note that I'm dealing with images from digital cameras so 6MP, 12MP or even 16MP are quite common. Because of that running computation intensive algorithms is not desired. I deal with hundreds or even thousands of images and have only limited CPU resources for image processing. Lets say a second or two per large image is max what I can accept.
I'm aware that for example a photograph of a blue sky might trigger a false positive, but that's OK. False positives are better than misses.
This how I would do it (Whole Method below, at the bottom of post, but just read from top to bottom):
Your quote:
"By photograph I mean a photograph of people, a landscape, an animal
etc."
My response to your quote:
This means that such images have edges, contours. The images you are
trying to separate out, no edges or little contours(for the second
example image at least)
Your quote:
one white or black pixel will ruin tests like that. So I would need to
somehow exclude extreme values if there are only few pixels with such
extremes
My response:
Minimizing the noise through methods like DoG(Difference of Gaussian), etc will reduce the
noisy, individual pixels
So I have taken your images and run it through the following codes:
cv::cvtColor(image, imagec, CV_BGR2GRAY); // where image is the example image you shown
cv::GaussianBlur(imagec,imagec, cv::Size(3,3), 0, 0, cv::BORDER_DEFAULT ); //blur image
cv::Canny(imagec, imagec, 20, 60, 3);
Results for example image 1 you gave:
As you can see after going through the code, the image became blank(totally black). The image quite big, hence bit difficult to show all in one window.
Results for example 2 you showed me:
The outline can be seen, but one method to solve this, is to introduce an ROI of about 20 to 30 pixels from the dimension of the image, so for instance, if image dimension is 640x320, the ROI may be 610x 290, where it is placed in the center of the image.
So now, let me introduce you my real method:
1) run all the images through the codes above to find edges
2) check which images doesn't have any any edges( images with no edges
will have 0 pixel with values more then 0 or little pixels with values more then 0, so set a slightly higher threshold to play it safe? You adjust accordingly, how many pixels to identify your images )
3) Save/Name out all the images without edges, which will be the images
you are trying to separate out from the rest.
4) The end.
EDIT(TO ANSWER THE COMMENT, would have commented back, but my reply is too long):
true about the blurring part. To minimise usage of blurring, you can first do an "elimination-like process", so those smooth like images in image 1 will be already separated and categorised into images you looking for.
From there you do a second test for the remaining images, which will be the "blurring".
If you really wish to avoid blurring, what I notice is that your example image 1 can be categorised as "smooth surface" while your example image 2 can be categorised as "rough-like surface", meaning which it be noisy, which led me to introduce the blurring in the first place.
From my experience and if I do remember correctly, such rough-like surfaces is very good in "watershed" or "clustering through colour" method, they blend very well, unlike the smooth images.
Since the leftover images are high chances of rough images, you can try the watershed method, and canny, you will find it be a black image, if I am not wrong. Try a line maybe like this:
pyrMeanShiftFiltering( image, images, 10, 20, 3)
I am not very sure if such method will be more expensive than Gaussian blurring. but you can try both and compare the computational speed for both.
In regard to your comment on grayscale images:
Converting to grayscale sounds risky - loosing color information may
trigger lot's of false positives
My answer:
I don't really think so. If your images you are trying to segment out
are of one colour, changing to grayscale doesn't matter. Of course if
you snap a photo of a blue sky, it might cause to be a false negative,
but as you said, those are ok.
If you think about it, images with people, etc inside, the intensity
change differs quite a lot. (of course unless your photograph have
extreme cases, like a green ball on a field of grass)
I do admit that converting to grayscale loses information. But in your
case, I doubt it will affect much, in fact, working with grayscale
images is faster and less expensive.
I would use entropy based approach. I don't have any custom code to share, but the following blog entry should push you in right direction.
http://envalo.com/image-cropping-php-using-entropy-explained/
The thing is, that the uniform images will have very low entropy compared to those with something interesting in them.
So the question is to find the correct threshold and process the whole set.
I would generate a color histogram for each image and compare how much they differ from a given pattern.
Maybe you want to normalize the brightness first to simplify the matching.
This is how I would solve it:
Find the average R, G, and B values across the image
Calculate a value for each pixel that is the sum of the differences of each channel from the average
Remove the top 0.1% of values to ignore outliers
Check the largest remaining difference against a threshold (you'll probably need to determine this threshold by trial and error)
The following apprach might be usefull.
Derive local binary pattern in 5x5 window centered around every pixel. So for one pixel you have 15 boolean values. In some direction (Clockwise or anticlockwise) calculate the number 1-0 and 0-1 changes. This is the feature value of the center pixel.
For all 20x20 window derive the variance of the pixel feature values.
If you take variance of the variances , for a uniform image it should approach towards zero. Whereas for other images it would be quite high. In this way there might be no necessary to fix thresholds and local binary pattern takes care of the potential uneven illumination.
for each of the R,G,B channels, calculate the standard deviation of intensity. If it is low enough, you have an uniform image.
If you are worried about having different uniform areas, calculate the standard deviations for, say, each 20x20 square separately, then calculate average of the standard deviations.
You probably can solve your problem using machine learning (classification). It is easier than it sounds. I will give an example:
1 - Feature extraction: compute a color histogram from all images (a histogram of RGB values). Probably you will want to reduce the number of possible values of R,G and B, so your histogram does not grows so large (this is known as requantization). For example, you could make a histogram that accepts 4 different values of R, G and B, yielding an histogram with 4*4*4 bins: [(R=1, G=1, B=1), (R=1, G=1, B=2), ... (R=4, G=4, B=4)].
2 - Manually mark some images that know that are not photographs.
3 - Train a classifier: now that you have examples of images that are photographs and images that are not photographs, you can use this information to train a classifier. This classifier, given a histogram can be used to predict the image is photography or not.
If you do not want to spend time on the classifier, you could try a more simple approach:
Compute the histogram from the image It that you want to know if it is a photography or not;
Compare the histogram of It with the histograms of all marked images and find the most similar histogram (for example, you can sum the differences between bins);
If the image with the most similar histogram is a photography, then you classify the image It as a photography. Otherwise, classify It as not being a photography
Below is my answer. I write a simple demo to explain my idea by C. You can find it in gist.
Ready:
one color/pixel contains three channels (four channels if you have alpha data)
every channel has 8 bit(256) in common
Make some defines:
#define IMAGEWIDTH 20 // Assumed
#define IMAGEHEIGHT 20 // Assumed
#define CHANNELBIT 8
#define COLORLEVEL 256
typedef struct tagPixel
{
unsigned int R : CHANNELBIT;
unsigned int G : CHANNELBIT;
unsigned int B : CHANNELBIT;
} Pixel;
Collect every count of color for every COLORLEVEL in each channel:
void TraverseAndCount(Pixel image_data[IMAGEWIDTH][IMAGEHEIGHT]
, unsigned int red_counts[COLORLEVEL]
, unsigned int green_counts[COLORLEVEL]
, unsigned int blue_counts[COLORLEVEL]);
Next step is very important. Analyze the count of color:
// just a very simple way to smooth the curve of the counts of colors
// and you can replace it with another way you want
unsigned int CalculateRange(unsigned int min_count
, unsigned int blur_size
, unsigned int color_counts[COLORLEVEL]);
This function does:
i smooth the curve of each channel count in axis - COLORLEVEL by blur_size. (you can smooth it by another way)
calculate the range of counts that is more than min_count
At last, calculate the average of range in each channel:
// calculate the average of the range for each channel of color
// the value is bigger if the image is more probably photographs
float AverageRange(unsigned int min_count, unsigned int blur_size
, unsigned int red_counts[COLORLEVEL]
, unsigned int green_counts[COLORLEVEL]
, unsigned int blue_counts[COLORLEVEL]);
Note:
the result depends the min_count. min_count should bigger than 0.
the bigger result is more probably that the image is a photo.
for a photograph, bigger result is more probably in smaller min_count.

XNA 2D Camera loosing precision

I have created a 2D camera (code below) for a top down game. Everything works fine when the players position is close to 0.0x and 0.0y.
Unfortunately as distance increases the transform seems to have problems, at around 0.0x 30e7y (yup that's 30 million y) the camera starts to shudder when the player moves (the camera gets updated with the player position at the end of each update) At really big distances, a billion + the camera wont even track the player, as I'm guessing what ever error is in the matrix is amplified by too much.
My question is: Is there either a problem in the matrix, or is this standard behavior for extreme numbers.
Camera Transform Method:
public Matrix getTransform()
{
Matrix transform;
transform = (Matrix.CreateTranslation(new Vector3(-position.X, -position.Y, 0)) *
Matrix.CreateRotationZ(rotation) * Matrix.CreateScale(new Vector3(zoom, zoom, 1.0f)) *
Matrix.CreateTranslation(new Vector3((viewport.Width / 2.0f), (viewport.Height / 2.0f), 0)));
return transform;
}
Camera Update Method:
This requests the objects position given it's ID, it returns a basic Vector2 which is then set as the cameras position.
if (camera.CameraMode == Camera2D.Mode.Track && cameraTrackObject != Guid.Empty)
{
camera.setFocus(quadTree.getObjectPosition(cameraTrackObject));
}
If any one can see an error or enlighten me as to why the matrix struggles I would be most grateful.
I have actually found the reason for this, it was something I should have thought of.
I'm using single precision floating points, which only have precision to 7 digits. That's fine for smaller numbers (up to around the 2.5 million mark I have found). Anything over this and the multiplication functions in the matrix start to gain precision errors as the floats start to truncate.
The best solution for my particular problem is to introduce some artificial scaling (I need the very large numbers as the simulation is set in space). I have limited my worlds to 5 million units squared (+/- 2.5 million units) and will come up with another way of granulating the world.
I also found a good answer about this here:
Vertices shaking with large camera position values
And a good article that discusses floating points in more detail:
What Every Computer Scientist Should Know About Floating-Point Arithmetic
Thank you for the views and comments!!

Compressing/packing "don't care" bits into 3 states

At the moment I am working on an on screen display project with black, white and transparent pixels. (This is an open source project: http://code.google.com/p/super-osd; that shows the 256x192 pixel set/clear OSD in development but I'm migrating to a white/black/clear OSD.)
Since each pixel is black, white or transparent I can use a simple 2 bit/4 state encoding where I store the black/white selection and the transparent selection. So I would have a truth table like this (x = don't care):
B/W T
x 0 pixel is transparent
0 1 pixel is black
1 1 pixel is white
However as can be clearly seen this wastes one bit when the pixel is transparent. I'm designing for a memory constrained microcontroller, so whenever I can save memory it is good.
So I'm trying to think of a way to pack these 3 states into some larger unit (say, a byte.) I am open to using lookup tables to decode and encode the data, so a complex algorithm can be used, but it cannot depend on the states of the pixels before or after the current unit/byte (this rules out any proper data compression algorithm) and the size must be consistent; that is, a scene with all transparent pixels must be the same as a scene with random noise. I was imagining something on the level of densely packed decimal which packs 3 x 4-bit (0-9) BCD numbers in only 10 bits with something like 24 states remaining out of the 1024, which is great. So does anyone have any ideas?
Any suggestions? Thanks!
In a byte (256 possible values) you can store 5 of your three-bit values. One way to look at it: three to the fifth power is 243, slightly less than 256. The fact that it's slightly less also shows that you're not wasting much of a fraction of a bit (hardly any, either).
For encoding five of your 3-bit "digits" into a byte, think of taking a number in base 3 made from your five "digits" in succession -- the resulting value is guaranteed to be less than 243 and therefore directly storable in a byte. Similarly, for decoding, do the base-3 conversion of a byte's value.

Resources