I'm not going to turn the images into files yet (and I don't know if I'm ever going to do this).
The drawings are made by a custom-made drawing program (where the user draws). When I resize the application, the drawing disappears, because it's not being redrawn. And that's because the image is not being memorized in any way. I need an algorithm for memorizing the drawing, so it can be redrawn after the whole application refreshes.
One algorithm I thought of is to memorize the location and color every pixel. But I don't think this is a good idea.
I'm currently using Java, but I need a language-agnostic algorithm. Still, I would accept a solution explained with code.
What algorithm should I use for memorizing the whole drawing?
You could memorize the user's actions: for example, if s/he draws a line, then memorize the starting and ending address. If s/he draws handsfree a drawing, then you memorize the single pixels (you have to!).
This allows to resize, rotate, etc. any drawing by just manipulating the coordinates.
The "drawing" becomes then a list of actions:
{
LINE_DRAWING,
x1, y1, x2, y2,
pen, color, thickness...
}
{
...
}
To redraw, just scan the same list and call again the appropriate subroutines. Depending on the language, you can represent the list as an array, a linked list, a doubly linked list, and implement things such as element deletion.
On file, I would suggest some sort of tagged format:
two bytes - element type
four bytes - this element's length
variable-size data depending on element type
Again, to "load" the drawing you just scan the file sequentially and populate the memory structures.
You can google 'vector drawing' for more details and hints.
There are lots of options. One is, as you say, to remember the image pixels. You can also simply remember all the user actions that generated the drawing and replay them when you need to reconstruct the drawing.
Another approach, depending on the tools that the drawing program offers the user, would be to build a more compact representation of the image. For instance, if the drawing program only offered the possibility of drawing lines, you could remember the set of line endpoints (and colors, line thicknesses, and whatever other line data was relevant). This generalizes in an obvious way to a larger set of geometric primitives.
For free-hand drawing, you can remember the stroke paths along with whatever stroke settings were set at the time. Depending on the complexity of the stroke tools your program offers, this may end up being more data than simply remembering the drawing pixels. However, it does allow, for instance, scaling the drawing if the canvas expands.
Related
I am writing a GUI that is supposed to display entities of a system in a 2D coordinate system, which the user can select and drag around. The system is mirror-symmetric w.r.t. the x and y axes. Currently I am subclassing an entity using a QGraphicsRectItem so that I can drag it around in the first quadrant (x>0, y>0) of the coordinate system. I reimplemented the paint method to draw the other three additional rectangles with painter.drawRectangle(). So when I move the entity in quadrant 1, the elements in the other three quadrants perform mirror motions. That works well.
In the next stage, each entity can be subdivided, i.e. consist of hundreds of rectangles. So I need to draw hundreds of rectangles and that four times, with mirror operations. The naive approach takes four for-loops, but I'm wondering if there is a smarter way of doing this in QT. The for-loops hurt a little because I'm using PyQt.
If your drawing operations are so slow the simplest thing you could do is draw to an image, and then simply draw the cached painting from the image 4 times, which will be very fast, since it will just be copying some pixel values.
It might be efficient to cache the drawing result not on item basis but to cache a quadrant of your grid. This way if you zoom in and the items get huge or numerous in count, you won't be wasting lots of memory, instead you will only need one image cache that's the screen size of the quadrant.
It really depends what you want to achieve, which at this point is not entirely clear from the description, and your image isn't showing neither.
I have multiple rectangles and they all share the same spot color. Is there a way to merge / group them into one vector object so the generated pdf has smaller size?
If you are creating the document from scratch, then the answer is trivial: yes!
It's sufficient to draw all the paths of the rectangles that share the same spot color and then use the operator that fills, stroke or fills & strokes the paths.
If you are talking about optimizing an existing PDF document, you're in for some heavy programming. You would need to parse every content stream looking for rectangle operators (assuming that the rectangles aren't drawn using move-to and line-to operators), check where these shapes are filled and/or stroked, and then rearrange all these operators. This would require a lot of thought. I would know where to begin, but I can't predict where it would end. Maybe it would turn out that it makes more sense to define a single rectangle as a Form XObject and reuse that single external object, maybe not. It's hard to predict.
Moreover: you are talking about operators in a stream. These streams are compressed anyway, so you may be doing a lot of work to gain only a very small decrease in size.
I would say: what you are asking for may be possible, but it is unclear why you would do this, because it would result in only a limited decrease in file size.
If size is an issue, there may be other places where you are "wasting bytes" that could result in a more desirable result. I am very curious to hear why you think the rectangles using spot colors are the culprit. You are reusing the spot color instance, aren't you? If you are creating a new spot color instance for every rectangle you draw, you have found the real culprit and you can avoid having to group the rectangles.
I have a large set of google maps api v3 polylines and markers that need to be rendered as transparent PNG's (implemented as ImageMapType). I've done all the math/geometry regarding transformations from latLng to pixel and tile coordinates.
The problem is: at the maximum allowable zoom for my app, that is 18, the compound image would span at least 80000 pixels both in width and height. So rendering it in one piece, then splitting it into tiles becomes impossible.
I tried the method of splitting polylines beforehand and placing the parts into tiles, then rendering each tile alone, which up until now works almost fine. But it will become very difficult when I will need to draw stylized markers / text and other fancy stuff, etc.
So far I used C# GDI+ as the drawing methods (the ol' Bitmap / Graphics pair).
Many questions here are about splitting an already existing image, storing, and linking it to the API. I already know how to do that.
My problem is how do I draw the initial very large image then split it up? It doesn't really need to be a true image/bitmap/call it whatever you want solution. A friend suggested me to use SVG but I don't know any good rendering solutions to suit my needs.
To make it a little easier to comprehend, think it in terms of input/output. My input is the data that I need to draw (lines, circles, text, etc) that spreads across tens of thousands of pixels, and the output must be the tiles. I really don't care what the 'magic box' is, and I don't even care what the platform is.
I ran into the same problem when creating custom tiles, and you are on the right track with your solution of creating one tile at a time. You just need to add some strategy to the process. What I do is like this:
Pseudo code:
for each tile {
- determine the lat/lon corners of the tile.
- query the database and load the objects that are within this tile.
for each object{
- calculate the tile pixels on which the object should be painted. [*A*]
- draw the object on the tile.
- Save the tile. (you're done with this tile).
}
}
alternatively:
Pseudo code:
- for each object to be drawn {
- determine what tile the object should be painted on.
- calculate the tile pixels on which the object should be painted.[*A*]
- get that tile, if it doesn't yet exist create a new one.
- draw the object on the tile.
- Save the tile. (you might need to draw more on this tile later)
}
I do this with Perl and the GD library.
[*A*] When painting objects that span more than one tile, if the object begins on the current tile then part of it will be left out automatically because you'll be attempting to paint outside the tile, while if the object began on the previous tile and you're drawing the second part then the pixel numbers should be negative, meaning that it began on the neighbor tile.
This is a bit hard to explain in a written post so please feel free to ask for further clarification if you need it and I'll edit the answer.
I'd recommend getting to know GDAL (http://gdal.org) and it's libraries. It has libraries for rasterization, tiling, data conversion, projections, warping, and much more.
I'm searching for an certain object in my photograph:
Object: Outline of a rectangle with an X in the middle. It looks like a rectangular checkbox. That's all. So, no fill, just lines. The rectangle will have the same ratios of length to width but it could be any size or any rotation in the photograph.
I've looked a whole bunch of image recognition approaches. But I'm trying to determine the best for this specific task. Most importantly, the object is made of lines and is not a filled shape. Also, there is no perspective distortion, so the rectangular object will always have right angles in the photograph.
Any ideas? I'm hoping for something that I can implement fairly easily.
Thanks all.
You could try using a corner detector (e.g. Harris) to find the corners of the box, the ends and the intersection of the X. That simplifies the problem to finding points in the right configuration.
Edit (response to comment):
I'm assuming you can find the corner points in your image, the 4 corners of the rectangle, the 4 line endings of the X and the center of the X, plus a few other corners in the image due to noise or objects in the background. That simplifies the problem to finding a set of 9 points in the right configuration, out of a given set of points.
My first try would be to look at each corner point A. Then I'd iterate over the points B close to A. Now if I assume that (e.g.) A is the upper left corner of the rectangle and B is the lower right corner, I can easily calculate, where I would expect the other corner points to be in the image. I'd use some nearest-neighbor search (or a library like FLANN) to see if there are corners where I'd expect them. If I can find a set of points that matches these expected positions, I know where the symbol would be, if it is present in the image.
You have to try if that is good enough for your application. If you have too many false positives (sets of corners of other objects that accidentially form a rectangle + X), you could check if there are lines (i.e. high contrast in the right direction) where you would expect them. And you could check if there is low contrast where there are no lines in the pattern. This should be relatively straightforward once you know the points in the image that correspond to the corners/line endings in the object you're looking for.
I'd suggest the Generalized Hough Transform. It seems you have a fairly simple, fixed shape. The generalized Hough transform should be able to detect that shape at any rotation or scale in the image. You many need to threshold the original image, or pre-process it in some way for this method to be useful though.
You can use local features to identify the object in image. Feature detection wiki
For example, you can calculate features on some referent image which contains only the object you're looking for and save the results, let's say, to a plain text file. After that you can search for the object just by comparing newly calculated features (on images with some complex scenes containing the object) with the referent ones.
Here's some good resource on local features:
Local Invariant Feature Detectors: A Survey
I have nothing useful to do and was playing with jigsaw puzzle like this:
alt text http://manual.gimp.org/nl/images/filters/examples/render-taj-jigsaw.jpg
and I was wondering if it'd be possible to make a program that assists me in putting it together.
Imagine that I have a small puzzle, like 4x3 pieces, but the little tabs and blanks are non-uniform - different pieces have these tabs in different height, of different shape, of different size. What I'd do is to take pictures of all of these pieces, let a program analyze them and store their attributes somewhere. Then, when I pick up a piece, I could ask the program to tell me which pieces should be its 'neighbours' - or if I have to fill in a blank, it'd tell me how does the wanted puzzle piece(s) look.
Unfortunately I've never did anything with image processing and pattern recognition, so I'd like to ask you for some pointers - how do I recognize a jigsaw piece (basically a square with tabs and holes) in a picture?
Then I'd probably need to rotate it so it's in the right position, scale to some proportion and then measure tab/blank on each side, and also each side's slope, if present.
I know that it would be too time consuming to scan/photograph 1000 pieces of puzzle and use it, this would be just a pet project where I'd learn something new.
Data acquisition
(This is known as Chroma Key, Blue Screen or Background Color method)
Find a well-lit room, with the least lighting variation across the room.
Find a color (hue) that is rarely used in the entire puzzle / picture.
Get a color paper that has that exactly same color.
Place as many puzzle pieces on the color paper as it'll fit.
You can categorize the puzzles into batches and use it as a computer hint later on.
Make sure the pieces do not overlap or touch each other.
Do not worry about orientation yet.
Take picture and download to computer.
Color calibration may be needed because the Chroma Key background may have upset the built-in color balance of the digital camera.
Acquisition data processing
Get some computer vision software
OpenCV, MATLAB, C++, Java, Python Imaging Library, etc.
Perform connected-component on the chroma key color on the image.
Ask for the contours of the holes of the connected component, which are the puzzle pieces.
Fix errors in the detected list.
Choose the indexing vocabulary (cf. Ira Baxter's post) and measure the pieces.
If the pieces are rectangular, find the corners first.
If the pieces are silghtly-off quadrilateral, the side lengths (measured corner to corner) is also a valuable signature.
Search for "Shape Context" on SO or Google or here.
Finally, get the color histogram of the piece, so that you can query pieces by color later.
To make them searchable, put them in a database, so that you can query pieces with any combinations of indexing vocabulary.
A step back to the problem itself. The problem of building a puzzle can be easy (P) or hard (NP), depending of whether the pieces fit only one neighbour, or many. If there is only one fit for each edge, then you just find, for each piece/side its neighbour and you're done (O(#pieces*#sides)). If some pieces allow multiple fits into different neighbours, then, in order to complete the whole puzzle, you may need backtracking (because you made a wrong choice and you get stuck).
However, the first problem to solve is how to represent pieces. If you want to represent arbitrary shapes, then you can probably use transparency or masks to represent which areas of a tile are actually part of the piece. If you use square shapes then the problem may be easier. In the latter case, you can consider the last row of pixels on each side of the square and match it with the most similar row of pixels that you find across all other pieces.
You can use the second approach to actually help you solve a real puzzle, despite the fact that you use square tiles. Real puzzles are normally built upon a NxM grid of pieces. When scanning the image from the box, you split it into the same NxM grid of square tiles, and get the system to solve that. The problem is then to visually map the actual squiggly piece that you hold in your hand with a tile inside the system (when they are small and uniformly coloured). But you get the same problem if you represent arbitrary shapes internally.