Graphviz ignores size attribute (A4 page) - graphviz

Consider the following minimal example graph that should fit on an A4 page
digraph G{
size="8.3,11.7!" ratio=fill;
foo->bar;
}
Compile with neato -Tpdf -o min_ex.pdf min_ex.gv
The resulting pdf file has dimensions of 236mm x 115mm and not, as intended, 210mm x 297mm.
Graphviz ignores this attribute both for graphs that are smaller than the page (like this one) and the ones that have to be scaled down to fit.
I have tried any combinations of size and ratio attributes, I can't get the graph to be put on an A4 page with any of them.
So, what have I to specify that the graph is always put on an A4 page, regardless of its size?
Documentation:
size:
Maximum width and height of drawing, in inches. If only a single number is given, this is used for both the width and the height.
If defined and the drawing is larger than the given size, the drawing is uniformly scaled down so that it fits within the given size.
If size ends in an exclamation point (!), then it is taken to be the desired size. In this case, if both dimensions of the drawing are less than size, the drawing is scaled up uniformly until at least one dimension equals its dimension in size.
ratio
Sets the aspect ratio (drawing height/drawing width) for the drawing. Note that this is adjusted before the size attribute constraints are enforced. In addition, the calculations usually ignore the node sizes, so the final drawing size may only approximate what is desired.
If ratio is numeric, it is taken as the desired aspect ratio. Then, if the actual aspect ratio is less than the desired ratio, the drawing height is scaled up to achieve the desired ratio; if the actual ratio is greater than that desired ratio, the drawing width is scaled up.
If ratio = "fill" and the size attribute is set, node positions are scaled, separately in both x and y, so that the final drawing exactly fills the specified size. If both size values exceed the width and height of the drawing, then both coordinate values of each node are scaled up accordingly. However, if either size dimension is smaller than the corresponding dimension in the drawing, one dimension is scaled up so that the final drawing has the same aspect ratio as specified by size. Then, when rendered, the layout will be scaled down uniformly in both dimensions to fit the given size, which may cause nodes and text to shrink as well. This may not be what the user wants, but it avoids the hard problem of how to reposition the nodes in an acceptable fashion to reduce the drawing size.
If ratio = "compress" and the size attribute is set, dot attempts to compress the initial layout to fit in the given size. This achieves a tighter packing of nodes but reduces the balance and symmetry. This feature only works in dot.
If ratio = "expand", the size attribute is set, and both the width and the height of the graph are less than the value in size, node positions are scaled uniformly until at least one dimension fits size exactly. Note that this is distinct from using size as the desired size, as here the drawing is expanded before edges are generated and all node and text sizes remain unchanged.
If ratio = "auto", the page attribute is set and the graph cannot be drawn on a single page, then size is set to an ``ideal'' value. In particular, the size in a given dimension will be the smallest integral multiple of the page size in that dimension which is at least half the current size. The two dimensions are then scaled independently to the new size. This feature only works in dot.

The problem lies in the details about the ratio:
Note that this is adjusted before the size attribute constraints are
enforced. In addition, the calculations usually ignore the node sizes,
so the final drawing size may only approximate what is desired.
It seems that graphviz
lays out the nodes as points (ignoring size)
adjusts for the ratio of the point nodes (still no size for nodes)
applies the graph's size constraints (in our case, upscaling the image): here we have already reached the desired dimensions, but we're not finished...
then point nodes become nodes with a real size (by default 0.5 inch high and 0.75 inch wide)
and finally the whole output gets the margin added
The result would be bigger than A4.
Therefore if we were to make nodes and margin as small as possible, then the output should come relatively close to A4.
Setting margin to 0 and the node's shape to point as well as their width and height to the minimum values with the following graph:
digraph G{
ratio="fill";
size="8.3,11.7!";
margin=0;
node[shape=point, height=0.02, width=0.01];
foo->bar;
}
neato -Tpdf with this graph results in a PDF with dimensions 211x297mm (using 8.267 inches as width will result in a clean 210x297mm).
Unfortunately, even knowing how graphviz works in respect to ratio=fill, I don't think there's an easy way to make sure the final result is always A4 when using nodes which actually have a width and height.

Related

Drawing pixel circle of given area

I have some area X by Y pixels and I need to fill it up pixel by pixel. The problem is that at any given moment the drawn shape should be as round as possible.
I think that this algorithm is subset of Ordered Dithering, when converting grayscale images to one-bit, but I could not find any references nor could I figure it out myself.
I am aware of Bresenham's Circle, but it is used to draw circle of certain radius not area.
I created animation of all filling percents for 10 by 10 pixel grid. As full area is 10x10=100px, then each frame is exactly 1% inc.
A filled disk has the equation
(X - Xc)² + (Y - Yc)² ≤ C.
When you increase C, the number of points that satisfies the equation increases, but because of symmetry it increases in bursts.
To obtain the desired filling effect, you can compute (X - Xc)² + (Y - Yc)² for every pixel, sort on this value, and let the pixels appear one by one (or in a single go if you know the desired number of pixels).
You can break ties in different ways:
keep the original order as when you computed the pixels, by using a stable sort;
shuffle the runs of equal values;
slightly alter the center coordinates so that there are no ties.
Filling with the de-centering trick.
Values:
Order:

Footprint finding algorithm

I'm trying to come up with an algorithm to optimize the shape of a polygon (or multiple polygons) to maximize the value contained within that shape.
I have data with 3 columns:
X: the location on the x axis
Y: the location on the y axis
Value: Value of the block which can have positive and negative values.
This data is from a regular grid so the spacing between each x and y value is consistent.
I want to create a bounding polygon that maximizes the contained value with the added condition.
There needs to be a minimum radius maintained at all points of the polygon. This means that we will either lose some positive value blocks or gain some negative value blocks.
The current algorithm I'm using does the following
Finds the maximum block value as a starting point (or user defined)
Finds all blocks within the minimum radius and determines if it is a viable point by checking the overall value is positive
Removes all blocks in the minimum search radius from further value calculations and flags them as part of the final shape
Moves onto the next point determined by a spiraling around the original point. (center is always a grid point so moves by deltaX or deltaY)
This appears to be picking up some cells that aren't needed. I'm sure there are shape algorithms out there but I don't have any idea what to look up to find help.
Below is a picture that hopefully helps outline the question. Positive cells are shown in red (negative cells are not shown). The black outline shows the shape my current routine is returning. I believe the left side should be brought in more. The minimum radius is 100m the bottom left black circle is approximately this.
Right now the code is running in R but I will probably move to something else if I can get the algorithm correct.
In response to the unclear vote the problem I am trying to solve without the background or attempted solution is:
"Create a bounding polygon (or polygons) around a series of points to maximize the contained value, while maintaining a minimum radius of curvature along the polygon"
Edit:
Data
I should have included some data it can be found here.
The file is a csv. 4 columns (X,Y,Z [not used], Value), length is ~25k size is 800kb.
Graphical approach
I would approach this graphically. My intuition tells me that the inside points are fully inside the casted circles with min radius r from all of the footprint points nearby. That means if you cast circle from each footprint point with radius r then all points that are inside at least half of all neighboring circles are inside your polygon. To be less vague if you are deeply inside polygon then you got Pi*r^2 such overlapping circles at any pixel. if you are on edge that you got half of them. This is easily computable.
First I need the dataset. As you did provide just jpg file I do not have the vales just the plot. So I handle this problem like a binary image. First I needed to recolor the image to remove jpg color distortions. After that this is my input:
I choose black background to easily apply additive math on image and also I like it more then white and leave the footprint red (maximally saturated). Now the algorithm:
create temp image
It should be the same size and cleared to black (color=0). Handle its pixels like integer counters of overlapping circles.
cast circles
for each red pixel in source image add +1 to each pixel inside the circle with minimal radius r around the same pixel but in the temp image. The result is like this (Blue are the lower bits of my pixelformat):
As r I used r=24 as that is the bottom left circle radius in your example +/-pixel.
select inside pixels only
so recolor temp image. All the pixels with color < 0.5*pi*r^2 recolor to black and the rest to red. The result is like this:
select polygon circumference points only
Just recolor all red pixels near black pixels to some neutral color blue and the rest to black. Result:
Now just polygonize the result. To compare with the input image you can combine them both (I OR them together):
[Notes]
You can play with the min radius or the area treshold property to achieve different behavior. But I think this is pretty close match to your problem.
Here some C++ source code for this:
//picture pic0,pic1;
// pic0 - source
// pic1 - output/temp
int x,y,xx,yy;
const int r=24; // min radius
const int s=float(1.570796*float(r*r)); // half of min radius area
const DWORD c_foot=0x00FF0000; // red
const DWORD c_poly=0x000000FF; // blue
// resize and clear temp image
pic1=pic0;
pic1.clear(0);
// add min radius circle to temp around any footprint pixel found in input image
for (y=r;y<pic1.ys-r;y++)
for (x=r;x<pic1.xs-r;x++)
if (pic0.p[y][x].dd==c_foot)
for (yy=-r;yy<=r;yy++)
for (xx=-r;xx<=r;xx++)
if ((xx*xx)+(yy*yy)<=r*r)
pic1.p[y+yy][x+xx].dd++;
pic1.save("out0.png");
// select only pixels which are inside footprint with min radius (half of area circles are around)
for (y=0;y<pic1.ys;y++)
for (x=0;x<pic1.xs;x++)
if (pic1.p[y][x].dd>=s) pic1.p[y][x].dd=c_foot;
else pic1.p[y][x].dd=0;
pic1.save("out1.png");
// slect only outside pixels
pic1.growfill(c_foot,0,c_poly);
for (y=0;y<pic1.ys;y++)
for (x=0;x<pic1.xs;x++)
if (pic1.p[y][x].dd==c_foot) pic1.p[y][x].dd=0;
pic1.save("out2.png");
pic1|=pic0; // combine in and out images to compare
pic1.save("out3.png");
I use my own picture class for images so some members are:
xs,ys size of image in pixels
p[y][x].dd is pixel at (x,y) position as 32 bit integer type
clear(color) - clears entire image
resize(xs,ys) - resizes image to new resolution
[Edit1] I got a small bug in source code
I noticed some edges were too sharp so I check the code and I forgot to add the circle condition while filling so it filled squares instead. I repaired the source code above. I really just added line if ((xx*xx)+(yy*yy)<=r*r). The results are slightly changed so I also updated the images with new results
I played with the inside area coefficient ratio and this one:
const int s=float(0.75*1.570796*float(r*r));
Leads to even better match for you. The smaller it is the more the polygon can overlap outside footprint. Result:
If the solution set must be a union of disks of given radius, I would try a greedy approach. (I suspect that the problem might be intractable - exponential running time - if you want an exact solution.)
For all pixels (your "blocks"), compute the sum of values in the disk around it and take the one with the highest sum. Mark this pixel and adjust the sums of all the pixels that are in its disk by deducing its value, because the marked pixel has been "consumed". Then scan all pixels in contact with it by an edge or a corner, and mark the pixel with the highest sum.
Continue this process until all sums are negative. Then the sum cannot increase anymore.
For an efficient implementation, you will need to keep a list of the border pixels, i.e. the unmarked pixels that are neighbors of a marked pixel. After you have picked the border pixel with the largest sum and marked it, you remove it from the list and recompute the sums for the unmarked pixels inside its disk; you also add the unmarked pixels that touch it.
On the picture, the pixels are marked in blue and the border pixels in green. The highlighted pixels are
the one that gets marked,
the ones for which the sum needs to be recomputed.
The computing time will be proportional to the area of the image times the area of a disk (for the initial computation of the sums), plus the area of the shape times the area of a disk (for the updates of the sums), plus the total of the lengths of the successive perimeters of the shape while it grows (to find the largest sum). [As the latter terms might be costly - on the order of the product of the area of the shape by its perimeter length -, it is advisable to use a heap data structure, which will reduce the sum of the lengths to the sum of their logarithm.]

Aligning to height/depth maps

I have 2 2D depth/height maps of equal dimension (256 x 256). Each pixel/cell of the depth map image, contains a float value. There are some pixels that have no information so are set to nan currently. The percentage of non nan cells can vary from ~ 20% to 80%. The depth maps are taken of the same area through point sampling an underlying common surface.
The idea is that the images represent a partial, yet overlapping, sampling of an underlying surface. And I need to align these images to create a combined sampled representation of the surface. If done blindly then the combined images have discontinuities especially in the z dimension (the float value).
What would be a fast method of aligning the 2 images? Translation in the x and y direction should be minimal only a few pixels (~ 0 to 10 pixels). But the float values of one image may need to be adjusted to align the images better. So minimizing the difference between the 2 images is the goal.
Thnx for any advice.
If your images are lacunar, one way is the exhaustive computation of a matching score in the window of overlap, ruling out the voids. FFT convolution will not apply. (Workload = overlap area * X-range * Y-range).
If both images differ in noise only, use the SAD matching score. If they also differ by the reference zero, subtract the average height before comparing.
You can achieve some acceleration by using an image pyramid, but you'll need to handle the voids.
Other approach could be to fill in the gaps by some interpolation method that somehow ensures that the interpolated values are compatible between both images.

Locating "black rectangles" on an image - language independent

I am writing a small image analysis program just for fun. Image analysis has always fascinated me. I am trying to locate regions on a scanned document. These regions are going to be marked by clearly defined filled black rectangles (pre-printed on the page).
My problem is locating the rectangles. I know SIFT\SURF find "features" but I am trying to find something specific. Here is what I was thinking of doing. I am not sure if this is the "right" way or there is a better idea.
First off using some library I will turn the image into greyscale, perhaps a PGM since that is what I'm used to working with in school. For the analysis I first plan to run the image through a state of the art deskew algorithm in OpenCV or something else that I find. Once I have my deskewed image I will then threshhold it at some pretty high thresshold. The rectangles are going to be straight black hence me using a pretty high threshhold. I will then experimentally determine a good size black rectangle to slide across the image. While sliding my rectangle across the image I will determine the areas where the greatest percentage of pixles are the same. I will have a cutoff, say 90%. If 90% of the pixles contained in my window are black I must have found a rectangle. My reasoning is that a true black rectangle slid over something that is "pretty much" a black rectangle is most likely a black rectangle. Since I deskewed the image I can assume that the rectangles are straight up and down "enough". I can then track the (x,y) offsets where the rectangles are found on the image and mark them.
Would anyone suggest a better approach?
There are many approaches that might work. (One can easily come up with 10 or more approaches.)
Idea #1 - Canny edge detection; find rectangle fit to contours
cv::Canny
cv::findContours
cv::minAreaRect, or
cv::boundingRect might also work, if the deskewing works as advertised.
Idea #2 - Find all lines using Hough transform; Iterates through all regions created from line intersections.
Idea #3 - (Improvement on #2) Restrict the Hough transform to horizontal and vertical lines by pre-processing.
Idea #4 - Compute Horizontal and Vertical profiles on the entire image; find dips; iterate through all candidate regions.
This idea is based on the assumption that the black rectangles are large enough that they leave a "depression" in both the horizontal and vertical projection profiles, which would be detectable despite other noise objects in the image.
cv::reduce
With dim = 0 or 1 for reducing to a row or column respectively,
With CV_REDUCE_AVG flag
Apply cv::threshold to the horizontal and vertical projection profiles, separately.
For each profile now thresholded into zero/non-zero, find runs of zeroes. These are the possible row ranges and column ranges that could contain the dark rectangles.
For each combination of candidate row range and column range, calculate the average pixel value to decide if it is a true dark rectangle.
Idea #5 - Use integral image (summed area table) to quickly calculate the average pixel value in arbitrary rectangles
cv::integral
To compute the sum (and average) of a rectangle from an integral image, see the Wikipedia article on Summed Area Table
Preprocessing idea - use morphological dilation (or erosion) to "erase" things that cannot be the large continuous black box.
Preprocessing idea - use pre-processing to enhance horizontal and vertical edges; suppress edges in other directions.
I don't know if it is a better approach, but the first thing that came to mind would be a scan-line solution (assuming black or white pixels): I'd check each scanline from top to bottom. In each scanline I'd check each pixel from left to right. A "first" black pixel would be a possible upperleft corner of a rect. If there were enough following contiguous black pixels on the line to meet my desired minimum width, keep the [left, width] in a list of possible rects. Find all possible rect starts and widths on the line.
For a rect to stay in the list and grow in height, the next scanline would have to have the same [left, width] occurrence, otherwise the rect is finished (if its height meets my desired minimum height) or discarded or ignored as too short in height.
You can easily add logic for situations like two rectangles too close to one another vertically or horizontally. Overlapping rectangles would be trickier but still possible to detect with added code.
Here's some pseudocode:
for s := 1 to scanlinecount do
begin
pixel := 1
while pixel <= scanlinewidth do
if black(s, pixel) then // possible rect
begin
left := pixel
repeat
inc(pixel)
until (pixel > scanlinewidth) or white(s, pixel)
width := pixel - left
if width >= MINWIDTH then // wide enough
rememberrect(s, left, width) // bumps height if already in list
end
else inc(pixel)
end
Your list of found rects stores the starting scanline, leftmost pixel, width, and height for each rect found. The "rememberrect" routine checks each rect in the list:
rememberrect(currentline, left, width):
for r := 1 to rectlist.count do
if rectlist[r].left = left
& rectlist[r].width = width
& rectlist[r].y + rectlist[r].height = currentline then
begin // found rect continuing on scanline
inc(rectlist[r].height)
exit
end
inc(rectlist.count) // add new rect to list
rectlist[rectlist.count].left := left
rectlist[rectlist.count].width := width
rectlist[rectlist.count].y := currentline
rectlist[rectlist.count].height := 1
If the group of black pixels on the current scanline has the same leftmost pixel and width as a group on the previous scanline (you'll know they're vertically contiguous because the starting scanline of the rect in the list plus its height will equal the current scanline) then rememberrect bumps the height of the found and remembered rect by 1. Otherwise, remember the new rect with initial height 1.
After the last scanline you'll have a long list of rect candidates, many of them only 1 pixel high. Delete or ignore any rects in the list that aren't high enough. To avoid growing a long list of futile candidates: at the start of each scanline mark all rects found so far as "finished". If rememberrect grows an existing rect or adds a new rect, mark that rect as "grown". At the end of each scanline, any rect still marked as finished that isn't tall enough can be deleted from the list.

Minimum number of rectangles in shape made from rectangles?

I'm not sure if there's an algorithm that can solve this.
A given number of rectangles are placed side by side horizontally from left to right to form a shape. You are given the width and height of each.
How would you determine the minimum number of rectangles needed to cover the whole shape?
i.e How would you redraw this shape using as few rectangles as possible?
I've can only think about trying to squeeze as many big rectangles as i can but that seems inefficient.
Any ideas?
Edit:
You are given a number n , and then n sizes:
2
1 3
2 5
The above would have two rectangles of sizes 1x3 and 2x5 next to each other.
I'm wondering how many rectangles would i least need to recreate that shape given rectangles cannot overlap.
Since your rectangles are well aligned, it makes the problem easier. You can simply create rectangles from the bottom up. Each time you do that, it creates new shapes to check. The good thing is, all your new shapes will also be base-aligned, and you can just repeat as necessary.
First, you want to find the minimum height rectangle. Make a rectangle that height, with the width as total width for the shape. Cut that much off the bottom of the shape.
You'll be left with multiple shapes. For each one, do the same thing.
Finding the minimum height rectangle should be O(n). Since you do that for each group, worst case is all different heights. Totals out to O(n2).
For example:
In the image, the minimum for each shape is highlighted green. The resulting rectangle is blue, to the right. The total number of rectangles needed is the total number of blue ones in the image, 7.
Note that I'm explaining this as if these were physical rectangles. In code, you can completely do away with the width, since it doesn't matter in the least unless you want to output the rectangles rather than just counting how many it takes.
You can also reduce the "make a rectangle and cut it from the shape" to simply subtracting the height from each rectangle that makes up that shape/subshape. Each contiguous section of shapes with +ve height after doing so will make up a new subshape.
If you look for an overview on algorithms for the general problem, Rectangular Decomposition of Binary Images (article by Tomas Suk, Cyril Höschl, and Jan Flusser) might be helpful. It compares different approaches: row methods, quadtree, largest inscribed block, transformation- and graph-based methods.
A juicy figure (from page 11) as an appetizer:
Figure 5: (a) The binary convolution kernel used in the experiment. (b) Its 10 blocks of GBD decomposition.

Resources