Pruning displayed cells - filter

I have a dataset with particle traces as polylines (vtkPolyData with vtkPoints and vtkCellArra). I would like to display only some of the particle traces using a filter in Paraview, as they are sometimes too many; e.g. display only every 10th cell.
In the Glyph filter, there is are options to "Mask Points" (limit max number of displayed points) and "Random Mode" (pick displayed points randomly, not sequentially), so something similar.
Is there some ready-made filter for this, or if not, how to use the programmable filter to write one?

Since you mention cells, maybe you are interested in using the cell centers filter, that will convert all your cells to points (and the corresponding cell arrays to point arrays), so that you can apply the glyph to them.
For having more options in masking, you can apply the MaskPoints Filter before the glyph, and see if you can get the effect you want with all the options provided (for more options, look in the advanced properties by clicking the gear next to the search box).

Related

What is the main idea of creating click heatmap?

in one of my projects, I would like to create heatmap of user clicks. I was searching a while and found this library - http://www.patrick-wied.at/static/heatmapjs/examples.html . That is basically exactly what I would like to make. I would like to create heatmap in SVG, if possible, that is only difference.
I would like to create my own heatmap and I'm just wondering how to do that. I have XY clicks position. Each click has mostly different XY position, but there can be exceptions time to time, a few clicks can have the came XY position.
I found a few solutions based on grid on website, where you have to check which clicks belong into the same column in this grid and according to these informations you are able to fill the most clicked columns with red or orange and so on. But it seems a little bit complicated to me and maybe slower for bigger grids.
So I'm wondering if there is another solution how to "calculate" heatmap colors or I would like to know the main idea used in library above.
Many thanks
To make this kind of heat map, you need some kind of writable array (or, as you put it, a "grid"). User clicks are added onto this array in a cumulative fashion, by adding a small "filter" sub-array (aligned around each click) to the writable array.
Unfortunately, this "grid" method seems to be the easiest, simplest way to get that kind of smooth, blobby appearance. Fortunately, this kind of operation is well-supported by software and hardware, under the name "computer graphics".
When considered as a computer graphics operation, the writable array is called an "accumulation buffer". The filter is what gives you the nice blobby appearance, even with a relatively small number of clicks -- you can tweak the size of the filter according to the needs of your application.
After accumulating the user clicks, you will need to convert from the raw accumulated values to some kind of visible color scale. This may involve looking through the entire accumulation buffer to find the largest value, and mapping your chosen color scale accordingly. Alternately, you could adjust your scale according to the number of mouse clicks, or (as in the demo you linked to) just choose a fixed scale regardless of the content of the buffer.
Finally, I should mention that SVG is not well-adapted to representing this kind of graphic. It should probably be saved as some kind of image file (.jpg or .png) instead.

Invoice / OCR: Detect two important points in invoice image

I am currently working on OCR software and my idea is to use templates to try to recognize data inside invoices.
However scanned invoices can have several 'flaws' with them:
Not all invoices, based on a single template, are correctly aligned under the scanner.
People can write on invoices
etc.
Example of invoice: (Have to google it, sadly cannot add a more concrete version as client data is confidential obviously)
I find my data in the invoices based on the x-values of the text.
However I need to know the scale of the invoice and the offset from left/right, before I can do any real calculations with all data that I have retrieved.
What have I tried so far?
1) Making the image monochrome and use the left and right bounds of the first appearance of a black pixel. This fails due to the fact that people can write on invoices.
2) Divide the invoice up in vertical sections, use the sections that have the highest amount of black pixels. Fails due to the fact that the distribution is not always uniform amongst similar templates.
I could really use your help on (1) how to identify important points in invoices and (2) on what I should focus as the important points.
I hope the question is clear enough as it is quite hard to explain.
Detecting rotation
I would suggest you start by detecting straight lines.
Look (perhaps randomly) for small areas with high contrast, i.e. mostly white but a fair amount of very black pixels as well. Then try to fit a line to these black pixels, e.g. using least squares method. Drop the outliers, and fit another line to the remaining points. Iterate this as required. Evaluate how good that fit is, i.e. how many of the pixels in the observed area are really close to the line, and how far that line extends beyond the observed area. Do this process for a number of regions, and you should get a weighted list of lines.
For each line, you can compute the direction of the line itself and the direction orthogonal to that. One of these numbers can be chosen from an interval [0°, 90°), the other will be 90° plus that value, so storing one is enough. Take all these directions, and find one angle which best matches all of them. You can do that using a sliding window of e.g. 5°: slide accross that (cyclic) region and find a value where the maximal number of lines are within the window, then compute the average or median of the angles within that window. All of this computation can be done taking the weights of the lines into account.
Once you have found the direction of lines, you can rotate your image so that the lines are perfectly aligned to the coordinate axes.
Detecting translation
Assuming the image wasn't scaled at any point, you can then try to use a FFT-based correlation of the image to match it to the template. Convert both images to gray, pad them with zeros till the originals take up at most 1/2 the edge length of the padded image, which preferrably should be a power of two. FFT both images in both directions, multiply them element-wise and iFFT back. The resulting image will encode how much the two images would agree for a given shift relative to one another. Simply find the maximum, and you know how to make them match.
Added text will cause no problems at all. This method will work best for large areas, like the company logo and gray background boxes. Thin lines will provide a poorer match, so in those cases you might have to blur the picture before doing the correlation, to broaden the features. You don't have to use the blurred image for further processing; once you know the offset you can return to the rotated but unblurred version.
Now you know both rotation and translation, and assumed no scaling or shearing, so you know exactly which portion of the template corresponds to which portion of the scan. Proceed.
If rotation is solved already, I'd just sum up all pixel color values horizontally and vertically to a single horizontal / vertical "line". This should provide clear spikes where you have horizontal and vertical lines in the form.
p.s. Generated a corresponding horizontal image with Gimp's scaling capabilities, attached below (it's a bit hard to see because it's only one pixel high and may get scaled down because it's > 700 px wide; the url is http://i.stack.imgur.com/Zy8zO.png ).

Geometry of fonts

If I want to draw a text on a control, I can get "a bounding rectangle" first and place it at an appropriate place (using GetTextExtentPoint32 function).
But I also need to know where some baselines are, e.g the two red lines in the picture.
(Their positions are calculated respect to the top of the bounding rectangle.)
I didn't figure a way to get these information. Please help.
The function GetTextMetrics will get you this. Select your font into the DC first, then call GetTextMetrics. The fields tmAscent and tmDescent of the TEXTMETRIC structure are probably the ones you need.

improve cartographic visualization

I need some advice about how to improve the visualization of cartographic information.
User can select different species and the webmapping app shows its geographical distribution (polygonal degree cells), each specie with a range of color (e.g darker orange color where we find more info, lighter orange where less info).
The problem is when more than one specie overlaps. What I am currently doing is just to calculate the additive color mix of two colors using http://www.xarg.org/project/jquery-color-plugin-xcolor/
As you can see in the image below, the resulting color where two species overlap (mixed blue and yellow) is not intuitive at all.
Someone has any idea or knows similar tools where to get inspiration? for creating the polygons I use d3.js, so if more complex SVG features have to be created I can give a try.
Some ideas I had are...
1) The more data on a polygon, the thicker the border (or each part of the border with its corresponding color)
2) add a label at the center of polygon saying how many species overlap.
3) Divide polygon in different parts, each one with corresponding species color.
thanks in advance,
Pere
My suggestion is something along the lines of option #3 that you listed, with a twist. Rather painting the entire cell with species colors, place a dot in each cell, one for each species. You can vary the color of each dot in the same way that you currently are: darker for more, ligher for less. This doesn't require you to blend colors, and it will expose more of your map to provide more context to the data. I'd try this approach with the border of the cell and without, and see which one works best.
Your visualization might also benefit from some interactivity. A tooltip providing more detailed information and perhaps a further breakdown of information could be displayed when the user hovers his mouse over each cell.
All of this is very subjective. However one thing's for sure: when you're dealing with multi-dimensional data as you are, the less you project dimensions down onto the same visual/perceptual axis, the better. I've seen some examples of "4-dimensional heatmaps" succeed in doing this (here's an example of visualizing latency on a heatmap, identifying different sources with different colors), but I don't think any attempt's made to combine colors.
My initial thoughts about what you are trying to create (a customized variant of a heat map for a slightly crowded data set, I believe:
One strategy is to employ a formula suggested for
n + 1
with regards to breaks in bin spacing. This causes me some concern regarding how many outliers your set has.
Equally-spaced breaks are ideal for compact data sets without
outliers. In many real data sets, especially proteomics data sets,
outliers can make this representation less effective.
One suggestion I have would be to consider the idea of adding some filters to your categories if you have not yet. This would allow slimming down the rendered data for faster reading by the user.
another solution would be to use something like (Comprehensive) R
or maybe even DanteR
Tutorial in displaying mass spectrometry-based proteomic data using heat maps
(Particularly worth noting I felt, was 'Color mapping'.)

Can I link actions between two d3 charts?

Very casual javscript user here. Hopefully this makes sense.
I am tracking 20 different series on a stacked area chart with nvd3.js. These series fluctuate between positive and negative values, some on a huge base and some on a small base. The result is that - when one of the really big series is below the x axis - it pushes everything else underneath too, and the positive series won't appear above the x axis until you filter out the bigger players using the key.
The technically inelegant but good looking solution I have come up with is to split all of my negative values into one array, and all of my positives into another. The top half of the page is a positive values graph, the bottom half is negative values and they line up pretty nicely.
The weakness with this approach is when you go to interact with it as an end user. If I filter out a series (by unchecking it in the key) or change the graph mode (with the type selector) or zoom in on a series (by clicking it so the graph refocuses to that series only) then it will only affect whichever graph you clicked on. I would like to adjust these three click events (and any others I've missed?) so that your action is synchronised across both graphs.
Is this achievable? Any reading material I can dig through where somebody has done something similar? I imagine linking two representations of one data set (e.g a pie and column graph) is vaguely analogous.

Resources