Histogram With Percent Split For Each Bar? - d3.js

Just discovered this incredible library, but am a bit overwhelmed with all the options. Consider a visualization of players, let's say basketball players. We want to compare them across on a number of qualities, say passing, scoring, rebounding, etc.
On any given quality, we compare each players score relative to the other's. So if one player had 2 points while the other only had 1 point, the first player would get 66% and the other player 33%. If there scores were equal, they'd each get 50%.
The qualities, however, vary in importance. So now imagine each quality as a histogram bar. It's height will tell you how important that quality is relative to the other qualities, ie, relative to the other histogram bar. And each bar will be shaded half one color, half another color, showing you each player's relative score on that quality.
First Question: Could I create such a graph using D3? Could I make the histogram horizontal?
Is there a better way to visualize these two simultaneous comparisons?
Thanks!
EDIT:
I found this example which could work perfectly for my situations -- size of circles showing importance of quality, and division of circles into red/blue showing each player's score. But I can't figure out what kind of graph that is?

The bubble chart is a force directed layout of nodes, I don't think there is a special name for the fact that the nodes show a relationship between two data points. Maybe if Mike sees this he can let you know since he is the one that built the example for the Times.
There are quite a few tricks to get this to work, you can check out the code at http://graphics8.nytimes.com/newsgraphics/2012/09/04/convention-speeches/ac823b240e99920e91945dbec49f35b268c09c38/index.js which has thankfully been left unminified.

Related

Invoice / OCR: Detect two important points in invoice image

I am currently working on OCR software and my idea is to use templates to try to recognize data inside invoices.
However scanned invoices can have several 'flaws' with them:
Not all invoices, based on a single template, are correctly aligned under the scanner.
People can write on invoices
etc.
Example of invoice: (Have to google it, sadly cannot add a more concrete version as client data is confidential obviously)
I find my data in the invoices based on the x-values of the text.
However I need to know the scale of the invoice and the offset from left/right, before I can do any real calculations with all data that I have retrieved.
What have I tried so far?
1) Making the image monochrome and use the left and right bounds of the first appearance of a black pixel. This fails due to the fact that people can write on invoices.
2) Divide the invoice up in vertical sections, use the sections that have the highest amount of black pixels. Fails due to the fact that the distribution is not always uniform amongst similar templates.
I could really use your help on (1) how to identify important points in invoices and (2) on what I should focus as the important points.
I hope the question is clear enough as it is quite hard to explain.
Detecting rotation
I would suggest you start by detecting straight lines.
Look (perhaps randomly) for small areas with high contrast, i.e. mostly white but a fair amount of very black pixels as well. Then try to fit a line to these black pixels, e.g. using least squares method. Drop the outliers, and fit another line to the remaining points. Iterate this as required. Evaluate how good that fit is, i.e. how many of the pixels in the observed area are really close to the line, and how far that line extends beyond the observed area. Do this process for a number of regions, and you should get a weighted list of lines.
For each line, you can compute the direction of the line itself and the direction orthogonal to that. One of these numbers can be chosen from an interval [0°, 90°), the other will be 90° plus that value, so storing one is enough. Take all these directions, and find one angle which best matches all of them. You can do that using a sliding window of e.g. 5°: slide accross that (cyclic) region and find a value where the maximal number of lines are within the window, then compute the average or median of the angles within that window. All of this computation can be done taking the weights of the lines into account.
Once you have found the direction of lines, you can rotate your image so that the lines are perfectly aligned to the coordinate axes.
Detecting translation
Assuming the image wasn't scaled at any point, you can then try to use a FFT-based correlation of the image to match it to the template. Convert both images to gray, pad them with zeros till the originals take up at most 1/2 the edge length of the padded image, which preferrably should be a power of two. FFT both images in both directions, multiply them element-wise and iFFT back. The resulting image will encode how much the two images would agree for a given shift relative to one another. Simply find the maximum, and you know how to make them match.
Added text will cause no problems at all. This method will work best for large areas, like the company logo and gray background boxes. Thin lines will provide a poorer match, so in those cases you might have to blur the picture before doing the correlation, to broaden the features. You don't have to use the blurred image for further processing; once you know the offset you can return to the rotated but unblurred version.
Now you know both rotation and translation, and assumed no scaling or shearing, so you know exactly which portion of the template corresponds to which portion of the scan. Proceed.
If rotation is solved already, I'd just sum up all pixel color values horizontally and vertically to a single horizontal / vertical "line". This should provide clear spikes where you have horizontal and vertical lines in the form.
p.s. Generated a corresponding horizontal image with Gimp's scaling capabilities, attached below (it's a bit hard to see because it's only one pixel high and may get scaled down because it's > 700 px wide; the url is http://i.stack.imgur.com/Zy8zO.png ).

improve cartographic visualization

I need some advice about how to improve the visualization of cartographic information.
User can select different species and the webmapping app shows its geographical distribution (polygonal degree cells), each specie with a range of color (e.g darker orange color where we find more info, lighter orange where less info).
The problem is when more than one specie overlaps. What I am currently doing is just to calculate the additive color mix of two colors using http://www.xarg.org/project/jquery-color-plugin-xcolor/
As you can see in the image below, the resulting color where two species overlap (mixed blue and yellow) is not intuitive at all.
Someone has any idea or knows similar tools where to get inspiration? for creating the polygons I use d3.js, so if more complex SVG features have to be created I can give a try.
Some ideas I had are...
1) The more data on a polygon, the thicker the border (or each part of the border with its corresponding color)
2) add a label at the center of polygon saying how many species overlap.
3) Divide polygon in different parts, each one with corresponding species color.
thanks in advance,
Pere
My suggestion is something along the lines of option #3 that you listed, with a twist. Rather painting the entire cell with species colors, place a dot in each cell, one for each species. You can vary the color of each dot in the same way that you currently are: darker for more, ligher for less. This doesn't require you to blend colors, and it will expose more of your map to provide more context to the data. I'd try this approach with the border of the cell and without, and see which one works best.
Your visualization might also benefit from some interactivity. A tooltip providing more detailed information and perhaps a further breakdown of information could be displayed when the user hovers his mouse over each cell.
All of this is very subjective. However one thing's for sure: when you're dealing with multi-dimensional data as you are, the less you project dimensions down onto the same visual/perceptual axis, the better. I've seen some examples of "4-dimensional heatmaps" succeed in doing this (here's an example of visualizing latency on a heatmap, identifying different sources with different colors), but I don't think any attempt's made to combine colors.
My initial thoughts about what you are trying to create (a customized variant of a heat map for a slightly crowded data set, I believe:
One strategy is to employ a formula suggested for
n + 1
with regards to breaks in bin spacing. This causes me some concern regarding how many outliers your set has.
Equally-spaced breaks are ideal for compact data sets without
outliers. In many real data sets, especially proteomics data sets,
outliers can make this representation less effective.
One suggestion I have would be to consider the idea of adding some filters to your categories if you have not yet. This would allow slimming down the rendered data for faster reading by the user.
another solution would be to use something like (Comprehensive) R
or maybe even DanteR
Tutorial in displaying mass spectrometry-based proteomic data using heat maps
(Particularly worth noting I felt, was 'Color mapping'.)

Counting object on image algorithm

I got school task again. This time, my teacher gave me task to create algorithm to count how many ducks on picture.
The picture is similar to this one:
I think I should use pattern recognition for searching how many ducks on it. But I don't know which pattern match for each duck.
I think that you can solve this problem by segmenting the ducks' beaks and counting the number of connected components in the binary image.
To segment the ducks' beaks, first convert the image to HSV color space and then perform a binarization using the hue component. Note that the ducks' beaks hue are different from other parts of the image.
Here's one way:
Hough transform for circles:
Initialize an accumulator array indexed by (x,y,radius)
For each pixel:
calculate an edge (e.g. Sobel operator will provide both magnitude and direction), if magnitude exceeds some threshold then:
increment every accumulator for which this edge could possibly lend evidence (only the (x,y) in the direction of the edge, only radii between min_duck_radius and max_duck_radius)
Now smooth and threshold the accumulator array, and the coordinates of highest accumulators show you where the heads are. The threshold may leap out at you if you histogram the values in the accumulators (there may be a clear difference between "lots of evidence" and "noise").
So that's very terse, but it can get you started.
It might be just because I'm working with SIFT right now, but to me it looks like it could be good for your problem.
It is an algorithm that matches the same object on two different pictures, where the objects can have different orientations, scales and be viewed from different perspectives on the two pictures. It can also work when an object is partially hidden (as your ducks are) by another object.
I'd suggest finding a good clear picture of a rubber ducky ( :D ) and then use some SIFT implementation (VLFeat - C library with SIFT but no visualization, SIFT++ - based on VLFeat, but in C++ , Rob Hess in C with OpenCV...).
You should bear in mind that matching with SIFT (and anything else) is not perfect - so you might not get the exact number of rubber duckies in the picture.

Automatic tracking algorithm

I'm trying to write a simple tracking routine to track some points on a movie.
Essentially I have a series of 100-frames-long movies, showing some bright spots on dark background.
I have ~100-150 spots per frame, and they move over the course of the movie. I would like to track them, so I'm looking for some efficient (but possibly not overkilling to implement) routine to do that.
A few more infos:
the spots are a few (es. 5x5) pixels in size
the movement are not big. A spot generally does not move more than 5-10 pixels from its original position. The movements are generally smooth.
the "shape" of these spots is generally fixed, they don't grow or shrink BUT they become less bright as the movie progresses.
the spots don't move in a particular direction. They can move right and then left and then right again
the user will select a region around each spot and then this region will be tracked, so I do not need to automatically find the points.
As the videos are b/w, I though I should rely on brigthness. For instance I thought I could move around the region and calculate the correlation of the region's area in the previous frame with that in the various positions in the next frame. I understand that this is a quite naïve solution, but do you think it may work? Does anyone know specific algorithms that do this? It doesn't need to be superfast, as long as it is accurate I'm happy.
Thank you
nico
Sounds like a job for Blob detection to me.
I would suggest the Pearson's product. Having a model (which could be any template image), you can measure the correlation of the template with any section of the frame.
The result is a probability factor which determine the correlation of the samples with the template one. It is especially applicable to 2D cases.
It has the advantage to be independent from the sample absolute value, since the result is dependent on the covariance related with the mean of the samples.
Once you detect an high probability, you can track the successive frames in the neightboor of the original position, and select the best correlation factor.
However, the size and the rotation of the template matter, but this is not the case as I can understand. You can customize the detection with any shape since the template image could represent any configuration.
Here is a single pass algorithm implementation , that I've used and works correctly.
This has got to be a well reasearched topic and I suspect there won't be any 100% accurate solution.
Some links which might be of use:
Learning patterns of activity using real-time tracking. A paper by two guys from MIT.
Kalman Filter. Especially the Computer Vision part.
Motion Tracker. A student project, which also has code and sample videos I believe.
Of course, this might be overkill for you, but hope it helps giving you other leads.
Simple is good. I'd start doing something like:
1) over a small rectangle, that surrounds a spot:
2) apply a weighted average of all the pixel coordinates in the area
3) call the averaged X and Y values the objects position
4) while scanning these pixels, do something to approximate the bounding box size
5) repeat next frame with a slightly enlarged bounding box so you don't clip spot that moves
The weight for the average should go to zero for pixels below some threshold. Number 4 can be as simple as tracking the min/max position of anything brighter than the same threshold.
This will of course have issues with spots that overlap or cross paths. But for some reason I keep thinking you're tracking stars with some unknown camera motion, in which case this should be fine.
I'm afraid that blob tracking is not simple, not if you want to do it well.
Start with blob detection as genpfault says.
Now you have spots on every frame and you need to link them up. If the blobs are moving independently, you can use some sort of correspondence algorithm to link them up. See for instance http://server.cs.ucf.edu/~vision/papers/01359751.pdf.
Now you may have collisions. You can use mixture of gaussians to try to separate them, give up and let the tracks cross, use any other before-and-after information to resolve the collisions (e.g. if A and B collide and A is brighter before and will be brighter after, you can keep track of A; if A and B move along predictable trajectories, you can use that also).
Or you can collaborate with a lab that does this sort of stuff all the time.

Looking for a good world map generation algorithm [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I'm working on a Civilization-like game and I'm looking for a good algorithm for generating Earth-like world maps. I've experimented with a few alternatives, but haven't hit on a real winner yet.
One option is to generate a heightmap using Perlin noise and add water at a level so that about 30% of the world is land. While Perlin noise (or similar fractal-based techniques) is frequently used for terrain and is reasonably realistic, it doesn't offer much in the way of control over the number, size and position of the resulting continents, which I'd like to have from a gameplay perspective.
A second option is to start with a randomly positioned one-tile seed (I'm working on a grid of tiles), determine the desired size for the continent and each turn add a tile that is horizontally or vertically adjacent to the existing continent until you've reached the desired size. Repeat for the other continents. This technique is part of the algorithm used in Civilization 4. The problem is that after placing the first few continents, it's possible to pick a starting location that's surrounded by other continents, and thus won't fit the new one. Also, it has a tendency to spawn continents too close together, resulting in something that looks more like a river than continents.
Does anyone happen to know a good algorithm for generating realistic continents on a grid-based map while keeping control over their number and relative sizes?
You could take a cue from nature and modify your second idea. Once you generate your continents (which are all about the same size), get them to randomly move and rotate and collide and deform each other and drift apart from each other. (Note: this may not be the easiest thing ever to implement.)
Edit: Here's another way of doing it, complete with an implementation — Polygonal Map Generation for Games.
I've created something similar to your first image in JavaScript. It's not super sophisticated but it works :
http://jsfiddle.net/AyexeM/zMZ9y/
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Untitled Document</title>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js"></script>
<style type="text/css">
#stage{
font-family: Courier New, monospace;
}
span{
display: none;
}
.tile{
float:left;
height:10px;
width:10px;
}
.water{
background-color: #55F;
}
.earth{
background-color: #273;
}
</style>
</head>
<body>
<div id="stage">
</div>
<script type="text/javascript">
var tileArray = new Array();
var probabilityModifier = 0;
var mapWidth=135;
var mapheight=65;
var tileSize=10;
var landMassAmount=2; // scale of 1 to 5
var landMassSize=3; // scale of 1 to 5
$('#stage').css('width',(mapWidth*tileSize)+'px');
for (var i = 0; i < mapWidth*mapheight; i++) {
var probability = 0;
var probabilityModifier = 0;
if (i<(mapWidth*2)||i%mapWidth<2||i%mapWidth>(mapWidth-3)||i>(mapWidth*mapheight)-((mapWidth*2)+1)){
// make the edges of the map water
probability=0;
}
else {
probability = 15 + landMassAmount;
if (i>(mapWidth*2)+2){
// Conform the tile upwards and to the left to its surroundings
var conformity =
(tileArray[i-mapWidth-1]==(tileArray[i-(mapWidth*2)-1]))+
(tileArray[i-mapWidth-1]==(tileArray[i-mapWidth]))+
(tileArray[i-mapWidth-1]==(tileArray[i-1]))+
(tileArray[i-mapWidth-1]==(tileArray[i-mapWidth-2]));
if (conformity<2)
{
tileArray[i-mapWidth-1]=!tileArray[i-mapWidth-1];
}
}
// get the probability of what type of tile this would be based on its surroundings
probabilityModifier = (tileArray[i-1]+tileArray[i-mapWidth]+tileArray[i-mapWidth+1])*(19+(landMassSize*1.4));
}
rndm=(Math.random()*101);
tileArray[i]=(rndm<(probability+probabilityModifier));
}
for (var i = 0; i < tileArray.length; i++) {
if (tileArray[i]){
$('#stage').append('<div class="tile earth '+i+'"> </div>');
}
else{
$('#stage').append('<div class="tile water '+i+'"> </div>');
}
}
</script>
</body>
</html>
I'd suggest you back up and
Think about what makes "good" continents.
Write an algorithm that can tell a good continental layout from a bad one.
Refine the algorithm so that you can quantify how good a good layout is.
Once you have that in place, you can start to implement an algorithm which should be shaped like this:
Generate crappy continents and then improve them.
For improvement you can try all sorts of standard optimization tricks, whether it's simulated annealing, genetic programming, or something completely ad hoc, like moving a randomly chosen edge square from whereever it is on the continent to the edge opposite the continent's center of mass. But the key is to be able to write a program that can tell good continents from bad ones. Start out with hand-drawn continents as well as your test continents, until you get something you like.
I wrote something similar to what you're after for an automated screensaver-style clone of Civilization 1. For the record I wrote this in VB.net but since you don't mention anything about language or platform in your question I'll keep it abstract.
The "map" specifies the number of continents, continent size variance (eg 1.0 would keep all continents with the same approximate land area, down to 0.1 would allow continents to exist with 1/10th the mass of the largest continent), maximum land area (as a percentage) to generate, and the central land bias. A "seed" is distributed randomly around the map for each continent, weighted towards the centre of the map as per the central bias (eg a low bias produces distributed continents more similar to Earth, where as a high central bias will resemble more of a Pangaea). Then for each iteration of growth, the "seeds" assign land tiles according to a distribution algorithm (more on that later) until a maximum land area has been reached.
The land distribution algorithm can be as precise as you want but I found more interesting results applying various genetic algorithms and rolling the dice. Conway's "Game of Life" is a really easy one to start out with. You'll need to add SOME globally aware logic to avoid things like continents growing into each other but for the most part things take care of themselves. The problem I found with more fractal-based approaches (which was my first inclination) was the results either looked too patterned, or lead to too many scenarios requiring hacky-feeling workaround rules to get a result which still didn't feel dynamic enough. Depending on the algorithm you use, you may want to apply a "blurring" pass over the result to eliminate things like abundant single-square ocean tiles and checkered coastlines. In the event something like a continent being spawned surrounded by several others and having nowhere left to grow, relocate the seed to a new point on the map and continue the growth passes. Yes, it can mean you sometimes end up with more continents than planned, but if it's really something you firmly don't want then another way to help avoid it is bias the growth algorithms so they favour growth in the direction with least proximity to other seeds. At worst (in my opinion anyway), you can flag a series as invalid when a seed has nowhere left to grow and generate a new map. Just make sure you set a maximum number of attempts so if anything unrealistic is specified (like fitting 50 even-weighted continents on a 10x10 board) it doesn't spend forever trying to find a valid solution.
I can't vouch for how Civ etc do it, and of course doesn't cover things like climate, land age etc but by playing around with the seed growth algorithm you can get pretty interesting results that resemble continents, archipelagos etc. You can use the same approach to produce 'organic' looking rivers, mountain ranges etc too.
Just thinking off the cuff here:
Pick some starting points, and assign each a randomly drawn (hoped for) size. You can can maintain a separate size draw for planned continents and planned islands if you want.
Loop over the land elements, and where they are not yet at the planned size add one square. But the fun part is weighing the chance that each neighboring element will be the one. Some suggested thing that might factor in:
Distance to the nearest "other" land. Further is better generates wide oceanic spaces. Nearer is better makes narrow channels. You have to decide if you're going to let bits merge as well.
Distance from the seed. Nearer is better means compact land masses, farther is better means long strung out bits
Number of existing land squares adjacent. Weighting in favor of many adjacent squares gives you smooth coast, preferring few gives you lots of inlets and peninsulas.
Presence of "resources" squares nearby? Depends on the game rules, when you generate resource square, and if you want to make it easy.
Will you allow bits to approach or join with the poles?
??? don't know what else
Continue until all land masses have reached the planned size or can't grow anymore for some reason.
Notice that diddling the parameter to these weighting factors allows you to tune the kind of world generated , which is a feature I liked about some of the Civs.
This way you'll need to do terrain generation on each bit separately.
You could try a diamond square algorithm or perlin noise to generate something like a height map. Then, assign ranges values to what shows up on the map. If your "height" goes from 0 to 100, then make 0 - 20 water, 20 - 30 beach, 30 - 80 grass, 80 - 100 mountains. I think notch did something similar to this in minicraft, but I'm not an expert, I'm just in a diamond square mindset after finally getting it working.
I think you can use "dynamic programming" style approach here.
Solve small problems first and combine
solutions smartly to solve bigger
problem.
A1= [elliptical rectangular random ... ]// list of continents with area A1 approx.
A2= [elliptical rectangular random ... ]// list of continents with area A2 approx.
A3= [elliptical rectangular random ... ]// list of continents with area A3 approx.
...
An= [elliptical rectangular random ... ]// list of continents with area An approx.
// note that elliptical is approximately elliptical in shape and same for the other shapes.
Choose one/more randomly from each of the lists (An).
Now you have control over number and area of continents.
You can use genetic algorithm for positioning them
as you see "fit" ;)
It will be very good to take a look at some "Graph Layout Algorithms"
Force Based Algorithms
Genetic Algorithm for Graph Layout
You can modify these to suit your purpose.
I had an idea for map creation similar to the tectonic plates answer. It went something like this:
sweep through the grid squares giving each square a "land" square if rnd <= 0.292 (the actual percentage of dry land on planet earth).
Migrate each land chunk one square toward its nearest larger neighbour. If neighbours are equidistant, go toward the larger chunk. If chunks are equal size, choose one randomly.
if two land squares touch, group them into a chunk, moving all squares as one from now on.
repeat from step 2. Stop when all chunks are connected.
This is similar to how gravity works in a 3D space. It's pretty complicated. A simpler algorithm for your needs would work as follows:
Drop in n starter land squares at random x,y positions and acceptable distances from each other. These are seeds for your continents. (Use the Pythagorean theorem to ensure the seeds have a minimum distance between themselves and all others.)
spawn a land square from an existing land square in a random direction, if that direction is an ocean square.
repeat step 2. Stop when land squares fill 30% of total map size.
if continents are close enough to each other, drop in land bridges as desired to simulate a Panama type effect.
Drop in smaller, random islands as desired for a more natural look.
for each extra "island" square you add, cut out inland seas and lake squares from the continents using the same algorithm in reverse. This will maintain the land percentage at the desired amount.
Let me know how this works out. I've never tried it myself.
PS. I see this is similar to what you tried. Except it sets up all the seeds at once, before beginning, so the continents will be far enough apart and will stop when the map is sufficiently filled.
I haven't actually tried this but it was inspired by David Johnstone's answer regarding tectonic plates. I tried implementing it myself in my old Civ project and when it came to handling collisions I had another idea. Instead of generating tiles directly, each continent consists of nodes. Distribute mass to each node then generate a series of "blob" continents using a 2D metaball approach. Tectonics and continental drift would be ridiculously easy to "fake" simply by moving the nodes around. Depending on how complex you want to go, you could even apply things like currents to handle the node movement and generate mountain ranges that correspond to plate boundaries overlapping. Probably wouldn't add that much to the gameplay side of things, but it could make for an interesting map generation from a purely academic perspective :)
A good explanation of metaballs if you haven't worked with them before:
http://www.gamedev.net/page/resources/_//feature/fprogramming/exploring-metaballs-and-isosurfaces-in-2d-r2556
Here's what I'm thinking, since I'm about to implement something like this that I have for a game in development. :
The world divided into regions. depending on the size of the world, it will determine how many regions. For this example, we'll assume a medium sized world, with 6 regions. Each grid zone breaks into 9 grid zones. those grid zones break into 9 grids each. (this is not for character movement, but merely for map creation) The Grids are for biomes, grid zones are for over arching land features, (continent vs ocean) and the regions are for overall climate. The grids break down into tiles.
Randomly generated, the regions get assigned logical climate sets. Grid zones get randomly assigned to, for instance; ocean or land. Grids get assigned biomes randomly with modifiers based on their grid zones and climate, these being forest, desert, plains, glacial, swamp or volcanic. Once all those basics are assigned, it's time to blend them together, using a random percentage based function that fills in tile sets. For example; if you have a forest biome, next to a desert biome, you have an algorithm that decreases the likely hood that a tile will be "foresty" and increases that it will be "deserty." So, about half way between them, you'll see a sort of blended affect combining the two biomes to off a somewhat smooth transition between them. Transition from one grid zone to the next would probably take a little more work to insure logic landmass formations.Like, for example, a biome from one grid zone that touches the biome from another, instead of having a simple switching percentage based on proximity. For example, there are 50 tiles from the center of the biome to the edge of the biome, meaning, there are 50 from the edge it touches to the center of the next biome. That would logically leave a 100% change from one biome to the next. So as the tiles get nearer to the border of the two biomes, the percentage narrows out to around 60% or so. It'd, I think, be unwise to give too much probability of crossing biomes far from the border, but you'll want the border to be somewhat blended. For the grid zones, the percentage change will be much more pronounced. Instead of the % going down to around 60%, it'd only drop down to around 80%. And a secondary check would then have to be performed to ensure that there's not a random water tile in the middle of a land biome next to the ocean without some logic to it. So, either, connect that water tile to the ocean mass to make a channel to explain the water tile, or remove it altogether. Land in a water based biome is easier to explain using rock outcrops and such.
Oh, kinda dumb, sorry.
I'd place fractal terrain according to some layout that you know "works" (e.g. 2x2 grid, diamond, etc, with some jitter) but with a Gaussian distribution damping peaks down towards the edges of the continent centers. Place the water level lower so that is mostly land until you get near the edges.

Resources