Normalizing dataset with ruby

Normalizing dataset with ruby - ruby

I have a data set that ranges from 1 to 30,000
I want to normalize it, so that it becomes 0.1 to 10
What is the best method/function to do that?
Would greatly appreciate it if you could give some sample code!

Here's a code snippet, assuming you want a linear normalization. It's a very simplistic version (just straight code, no methods), so you can see "how it works" and can apply it to anything.
xmin = 1.0
xmax = 30000.0
ymin = 0.1
ymax = 10.0
xrange = xmax-xmin
yrange = ymax-ymin
y = ymin + (x-xmin) * (yrange / xrange)
And here it is done as a function:
def normalise(x, xmin, xmax, ymin, ymax)
xrange = xmax - xmin
yrange = ymax - ymin
ymin + (x - xmin) * (yrange.to_f / xrange)
end
puts normalise(2000, 1, 30000, 0.1, 10)
(Note: the to_f ensures we don't fall into the black hole of integer division)

This is a well known way to scale a collection numbers. It has more precise name but I can't remember and fail to google it.
def scale(numbers, min, max)
current_min = numbers.min
current_max = numbers.max
numbers.map {|n| min + (n - current_min) * (max - min) / (current_max - current_min)}
end
dataset = [1,30000,15000,200,3000]
result = scale(dataset, 0.1, 10.0)
=> [0.1, 10.0, 5.04983499449982, 0.165672189072969, 1.08970299009967]
scale(result, 1, 30000)
=> [1.0, 30000.000000000004, 15000.0, 199.99999999999997, 3000.0000000000005]
As you can see, you have to be aware of rounding issues. You should probably also make sure that you don't get integers as min & max because integer division will damage the result.

Here's the Ruby Way for the common case of setting an array's min to 0.0 and max to 1.0.
class Array
def normalize!
xMin,xMax = self.minmax
dx = (xMax-xMin).to_f
self.map! {|x| (x-xMin) / dx }
end
end
a = [3.0, 6.0, 3.1416]
a.normalize!
=> [0.0, 1.0, 0.047199999999999985]
For a min and max other than 0 and 1, add arguments to normalize! in the manner of Elfstrom's answer.

x = x / 3030.3031 + 0.1

Related

How to generate points on a line? in Processing

I’m trying to make 6 dots along a line(0, random(height), width, random(height)). The dots should be evenly spaced.

You can use lerp(start, end, t) to linearly interpolate between to values by specifying t: where in between the start/end values you'd like the result to be.
This t value is between 0.0 and 1.0 (normalised value). You can think if of it as percentage. (e.g. 0.0 is at the start (0%) value, 1.0 is at the end value(100%), 0.5 is 50% between the start and end value).
In your case, you would:
store the randomly generated values first (before interpolation)
iterate 6 times, and for each iteration
for each iteration, map the iteration index to the normalised value (t)
Finally, use lerp() by plugging in the from/to values and the t value at the current iteration.
Here's a basic example:
float fromX = 0;
float fromY = random(height);
float toX = width;
float toY = random(height);
int numPoints = 6;
for(int i = 0 ; i < numPoints; i++){
float interpolationAmount = map(i, 0, numPoints - 1, 0.0, 1.0);
float interpolatedX = lerp(fromX, toX, interpolationAmount);
float interpolatedY = lerp(fromY, toY, interpolationAmount);
ellipse(interpolatedX, interpolatedY, 9, 9);
}
Alternatively you can use PVector's lerp() to easiely interpolate between points in 2D (or 3D), without having to interpolate every component:
PVector start = new PVector(0 , random(height));
PVector end = new PVector(width, random(height));
for(float t = 0.0 ; t <= 1.0 ; t += 1.0 / 5){
PVector inbetween = PVector.lerp(start, end, t);
ellipse(inbetween.x, inbetween.y, 9, 9);
}
Update
The slope is the ratio (division) between the difference on Y axis (called rise, Δy = y2 - y1 (E.g. toY - fromY)) and the difference on the X axis (called run, Δx = x2 - x1 (e.g. toX - fromY)).
You can use this difference between start and end points (defining the slope) to draw the points in between.
If you divide this difference into equal sections, each for a point you'd like to draw, then you can multiply it as you iterate and simply translate/offset it from the start position:
// start point
float fromX = 0;
float fromY = random(height);
// end point
float toX = width;
float toY = random(height);
// difference between each component
float diffY = toY - fromY;
float diffX = toX - fromX;
// slope = ratio between Y and X difference
float slope = diffY / diffX;
println("slope as ratio", slope, "as degrees", degrees(atan2(diffY, diffX) + PI));
// start drawing 6 points
int numPoints = 6;
// precalculate a sixth
float sectionIncrement = 1.0 / (numPoints - 1);
for(int i = 0 ; i < 6; i++){
// a sixth incremented (e.g. 1/6 * 0, * 1, *2, ...)
float section = sectionIncrement * i;
// a sixth incremented and mulitplied to the difference
// e.g. 1/6 of slope difference, 2/6 of slope / etc.
// to which we offset the start location (fromX, fromY +)
float x = fromX + (diffX * section);
float y = fromY + (diffY * section);
// render
ellipse(x, y, 9, 9);
}

point(0, random(height))
point(width/5, random(height))
point(width/5*2, random(height))
point(width/5*3, random(height))
point(width/5*4, random(height))
point(width, random(height))

How to generate Random coordinates within a circle with specified radius?

I am trying to generate random coordinates (lat,long) that lies within a circle with 5 kilometer radius where center point is located at some coordinates (x, y). I am trying to code this in ruby and I'm using the method but somehow i get the results that are NOT within specified 5 km radius.
def location(lat, lng, max_dist_meters)
max_radius = Math.sqrt((max_dist_meters ** 2) / 2.0)
lat_offset = rand(10 ** (Math.log10(max_radius / 1.11)-5))
lng_offset = rand(10 ** (Math.log10(max_radius / 1.11)-5))
lat += [1,-1].sample * lat_offset
lng += [1,-1].sample * lng_offset
lat = [[-90, lat].max, 90].min
lng = [[-180, lng].max, 180].min
[lat, lng]
end

Your code
max_radius = Math.sqrt((max_dist_meters ** 2) / 2.0)
This is just max_dist_meters.abs / Math.sqrt(2) or max_dist_meters * 0.7071067811865475.
10 ** (Math.log10(max_radius / 1.11)-5)
This can be written 9.00901E-6 * max_radius, so it's 6.370325E−6 * max_dist_meters.
rand(10 ** (Math.log10(max_radius / 1.11)-5))
Now for the fun part : rand(x) is just rand() if x is between -1 and 1. So if max_dist_meters is smaller than 1/6.370325E−6 ~ 156977.86, all your 3 first lines do is :
lat_offset = rand()
lng_offset = rand()
So for max_dist_meters = 5000, your method will return a random point that could be 1° longitude and 1° latitude away. At most, it would be a bit more than 157km.
Worse, if x is between 156978 and 313955, your code is equivalent to :
lat_offset = lng_offset = 0
Since Ruby 2.4
[[-90, lat].max, 90].min
can be written lat.clamp(-90, 90)
Possible solution
To get a uniform distribution of random points on the disk of radius max_radius, you need a non-uniform distribution of random radii :
def random_point_in_disk(max_radius)
r = max_radius * rand ** 0.5
theta = rand * 2 * Math::PI
[r * Math.cos(theta), r * Math.sin(theta)]
end
Here's a plot with a million random points :
Here's the same plot with #Schwern's code :
Once you have this method, you can apply some basic math to convert meters to latitude and longitude. Just remember that 1° of latitude is always 111.2km but 1° of longitude is 111.2km at the equator but 0km at the poles :
def random_point_in_disk(max_radius)
r = max_radius * rand**0.5
theta = rand * 2 * Math::PI
[r * Math.cos(theta), r * Math.sin(theta)]
end
EarthRadius = 6371 # km
OneDegree = EarthRadius * 2 * Math::PI / 360 * 1000 # 1° latitude in meters
def random_location(lon, lat, max_radius)
dx, dy = random_point_in_disk(max_radius)
random_lat = lat + dy / OneDegree
random_lon = lon + dx / ( OneDegree * Math::cos(lat * Math::PI / 180) )
[random_lon, random_lat]
end
For this kind of calculation, there's no need to install a 800-pound GIS gorilla.
A few points:
We usually talk about latitude first and longitude second, but in GIS, it's usually lon first because x comes before y.
cos(lat) is considered to be constant so max_radius shouldn't be too big. A few dozens kilometers shouldn't pose any problem. The shape of a disk on a sphere becomes weird with a large radius.
Don't use this method too close to the poles, you'd get arbitrarily large coordinates otherwise.
To test it, let's create random points on 200km disks at different locations:
10_000.times do
[-120, -60, 0, 60, 120].each do |lon|
[-85, -45, 0, 45, 85].each do |lat|
puts random_location(lon, lat, 200_000).join(' ')
end
end
end
With gnuplot, here's the resulting diagram:
Yay! I just reinvented Tissot's indicatrix, 150 years too late :

def location(x_origin, y_origin, radius)
x_offset, y_offset = nil, nil
rad = radius.to_f
begin
x_offset = rand(-rad..rad)
y_offset = rand(-rad..rad)
end until Math.hypot(x_offset, y_offset) < radius
[x_origin + x_offset, y_origin + y_offset]
end

I'd suggest generating a random radius and a random angle. Then you can use those to generate a coordinate with Math.sin and Math.cos.
def location(max_radius)
# 0 to max radius.
# Using a range ensures max_radius is included.
radius = Random.rand(0.0..max_radius)
# 0 to 2 Pi excluding 2 Pi because that's just 0.
# Using Random.rand() because Kernel#rand() does not honor floats.
radians = Random.rand(2 * Math::PI)
# Math.cos/sin work in radians, not degrees.
x = radius * Math.cos(radians)
y = radius * Math.sin(radians)
return [x, y]
end
I'll leave converting this to lat/long and adding a center for you.
Really what I'd suggest is finding a geometry library that supports operations like "give me a random point inside this shape", "is this point inside this shape" and will do lat/long conversions for you because this stuff is very easy to get subtly wrong.
You could build on top of the Ruby geometry gem which provides you with classes for basic shapes. Or if your data is in a database, many support geometric types like PostgreSQL's geometry types, the more powerful PostGIS add-on, or even MySQL has spatial data types.

There's a gem that does what you need.
https://github.com/sauloperez/random-location
Just install the gem and require it on your code:
gem install random-location
require 'random-location'
RandomLocation.near_by(41.38506, 2.17340, 10000)

How do I translate and scale points within a bounding box?

I have a number of points P of the form (x, y) where x,y are real numbers. I want to translate and scale all these points within a bounding box (rectangle) which begins at the point (0,0) (top left) and extends to the point (1000, 1000) (bottom right).
Why is it that the following algorithm does not produce points in that bounding box?
for Point p in P:
max = greatest(p.x, p.y, max)
scale = 1000 / max
for Point p in P:
p.x = (p.x - 500) * scale + 500
p.y = (p.y - 500) * scale + 500
I fear that this won't work when p.x or p.y is a negative number.
I would also like to maintain the "shape" of the points.

Find all of yMin, yMax, xMin, xMax, xDelta = xMax-xMin and yDelta = yMax-yMin for your set of points.
Set max = greatest(xDelta,yDelta).
Foreach Point p set p.X = (p.X - xMin) * scale and p.Y = (p.Y - yMin) * scale

Choosing an attractive linear scale for a graph’s Y Axis - more

Further to: Choosing an attractive linear scale for a graph's Y Axis
And what to do when some of the points
are negative?
I believe this part of the question was not answered but it seems I can't comment or extend that question so I've created a new one
Values -100, 0, 100 with 5 ticks:
lower bound = -100
upper bound = 100
range = 100--100 = 200
tick range = 40
Divide by 10^2 for 0.4, translates to 0.4, which gives (multiplied by 10^2) 40.
new lower bound = 40 * round(-100/40) = -80
new upper bound = 40 * round(1+100/40) = 120
or
new lower bound = 40 * floor(-100/40) = -120
new upper bound = 40 * floor(1+100/40) = 120
Now the range has been increased to 240 (an extra tick!), with 5 ticks at 40 each.
it will take 6 steps to fill the new range!
Solution?

I use the following code. It produces nicely-spaced steps for human viewers and caters for ranges that pass through zero.
public static class AxisUtil
{
public static float CalculateStepSize(float range, float targetSteps)
{
// calculate an initial guess at step size
float tempStep = range/targetSteps;
// get the magnitude of the step size
float mag = (float)Math.Floor(Math.Log10(tempStep));
float magPow = (float)Math.Pow(10, mag);
// calculate most significant digit of the new step size
float magMsd = (int)(tempStep/magPow + 0.5);
// promote the MSD to either 1, 2, or 5
if (magMsd > 5.0)
magMsd = 10.0f;
else if (magMsd > 2.0)
magMsd = 5.0f;
else if (magMsd > 1.0)
magMsd = 2.0f;
return magMsd*magPow;
}
}

Smooth spectrum for Mandelbrot Set rendering

I'm currently writing a program to generate really enormous (65536x65536 pixels and above) Mandelbrot images, and I'd like to devise a spectrum and coloring scheme that does them justice. The wikipedia featured mandelbrot image seems like an excellent example, especially how the palette remains varied at all zoom levels of the sequence. I'm not sure if it's rotating the palette or doing some other trick to achieve this, though.
I'm familiar with the smooth coloring algorithm for the mandelbrot set, so I can avoid banding, but I still need a way to assign colors to output values from this algorithm.
The images I'm generating are pyramidal (eg, a series of images, each of which has half the dimensions of the previous one), so I can use a rotating palette of some sort, as long as the change in the palette between subsequent zoom levels isn't too obvious.

This is the smooth color algorithm:
Lets say you start with the complex number z0 and iterate n times until it escapes. Let the end point be zn.
A smooth value would be
nsmooth := n + 1 - Math.log(Math.log(zn.abs()))/Math.log(2)
This only works for mandelbrot, if you want to compute a smooth function for julia sets, then use
Complex z = new Complex(x,y);
double smoothcolor = Math.exp(-z.abs());
for(i=0;i<max_iter && z.abs() < 30;i++) {
z = f(z);
smoothcolor += Math.exp(-z.abs());
}
Then smoothcolor is in the interval (0,max_iter).
Divide smoothcolor with max_iter to get a value between 0 and 1.
To get a smooth color from the value:
This can be called, for example (in Java):
Color.HSBtoRGB(0.95f + 10 * smoothcolor ,0.6f,1.0f);
since the first value in HSB color parameters is used to define the color from the color circle.

Use the smooth coloring algorithm to calculate all of the values within the viewport, then map your palette from the lowest to highest value. Thus, as you zoom in and the higher values are no longer visible, the palette will scale down as well. With the same constants for n and B you will end up with a range of 0.0 to 1.0 for a fully zoomed out set, but at deeper zooms the dynamic range will shrink, to say 0.0 to 0.1 at 200% zoom, 0.0 to 0.0001 at 20000% zoom, etc.

Here is a typical inner loop for a naive Mandelbrot generator. To get a smooth colour you want to pass in the real and complex "lengths" and the iteration you bailed out at. I've included the Mandelbrot code so you can see which vars to use to calculate the colour.
for (ix = 0; ix < panelMain.Width; ix++)
{
cx = cxMin + (double )ix * pixelWidth;
// init this go
zx = 0.0;
zy = 0.0;
zx2 = 0.0;
zy2 = 0.0;
for (i = 0; i < iterationMax && ((zx2 + zy2) < er2); i++)
{
zy = zx * zy * 2.0 + cy;
zx = zx2 - zy2 + cx;
zx2 = zx * zx;
zy2 = zy * zy;
}
if (i == iterationMax)
{
// interior, part of set, black
// set colour to black
g.FillRectangle(sbBlack, ix, iy, 1, 1);
}
else
{
// outside, set colour proportional to time/distance it took to converge
// set colour not black
SolidBrush sbNeato = new SolidBrush(MapColor(i, zx2, zy2));
g.FillRectangle(sbNeato, ix, iy, 1, 1);
}
and MapColor below: (see this link to get the ColorFromHSV function)
private Color MapColor(int i, double r, double c)
{
double di=(double )i;
double zn;
double hue;
zn = Math.Sqrt(r + c);
hue = di + 1.0 - Math.Log(Math.Log(Math.Abs(zn))) / Math.Log(2.0); // 2 is escape radius
hue = 0.95 + 20.0 * hue; // adjust to make it prettier
// the hsv function expects values from 0 to 360
while (hue > 360.0)
hue -= 360.0;
while (hue < 0.0)
hue += 360.0;
return ColorFromHSV(hue, 0.8, 1.0);
}
MapColour is "smoothing" the bailout values from 0 to 1 which then can be used to map a colour without horrible banding. Playing with MapColour and/or the hsv function lets you alter what colours are used.

Seems simple to do by trial and error. Assume you can define HSV1 and HSV2 (hue, saturation, value) of the endpoint colors you wish to use (black and white; blue and yellow; dark red and light green; etc.), and assume you have an algorithm to assign a value P between 0.0 and 1.0 to each of your pixels. Then that pixel's color becomes
(H2 - H1) * P + H1 = HP
(S2 - S1) * P + S1 = SP
(V2 - V1) * P + V1 = VP
With that done, just observe the results and see how you like them. If the algorithm to assign P is continuous, then the gradient should be smooth as well.

My eventual solution was to create a nice looking (and fairly large) palette and store it as a constant array in the source, then interpolate between indexes in it using the smooth coloring algorithm. The palette wraps (and is designed to be continuous), but this doesn't appear to matter much.

What's going on with the color mapping in that image is that it's using a 'log transfer function' on the index (according to documentation). How exactly it's doing it I still haven't figured out yet. The program that produced it uses a palette of 400 colors, so index ranges [0,399), wrapping around if needed. I've managed to get pretty close to matching it's behavior. I use an index range of [0,1) and map it like so:
double value = Math.log(0.021 * (iteration + delta + 60)) + 0.72;
value = value - Math.floor(value);
It's kind of odd that I have to use these special constants in there to get my results to match, since I doubt they do any of that. But whatever works in the end, right?

here you can find a version with javascript
usage :
var rgbcol = [] ;
var rgbcol = MapColor ( Iteration , Zy2,Zx2 ) ;
point ( ctx , iX, iY ,rgbcol[0],rgbcol[1],rgbcol[2] );
function
/*
* The Mandelbrot Set, in HTML5 canvas and javascript.
* https://github.com/cslarsen/mandelbrot-js
*
* Copyright (C) 2012 Christian Stigen Larsen
*/
/*
* Convert hue-saturation-value/luminosity to RGB.
*
* Input ranges:
* H = [0, 360] (integer degrees)
* S = [0.0, 1.0] (float)
* V = [0.0, 1.0] (float)
*/
function hsv_to_rgb(h, s, v)
{
if ( v > 1.0 ) v = 1.0;
var hp = h/60.0;
var c = v * s;
var x = c*(1 - Math.abs((hp % 2) - 1));
var rgb = [0,0,0];
if ( 0<=hp && hp<1 ) rgb = [c, x, 0];
if ( 1<=hp && hp<2 ) rgb = [x, c, 0];
if ( 2<=hp && hp<3 ) rgb = [0, c, x];
if ( 3<=hp && hp<4 ) rgb = [0, x, c];
if ( 4<=hp && hp<5 ) rgb = [x, 0, c];
if ( 5<=hp && hp<6 ) rgb = [c, 0, x];
var m = v - c;
rgb[0] += m;
rgb[1] += m;
rgb[2] += m;
rgb[0] *= 255;
rgb[1] *= 255;
rgb[2] *= 255;
rgb[0] = parseInt ( rgb[0] );
rgb[1] = parseInt ( rgb[1] );
rgb[2] = parseInt ( rgb[2] );
return rgb;
}
// http://stackoverflow.com/questions/369438/smooth-spectrum-for-mandelbrot-set-rendering
// alex russel : http://stackoverflow.com/users/2146829/alex-russell
function MapColor(i,r,c)
{
var di= i;
var zn;
var hue;
zn = Math.sqrt(r + c);
hue = di + 1.0 - Math.log(Math.log(Math.abs(zn))) / Math.log(2.0); // 2 is escape radius
hue = 0.95 + 20.0 * hue; // adjust to make it prettier
// the hsv function expects values from 0 to 360
while (hue > 360.0)
hue -= 360.0;
while (hue < 0.0)
hue += 360.0;
return hsv_to_rgb(hue, 0.8, 1.0);
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Normalizing dataset with ruby - ruby

I have a data set that ranges from 1 to 30,000 I want to normalize it, so that it becomes 0.1 to 10 What is the best method/function to do that? Would greatly appreciate it if you could give some sample code!

x = x / 3030.3031 + 0.1

Related

How to generate points on a line? in Processing

How to generate Random coordinates within a circle with specified radius?

How do I translate and scale points within a bounding box?

Choosing an attractive linear scale for a graph’s Y Axis - more

Smooth spectrum for Mandelbrot Set rendering

Categories

Resources