When using D3.js, is there a method to automatically get the closest `key` that has data? - d3.js

When we use D3.js, let's say for simplicity, we have data, which are stock prices:
2021-03-18 $38.10
2021-03-19 $38.60
2021-03-22 $38.80
and we use D3 to plot a line chart for the stock price, and then move the mouse around to "hover" above the prices, and it'd show the price for that date.
Right now I am using
d3.select("svg").on("mousemove", (ev) => {
const hour = xScale.invert(ev.offsetX - dimensions.margin.left).getHours();
to get the hour of where the user is hovering on. The xScale is a scale function scaleTime() from domain to range, and xScale.invert is the function that convert the range back to domain.
If the hour is 12pm or later, I consider it the next day, and if it is before or equal, I consider it the same day. This is because the stock price of 2021-03-19 is considered to be at 12:00am (the midnight), so if I am getting to 9pm, for example, the mouse cursor is really close to the next day.
And then, let's say I identified that it is 2021-03-20, then I check whether there is stock price data. But since it was a Saturday and has no stock data, I use a function to check by the following method:
I first would go back to 2021-03-19 and see if there is a stock price. (I first build a lookup table to map date to data). If there is, then use it.
But if there isn't, I just use the delta of 1 day and move further and further, so it would go to 2021-03-21 and then increment the delta and use -delta to check for 2021-03-18, so I just use a point and go "later" and "before" with an increasing delta, until I am able to find a price
In other words, I have a "first candidate" and a "second candidate". If the first candidate has data, then use it. Otherwise, try the second candidate. If still not work, then work from the first candidate and use delta of 1 day and move "later" or "before", and if not work, use a delta of 2 days, and 3 days, until I am able to find a date with data.
Then I use this price to show on screen, to report what the date and price is
But this method is a bit low level. Does D3.js already have a method to directly do that: to spit out an invert number, which is closest to the key that has data in the dataset?

There are several functions provided by d3.js which can be used, depending on the exact situation:
1. You operate in screen space and want a mapping of the current mouse position on the closest point of the visualization which represents a single data object
In that case, you would probably want to use d3-delaunay.
d3-delaunay is a fast library for computing the Voronoi diagram of a
set of two-dimensional points. One can use delaunay.find to identify the data point closest to the pointer. Here is one example.
2. If you operate in the data domain (e.g. because you have already inverted the mouse position to the data domain)
As #Gerardo Furtado points out, you can use d3.bisect.
d3.bisect finds the position into which a given value can be inserted
into a sorted array while maintaining sorted order. If the value
already exists in the array, d3.bisect will find its position
efficiently. Here is one
example.
See also: D3: What is a Bisector? and d3.bisector using Date() Object does not resolve
Another option d3.js provides is d3.scaleThreshold.
Threshold scales allow you to map arbitrary subsets of the domain to discrete values in the range. The input domain is still continuous, and divided into slices based on a set of threshold values.
The idea is the following:
You create a d3.scaleThreshold to map any date (= continuous domain) to the fixed set of valid dates given your data by mapping it to the closest date. For that you have to specify the domain as an array of n - 1 dates which are residing in between the n valid dates. The range is the array of the valid dates.
It might not be as efficient as d3.bisect depending on your data.
const data_original = [{ date: "2021-03-18", value: "38.10"},
{ date: "2021-03-19", value: "38.60"},
{ date: "2021-03-22", value: "38.80"},
];
const data_types_converted = data_original.map(d => ({"date": new Date(d.date), "value": +d.value}));
const data_just_dates = data_types_converted.map(d => d.date);
let newDate = new Date("2021-03-17");
console.log(newDate + " -> " + getClosestDate(newDate, data_just_dates));
newDate = new Date("2021-03-18");
console.log(newDate + " -> " + getClosestDate(newDate, data_just_dates));
newDate = new Date("2021-03-19");
console.log(newDate + " -> " + getClosestDate(newDate, data_just_dates));
newDate = new Date("2021-03-20");
console.log(newDate + " -> " + getClosestDate(newDate, data_just_dates));
newDate = new Date("2021-03-21");
console.log(newDate + " -> " + getClosestDate(newDate, data_just_dates));
newDate = new Date("2021-03-22");
console.log(newDate + " -> " + getClosestDate(newDate, data_just_dates));
newDate = new Date("2021-03-23");
console.log(newDate + " -> " + getClosestDate(newDate, data_just_dates));
function getClosestDate(newDate, validDates) {
const domain = [];
let midday_local;
let midday_UTC;
validDates.forEach((d,i) => {
if (i < validDates.length - 1) {
midday_local = new Date((validDates[i].getTime() + validDates[i + 1].getTime()) / 2); // midday in local time
midday_UTC = convertDateToUTC(midday_local); // midday in UTC time
domain.push(midday_UTC);
}
});
const scale = d3.scaleThreshold()
.domain(domain)
.range(validDates);
return scale(newDate)
}
function convertDateToUTC(date) {
return new Date(
date.getUTCFullYear(),
date.getUTCMonth(),
date.getUTCDate(),
date.getUTCHours(),
date.getUTCMinutes(),
date.getUTCSeconds()
);
}
<script src="https://d3js.org/d3.v6.min.js"></script>

Related

Automatically fill the current date in a cell if the data is not empty in Google sheet

I have 9 columns of data. With the requirement in column 9 that there are data, the first column will dynamically fill in the current date.
I use the formula "= ArrayFormula (IF (ISTEXT (D7), TODAY ()," "))" but the problem is that if it passes the next day it will change to the next day's date. I do not want it to change the day after the new day, what should I do?
I am not sure if it is possible using a simple google sheet function.
If you want to use a google script to solve this then you can apply below:
function onEdit(){
dateStamp();
}
function dateStamp() {
const ss = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('Data')//change sheet name per your requirement;
const date = new Date();
const lr = ss.getDataRange().getLastRow();
const dataRange = ss.getRange(2,9,lr-1).getValues() //change range per your requirement;
const dateRange = ss.getRange(2,2,lr-1).getValues() //change range per your requirement;
for (i=0; i<dataRange.length; i++){
if(dataRange[i] !=''){
if (dateRange[i] == '')
ss.getRange(i+2,2).setValue(date);
}
}
}

Sort a range or array based on two columns that contain the date and time

Currently I'm trying to create a Google Apps Script for Google Sheets which will allow adding weekly recurring events, batchwise, for upcoming events. My colleagues will then make minor changes to these added events (e.g. make date and time corrections, change the contact person, add materials neccessary for the event and so forth).
So far, I have written the following script:
function CopyWeeklyEventRows() {
var ss = SpreadsheetApp.getActiveSheet();
var repeatingWeeks = ss.getRange(5,1).getValue(); // gets how many weeks it should repeat
var startDate = ss.getRange(6, 1).getValue(); // gets the start date
var startWeekday = startDate.getDay(); // gives the weekday of the start date
var regWeek = ss.getRange(9, 2, 4, 7).getValues(); // gets the regular week data
var regWeekdays = new Array(regWeek.length); // creates an array to store the weekdays of the regWeek
var ArrayStartDate = new Array(startDate); // helps to store the We
for (var i = 0; i < regWeek.length; i++){ // calculates the difference between startWeekday and each regWeekdays
regWeekdays[i] = regWeek[i][1].getDay() - startWeekday;
Logger.log(regWeekdays[i]);
// Add 7 to move to the next week and avoid negative values
if (regWeekdays[i] < 0) {
regWeekdays[i] = regWeekdays[i] + 7;
}
// Add days according to difference between startWeekday and each regWeekdays
regWeek[i][0] = new Date(ArrayStartDate[0].getTime() + regWeekdays[i]*3600000*24);
}
// I'm struggling with this line. The array regWeek is not sorted:
//regWeek.sort([{ column: 1, ascending: true }]);
ss.getRange(ss.getLastRow() + 1, 2, 4, 7).setValues(regWeek); // copies weekly events after the last row
}
It allows to add one week of recurring events to the overview section of the spreadsheet based on a start date. If the start date is a Tuesday, the regular week is added starting from a Tuesday. However, the rows are not sorted according to the dates:
.
How can the rows be sorted by ascending date (followed by time) before adding them to the overview?
My search for similar questions revealed Google Script sort 2D Array by any column which is the closest hit I've found. The same error message is shown when running my script with the sort line. I don't understand the difference between Range and array yet which might help to solve the issue.
To give you a broader picture, here's what I'm currently working on:
I've noticed that the format will not necessarily remain when adding
new recurring events. So far I haven't found the rule and formatted by
hand in a second step.
A drawback is currently that the weekly recurring events section is
fixed. I've tried to find the last filled entry and use it to set the
range of regWeek, but got stuck.
Use the column A to exclude recurring events from the addition
process using a dropdown.
Allow my colleagues to add an event to the recurring events using a
dropdown (e.g. A26). This event should then be added with sorting to
the right day of the week and start time. The sorting will come in
handy.
Thanks in advance for your input regarding the sorting as well as suggestions on how to improve the code in general.
A demo version of the spreadsheet
UpdateV01:
Here the code lines which copy and sort (first by date, then by time)
ss.getRange(ss.getLastRow()+1,2,4,7).setValues(regWeek); // copies weekly events after the last row
ss.getRange(ss.getLastRow()-3,2,4,7).sort([{column: 2, ascending: true}, {column: 4, ascending: true}]); // sorts only the copied weekly events chronologically
As #tehhowch pointed out, this is slow. Better to sort BEFORE writing.
I will implement this method and post it here.
UpdateV02:
regWeek.sort(function (r1, r2) {
// sorts ascending on the third column, which is index 2
return r1[2] - r2[2];
});
regWeek.sort(function (r1, r2) {
// r1 and r2 are elements in the regWeek array, i.e.
// they are each a row array if regWeek is an array of arrays:
// Sort ascending on the first column, which is index 0:
// if r1[0] = 1, r2[0] = 2, then 1 - 2 is -1, so r1 sorts before r2
return r1[0] - r2[0];
});
UpdateV03:
Here an attempt to repeat the recurring events over several weeks. Don't know yet how to include the push for the whole "week".
// Repeat week for "A5" times and add to start/end date
for (var j = 0; j < repeatingWeeks; j++){
for (var i = 0; i < numFilledRows; i++){
regWeekRepeated[i+j*6][0] = new Date(regWeek[i][0].getTime() + j*7*3600000*24); // <-This line leads to an error message
regWeekRepeated[i+j*6][3] = new Date(regWeek[i][3].getTime() + j*7*3600000*24);
}
}
My question was answered and I was able to make the code work as intended.
Given your comment - you want to sort the written chunk - you have two methods available. One is to sort written data after writing, by using the Spreadsheet service's Range#sort(sortObject) method. The other is to sort the data before writing, using the JavaScript Array#sort(sortFunction()) method.
Currently, your sort code //regWeek.sort([{ column: 1, ascending: true }]); is attempting to sort a JavaScript array, using the sorting object expected by the Spreadsheet service. Thus, you can simply chain this .sort(...) call to your write call, as Range#setValues() returns the same Range, allowing repeated Range method calling (e.g. to set values, then apply formatting, etc.).
This looks like:
ss.getRange(ss.getLastRow() + 1, 2, regWeek.length, regWeek[0].length)
.setValues(regWeek)
/* other "chainable" Range methods you want to apply to
the cells you just wrote to. */
.sort([{column: 1, ascending: true}, ...]);
Here I have updated the range you access to reference the data you are attempting to write - regWeek - so that it is always the correct size to hold the data. I've also visually broken apart the one-liner so you can better see the "chaining" that is happening between Spreadsheet service calls.
The other method - sorting before writing - will be faster, especially as the size and complexity of the sort increases. The idea behind sorting a range is you need to use a function that returns a negative value when the first index's value should come before the second's, a positive value when the first index's value should come after the second's, and a zero value if they are equivalent. This means a function that returns a boolean is NOT going to sort as one thinks, since false and 0 are equivalent in Javascript, while true and 1 are also equivalent.
Your sort looks like this, assuming regWeek is an array of arrays and you are sorting on numeric values (or at least values which will cast to numbers, like Dates).
regWeek.sort(function (r1, r2) {
// r1 and r2 are elements in the regWeek array, i.e.
// they are each a row array if regWeek is an array of arrays:
// Sort ascending on the first column, which is index 0:
// if r1[0] = 1, r2[0] = 2, then 1 - 2 is -1, so r1 sorts before r2
return r1[0] - r2[0];
});
I strongly recommend reviewing the Array#sort documentation.
You could sort the "Weekly Events" range before you set the regWeek variable. Then the range would be in the order you want before you process it. Or you could sort the whole "Overview" range after setting the data. Here's a quick function you can call to sort the range by multiple columns. You can of course tweak it to sort the "Weekly Events" range instead of the "Overview" range.
function sortRng() {
var ss = SpreadsheetApp.getActiveSheet();
var firstRow = 22; var firstCol = 1;
var numRows = ss.getLastRow() - firstRow + 1;
var numCols = ss.getLastColumn();
var overviewRng = ss.getRange(firstRow, firstCol, numRows, numCols);
Logger.log(overviewRng.getA1Notation());
overviewRng.sort([{column: 2, ascending: true}, {column: 4, ascending: true}]);
}
As for getting the number of filled rows in the Weekly Events section, you need to search a column that will always have data if any row has data (like the start date column b), loop through the values and the first time it finds a blank, return that number. That will give you the number of rows that it needs to copy. Warning: if you don't have at least one blank value in column B between the Weekly Events section and the Overview section, you will probably get unwanted results.
function getNumFilledRows() {
var ss = SpreadsheetApp.getActiveSheet();
var eventFirstRow = 9; var numFilledRows = 0;
var colToCheck = 'B';//the StartDate col which should always have data if the row is filled
var vals = ss.getRange(colToCheck + eventFirstRow + ":" + colToCheck).getValues();
for (i = 0; i < vals.length; i++) {
if (vals[i][0] == '') {
numFilledRows = i;
break;
}
}
Logger.log(numFilledRows);
return numFilledRows;
}
EDIT:
If you just want to sort the array in javascript before writing, and you want to sort by Start Date first, then by Time of day, you could make a temporary array, and add a column to each row that is date and time combined. array.sort() sorts dates alphabetically, so you would need to convert that date to an integer. Then you could sort the array by the new column, then delete the new column from each row. I included a function that does this below. It could be a lot more compact but I thought it might be more legible like this.
function sortDates() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var vals = ss.getActiveSheet().getRange('B22:H34').getDisplayValues(); //get display values because getValues returns time as weird date 1899 and wrong time.
var theDate = new Date(); var newArray = []; var theHour = ''; var theMinutes = '';
var theTime = '';
//Create a new array that inserts date and time as the first column in each row
vals.forEach(function(aRow) {
theTime = aRow[2];//hardcoded - assumes time is the third column that you grabbed
//get the hours (before colon) as a number
theHour = Number(theTime.substring(0,theTime.indexOf(':')));
//get the minutes(after colon) as a number
theMinutes = Number(theTime.substring(theTime.indexOf(':')+1));
theDate = new Date(aRow[0]);//hardcoded - assumes date is the first column you grabbed.
theDate.setHours(theHour);
theDate.setMinutes(theMinutes);
aRow.unshift(theDate.getTime()); //Add the date and time as integer to the first item in the aRow array for sorting purposes.
newArray.push(aRow);
});
//Sort the newArray based on the first item of each row (date and time as number)
newArray.sort((function(index){
return function(a, b){
return (a[index] === b[index] ? 0 : (a[index] < b[index] ? -1 : 1));
};})(0));
//Remove the first column of each row (date and time combined) that we added in the first step
newArray.forEach(function(aRow) {
aRow.shift();
});
Logger.log(newArray);
}

Shopify Time difference

I am newbie to shopify, I have to display a time counter up-to 9 PM everyday. The logic includes to deduct current time from specified time and the difference time will display as counter(remaining time).
I am able to retrieve current timestamp in shopify using the code below
{% assign timestamp = 'now' | date %}
Now I have a date "21-03-2016 21:00:00" and wants to convert in timestamp but not able to get the solution.
Let me know if any one can help me in this. Thank You.
Note that the value will be the current time of when the page was last
generated from the template, not when the page is presented to a user
if caching or static site generation is involved.
http://shopify.github.io/liquid/filters/date/
This is probably best done using Javascript, not pure-Liquid. You generally want your pages to be as cache-able as possible for Shopify's servers to keep loading times as lean as possible.
You will want to use Liquid to compensate for timezones, however. You can get the store's configured timezone and drop that into a Javascript variable:
var timezone = {{ 1 | date: '%z' | json }};
The above liquid statement takes some arbitrary input, runs it through the date filter and outputs only the timezone component configured for that store (using '%z'), then runs that through the json filter to ensure that whatever the output will always be javascript-legal.
Now that you know your timezone offset, you can assemble your timer logic in Javascript using all your normal date/time tricks. For example:
var now = new Date();
var nowStr = now.toISOString(); // Output format: '2019-01-01T00:00'
// Chop off the time portion, then append your desired time & store timezone
var closingTimeStr = nowStr.split('T')[0] + 'T21:00' + timezone;
// We make a new date using today's date/time string
var closingTime = new Date(closingTimeStr);
// And now we can do math based on the difference
var difference_ms = closingTime - now;
if(difference_ms > 0){
// Parse the difference into hours, minutes and seconds if a positive amount of time remains. Writing a better parsing function is left as an exercise for the reader.
var hours = parseInt(difference_ms / (60 * 60 * 1000) );
difference_ms %= (60 * 60 * 1000);
var minutes = parseInt(difference_ms / (60 * 1000) );
difference_ms %= (60 * 1000);
var seconds = parseInt(difference_ms / (1000) );
console.log('Hurry down! Closing in ' + hours + ' hours, ' + minutes + ' minutes, and ' + seconds + ' seconds!');
} else {
// Show appropriate message if we're out-of-bounds
console.log('Closed for today - try again tomorrow!');
}

How can I sum binned time series using d3.js?

I want a simple graph like:
The data I have is a simple list of transactions with two properties:
timestamp
amount
I tried d3.layout.histogram().bins() but it seems it only supports counting the transactions.
I mustn't be the only one looking for that, am I ?
Ok, so the IRC folks helped me out and pointed to nest, which works great (this is CoffeeScript):
nested_data = d3.nest()
.key((d) -> d3.time.day(d.timestamp))
.rollup((a) -> d3.sum(a, (d) -> d.amount))
.entries(incoming_data) # An array of {timestamp: ..., amount: ...} objects
# Optional
nested_data.map (d) ->
d.date = new Date(d.key)
The trick here is d3.time.day which takes a timestamp, and tells you which day (12 a.m. in the night) that timestamp belongs to. This function and the other ones like d3.time.week, etc.. can bin timeseries very well.
The other trick is the nest().rollup() function, which after being grouped by key(), sum all of the events on a given day.
Last thing I wanted, was to interpolate empty values on the days where I had no transactions. This is the last part of the code:
# Interpolate empty vals
nested_data.sort((a, b) -> d3.descending(a.date, b.date))
ex = d3.extent(nested_data, (d) -> d.date)
each_day = d3.time.days(ex[0], ex[1])
# Build a hashmap with the days we have
data_hash = {}
angular.forEach(data, (d) ->
data_hash[d.date] = d.values
)
# Build a new array for each day, including those where we didn't have transactions
new_data = []
angular.forEach(each_day, (d) ->
val = 0
if data_hash[d]
val = data_hash[d]
new_data.push({date: d, values: val})
)
final_data = new_data
Hope this helps somebody!
The histogram code doesn't support this, but you can easily do the binning yourself. Assuming that you have a date and a count for each transaction, you can bin by day like this.
var bins = {};
transactions.forEach(function(t) {
var key = t.date.toDateString();
bins[key] = bins[key] || 0;
bins[key] += t.amount;
});
You can obviously parse the date string back into a date if you need it; the point of using .toDateString() here is that the time part is chopped off and everything binned by day. If you want to bin by another time interval, you can use the same technique and extract a different part of the date.

Stata: how to get observation value 5 minutes ahead with gabbed time data

I got high frequency data from a limit order book in Stata. Time does not have a regular interval, and some observations are at the same time (in milliseconds). For each observation I need to get the midpoint 5 minutes later in a separate column. So for observation 1 the midpoint would be 10.49, because the last midpoint closest to 09:05:02.579 would be 10.49.
How to do this in Stata?
datetime midpoint
12/02/2012 09:00:02.579 10.5125
12/02/2012 09:00:03.471 10.5125
12/02/2012 09:00:03.471 10.5125
12/02/2012 09:00:03.471 10.51
12/02/2012 09:00:03.471 10.51
12/02/2012 09:00:03.549 10.505
12/02/2012 09:00:03.549 10.5075
......
12/02/2012 09:04:59.785 10.495
12/02/2012 09:05:00.829 10.4925
12/02/2012 09:05:01.209 10.49
12/02/2012 09:05:03.057 10.4875
12/02/2012 09:05:05.055 10.485
.....
My approach would be
generate a new data set shifted by five minutes
append this shifter data set
find closest before and after observations to your five minute delta
use some criteria to pick the better of these two values
You specified closest, but you might want to add some other criteria depending on your book. Also, you mentioned more than one value at a given ms tick, but without more information I'm not sure how to handle that. Do you want to combine those midpoints first? Or are they different stocks?
Here's some code that implements the basics of the approach above.
clear
version 11.2
set seed 2001
* generate some data
set obs 100000
generate double dt = ///
tc(02dec2012 09:00:00.000) + 1000*_n + int(100*rnormal())
format dt %tcDDmonCCYY_HH:MM:SS.sss
sort dt
generate midpt = 100
replace midpt = ///
round(midpt[_n - 1] + 0.1*rnormal(), 0.005) if (_n != 1)
* add back future midpts
preserve
tempfile future
rename midpt fmidpt
rename dt fdt
generate double dt = fdt - tc(00:05:00.000)
save `future'
restore
append using `future'
* generate midpoints before and after 5 minutes in the future
sort dt
foreach v of varlist fdt fmidpt {
clonevar `v'_b = `v'
replace `v'_b = `v'_b[_n - 1] if missing(`v'_b)
}
gsort -dt
foreach v of varlist fdt fmidpt {
clonevar `v'_a = `v'
replace `v'_a = `v'_a[_n - 1] if missing(`v'_a)
}
format fdt* %tcDDmonCCYY_HH:MM:SS.sss
* use some algorithm to pick correct value
sort dt
generate choose_b = ///
((dt + tc(00:05:00.000)) - fdt_b) < (fdt_a - (dt + tc(00:05:00.000)))
generate fdt_c = cond(choose_b, fdt_b, fdt_a)
generate fmidpt_c = cond(choose_b, fmidpt_b, fmidpt_a)
format fdt_c %tcDDmonCCYY_HH:MM:SS.sss
// Construct a variable to look for in the dataset
gen double midpoint_5 = (datetime + 5*60000)
format midpoint_5 %tcNN/DD/CCYY_HH:MM:SS.sss
// will contain the closest observation number and midpoint 5 minutes a head
gen _t = .
gen double midpoint_at5 = .
// How many observations in the sample?
local N = _N
// We will use these variables to skip some observations in the loop
egen obs_in_minute = count(minutes_filter), by(minutes_filter)
egen max_obs_in_minute = max(obs_in_minute)
set more off
// For each observation
forvalues i = 1/`N' {
// If it is a trade
if type[`i'] == "Trade" {
// Set the time to lookup in the data
local lookup = midpoint_5[`i']
// The time should be between the min and max(*5)
local min = `i' + obs_in_minute[`i'] // this might cause errors
local max = `i' + max_obs_in_minute[`i']*5
// For each of these observations
forvalues j = `min'/`max' {
// Check if the lookup date is smaller than the datetime of the observation
if `lookup' < datetime[`j'] {
// Set the observation ID at the lookup ID 1 observation before
quietly replace _t = `j'-1 in `i'
// Set the midpoint at the lookup ID 1 observation before
quietly replace midpoint_at5 = midpoint[`j'-1] in `i'
// We have found the closest 5th min ahead... now stop loop and continue to next observation.
continue, break
}
}
// This is to indicate where we are in the loop
display "`i'/`N'"
}
}

Resources