Why is there consistent variation in execution time on my timed trigger? - performance

I have a timed trigger that runs every 15 minutes. A simplified partial version of the script is shown below. The script compiles data from about 50 other spreadsheets and records a row for each spreadsheet, then writes that summary data to the active spreadsheet.
I noticed that in the logs, there is an alternating pattern in the execution times for this script: half the executions take 200-400 seconds, and the other half typically take 700-900 seconds. It's a pretty significant difference, and the pattern persists over the past several days of logs.
There's nothing in the script itself that changes from one execution to the next, so I'm curious if anyone can suggest a reason this would happen (even better if it's a documented reason). For example, is there some sort of caching of the spreadsheet reads so that the next execution gets those values faster?
// The triggered function.
function updateRankings()
{
var rankingSheet = SS.getSheetByName(RANKING_SHEET_NAME) // SS is the active spreadsheet
// Read the id's of the target spreadsheets, which are stored on an external spreadsheet
var gyms = getRowsData( SpreadsheetApp.openById(ADMIN_PANEL_ID).getSheetByName(ADMIN_PANEL_SHEET_NAME))
// Iterate over gyms
gyms.forEach(getGymStats)
// Write the compiled data back to the active sheet
setRowsData(rankingSheet, gyms)
}
function getGymStats(gym)
{
var gymSpreadsheet = SpreadsheetApp.openById(gym.spreadsheetId)
// Force spreadsheet formulas to calculate before reading values
SpreadsheetApp.flush()
var metricsSheet = gymSpreadsheet.getSheetByName('Detailed Metrics')
var statsColumn = metricsSheet.getRange('E:E').getValues()
var roasColumn = metricsSheet.getRange('J:J').getValues()
// Get stats
var gymStats = {
facebookAdSpend: getFacebookAdSpend(gymSpreadsheet),
scheduling: statsColumn[8][0],
showup: statsColumn[9][0],
closing: statsColumn[10][0],
costPerLead: statsColumn[25][0],
costPerAppointment: statsColumn[26][0],
costPerShow: statsColumn[27][0],
costPerAcquisition: statsColumn[28][0],
leadCount: statsColumn[13][0],
frontEndRoas: (roasColumn[21][0] / statsColumn[5][0]) || 0,
totalRoas: (roasColumn[35][0] / statsColumn[5][0]) || 0,
totalProjectedRoas: (roasColumn[36][0] / statsColumn[5][0]) || 0,
conversionRate: (gym.currency ?
'=IFS(ISBLANK(INDIRECT("R[0]C[-4]", FALSE)),,ISBLANK(INDIRECT("R[0]C[-2]", FALSE)), 1,TRUE, IFERROR(GOOGLEFINANCE("Currency:"&INDIRECT("R[0]C[-2]", FALSE)&"USD")))' :
1)
}
Object.assign(gym, gymStats)
}
function getFacebookAdSpend(spreadsheet)
{
var range = spreadsheet.getRangeByName('FacebookAdSpend')
if (!range) return ''
return range.getValue()
}

Related

Sending an email each time a condition is met from a Spreadsheet in Apps script

I'm trying to write some code in Apps Script that triggers an email each time a condition is fulfilled in a Spreadsheet.
This spreadsheet contains the age of different transgenic lines of fish (the ages of the fish are automatically updated in the spreadsheet) and each of these transgenic lines has an associated caretaker with an email address. My idea is to trigger an automatic email using Apps script that is sent to the assigned caretaker each time one of these transgenic lines becomes older than 2 years old. However, I haven't been able to make it work yet. I'm not really sure which part of my code is preventing it from working properly.
Below I attach an example of how the spreadsheet would look like, as well as an example of the code that I've been trying to use (I'm a beginner when it comes to coding, so it's possible that there are many basic errors in it):
function fishalert() {
var subject = 'Fish aging alert';
var years = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Sheet1").getRange("C2:C10").getValues();
if (years > 2){
for(r=2;r<20;r++){
var name = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Sheet1").getRange(r,1).getValue();
var emailaddress = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Sheet1").getRange(r,4).getValue();
var message = 'Line ' + name + ' is more than 2 years old';
MailApp.sendEmail(emailaddress, subject, message);
}
}
}
Sending Email when conditions are met by sampling once a day
function fishalert(e) {
if (e['day-of-week'] < 6) {//sends emails mon through fri between 9 and 10 in the morning
const ss = SpreadsheetApp.getActive();
const sh = ss.getSheetByName('Sheet1');
const vs = sh.getRange(2, 1, sh.getLastRow() - 1, 4).getValues();
vs.forEach(r => {
if (r[2] > 2) {
MailApp.sendEmail(r[3], 'Fish aging alert', `Line ${r[0]} is more than 2 years old`);
}
});
}
}
Run this once:
function createTimeBasedTrigger() {
if(ScriptApp.getProjectTriggers().filter(t => t.getHandlerFunction() == 'fishalert').length == 0) {
ScriptApp.newTrigger('fishalert').timeBased().everyDays(1).atHour(9).create();
}
}
Time Driven Triggers
Time Driven Trigger Event Object
Class Range getValues() Method
For future enhancements
you will probably only want to send these emails on a less frequent schedule and probably only once when the threshold is achieved and you'll probably want to collect one email for each unique email address. But this is an answer to your current question

Sort a range or array based on two columns that contain the date and time

Currently I'm trying to create a Google Apps Script for Google Sheets which will allow adding weekly recurring events, batchwise, for upcoming events. My colleagues will then make minor changes to these added events (e.g. make date and time corrections, change the contact person, add materials neccessary for the event and so forth).
So far, I have written the following script:
function CopyWeeklyEventRows() {
var ss = SpreadsheetApp.getActiveSheet();
var repeatingWeeks = ss.getRange(5,1).getValue(); // gets how many weeks it should repeat
var startDate = ss.getRange(6, 1).getValue(); // gets the start date
var startWeekday = startDate.getDay(); // gives the weekday of the start date
var regWeek = ss.getRange(9, 2, 4, 7).getValues(); // gets the regular week data
var regWeekdays = new Array(regWeek.length); // creates an array to store the weekdays of the regWeek
var ArrayStartDate = new Array(startDate); // helps to store the We
for (var i = 0; i < regWeek.length; i++){ // calculates the difference between startWeekday and each regWeekdays
regWeekdays[i] = regWeek[i][1].getDay() - startWeekday;
Logger.log(regWeekdays[i]);
// Add 7 to move to the next week and avoid negative values
if (regWeekdays[i] < 0) {
regWeekdays[i] = regWeekdays[i] + 7;
}
// Add days according to difference between startWeekday and each regWeekdays
regWeek[i][0] = new Date(ArrayStartDate[0].getTime() + regWeekdays[i]*3600000*24);
}
// I'm struggling with this line. The array regWeek is not sorted:
//regWeek.sort([{ column: 1, ascending: true }]);
ss.getRange(ss.getLastRow() + 1, 2, 4, 7).setValues(regWeek); // copies weekly events after the last row
}
It allows to add one week of recurring events to the overview section of the spreadsheet based on a start date. If the start date is a Tuesday, the regular week is added starting from a Tuesday. However, the rows are not sorted according to the dates:
.
How can the rows be sorted by ascending date (followed by time) before adding them to the overview?
My search for similar questions revealed Google Script sort 2D Array by any column which is the closest hit I've found. The same error message is shown when running my script with the sort line. I don't understand the difference between Range and array yet which might help to solve the issue.
To give you a broader picture, here's what I'm currently working on:
I've noticed that the format will not necessarily remain when adding
new recurring events. So far I haven't found the rule and formatted by
hand in a second step.
A drawback is currently that the weekly recurring events section is
fixed. I've tried to find the last filled entry and use it to set the
range of regWeek, but got stuck.
Use the column A to exclude recurring events from the addition
process using a dropdown.
Allow my colleagues to add an event to the recurring events using a
dropdown (e.g. A26). This event should then be added with sorting to
the right day of the week and start time. The sorting will come in
handy.
Thanks in advance for your input regarding the sorting as well as suggestions on how to improve the code in general.
A demo version of the spreadsheet
UpdateV01:
Here the code lines which copy and sort (first by date, then by time)
ss.getRange(ss.getLastRow()+1,2,4,7).setValues(regWeek); // copies weekly events after the last row
ss.getRange(ss.getLastRow()-3,2,4,7).sort([{column: 2, ascending: true}, {column: 4, ascending: true}]); // sorts only the copied weekly events chronologically
As #tehhowch pointed out, this is slow. Better to sort BEFORE writing.
I will implement this method and post it here.
UpdateV02:
regWeek.sort(function (r1, r2) {
// sorts ascending on the third column, which is index 2
return r1[2] - r2[2];
});
regWeek.sort(function (r1, r2) {
// r1 and r2 are elements in the regWeek array, i.e.
// they are each a row array if regWeek is an array of arrays:
// Sort ascending on the first column, which is index 0:
// if r1[0] = 1, r2[0] = 2, then 1 - 2 is -1, so r1 sorts before r2
return r1[0] - r2[0];
});
UpdateV03:
Here an attempt to repeat the recurring events over several weeks. Don't know yet how to include the push for the whole "week".
// Repeat week for "A5" times and add to start/end date
for (var j = 0; j < repeatingWeeks; j++){
for (var i = 0; i < numFilledRows; i++){
regWeekRepeated[i+j*6][0] = new Date(regWeek[i][0].getTime() + j*7*3600000*24); // <-This line leads to an error message
regWeekRepeated[i+j*6][3] = new Date(regWeek[i][3].getTime() + j*7*3600000*24);
}
}
My question was answered and I was able to make the code work as intended.
Given your comment - you want to sort the written chunk - you have two methods available. One is to sort written data after writing, by using the Spreadsheet service's Range#sort(sortObject) method. The other is to sort the data before writing, using the JavaScript Array#sort(sortFunction()) method.
Currently, your sort code //regWeek.sort([{ column: 1, ascending: true }]); is attempting to sort a JavaScript array, using the sorting object expected by the Spreadsheet service. Thus, you can simply chain this .sort(...) call to your write call, as Range#setValues() returns the same Range, allowing repeated Range method calling (e.g. to set values, then apply formatting, etc.).
This looks like:
ss.getRange(ss.getLastRow() + 1, 2, regWeek.length, regWeek[0].length)
.setValues(regWeek)
/* other "chainable" Range methods you want to apply to
the cells you just wrote to. */
.sort([{column: 1, ascending: true}, ...]);
Here I have updated the range you access to reference the data you are attempting to write - regWeek - so that it is always the correct size to hold the data. I've also visually broken apart the one-liner so you can better see the "chaining" that is happening between Spreadsheet service calls.
The other method - sorting before writing - will be faster, especially as the size and complexity of the sort increases. The idea behind sorting a range is you need to use a function that returns a negative value when the first index's value should come before the second's, a positive value when the first index's value should come after the second's, and a zero value if they are equivalent. This means a function that returns a boolean is NOT going to sort as one thinks, since false and 0 are equivalent in Javascript, while true and 1 are also equivalent.
Your sort looks like this, assuming regWeek is an array of arrays and you are sorting on numeric values (or at least values which will cast to numbers, like Dates).
regWeek.sort(function (r1, r2) {
// r1 and r2 are elements in the regWeek array, i.e.
// they are each a row array if regWeek is an array of arrays:
// Sort ascending on the first column, which is index 0:
// if r1[0] = 1, r2[0] = 2, then 1 - 2 is -1, so r1 sorts before r2
return r1[0] - r2[0];
});
I strongly recommend reviewing the Array#sort documentation.
You could sort the "Weekly Events" range before you set the regWeek variable. Then the range would be in the order you want before you process it. Or you could sort the whole "Overview" range after setting the data. Here's a quick function you can call to sort the range by multiple columns. You can of course tweak it to sort the "Weekly Events" range instead of the "Overview" range.
function sortRng() {
var ss = SpreadsheetApp.getActiveSheet();
var firstRow = 22; var firstCol = 1;
var numRows = ss.getLastRow() - firstRow + 1;
var numCols = ss.getLastColumn();
var overviewRng = ss.getRange(firstRow, firstCol, numRows, numCols);
Logger.log(overviewRng.getA1Notation());
overviewRng.sort([{column: 2, ascending: true}, {column: 4, ascending: true}]);
}
As for getting the number of filled rows in the Weekly Events section, you need to search a column that will always have data if any row has data (like the start date column b), loop through the values and the first time it finds a blank, return that number. That will give you the number of rows that it needs to copy. Warning: if you don't have at least one blank value in column B between the Weekly Events section and the Overview section, you will probably get unwanted results.
function getNumFilledRows() {
var ss = SpreadsheetApp.getActiveSheet();
var eventFirstRow = 9; var numFilledRows = 0;
var colToCheck = 'B';//the StartDate col which should always have data if the row is filled
var vals = ss.getRange(colToCheck + eventFirstRow + ":" + colToCheck).getValues();
for (i = 0; i < vals.length; i++) {
if (vals[i][0] == '') {
numFilledRows = i;
break;
}
}
Logger.log(numFilledRows);
return numFilledRows;
}
EDIT:
If you just want to sort the array in javascript before writing, and you want to sort by Start Date first, then by Time of day, you could make a temporary array, and add a column to each row that is date and time combined. array.sort() sorts dates alphabetically, so you would need to convert that date to an integer. Then you could sort the array by the new column, then delete the new column from each row. I included a function that does this below. It could be a lot more compact but I thought it might be more legible like this.
function sortDates() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var vals = ss.getActiveSheet().getRange('B22:H34').getDisplayValues(); //get display values because getValues returns time as weird date 1899 and wrong time.
var theDate = new Date(); var newArray = []; var theHour = ''; var theMinutes = '';
var theTime = '';
//Create a new array that inserts date and time as the first column in each row
vals.forEach(function(aRow) {
theTime = aRow[2];//hardcoded - assumes time is the third column that you grabbed
//get the hours (before colon) as a number
theHour = Number(theTime.substring(0,theTime.indexOf(':')));
//get the minutes(after colon) as a number
theMinutes = Number(theTime.substring(theTime.indexOf(':')+1));
theDate = new Date(aRow[0]);//hardcoded - assumes date is the first column you grabbed.
theDate.setHours(theHour);
theDate.setMinutes(theMinutes);
aRow.unshift(theDate.getTime()); //Add the date and time as integer to the first item in the aRow array for sorting purposes.
newArray.push(aRow);
});
//Sort the newArray based on the first item of each row (date and time as number)
newArray.sort((function(index){
return function(a, b){
return (a[index] === b[index] ? 0 : (a[index] < b[index] ? -1 : 1));
};})(0));
//Remove the first column of each row (date and time combined) that we added in the first step
newArray.forEach(function(aRow) {
aRow.shift();
});
Logger.log(newArray);
}

group.all() call required for data to populate correctly

So I've encountered a weird issue when dealing with making Groups based on a variable when the crossfilter is using an array, instead of a literal number.
I currently have an output array of a date, then 4 values, that I then map into a composite graph. The problem is that the 4 values can fluctuate depending on the input given to the page. What I mean is that based on what it receives, I can have 3 values, or 10, and there's no way to know in advance. They're placed into an array which is then given to a crossfilter. When in testing, I was accessing using
dimension.group.reduceSum(function(d) { return d[0]; });
Where 0 was changed to whatever I needed. But I've finished testing, for the most part, and began to adapt it into a dynamic system where it can change, but there's always at least the first two. To do this I created an integer that keeps track of what index I'm at, and then increases it after the group has been created. The following code is being used:
var range = crossfilter(results);
var dLen = 0;
var curIndex = 0;
var dateDimension = range.dimension(function(d) { dLen = d.length; return d[curIndex]; });
curIndex++;
var aGroup = dateDimension.group().reduceSum(function(d) { return d[curIndex]; });
curIndex++;
var bGroup = dateDimension.group().reduceSum(function(d) { return d[curIndex]; });
curIndex++;
var otherGroups = [];
for(var h = 0; h < dLen-3; h++) {
otherGroups[h] = dateDimension.group().reduceSum(function(d) { return d[curIndex]; });
curIndex++;
}
var charts = [];
for(var x = 0; x < dLen - 3; x++) {
charts[x] = dc.barChart(dataGraph)
.group(otherGroups[x], "Extra Group " + (x+1))
.hidableStacks(true)
}
charts[charts.length] = dc.lineChart(dataGraph)
.group(aGroup, "Group A")
.hidableStacks(true)
charts[charts.length] = dc.lineChart(dataGraph)
.group(aGroup, "Group B")
.hidableStacks(true)
The issue is this:
The graph gets built empty. I checked the curIndex variable multiple times and it was always correct. I finally decided to instead check the actual group's resulting data using the .all() method.
The weird thing is that AFTER I used .all(), now the data works. Without a .all() call, the graph cannot determine the data and outputs absolutely nothing, however if I call .all() immediately after the group has been created, it populates correctly.
Each Group needs to call .all(), or only the ones that do will work. For example, when I first was debugging, I used .all() only on aGroup, and only aGroup populated into the graph. When I added it to bGroup, then both aGroup and bGroup populated. So in the current build, every group has .all() called directly after it is created.
Technically there's no issue, but I'm really confused on why this is required. I have absolutely no idea what the cause of this is, and I was wondering if there was any insight into it. When I was using literals, there was no issue, it only happens when I'm using a variable to create the groups. I tried to get output later, and when I do I received NaN for all the values. I'm not really sure why .all() is changing values into what they should be especially when it only occurs if I do it immediately after the group has been created.
Below is a screenshot of the graph. The top is when everything has a .all() call after being created, while the bottom is when the Extra Groups (the ones defined in the for loop) do not have the .all() call anymore. The data is just not there at all, I'm not really sure why. Any thoughts would be great.
http://i.stack.imgur.com/0j1ey.jpg
It looks like you may have run into the classic "generating lambdas from loops" JavaScript problem.
You are creating a whole bunch of functions that reference curIndex but unless you call those functions immediately, they will refer to the same instance of curIndex in the global environment. So if you call them after initialization, they will probably all try to use a value which is past the end.
Instead, you might create a function which generates your lambdas, like so:
function accessor(curIndex) {
return function(d) { return d[curIndex]; };
}
And then each time call .reduceSum(accessor(curIndex))
This will cause the value of curIndex to get copied each time you call the accessor function (or you can think of each generated function as having its own environment with its own curIndex).

What is faster: ScriptDb or SpreadsheetApp?

Let's say I have a a script that iterates over a list of 400 objects.
Each object has anywhere from 1 to 10 properties.
Each property is a reasonable size string or a somewhat large integer.
Is there a significant difference in performance of saving these objects into ScriptDB vs saving them into Spreadsheet(w/o doing it in one bulk operation).
Executive Summary
Yes, there is a significant difference! Huge! And I have to admit that this experiment didn't turn out the way I expected.
With this amount of data, writing to a spreadsheet was always much faster than using ScriptDB.
These experiments support the assertions regarding bulk operations in the Google Apps Script Best Practices. Saving data in a spreadsheet using a single setValues() call was 75% faster than line-by-line, and two orders of magnitude faster than cell-by-cell.
On the other hand, recommendations to use Spreadsheet.flush() should be considered carefully, due to the performance impact. In these experiments, a single write of a 4000-cell spreadsheet took less than 50ms, and adding a call to flush() increased that to 610ms - still less than a second, but an order of magnitude tax seems ludicrous. Calling flush() for each of the 400 rows in the sample spreadsheet made the operation take almost 12 seconds, when it took just 164 ms without it. If you've been experiencing Exceeded maximum execution time errors, you may benefit from both optimizing your code AND removing calls to flush().
Experimental Results
All timings were derived following the technique described in How to measure time taken by a function to execute. Times are expressed in milliseconds.
Here are the results from a single pass of five different approaches, two using ScriptDB, three writing to Spreadsheets, all with the same source data. (400 objects with 5 String & 5 Number attributes)
Experiment 1
Elapsed time for ScriptDB/Object test: 53529
Elapsed time for ScriptDB/Batch test: 37700
Elapsed time for Spreadsheet/Object test: 145
Elapsed time for Spreadsheet/Attribute test: 4045
Elapsed time for Spreadsheet/Bulk test: 32
Effect of Spreadsheet.flush()
Experiment 2
In this experiment, the only difference from Experiment 1 was that we called Spreadsheet.flush() after every setValue/s call. The cost of doing so is dramatic, (around 700%) but does not change the recommendation to use a spreadsheet over ScriptDB for speed reasons, because writing to spreadsheets is still faster.
Elapsed time for ScriptDB/Object test: 55282
Elapsed time for ScriptDB/Batch test: 37370
Elapsed time for Spreadsheet/Object test: 11888
Elapsed time for Spreadsheet/Attribute test: 117388
Elapsed time for Spreadsheet/Bulk test: 610
Note: This experiment was often killed with Exceeded maximum execution time.
Caveat Emptor
You're reading this on the interwebs, so it must be true! But take it with a grain of salt.
These are results from very small sample sizes, and may not be completely reproducible.
These results are measuring something that changes constantly - while they were observed on Feb 28 2013, the system they measured could be completely different when you read this.
The efficiency of these operations is affected by many factors that are not controlled in these experiments; caching of instructions & intermediate results and server load, for example.
Maybe, just maybe, someone at Google will read this, and improve the efficiency of ScriptDB!
The Code
If you want to perform (or better yet, improve) these experiments, create a blank spreadsheet, and copy this into a new script within it. This is also available as a gist.
/**
* Run experiments to measure speed of various approaches to saving data in
* Google App Script (GAS).
*/
function testSpeed() {
var numObj = 400;
var numAttr = 10;
var doFlush = false; // Set true to activate calls to SpreadsheetApp.flush()
var arr = buildArray(numObj,numAttr);
var start, stop; // time catchers
var db = ScriptDb.getMyDb();
var sheet;
// Save into ScriptDB, Object at a time
deleteAll(); // Clear ScriptDB
start = new Date().getTime();
for (var i=1; i<=numObj; i++) {
db.save({type: "myObj", data:arr[i]});
}
stop = new Date().getTime();
Logger.log("Elapsed time for ScriptDB/Object test: " + (stop - start));
// Save into ScriptDB, Batch
var items = [];
// Restructure data - this is done outside the timed loop, assuming that
// the data would not be in an array if we were using this approach.
for (var obj=1; obj<=numObj; obj++) {
var thisObj = new Object();
for (var attr=0; attr < numAttr; attr++) {
thisObj[arr[0][attr]] = arr[obj][attr];
}
items.push(thisObj);
}
deleteAll(); // Clear ScriptDB
start = new Date().getTime();
db.saveBatch(items, false);
stop = new Date().getTime();
Logger.log("Elapsed time for ScriptDB/Batch test: " + (stop - start));
// Save into Spreadsheet, Object at a time
sheet = SpreadsheetApp.getActive().getActiveSheet().clear();
start = new Date().getTime();
for (var row=0; row<=numObj; row++) {
var values = [];
values.push(arr[row]);
sheet.getRange(row+1, 1, 1, numAttr).setValues(values);
if (doFlush) SpreadsheetApp.flush();
}
stop = new Date().getTime();
Logger.log("Elapsed time for Spreadsheet/Object test: " + (stop - start));
// Save into Spreadsheet, Attribute at a time
sheet = SpreadsheetApp.getActive().getActiveSheet().clear();
start = new Date().getTime();
for (var row=0; row<=numObj; row++) {
for (var cell=0; cell<numAttr; cell++) {
sheet.getRange(row+1, cell+1, 1, 1).setValue(arr[row][cell]);
if (doFlush) SpreadsheetApp.flush();
}
}
stop = new Date().getTime();
Logger.log("Elapsed time for Spreadsheet/Attribute test: " + (stop - start));
// Save into Spreadsheet, Bulk
sheet = SpreadsheetApp.getActive().getActiveSheet().clear();
start = new Date().getTime();
sheet.getRange(1, 1, numObj+1, numAttr).setValues(arr);
if (doFlush) SpreadsheetApp.flush();
stop = new Date().getTime();
Logger.log("Elapsed time for Spreadsheet/Bulk test: " + (stop - start));
}
/**
* Create a two-dimensional array populated with 'numObj' rows of 'numAttr' cells.
*/
function buildArray(numObj,numAttr) {
numObj = numObj | 400;
numAttr = numAttr | 10;
var array = [];
for (var obj = 0; obj <= numObj; obj++) {
array[obj] = [];
for (var attr = 0; attr < numAttr; attr++) {
var value;
if (obj == 0) {
// Define attribute names / column headers
value = "Attr"+attr;
}
else {
value = ((attr % 2) == 0) ? "This is a reasonable sized string for testing purposes, not too long, not too short." : Number.MAX_VALUE;
}
array[obj].push(value);
}
}
return array
}
function deleteAll() {
var db = ScriptDb.getMyDb();
while (true) {
var result = db.query({}); // get everything, up to limit
if (result.getSize() == 0) {
break;
}
while (result.hasNext()) {
var item = result.next()
db.remove(item);
}
}
}
ScriptDB has been deprecated. Do not use.

Google calendar query returns at most 25 entries

I'm trying to delete all calendar entries from today forward. I run a query then call getEntries() on the query result. getEntries() always returns 25 entries (or less if there are fewer than 25 entries on the calendar). Why aren't all the entries returned? I'm expecting about 80 entries.
As a test, I tried running the query, deleting the 25 entries returned, running the query again, deleting again, etc. This works, but there must be a better way.
Below is the Java code that only runs the query once.
CalendarQuery myQuery = new CalendarQuery(feedUrl);
DateFormat dfGoogle = new SimpleDateFormat("yyyy-MM-dd'T00:00:00'");
Date dt = Calendar.getInstance().getTime();
myQuery.setMinimumStartTime(DateTime.parseDateTime(dfGoogle.format(dt)));
// Make the end time far into the future so we delete everything
myQuery.setMaximumStartTime(DateTime.parseDateTime("2099-12-31T23:59:59"));
// Execute the query and get the response
CalendarEventFeed resultFeed = service.query(myQuery, CalendarEventFeed.class);
// !!! This returns 25 (or less if there are fewer than 25 entries on the calendar) !!!
int test = resultFeed.getEntries().size();
// Delete all the entries returned by the query
for (int j = 0; j < resultFeed.getEntries().size(); j++) {
CalendarEventEntry entry = resultFeed.getEntries().get(j);
entry.delete();
}
PS: I've looked at the Data API Developer's Guide and the Google Data API Javadoc. These sites are okay, but not great. Does anyone know of additional Google API documentation?
You can increase the number of results with myQuery.setMaxResults(). There will be a maximum maximum though, so you can make multiple queries ('paged' results) by varying myQuery.setStartIndex().
http://code.google.com/apis/gdata/javadoc/com/google/gdata/client/Query.html#setMaxResults(int)
http://code.google.com/apis/gdata/javadoc/com/google/gdata/client/Query.html#setStartIndex(int)
Based on the answers from Jim Blackler and Chris Kaminski, I enhanced my code to read the query results in pages. I also do the delete as a batch, which should be faster than doing individual deletions.
I'm providing the Java code here in case it is useful to anyone.
CalendarQuery myQuery = new CalendarQuery(feedUrl);
DateFormat dfGoogle = new SimpleDateFormat("yyyy-MM-dd'T00:00:00'");
Date dt = Calendar.getInstance().getTime();
myQuery.setMinimumStartTime(DateTime.parseDateTime(dfGoogle.format(dt)));
// Make the end time far into the future so we delete everything
myQuery.setMaximumStartTime(DateTime.parseDateTime("2099-12-31T23:59:59"));
// Set the maximum number of results to return for the query.
// Note: A GData server may choose to provide fewer results, but will never provide
// more than the requested maximum.
myQuery.setMaxResults(5000);
int startIndex = 1;
int entriesReturned;
List<CalendarEventEntry> allCalEntries = new ArrayList<CalendarEventEntry>();
CalendarEventFeed resultFeed;
// Run our query as many times as necessary to get all the
// Google calendar entries we want
while (true) {
myQuery.setStartIndex(startIndex);
// Execute the query and get the response
resultFeed = service.query(myQuery, CalendarEventFeed.class);
entriesReturned = resultFeed.getEntries().size();
if (entriesReturned == 0)
// We've hit the end of the list
break;
// Add the returned entries to our local list
allCalEntries.addAll(resultFeed.getEntries());
startIndex = startIndex + entriesReturned;
}
// Delete all the entries as a batch delete
CalendarEventFeed batchRequest = new CalendarEventFeed();
for (int i = 0; i < allCalEntries.size(); i++) {
CalendarEventEntry entry = allCalEntries.get(i);
BatchUtils.setBatchId(entry, Integer.toString(i));
BatchUtils.setBatchOperationType(entry, BatchOperationType.DELETE);
batchRequest.getEntries().add(entry);
}
// Get the batch link URL and send the batch request
Link batchLink = resultFeed.getLink(Link.Rel.FEED_BATCH, Link.Type.ATOM);
CalendarEventFeed batchResponse = service.batch(new URL(batchLink.getHref()), batchRequest);
// Ensure that all the operations were successful
boolean isSuccess = true;
StringBuffer batchFailureMsg = new StringBuffer("These entries in the batch delete failed:");
for (CalendarEventEntry entry : batchResponse.getEntries()) {
String batchId = BatchUtils.getBatchId(entry);
if (!BatchUtils.isSuccess(entry)) {
isSuccess = false;
BatchStatus status = BatchUtils.getBatchStatus(entry);
batchFailureMsg.append("\nID: " + batchId + " Reason: " + status.getReason());
}
}
if (!isSuccess) {
throw new Exception(batchFailureMsg.toString());
}
There is a small quote on the API page
http://code.google.com/apis/calendar/data/1.0/reference.html#Parameters
Note: The max-results query parameter for Calendar is set to 25 by default,
so that you won't receive an entire
calendar feed by accident. If you want
to receive the entire feed, you can
specify a very large number for
max-results.
So to get all events from a google calendar feed, we do this:
google.calendarurl.com/.../basic?max-results=999999
in the API you can also query with setMaxResults=999999
I got here while searching for a Python solution;
Should anyone be stuck in the same way, the important line is the fourth:
query = gdata.calendar.service.CalendarEventQuery(cal, visibility, projection)
query.start_min = start_date
query.start_max = end_date
query.max_results = 1000
Unfortunately, Google is going to limit the maximum number of queries you can retrieve. This is so as to keep the query governor in their guidelines (HTTP requests not allowed to take more than 30 seconds, for example). They've built their whole architecture around this, so you might as well build the logic as you have.

Resources