Bing custom search apis returning only limited results from one location and full result from different location - bing-api

I am trying to use Bing Custom Search's API for documents from Cognitive Services. The strange thing is that when I run it from India, it gives me more than a thousand results, but when I run it from a US server, it returns only 25 (sometimes 50 results). Here is the sample code for that:
var totalCount = 0;
var filetypes = new List<string> { "pdf", "docx", "doc" };
foreach (var filetype in filetypes)
{
var searchTerm = "microsoft%20.net%20resume+filetype%3a" + filetype;
Console.WriteLine("Searching for : " + filetype);
for (var i = 0; i < 40; i++)
{
var nextCount = 0;
var url = "https://api.cognitive.microsoft.com/bingcustomsearch/v7.0/search?" +
"q=" + searchTerm +
"&customconfig=" + customConfigId +
"&count=25" + "&offset=" + ((i * 25) + nextCount);
using (var client = new HttpClient())
{
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", subscriptionKey);
var httpResponseMessage = client.GetAsync(url).Result;
var responseContent = httpResponseMessage.Content.ReadAsStringAsync().Result;
BingCustomSearchResponse response =
JsonConvert.DeserializeObject<BingCustomSearchResponse>(responseContent);
if (response.webPages == null || response.webPages.value.Length <= 0)
{
Console.WriteLine("response.webPages is null ");
break;
}
foreach (var webPage in response.webPages.value)
{
Console.WriteLine("name: " + webPage.name);
Console.WriteLine("url: " + webPage.url);
Console.WriteLine("displayUrl: " + webPage.displayUrl);
Console.WriteLine("snippet: " + webPage.snippet);
Console.WriteLine("dateLastCrawled: " + webPage.dateLastCrawled);
Console.WriteLine();
}
totalCount = totalCount + response.webPages.value.Length;
}
}
}
The subscription key I am using is a trial key.

I got the reason of this behavior. Actually it had nothing to do with region/country/market.
After looking into the response i got this message.
"Rate limit is exceeded. Try again in 1 seconds"
It means for after each call in the loop i have to wait for 1 second to give next call. Now need to know is this limit for trial subscription or this is kept for all calls to prevent DDOS attack or something.
May be from India it was working because may one iteraction is already taking one or more second.

Two things you can try: 1) In searchTerm, no need to use %20 and %3a, just use punctuations as you type in Bing, e.g. var searchTerm = "microsoft.net resume filetype::"+filetype, and 2) Enforce market by appending mkt=en-in (for India) or en-us (for US) in the query. You can do this by appending +"&mkt=en-in" at the end of url.
I presume for custom search you have selected domains (for both en-in and en-us markets) that return thousands of results for this query.

Related

Console.Log and Discord Message displaying different results

I have a relatively basic set of code here that is programmed to roll 3 dice and give me the results whenever I type "/roll" in discord.
However, the results displayed on my command terminal are always different from those in the discord message from the bot.
async execute(interaction)
{
var num = 3;
num = Number(num);
function rollingDice(num) {
// make the dice rolling "num" times.
// return the results.
var diceResults = "";
for (var i = 0; i < num; i++) {
var resultOfEachDice = "";
resultOfEachDice = Math.floor((Math.random() * 6) + 1);
diceResults += resultOfEachDice + ", ";
}
var lastComma = diceResults.lastIndexOf(", ");
diceResults = diceResults.slice(0, lastComma);
return diceResults;
}
var diceResults = rollingDice()
console.log("Rolled " + num + " dice: " + rollingDice(num));
console.log(process.argv);
await interaction.reply('You rolled: ' + rollingDice(num));
So I will type /roll and my terminal will say I rolled "3, 5, 2" while the message would have something entirely different like "1, 6, 4".
This happens every time I run the command and I am not sure what the issue is.

Why is this getting the same row twice in second loop?

The code below runs with no apparent errors, but despite I've looked for all possible causes I could think of, I couldn't find the reason why it gets the correct row iterated, marks it as processed ("Sim), and from the 2º iteration on, it gets the new row, but repeats the others already iterated over.
function formToData() {
var sheet = SpreadsheetApp.getActiveSpreadsheet();
var ss = sheet.getSheetByName("Form Responses 1");
var targetSheet = sheet.getSheetByName("Cadastro de Cliente");
var StartRow = 2;
var RowRange = ss.getLastRow() - StartRow + 1;
var WholeRange = ss.getRange(StartRow, 1, RowRange, 30);
var AllValues = WholeRange.getValues();
var message = "";
for (var i = 0; i < AllValues.length; i++) {
var currentRow = AllValues[i];
//if row has been sent, then continue to next iteration
if (currentRow[0] != "" && currentRow[29] != "Sim") {
//set the row to look at
var setRow = parseInt(i) + StartRow;
var data = currentRow[0];
var dataFormatted = Utilities.formatDate(data, SpreadsheetApp.getActive().getSpreadsheetTimeZone(), "dd/MM/yyyy', às 'HH:mm") + "hs";
//set HTML template for information
message +=
"<p><b>Data: </b>" + dataFormatted + "</p>" +
"<p><b>Unidade: </b>" + currentRow[1] + "</p>"
//mark row as "sent"
ss.getRange(setRow, 30).setValue("Sim");
var values = targetSheet.getRange("A:A").getValues();
var maxIndex = values.reduce(function (maxIndex, row, index) {
return row[0] === "" ? maxIndex : index;
}, 0);
targetSheet.getRange(maxIndex + 2, 1, 1, 30)
.setNumberFormat("#")
.setValues([currentRow]);
var sendTo = "email";
var subject = "Cadastro de cliente novo";
if (message) {
MailApp.sendEmail({
to: sendTo,
subject: subject,
name: "Comercial - Emape",
htmlBody: message,
});
}
}
}//For loop closes
}
I'd appreaciate if you could help me find the flaw.
I tried to replicate this behavior, however, on a static sheet the script performs as expected:
Set unmarked rows in column 30 as 'Sim'.
Copy these rows to a separate sheet starting from the first empty row (or whichever row that Column A is empty).
Given the names of the functions and the sheets, this may be a Sheet generated by a Google Form. These sheets are dynamic, and there is a possibility that the contents may change while your script is running, especially if users are allowed to edit responses.
As a workaround, I suggest to lock the form before running the script:
Also, check the contents of the sheet and form for any possibility that the "Sim" mark might be overwritten by new or edited form data, maybe the sheet is inserting 30 columns instead of 29 or less.

Fastest way to search for a row in a large Google Sheet using/in Google Apps Script

GAS is quite powerful and you could write a full fledged web-app using a Google Sheet as the DB back-end. There are many reasons not to do this but I figure in some cases it is okay.
I think the biggest issue will be performance issues when looking for rows based on some criteria in a sheet with a lot of rows. I know there are many ways to "query" a sheet but I can't find reliable information on which is the fastest.
One of the complexities is that many people can edit a sheet which means there are a variable number of situations you'd have to account for. For the sake of simplicity, I want to assume the sheet:
Is locked down so only one person can see it
The first column has the row number (=row())
The most basic query is finding a row where a specific column equals some value.
Which method would be the fastest?
I have a sheet with ~19k rows and ~38 columns, filled with all sorts of unsorted real-world data. That is almost 700k rows so I figured it would be a good sheet to time a few methods and see which is the fastest.
method 1: get sheet as a 2D array then go through each row
method 2: get sheet as a 2D array, sort it, then using a binary search algorithm to find the row
method 3: make a UrlFetch call to Google visualization query and don't provide last row
method 4: make a UrlFetch call to Google visualization query and provide last row
Here are the my query functions.
function method1(spreadsheetID, sheetName, columnIndex, query)
{
// get the sheet values excluding header,
var rowValues = SpreadsheetApp.openById(spreadsheetID).getSheetByName(sheetName).getSheetValues(2, 1, -1, -1);
// loop through each row
for(var i = 0, numRows = rowValues.length; i < numRows; ++i)
{
// return it if found
if(rowValues[i][columnIndex] == query) return rowValues[i]
}
return false;
}
function method2(spreadsheetID, sheetName, columnIndex, query)
{
// get the sheet values excluding header
var rowValues = SpreadsheetApp.openById(spreadsheetID).getSheetByName(sheetName).getSheetValues(2, 1, -1, -1);
// sort it
rowValues.sort(function(a, b){
if(a[columnIndex] < b[columnIndex]) return -1;
if(a[columnIndex] > b[columnIndex]) return 1;
return 0;
});
// search using binary search
var foundRow = matrixBinarySearch(rowValues, columnIndex, query, 0, rowValues.length - 1);
// return if found
if(foundRow != -1)
{
return rowValues[foundRow];
}
return false;
}
function method3(spreadsheetID, sheetName, queryColumnLetterStart, queryColumnLetterEnd, queryColumnLetterSearch, query)
{
// SQL like query
myQuery = "SELECT * WHERE " + queryColumnLetterSearch + " = '" + query + "'";
// the query URL
// don't provide last row in range selection
var qvizURL = 'https://docs.google.com/spreadsheets/d/' + spreadsheetID + '/gviz/tq?tqx=out:json&headers=1&sheet=' + sheetName + '&range=' + queryColumnLetterStart + ":" + queryColumnLetterEnd + '&tq=' + encodeURIComponent(myQuery);
// fetch the data
var ret = UrlFetchApp.fetch(qvizURL, {headers: {Authorization: 'Bearer ' + ScriptApp.getOAuthToken()}}).getContentText();
// remove some crap from the return string
return JSON.parse(ret.replace("/*O_o*/", "").replace("google.visualization.Query.setResponse(", "").slice(0, -2));
}
function method4(spreadsheetID, sheetName, queryColumnLetterStart, queryColumnLetterEnd, queryColumnLetterSearch, query)
{
// find the last row in the sheet
var lastRow = SpreadsheetApp.openById(spreadsheetID).getSheetByName(sheetName).getLastRow();
// SQL like query
myQuery = "SELECT * WHERE " + queryColumnLetterSearch + " = '" + query + "'";
// the query URL
var qvizURL = 'https://docs.google.com/spreadsheets/d/' + spreadsheetID + '/gviz/tq?tqx=out:json&headers=1&sheet=' + sheetName + '&range=' + queryColumnLetterStart + "1:" + queryColumnLetterEnd + lastRow + '&tq=' + encodeURIComponent(myQuery);
// fetch the data
var ret = UrlFetchApp.fetch(qvizURL, {headers: {Authorization: 'Bearer ' + ScriptApp.getOAuthToken()}}).getContentText();
// remove some crap from the return string
return JSON.parse(ret.replace("/*O_o*/", "").replace("google.visualization.Query.setResponse(", "").slice(0, -2));
}
My binary search algorithm:
function matrixBinarySearch(matrix, columnIndex, query, firstIndex, lastIndex)
{
// find the value using binary search
// https://www.w3resource.com/javascript-exercises/javascript-array-exercise-18.php
// first make sure the query string is valid
// if it is less than the smallest value
// or larger than the largest value
// it is not valid
if(query < matrix[firstIndex][columnIndex] || query > matrix[lastIndex][columnIndex]) return -1;
// if its the first row
if(query == matrix[firstIndex][columnIndex]) return firstIndex;
// if its the last row
if(query == matrix[lastIndex][columnIndex]) return lastIndex;
// now start doing binary search
var middleIndex = Math.floor((lastIndex + firstIndex)/2);
while(matrix[middleIndex][columnIndex] != query && firstIndex < lastIndex)
{
if(query < matrix[middleIndex][columnIndex])
{
lastIndex = middleIndex - 1;
}
else if(query > matrix[middleIndex][columnIndex])
{
firstIndex = middleIndex + 1;
}
middleIndex = Math.floor((lastIndex + firstIndex)/2);
}
return matrix[middleIndex][columnIndex] == query ? middleIndex : -1;
}
This is the function I used to test them all:
// each time this function is called it will try one method
// the first time it is called it will try method1
// then method2, then method3, then method4
// after it does method4 it will start back at method1
// we will use script properties to save which method is next
// we also want to use the same query string for each batch so we'll save that in script properties too
function testIt()
{
// get the sheet where we're staving run times
var runTimesSheet = SpreadsheetApp.openById("...").getSheetByName("times");
// we want to see true speed tests and don't want server side caching so we a copy of our data sheet
// make a copy of our data sheet and get its ID
var tempSheetID = SpreadsheetApp.openById("...").copy("temp sheet").getId();
// get script properties
var scriptProperties = PropertiesService.getScriptProperties();
// the counter
var searchCounter = Number(scriptProperties.getProperty("searchCounter"));
// index of search list we want to query for
var searchListIndex = Number(scriptProperties.getProperty("searchListIndex"));
// if we're at 0 then we need to get the index of the query string
if(searchCounter == 0)
{
searchListIndex = Math.floor(Math.random() * searchList.length);
scriptProperties.setProperty("searchListIndex", searchListIndex);
}
// query string
var query = searchList[searchListIndex];
// save relevant data
var timerRow = ["method" + (searchCounter + 1), searchListIndex, query, 0, "", "", "", ""];
// run the appropriate method
switch(searchCounter)
{
case 0:
// start time
var start = (new Date()).getTime();
// run the query
var ret = method1(tempSheetID, "Extract", 1, query);
// end time
timerRow[3] = ((new Date()).getTime() - start) / 1000;
// if we found the row save its values in the timer output so we can confirm it was found
if(ret)
{
timerRow[4] = ret[0];
timerRow[5] = ret[1];
timerRow[6] = ret[2];
timerRow[7] = ret[3];
}
break;
case 1:
var start = (new Date()).getTime();
var ret = method2(tempSheetID, "Extract", 1, query);
timerRow[3] = ((new Date()).getTime() - start) / 1000;
if(ret)
{
timerRow[4] = ret[0];
timerRow[5] = ret[1];
timerRow[6] = ret[2];
timerRow[7] = ret[3];
}
break;
case 2:
var start = (new Date()).getTime();
var ret = method3(tempSheetID, "Extract", "A", "AL", "B", query);
timerRow[3] = ((new Date()).getTime() - start) / 1000;
if(ret.table.rows.length)
{
timerRow[4] = ret.table.rows[0].c[0].v;
timerRow[5] = ret.table.rows[0].c[1].v;
timerRow[6] = ret.table.rows[0].c[2].v;
timerRow[7] = ret.table.rows[0].c[3].v;
}
break;
case 3:
var start = (new Date()).getTime();
var ret = method3(tempSheetID, "Extract", "A", "AL", "B", query);
timerRow[3] = ((new Date()).getTime() - start) / 1000;
if(ret.table.rows.length)
{
timerRow[4] = ret.table.rows[0].c[0].v;
timerRow[5] = ret.table.rows[0].c[1].v;
timerRow[6] = ret.table.rows[0].c[2].v;
timerRow[7] = ret.table.rows[0].c[3].v;
}
break;
}
// delete the temp file
DriveApp.getFileById(tempSheetID).setTrashed(true);
// save run times
runTimesSheet.appendRow(timerRow);
// start back at 0 if we're the end
if(++searchCounter == 4) searchCounter = 0;
// save the search counter
scriptProperties.setProperty("searchCounter", searchCounter);
}
I have a global variable searchList that is an array of various query strings -- some are in the sheet, some are not.
I ran testit on a trigger to run every minute. After 152 iterations I had 38 batches. Looking at the result, this is what I see for each method:
| Method | Minimum Seconds | Maximum Seconds | Average Seconds |
|---------|-----------------|-----------------|-----------------|
| method1 | 8.24 | 36.94 | 11.86 |
| method2 | 9.93 | 23.38 | 14.09 |
| method3 | 1.92 | 5.48 | 3.06 |
| method4 | 2.20 | 11.14 | 3.36 |
So it appears that, at least for my data-set, is using Google visualization query is the fastest.

How to reduce execution time of this Google Apps Script?

I wrote a script that gets a rows data from a spreadsheet and loops through them, calling a function to send an SMS if the rows' data meets certain conditions (having a phone number and not having already been sent for example).
However after adding about 600 rows, the script execution time exceeds it's limit, that seems to be 5 minutes according to my research. I'm using JavaScript objects to read data and a for loop to iterate through the rows.
Can anyone tel me if it is possible to make it faster? I'm very new to programming but this seems such a light task for all this computing power that I can't understand why it takes so long
Thanks in advance!
Here's the code of the function I'm using:
// Will send SMS on the currently active sheet
function sendSms() {
// Use the send sms menu to trigger reconcile
var user = ScriptProperties.getProperty(PROPERTY_USER_RECONCILE);
if (user == null)
reconcileUser();
// The sheets
var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Registo");
var settingsSheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Settings");
// Fetch values for each row in the Range.
var startRow = 2;
var apiKey = settingsSheet.getRange("B2").getValue();
var apiSecret = settingsSheet.getRange("B3").getValue();
var prefix = settingsSheet.getRange("B4").getValue();
var numRows = sheet.getMaxRows() - 1;
var numCols = 16;
var statusColNum = 15; // IMPT: To keep track status in col 15
var dataRange = sheet.getRange(startRow, 1, numRows, numCols);
// Make sure there is API key and secret
if (apiKey == "" || apiSecret == "") {
Browser.msgBox("You MUST fill in your API key and secret in Settings sheet first!");
return;
}
// Create one JavaScript object per row of data.
var objects = getRowsData(sheet, dataRange);
var totalSent = 0;
for (var i = 0; i < objects.length; ++i) {
// Get a row object
var rowData = objects[i];
var ss = SpreadsheetApp.getActiveSpreadsheet();
var templateSheet = ss.getSheetByName("SMS Modelo");
var template = templateSheet.getRange("A1").getValue();
// jump loop iteration if conditions not satisied
if (rowData.resolv == "x" || rowData.contactoUtente == null || rowData.contactoUtente == "" || rowData.reserv == null || rowData.reserv == "" || rowData.cont == "x" || rowData.sms !== null) continue;
var message = fillInTemplateFromObject(template, rowData);
var senderName = "Farm Cunha"
var mobile = rowData.contactoUtente;
// Send via Nexmo API
var response = nexmoSendSms(apiKey, apiSecret,"+351" + mobile, message, senderName);
if (response.getResponseCode() == 200) {
var object = JSON.parse(response.getContentText());
if (object.messages[0]['status'] == "0") {
// Set to QUEUE status - We assumed SENT, as we don't handle delivery status.
//sheet.getRange(startRow + i, statusColNum).setValue(STATUS_QUEUE);
sheet.getRange(startRow + i, statusColNum).setValue(STATUS_SENT);
// Set the reference id
sheet.getRange(startRow + i, 19).setValue(object.messages[0]['message-id']);
// sheet.getRange(startRow + i, statusColNum+3).setValue(new Date()); linha pode ser activada para fazer timestamp do envio
totalSent++;
}
else {
// If status is not 0, then it is an error.
// Set status to the error text
sheet.getRange(startRow + i, statusColNum).setValue(object.messages[0]['error-text']);
}
}
else {
// Non 200 OK response
sheet.getRange(startRow + i, statusColNum).setValue("Error Response Code: " + response.getResponseCode);
}
SpreadsheetApp.flush();
// Need a wait. Need to throttle else will have "Route Busy" error.
Utilities.sleep(2000);
}
// Update total sent
var lastTotalSent = parseInt(ScriptProperties.getProperty(PROPERTY_SMS_SENT_FOR_RECONCILE));
if (isNaN(lastTotalSent)) lastTotalSent = 0;
ScriptProperties.setProperty(PROPERTY_SMS_SENT_FOR_RECONCILE, (lastTotalSent + totalSent).toString());
Logger.log("Last sent: " + lastTotalSent + " now sent: " + totalSent);
reconcileApp();
}
You have a few things in your loop that are too time consuming : spreadsheet readings and API calls + 2 seconds sleep !.
I would obviously advise you to take these out of the loop (specially the template sheet reading that is always the same!). A possible solution would be to check the conditions from the row objects and to save the valid entries in an array... THEN iterate in this array to call the API.
If this is still too long then proceed by small batches, saving the end position of the partial iteration in scriptproperties and using a timer trigger that will continue the process every 5 minutes until it is completed (and kill the trigger at the end).
There are a few example of this kind of "mechanics" on this forum, one recent example I suggested is here (it's more like a draft but the idea is there)
Ok, I've solved it by taking these 3 lines out of the loop as Serge (thanks) had told me to:
var ss = SpreadsheetApp.getActiveSpreadsheet();
var templateSheet = ss.getSheetByName("SMS Modelo");
var template = templateSheet.getRange("A1").getValue();
It's so simple that I don't know how I was not seeing that.
This simple change made the script much faster. For example, going through 600 rows would take more than 5 minutes. Now, more than 5000 rows only take seconds.

Microsoft Outlook Interop (extract attachments) very slow

I'm using Microsoft.Office.Interop.Outlook to extract e-mail attachments:
var MAPI = new Application().GetNamespace("MAPI");
var ExampleFolder = MAPI.GetDefaultFolder(OlDefaultFolders.olFolderSentMail)
foreach (dynamic i in ExampleFolder.Items)
if (i.Attachments.Count > 0)
; // DoSomething();
Unfortunately this is extremely slow.
Is there any faster way to check for attachments?
Is it possible to filter/sort e-mails by date: loop through the last n items only?
sure, you can sort the collection using Items.Sort.
You can also use Items.Find/FindNext or Items.Restrict to look for items with attachments only. The property you need is PR_HASATTACH (DASL name http://schemas.microsoft.com/mapi/proptag/0x0E1B000B)
#Kiquenet (I can't add a comment below yours), here is the code to get items with attachments from Items.Restrict:
//fanti's code
var MAPI = new Application().GetNamespace("MAPI");
var ExampleFolder = MAPI.GetDefaultFolder(OlDefaultFolders.olFolderSentMail)
Urn way (tested, ok -> source https://social.msdn.microsoft.com/Forums/windowsapps/en-US/b6fef244-756c-4ab0-a22b-78137cfb4349/datereceived-filter-nor-happeinig?forum=outlookdev):
var itemsWithAttachment = ExampleFolder.Items.Restrict("#SQL= urn:schemas:httpmail:hasattachment = True");
DASL way (tested, ko -> 'should work' source https://learn.microsoft.com/en-us/office/client-developer/outlook/pia/how-to-filter-and-efficiently-enumerate-items-in-a-folder):
const string PR_HAS_ATTACH = "https://schemas.microsoft.com/mapi/proptag/0x0E1B000B";
var itemsWithAttachment = ExampleFolder.Items.Restrict("#SQL=\"" + PR_HAS_ATTACH + "\" = 1");
To filter by a date, just add "AND"s or "OR"s like this (Urn way):
var itemsWithAttachmentAndDate = ExampleFolder.Items.Restrict("#SQL= urn:schemas:httpmail:hasattachment = True"
+ " AND urn:schemas:httpmail:datereceived <= '" + DateTime.Now.AddMonths(-3) + "'");
To loop through the last n items only:
int n = 3;
for (int i = itemsWithAttachmentAndDate.Count - 1; i > n; i--)
{
//current item: itemsWithAttachmentAndDate[i] //Beware: "dynamic" typed!
; //DoSomething();
}

Resources