Google Spreadsheet API - returns remote 500 error - google-api

Has anyone battled 500 errors with the Google spreadsheet API for google domains?
I have copied the code in this post (2-legged OAuth): http://code.google.com/p/google-gdata/source/browse/trunk/clients/cs/samples/OAuth/Program.cs, substituted in my domain;s API id and secret and my own credentials, and it works.
So it appears my domain setup is fine (at least for the contacts/calendar apis).
However swapping the code out for a new Spreadsheet service / query instead, it reverts to type: remote server returned an internal server error (500).
var ssq = new SpreadsheetQuery();
ssq.Uri = new OAuthUri("https://spreadsheets.google.com/feeds/spreadsheets/private/full", "me", "mydomain.com");
ssq.OAuthRequestorId = "me#mydomain.com"; // can do this instead of using OAuthUri for queries
var feed = ssservice.Query(ssq); //boom 500
Console.WriteLine("ss:" + feed.Entries.Count);
I are befuddled

I had to make sure to use the "correct" class:
not
//using SpreadsheetQuery = Google.GData.Spreadsheets.SpreadsheetQuery;
but
using SpreadsheetQuery = Google.GData.Documents.SpreadsheetQuery;
stinky-malinky
Seems you need the gdocs api to query for spreadsheets, but the spreadsheet api to query inside of a spreadsheet but nowhere on the internet until now will you find this undeniably important tit-bit. Google sucks hard on that one.

Related

Google sheets IMPORTXML fails for ASX data

I am trying to extract the "Forward Dividend & Yield" value from https://finance.yahoo.com/ for multiple companies in different markets, into Google Sheets.
This is successful:
=IMPORTXML("https://finance.yahoo.com/quote/WBS", "//*[#id='quote-summary']/div[2]/table/tbody/tr[6]/td[2]")
But this fails with #N/A:
=IMPORTXML("https://finance.yahoo.com/quote/CBA.AX", "//*[#id='quote-summary']/div[2]/table/tbody/tr[6]/td[2]")
I cannot work out what needs to be different for ASX ticker codes, why does CBA.AX cause a problem?
Huge thanks for any help
When I tested the formula of =IMPORTXML("https://finance.yahoo.com/quote/CBA.AX", "//*"), an error of Error Resource at url not found. occurred. I thought that this might be the reason of your issue.
But, fortunately, when I try to retrieve the HTML from the same URL using Google Apps Script, the HTML could be retrieved. So, in this answer, I would like to propose to retrieve the value using the custom function created by Google Apps Script. The sample script is as follows.
Sample script:
Please copy and paste the following script to the script editor of Google Spreadsheet and save it. And, please put a formula of =SAMPLE("https://finance.yahoo.com/quote/CBA.AX") to a cell. By this, the value is retrieved.
function SAMPLE(url) {
const res = UrlFetchApp.fetch(url).getContentText().match(/DIVIDEND_AND_YIELD-value.+?>(.+?)</);
return res && res.length > 1 ? res[1] : "No value";
}
Result:
When above script is used, the following result is obtained.
Note:
When this script is used, you can also use =SAMPLE("https://finance.yahoo.com/quote/WBS").
In this case, when the HTML structure of the URL is changed, this script might not be able to be used. I think that this situation is the same with IMPORTXML and the xpath. So please be careful this.
References:
Custom Functions in Google Sheets
Class UrlFetchApp
An other solution is to decode the json contained in the source of the web page. Of course you can't use importxml since the web page is built on your side by javascript and not on server's side. You can access data by this way and get a lot of informations
var source = UrlFetchApp.fetch(url).getContentText()
var jsonString = source.match(/(?<=root.App.main = ).*(?=}}}})/g) + '}}}}'
i.e. for what you are looking for you can use
function trailingAnnualDividendRate(){
var url='https://finance.yahoo.com/quote/CBA.AX'
var source = UrlFetchApp.fetch(url).getContentText()
var jsonString = source.match(/(?<=root.App.main = ).*(?=}}}})/g) + '}}}}'
var data = JSON.parse(jsonString)
var dividendRate = data.context.dispatcher.stores.QuoteSummaryStore.summaryDetail.trailingAnnualDividendRate.raw
Logger.log(dividendRate)
}

how can I get ALL records from route53?

how can I get ALL records from route53?
referring code snippet here, which seemed to work for someone, however not clear to me: https://github.com/aws/aws-sdk-ruby/issues/620
Trying to get all (I have about ~7000 records) via resource record sets but can't seem to get the pagination to work with list_resource_record_sets. Here's what I have:
route53 = Aws::Route53::Client.new
response = route53.list_resource_record_sets({
start_record_name: fqdn(name),
start_record_type: type,
max_items: 100, # fyi - aws api maximum is 100 so we'll need to page
})
response.last_page?
response = response.next_page until response.last_page?
I verified I'm hooked into right region, I see the record I'm trying to get (so I can delete later) in aws console, but can't seem to get it through the api. I used this: https://github.com/aws/aws-sdk-ruby/issues/620 as a starting point.
Any ideas on what I'm doing wrong? Or is there an easier way, perhaps another method in the api I'm not finding, for me to get just the record I need given the hosted_zone_id, type and name?
The issue you linked is for the Ruby AWS SDK v2, but the latest is v3. It also looks like things may have changed around a bit since 2014, as I'm not seeing the #next_page or #last_page? methods in the v2 API or the v3 API.
Consider using the #next_record_name and #next_record_type from the response when #is_truncated is true. That's more consistent with how other paginations work in the Ruby AWS SDK, such as with DynamoDB scans for example.
Something like the following should work (though I don't have an AWS account with records to test it out):
route53 = Aws::Route53::Client.new
hosted_zone = ? # Required field according to the API docs
next_name = fqdn(name)
next_type = type
loop do
response = route53.list_resource_record_sets(
hosted_zone_id: hosted_zone,
start_record_name: next_name,
start_record_type: next_type,
max_items: 100, # fyi - aws api maximum is 100 so we'll need to page
)
records = response.resource_record_sets
# Break here if you find the record you want
# Also break if we've run out of pages
break unless response.is_truncated
next_name = response.next_record_name
next_type = response.next_record_type
end

credentials for google knowledge graph

I am trying to use the Google Knowledge graph API. I already have the API key and also use the library instead of the RESTful API.
kgSearch = Kgsearch::KgsearchService.new
response = kgSearch.search_entities(query: query)
I have tried to instantiate the service as below
kgSearch = Kgsearch::KgsearchService.new(api: 'klfkdlfkdlm')
it's rejected because the init expect no arguments.
Any idea, how to add the api_key ??
I try also:
response = kgSearch.search_entities(query: query, api: 'fjfkjfl')
same things
Any ideas?
According to the Ruby Docs for the Google Api Client, key is an instance method where you can assign your api key (http://www.rubydoc.info/github/google/google-api-ruby-client/Google/Apis/KgsearchV1/KgsearchService#key-instance_method).
So I believe you'd do something like the following:
kgSearch = Kgsearch::KgsearchService.new
kgSearch.key = 'your_key_here'
response = kgSearch.search_entities(query: query) # and any other options that are necessary

Direct (and simple!) AJAX upload to AWS S3 from (AngularJS) Single Page App

I know there's been a lot of coverage on upload to AWS S3. However, I've been struggling with this for about 24 hours now and I have not found any answer that fits my situation.
What I'm trying to do
Upload a file to AWS S3 directly from my client to my S3 bucket. The situation is:
It's a Single Page App, so upload request must be in AJAX
My server and my client are not on the same domain
The S3 bucket is of the newest sort (Frankfurt), for which some signature-generating libraries don't work (see below)
Client is in AngularJS
Server is in ExpressJS
What I've tried
Heroku's article on direct upload to S3. Doesn't fit my client/server configuration (plus it really does not fit harmoniously with Angular)
ready-made directives like ng-s3upload. Does not work because their signature-generating algorithm is not accepted by recent s3 buckets.
Manually creating a file upload directive and logic on the client like in this article (using FormData and Angular's $http). It consisted of getting a signed URL from AWS on the server (and that part worked), then AJAX-uploading to that URL. It failed with some mysterious CORS-related message (although I did set a CORS config on Heroku)
It seems I'm facing 2 difficulties: having a file input that works in my Single Page App, and getting AWS's workflow right.
The kind of solution I'm looking for
If possible, I'd like to avoid 'all included' solutions that manage the whole process while hiding of all of the complexity, making it hard to adapt to special cases. I'd much rather have a simple explanation breaking down the flow of data between the various components involved, even if it requires some more plumbing from me.
I finally managed. The key points were:
Let go of Angular's $http, and use native XMLHttpRequest instead.
Use the getSignedUrl feature of AWS's SDK, instead on implementing my own signature-generating workflow like many libraries do.
Set the AWS configuration to use the proper signature version (v4 at the time of writing) and region ('eu-central-1' in the case of Frankfurt).
Here below is a step-by-step guide of what I did; it uses AngularJS and NodeJS on the server, but should be rather easy to adapt to other stacks, especially because it deals with the most pathological cases (SPA on a different domain that the server, with a bucket in a recent - at the time of writing - region).
Workflow summary
The user selects a file in the browser; your JavaScript keeps a reference to it.
the client sends a request to your server to obtain a signed upload URL.
Your server chooses a name for the object to put in the bucket (make sure to avoid name collisions!).
The server obtains a signed URL for your object using the AWS SDK, and sends it back to the client. This involves the object's name and the AWS credentials.
Given the file and the signed URL, the client sends a PUT request directly to your S3 Bucket.
Before you start
Make sure that:
Your server has the AWS SDK
Your server has AWS credentials with proper access rights to your bucket
Your S3 bucket has a proper CORS configuration for your client.
Step 1: setup a SPA-friendly file upload form / widget.
All that matters is to have a workflow that eventually gives you programmatic access to a File object - without uploading it.
In my case, I used the ng-file-select and ng-file-drop directives of the excellent angular-file-upload library. But there are other ways of doing it (see this post for example.).
Note that you can access useful information in your file object such as file.name, file.type etc.
Step 2: Get a signed URL for the file on your server
On your server, you can use the AWS SDK to obtain a secure, temporary URL to PUT your file from someplace else (like your frontend).
In NodeJS, I did it this way:
// ---------------------------------
// some initial configuration
var aws = require('aws-sdk');
aws.config.update({
accessKeyId: process.env.AWS_ACCESS_KEY,
secretAccessKey: process.env.AWS_SECRET_KEY,
signatureVersion: 'v4',
region: 'eu-central-1'
});
// ---------------------------------
// now say you want fetch a URL for an object named `objectName`
var s3 = new aws.S3();
var s3_params = {
Bucket: MY_BUCKET_NAME,
Key: objectName,
Expires: 60,
ACL: 'public-read'
};
s3.getSignedUrl('putObject', s3_params, function (err, signedUrl) {
// send signedUrl back to client
// [...]
});
You'll probably want to know the URL to GET your object to (typically if it's an image). To do this, I simply removed the query string from the URL:
var url = require('url');
// ...
var parsedUrl = url.parse(signedUrl);
parsedUrl.search = null;
var objectUrl = url.format(parsedUrl);
Step 3: send the PUT request from the client
Now that your client has your File object and the signed URL, it can send the PUT request to S3. My advice in Angular's case is to just use XMLHttpRequest instead of the $http service:
var signedUrl, file;
// ...
var d_completed = $q.defer(); // since I'm working with Angular, I use $q for asynchronous control flow, but it's not mandatory
var xhr = new XMLHttpRequest();
xhr.file = file; // not necessary if you create scopes like this
xhr.onreadystatechange = function(e) {
if ( 4 == this.readyState ) {
// done uploading! HURRAY!
d_completed.resolve(true);
}
};
xhr.open('PUT', signedUrl, true);
xhr.setRequestHeader("Content-Type","application/octet-stream");
xhr.send(file);
Acknowledgements
I would like to thank emil10001 and Will Webberley, whose publications were very valuable to me for this issue.
You can use the ng-file-upload $upload.http method in conjunction with the aws-sdk getSignedUrl to accomplish this. After you get the signedUrl back from your server, this is the client code:
var fileReader = new FileReader();
fileReader.readAsArrayBuffer(file);
fileReader.onload = function(e) {
$upload.http({
method: 'PUT',
headers: {'Content-Type': file.type != '' ? file.type : 'application/octet-stream'},
url: signedUrl,
data: e.target.result
}).progress(function (evt) {
var progressPercentage = parseInt(100.0 * evt.loaded / evt.total);
console.log('progress: ' + progressPercentage + '% ' + file.name);
}).success(function (data, status, headers, config) {
console.log('file ' + file.name + 'uploaded. Response: ' + data);
});
To do multipart uploads, or those larger than 5 GB, this process gets a bit more complicated, as each part needs its own signature. Conveniently, there is a JS library for that:
https://github.com/TTLabs/EvaporateJS
via https://github.com/aws/aws-sdk-js/issues/468
Use s3-file-upload open source directive having dynamic data-binding and auto-callback functions - https://github.com/vinayvnvv/s3FileUpload

Logging Into Google To Scrape A Private Google Group (over HTTPS)

I'm trying to log into Google, so that I can scrape & migrate a private google group.
It doesn't seem to log in over SSL. Any ideas appreciated. I'm using Mechanize and the code is below:
group_signin_url = "https://login page to goolge, with referrer url to a private group here"
user = ENV['GOOGLE_USER']
password = ENV['GOOGLE_PASSWORD']
scraper = Mechanize.new
scraper.user_agent = Mechanize::AGENT_ALIASES["Linux Firefox"]
scraper.agent.http.verify_mode = OpenSSL::SSL::VERIFY_NONE
page = scraper.get group_signin_url
google_form = page.form
google_form.Email = user
google_form.Passwd = password
group_page = scraper.submit(google_form, google_form.buttons.first)
pp group_page
I worked with Ian (the OP) on this problem and just felt we should close this thread with some answers based on what we found when we spent some more time on the problem.
1) You can't scrape a Google Group with Mechanize. We managed to get logged in abut the content of the Google Group pages is all rendered in-browser, meaning that HTTP requests, such as issued by Mechanize, are returned with a few links and no actual content.
We found that we could get page content by the use of Selenium (we used Selenium in Firefox, using the Ruby bindings).
2) the HTML element IDs/classes in Google Groups are obfuscated but we found that these Selenium commands will pull out the bits you need (until Google change them)
message snippets (click on them to expand messages)
find_elements(:class, 'GFP-UI5CCLB')
elements with name of author
find_elements(:class, 'GFP-UI5CA1B')
elements with content of post
find_elements(:class, 'GFP-UI5CCKB')
elements containing date
find_elements(:class, 'GFP-UI5CDKB') (and then use the attribute[:title] for a full length date string)
3) I have some Ruby code here which scrapes the content programmatically and uploads it into a Discourse forum (which is what we were trying to migrate to).
It's hacky but it kind of works. I recently migrated 2 commercially important Google Groups using this script. I'm up for taking on 'We Scrape Your Google Group' type work, please PM me.

Resources