Number of restaurants with specific cuisine in each country - google-api

I am trying to figure out how many restaurants, in each country, there are of a specific cuisine (seafood). I have looked at Google Places Api and TripAdvisor Api, but cannot find these numbers. I don´t need the list of restaurants, only number of restaurants. I found OpenStreetMap which looked very promising. I downloaded data for Norway, but the numbers are not correct (osmium tags-filter norway-latest.osm.pbf cuisine=seafood) = 62, which is way to low.
Any suggestion for how and where I can find what I am looking for?

Extrapolate.
You won't get an accurate answer, how do you even define what a seafood restaurant is?
Find out roughly how many restaurants there are in the area you are interested in and then decide what % of them might be seafood restaurants.

You can use this approach to extract the data from OpenStreetMap:
https://gis.stackexchange.com/questions/363474/aggregate-number-of-features-by-country-in-overpass
You can run the query on http://overpass-turbo.eu/ (go to settings and chose the kumi-systems server).
The query could look like this:
// Define fields for csv output
[out:csv(name, total)][timeout:2500];
//All countries
area["admin_level"=2];
// Count in each area
foreach->.regio(
// Collect all Nodes with highway=milestone in the current area
( node(area.regio)[cuisine=seafood];
way(area.regio)[cuisine=seafood];
rel(area.regio)[cuisine=seafood];);
// assemble the output
make count name = regio.set(t["name:en"]),
total = count(nodes) + count(ways) + count(relations);
out;
);
This query can take a long time (at the time of writing, mine did not yet finish)
You can also run the query via curl in on some server and let the results mailed to you via curl ....... | mail -s "Overpass Result" yourmail#example.com. You get the curl command in the browser network tab by "copy curl"
I also considered Taginfo (https://taginfo.openstreetmap.org/tags/cuisine=seafood) but it cannot filter by tag.

Related

Is it possible to pair the affiliation history with a year in which a researcher served in given institution, as it appears in the scopus website?

I used au.affiliation_history to get the affiliation history from a list of Authors ID.  It worked great but now I am trying to pair the affiliation history with the Year in which the researcher served in the institution that appear in the affiliation history results.  
However, I cannot find the way to do this. Is is possible to do this? If so,  can you please give me a hint or idea how can I achieve this? 
Unfortunately, the information Scopus shares on an author profile on scopus.com is not the same they share via the Author Retrieval API. I think the only way to get to yearly affiliations is to extract them from the publications that you get from the Scopus Search API.
from collections import defaultdict
from pybliometrics.scopus import ScopusSearch
AUTHOR = "7004212771"
q = f"AU-ID({AUTHOR})"
s = ScopusSearch(q)
yearly_affs = defaultdict(lambda: list())
for pub in s.results:
year = pub.coverDate[:4]
auth_idx = pub.author_ids.split(";").index(AUTHOR)
affs = pub.author_afids.split(";")[auth_idx].split("-")
yearly_affs[year].extend(affs)
yearly_affs then contains a list of all affiliations recorded in publications for that year.
Naturally, the list will contain duplicates. If you don't like that, use set() and update() instead.
The .split("-") part for affs is for multiple affiliations (when the researcher reports multiple affiliations on a paper. You might want to use the first reported instead. Then use [0] at the end and append() in the next row.
Also, there will likely be gaps. I recommend turning the yearly_affs into a pandas DataFrame, select for each year the main affiliation, and then fill gaps forward or backward.

Search Console API: Impressions don't add up comparing totals to contains / not contains keywords

We are using the Search Console (webmaster tools) API to download search performance results for our site to compare search performance on people searching using our company name vs non company name searches. We have found a problem where the impressions don't add up when comparing "all search results" to "search results via specific keywords".
For example, if we do a report to show all web results for all devices for our site on a specific date, we get 189,491 impressions. If we then report to show results with the keyword "Our Name" we get 61,046. If we report on "OurName" (same keyword but without spaces) we get 1,086. If we then report not contains "Our Name" and not contains "OurName" we get 65,827, which adds up to 127,959, meaning somewhere we have 61,532 impressions missing.
Interestingly, if we change the filter on not contains to also include device equals DESKTOP, it increases to 65,997, yet I would have expected this to be equal to or less than all device impressions.
From the data we have this seemed to have stopped working on the 27th November 2015 (before this the 3 figures always added up to the total, on this date and afterwards they don't). The impressions add up fine if we only do one contains and one not contains. Clicks always seem to add up correctly, so I'm wondering if one of these queries is excluding data with zero clicks?
We are using the .Net library to access the Search Console data, but we get the same results when using the API Explorer. It is hard to replicate using the search console, as this doesn't allow you to include multi "not contains" keywords. The total figures and the contains "our name" / "ourname" figures match between the API and the search console.
I've found a few other post on here where people are having similar problems but they are dated over a year ago, and we've only just noticed the problem in the last 3 weeks so I don't know if this is a new problem.
The query for the not contains is as follows:
POST https://www.googleapis.com/webmasters/v3/sites/{YOUR_SITE_URL}/searchAnalytics/query?fields=rows&key={YOUR_API_KEY}
{
"startDate": "2015-12-07",
"endDate": "2015-12-07",
"searchType": "web",
"dimensionFilterGroups": [
{
"filters": [
{
"dimension": "query",
"expression": "our name",
"operator": "notContains"
},
{
"dimension": "query",
"expression": "ourname",
"operator": "notContains"
}
]
}
]
}
Many thanks in advance for any help
cross posted from Google Search Console Forum
From the API reference, there is no OR operation available for multiple filter expressions:
"Whether all filters in this group must return true ("and"), or one or more must return true (not yet supported)."
BOTH filters must be passed to get into the total.
Does not contain "our name" AND Does not contain "ourname".
https://developers.google.com/webmaster-tools/v3/searchanalytics/query
Having said that, you probably are even more at a loss to explain some of your results...maybe you have a number of queries that contain both "our name" AND "ourname"??
I working on the same topic at the moment (excluding brand searches); like Google say, they excluding search queries that can contain privat information:
To protect user privacy, Search Analytics doesn't show all data. For example, we might not track some queries that are made a very small number of times or those that contain personal or sensitive information.
https://support.google.com/webmasters/answer/6155685?hl=en#tablegone
With this in mind you have a big block of data with no query information, so if you filter in any way, that whole block isn't included.
For example, we had like 325.000 total impressions on the 01.07., but if I do two separate queries one with including and one with excluding and add the values for clicks and impressions together, I get like the total numbers for that block where my queries living in.
In our case that is around 180.000 impression, so 145k impressions were made with queries I don't know and can't filter them.
In your case the 127,959 could be your total of impressions (depending of your keywords). So your non brand traffic with 65,827 impressions is more like 50% percent than 30%.
I hope it's more or less understandable.

How to get google places api to return nearby hotels

I'm trying to put together a google places call to do a nearby search just for hotels. I tried using the types parm but the closest type to hotel I found was lodging which produced no results. So then I tried using the name parm and the same zero result. Am I missing something or is places not meant to perform a search of this type. I increased the radius to a number I'm certain that there are other hotels (2 should have been fine).
//maps.googleapis.com/maps/api/place/nearbysearch/json?location=32.800870,-96.830803&radius=25&name=Marriot%20Sheraton%20W&key=
Result from call:
{
html_attributions: [ ],
results: [ ],
status: "ZERO_RESULTS"
}
Thanks!
The radius of 25m appears to be too small and the name is not a subset of any place names in the area. Were you trying to get any places with a name of Marriot, Sheraton, or W? I don't believe the implementation works that way and instead it looks for a place with "Marriot Sheraton W" in the name.
If you change radius to 400 and name to Sheraton, then you do get the nearby "Sheraton Suites Market Center Dallas".
https://maps.googleapis.com/maps/api/place/nearbysearch/json?location=32.800870,-96.830803&radius=400&name=Sheraton&key=[key]

Yahoo Pipes: Extracting number from feed item for use in URL builder

Been looking all over the place for a solution to this issue. I have a Yahoo Pipe (http://pipes.yahoo.com/pipes/pipe.info?_id=e5420863cfa494ee40e4c9be43f0e812) that I've created to pull back image content from the Bing Search API. The URL builder includes a $skip attribute that takes an integer and uses it to select the starting (index) point for the result set that the query returns.
My initial plan had been to use the math engine in the Wolfram Alpha API to generate a random number (randomInteger[1000]) that I could use to seed the $skip value each time that the pipe is run. I have an earlier version of the pipe where I was able to get the query / result steps working using either "XPath Fetch" and "Fetch Data". However, regardless of how I Fetch the result, the response returns as an attribute / value pair in a list item.Even when I use "Emit items as string" in XPath Fetch, I still get a list with a single item, when what I really want is the integer that I can plug into my $skip attribute.
I've tried everything in Pipes I can think of, and spent a lot of time online looking for an answer. Is there anyway to extract text (in this case, a number) from a single list item and then use the output as input to "wire" a text parameter in another Pipes block? Any suggestions / ideas welcome. In the meantime, I'm generating a sorta-random number by manipulating a timecode hash, but it just feels tacky :-)
Thanks!
All the sources are for repeated items. You can't have a source that just makes a single number.
I'm not really clear what you're trying to do. You want to put a random number into part of the URL string that gets an RSS feed?

How to get the 'current observation data' from the NDFD (NOAA, NWS) REST service?

I'm trying to use the NDFD (National Digital Forecast Database) to get current temperature and relative humidity given a Lat and Long using their REST based service.
The issue at hand:
I can't match the 'current observation data' WITH the 'results' I get back from the REST-service.
The setup:
Location:
* Apple (1-infinite loop, Cupertino, California)
* Lat = 37.33; Lon = -122.03
If I issue the following REST-call:
http://www.weather.gov/forecasts/xml/sample_products/browser_interface/ndfdXMLclient.php?lat=37.33&lon=-122.03&product=time-series&begin=2009-06-21T17:12:35&end=2009-06-21T17:12:35&appt=appt&rh=rh&temp_r=temp_r&temp=temp
Note 1: I'm passing the begin and end time in UTC. They're the same because I'm
looking for just a single-point-in-time: the latest observed
temp and relative humidity.
AND, then compare it to what is the closet reporting stations (San Jose International Airport, CA - KSJC - 37.37N 121.93W) # http://www.weather.gov/xml/current_obs/KSJC.xml
** I can never get them to MATCH. **
Note 2: The nearest reporting station is return back from the REST call
as well, so I know I'm comparing Location apples to Location apples.
I've had two ideas:
1: I'm doing something wrong with how I'm passing in the begin/end times into the REST call...
2: You can't get 'current observed data' the way I'm trying to...
Lastly:
I've found a solution using outoftime's NOAA ruby lib , [it parses an observation stations YAML file to find the nearest one given Lat/Lng then goes directly to that station via its identifier i.e. http://www.weather.gov/xml/current_obs/KSJC.xml].... but it just feels like I may be missing something obvious here and would like to use the REST-based interface ;)
Any help or pointers would be appreciated!
Thanks!
It looks like the service you are calling isn't for current data. Judging by the URL and the XML results it seems to be for forecasts. You can also put in future dates to get future forecast data. It expects the dates to be in the -0700 time zone according to the response. I'm not sure which service you should be calling to get the data you want though.
I know that this is an old question, but this is what I'm using to get current weather conditions: http://forecast.weather.gov/MapClick.php?lat=43.09110&lon=-79.0162&unit=0&lg=english&FcstType=dwml
Found this api/link yesterday. Its still developmental (operation-mode="developmental"):
http://forecast.weather.gov/MapClick.php?lat=37.33&lon=-122.03&FcstType=dwml
If you want the "current" observation, you use the XML here:
http://w1.weather.gov/xml/current_obs/seek.php?state=or&Find=Find
e.g.,:
http://w1.weather.gov/xml/current_obs/KAST.xml
If you click on the link you'll get a rendered page. However, if you pull from it using normal rest methods or just wget, it delivers an xml file.

Resources