Azure Face Identify more than 10 faces in a Person Group - limit

So, a straightforward question. My first on SO. Asking here because it says so on the Azure Docs to ask here.
I understand that the Face API can identify at most 10 faces in an API call. Is it possible to get this limit raised to, say, 50? Maybe through some specific pricing agreement?
Thanks and regards.

I see the best way to do this is by handling it on your side, maybe by dividing your 50 faces image into 5 pieces where each piece will have 10 faces, then make an API call for each piece, noting that the paid tier is limited to 10 calls per second, so, if you have more than 10 pieces you'll have to put them in a queue and have the view load the results using the async/await pattern or just load for a few seconds until you have all the results to present and use.

We are using the cognitive services for a while and didn't found a way to increase some of the hard limits even after some discussions with representatives from MS (if you can find the right person they are very helpful, but it's hard to find the right person :). Similar to the 10 person limit, you also can't change 24 hour face keeping duration. At least we were not able to find it.
Of course this case only true for using the scenario of Create PersonList, Detect Image and Identify. If you just use identification using face list, you don't have this limit as you have the ability to save those faces in a face list and then query them which will search the whole list even if it's 10 or 100 faces.

Related

What are best known algorithms/techniques for updating edges on huge graph structures like social networks?

On social networks like twitter where millions follow single account, it must be very challenging to update all followers instantly when a new tweet is posted. Similarly on facebook there are fan pages with millions of followers and we see updates from them instantly when posted on page. I am wondering what are best known techniques and algorithms to achieve this. I understand with billion accounts, they have huge data centers across globe and even if we reduce this problem for just one computer in following manner - 100,000 nodes with average 200 edges per node, then every single node update will require 200 edge updates. So what are best techniques/algorithms to optimize such large updates. Thanks!
The best way is usually just to do all the updates. You say they can be seen "instantly", but actually the updates probably propagate through the network and can take up to a few seconds to show up in followers' feeds.
Having to do all those updates may seem like a lot, but on average follower will check for updates much more often than a person being followed will produce them, and checking for updates has to be much faster.
The choices are:
Update 1 million followers, a couple times a day, within a few seconds; or
Respond to checks from 1 million followers, a couple hundred times a day, within 1/10 second or so.
There are in-between strategies involving clustering users and stuff, but usage patterns like you see on Facebook and Twitter are probably so heavily biased toward option (1) that such strategies don't pay off.

Gameanalytics, days since install/session number filter

In "look up metrics” I’m trying to know how my players improve in playing my game.
I have the score (both as desing event and progression, just to try) and in look up metrics I try to “filter” with session number or days since install but, even if I group by Dimension, this doesn’t produce any result.
For instance if I do the same but with device filter it shows me the right histogram with score's mean per device.
What am I doing wrong?
From the customer care:
The session filter works only on core metrics at this point (like DAU). We hope to make this filter compatible with custom metrics as well but this might take time as we first need to include this improvement to our roadmap and then evaluate it by comparing it with our other tasks. As a result, there is no ETA on making a release.
I would recommend you to download the raw data (go to "Export data" in the settings of the game) and perform an analysis on your own for this sort of "per user" analysis. You should be able to create stats per user. GA does not do this since your game can reach millions of users and there's no way you can plot this amount of entries in a browser.

CEP (Apama) Increase performance for geofencing

Need to know if it is possible (and will have effect) to implement a b-tree within a the CEP (single corerlator). The problem we face is that we can not handle more then 1000 messages per second. I think it is caused by the way the solution has been implemented.
We want to detect if a position is wihtin a zone and raise an event on entering, stop, start and leaving zone. We have now just around 500 zones and up to more then 1000 positions per second want to increase the zones. Messages are now being back-up. I think the solutions would be introducing a B-tree within the CEP. So firts you would detect if a position is in the head zone and then query if the positons are in the zones within this head zone. I think this could increase perfomance, but not realy sure if it is possible or wise.
Has anyone had any experience?
Firstly, we're deprecating the community forum in favour of answering questions here, so you're in the right place.
Secondly, the answer to your question probably needs a bit more detail about what you're doing at the moment. How are you managing your geofencing at the moment? Apama has built-in support for matching locations against rectangular areas with the location type. Using that in a hypertree expression with a listener should be very fast.
To manage other shaped geofences we'd recommend starting by using the bounding box in a listener and then doing your specific geofence calculation on events that fall within the bounding box.
To answer your question about a hierarchical approach - if the above does not help enough then you could start with corse-grained bounding boxes in an ingestion context which then delegates to multiple secondary contexts with more detailed bounding boxes again using the hypertree. These secondary contexts would be able to work in parallel.
On a single large machine we've managed to handle hundreds of thousands of location updates across thousands of geofences, although this will very much depend on what action you're taking when you get a match and what your match rate is.
HTH,
Matt

What is the proper way to use the radius parameter in the Google Places API?

I am using the Google Places API to retrieve all the POI (Places of Interest) around a the current location, it works ok but I have noticed that whatever the value of the radius is, I always get the same number of results (~ 20). As a result, if I give a radius that is too big, I don't necessarily get the nearest POIs. If I reduce the amount of the radius to be small enough, I will retrieve those nearest places again (from experimentation, I have noticed that 100 meters is a proper value) but that means that I will not get any POIs beyond 100 meters which is not quite what I want.
My question is: is there any way by which I can get all the POIs (with no limitations) within a certain radius.
Thank you!
The Google Places API always returns 20 results by design, selecting the 20 results that best fit the criteria you define in your request. The Developer's Guide / Docs don't explicitly cite that number anywhere that I have seen. I learned about the limit watching the Autocomplete Demo & Places API Demo & Discussion Video, given by Paul Saxman, a Developer Advocate at Google and Marcelo Camelo, Google's Technical Lead for the Places API.
The entire video is worth watching, but more specific to your question, if you set the playback timer at about 11:50, Marcelo Camelo is contrasting the Autocomplete tool versus the general Places API, and that's the portion of the video where he mentions the 20 result limit. He mentions 20 as the standard result count several times.
There are many other good Places API and Google Maps videos linked to that area on YouTube as well.
As mentioned on the Google Places Issue Tracker here: http://code.google.com/p/gmaps-api-issues/issues/detail?id=3425
We are restricted by our data provider licenses to enable apps to display no more than 20 places results at a time. Consequently we are not able to increase this limit at this time.
It does sound like you are however trying to return results that are closest to a specified location, this is now possible by using the 'rankby=distance' parameter instead of 'radius' in your request.
e.g.
https://maps.googleapis.com/maps/api/place/search/json?location=-33.8670522,151.1957362&rankby=distance&types=food&name=harbour&sensor=false&key=YOUR_API_KEY
Try google.maps.places.RankBy.DISTANCE; as default is google.maps.places.RankBy.PROMINENCE;
An easy example of this is shown Here
(Chrome only)

An easy way with Twitter API to get the list of Followings for a very large list of users?

I have about 200,000 Twitter followers across a few twitter accounts. I am trying to find the twitter accounts that a large proportion of my followers are following.
Having looked over the Search API I think this is going to be very slow, unless I am missing a something.
40 calls using GET followers/ids to get the list of 200,000 accounts. Then all I can think of is doing 200,000 calls to GET friends/ids. But at the current rate limit of 150 calls/hour, that would take 55 days. Even if I could get Twitter to up my limit slightly, this is still going to be slow going. Any ideas?
The short answer to your question is, no, there is indeed no quick way to do this. And furthermore, with API v 1.0 being deprecated sometime in March, and v 1.1 being the law of the land (more on this in a moment).
As I understand it, what you want to do is compile a list of followed accounts for each of the initial 200,000 follower accounts. You then want to count each one of these 200,000 original accounts as a "voter", and then the total set of accounts followed by any of these 200,000 as "candidates". Ultimately, you want to be able to rank this list of candidates by "votes" from the list of 200,000.
A few things:
1.) I believe you're actually referencing the REST API, not the Search API.
2.) Based on what you've said about getting 150 requests per hour, I can infer that you're making unauthenticated requests to the API endpoints in question. That limits you to only 150 calls. As a short term fix (i.e., in the next few weeks, prior to v 1.0 being retired), you could make authenticated requests instead, which will boost your hourly rate limit to 350 (source: Twitter API Documentation). That alone would more than double your calls per hour.
2.) If this is something you expect to need to do on an ongoing basis, things get much worse. Once API 1.0 is no longer available, you'll be subject to the v 1.1 API limits, which a.) require authentication, no matter what, and b.) are limited per API method/endpoint. For GET friends/ids and GET followers/ids, in particular, you will be able to only make 15 calls per 15 minutes or 60 per hour. That means that the sort of analysis you want to do will basically become unfeasible (unless you were to skirt the Twitter API terms of service by using multiple apps/ip addresses, etc.). You can read all about this here. Suffice to say, researchers and developers that rely on these API endpoints for doing network analysis are less than happy about these changes, but Twitter doesn't appear to be moderating its position on this.
Given all of the above, my best advice would be to use API version 1.0 while you still can, and start making authenticated requests.
Another thought -- not sure what your use case is -- but you might consider pulling in, say, the 1000 most recent tweets from each of the 200,000 followers and then leveraging the metadata contained in each tweet about mentions. Mentions of other users are potentially more informative than knowing that someone simply follows someone else. You could still tally the most mentioned accounts. The benefit here would be that in moving from API 1.0 to 1.1, the endpoint for pulling in timelines for users will actually have it's API limit raised from 350 per hour to 720 (Source: Twitter API 1.1 documentation)
Hope this helps, and good luck!
Ben

Resources