Google Proximity Beacon API: how to register iBeacon? - ibeacon

The Google's Proximity Beacon API documentation uses Eddystone as an example everywhere:
https://developers.google.com/beacons/proximity/register
However, documentation mentions two more types of beacons, AltBeacon and iBeacon.
If I understand correctly, something like this should be used (adapted from Google's example):
{
"advertisedId": {
"type":"IBEACON",
"id":"base 64 of what???"},
"status":"ACTIVE",
"latLng": {
"latitude": 51.4935657,
"longitude": -0.1465538
}
}
However, what is the acceptable binary format for iBeacon's UUID,Major,Minor (which should be base64'd)?

The id of the advertisedId will be the 20 bytes of the iBeacon UUID + major + minor base64 encoded directly from the binary form. (i.e. don't print it out as hex or text first before base64 encoding. Just take the blob and base64 that).
Otherwise your request looks right!

Related

Decode Protobuf Text

I have some Protobuf text that I'm receiving via an http response from a website. The text roughly looks like this:
1 {
2: some value
7: {
12: some value
}
8: some value
}
except the content is much larger. I don't want to paste the actual text for security purposes.
Anyways, how can I "decode" this so that I can see the schemas?
At the moment it is impossible to obtain a perfectly accurate schema from a protobuf message.
That being said, you can get semi-close. There are some tools like protobuf-inspector that can print out a bit more information about the structure of the message.
Some important caveats about this tool (and in general) as to why it's not possible to obtain the full schema, taken from the README of the tool:
[...] the field names are obviously lost, together with some high-level details such as:
whether a varint uses zig-zag encoding or not (will assume no zig-zag by default)
whether a 32-bit/64-bit value is an integer or float (both shown by default)
signedness (auto-detect by default)

Is there a JSON equivalent of the html_strip filter?

We receive data from application forms in JSON and need to be able to search on it - but only the text entered by the user. Some of our data from other sources comes in as XML and this is fine - the html_strip (https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-htmlstrip-charfilter.html) character filter does the job.
But is there an equivalent for JSON - you send serialized JSON as text and it strips out the tags just leaving the data?
A very simplistic example:
The application form sends back this data:
{
"ed_hist1": "Glasgow High School",
"ed_hist2": "Edinburgh University"
}
This gets serialized and added to our document as a text field:
{
"Type": "applicationform",
"Id": 1,
"Name": "Margaret Blenkinsop",
"Email": "JohnB232#myCompany.COM",
"Text": "{\"ed_hist1\":\"Glasgow High School\",\"ed_hist2\":\"Edinburgh University\"}"
}
And that gets sent to ES.
When I search the text field I don't want to be able to find "ed_hist1" or "ed_hist2" only "Glasgow High School" and ""Edinburgh University".
Or is the only way to pre-process the JSON? (Which is fine but I don't want to manually code something if ES will take care of it for me.)
Solution #1
There are a couple ways of accomplishing what you want. The most "idiomatic" way would be to pre-process the JSON coming from you application, transforming the JSON doc you have to a list.
EG:
{
"ed_hist1": "Glasgow High School",
"ed_hist2": "Edinburgh University"
}
becomes
[
"Glasgow High School",
"Edinburgh University"
]
And then your document would look as follows:
{
"Type": "applicationform",
"Id": 1,
"Name": "Margaret Blenkinsop",
"Email": "JohnB232#myCompany.COM",
"Text": ["Glasgow High School", "Edinburgh University"]
}
You could then search the Text field for the ed_hist you are looking for and have documents returned that match your query etc. This is the most simple solution for your case, but there are other ways of structuring your data depending on what questions you want to ask of it.
Solution #2
You seem to be thinking a fair bit about keeping the original JSON document as text in a field as part of another document. Without additional context, I do not particularly like this solution but I assume you have your reasons. For this second, not recommended use case, I would use char filters in addition to storeing your original field value. The process would look like this:
index document
original documented `store`d
custom char filter strips out the unwanted JSON language characters
text indexed
At query time you can get back the original stored value, but also leverage the "full text" search on the text contained in the JSON document after the characters have been stripped. I really believe you can get more out of your data if you do a solution #1 variant. Stripping HTML for text search is a good idea to search through markup documents, but stripping JSON makes less sense as JSON is ES's bread and butter, and will give you more tools to work with as JSON.
EDIT: I forgot to link to the store documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-store.html
EDIT #2: Of additional note is the _source field which is enabled by default: https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-source-field.html Your original JSON document lives there, for free, as well. This can get you back the original document you sent to ES, while still giving you the more idiomatic data structure in the DB.

How to highlight filler words (Um, uh, ah) in transcript?

Is there a configuration in google cloud speech that allows me to see the filler words in the returned JSON transcript? Currently, it seems like the transcript returned by google cloud speech automatically filters out such words (uh, um, ah, like, etc..).
I've attempted to use the 'phrases' parameter in my audio recognize call, which puts emphasis on detecting specific phrases in the audio.
wordsToDetect = ["um", "like", "so", "honestly", "basically", "actually", "uh"]
audio = speech.audio output_filepath,
encoding: :flac,
language: "en-US"
results = audio.recognize phrases: wordsToDetect
Is it the case that Google Cloud Speech simply automatically filters out filler words like "um" and "uhhh"?
Almost all the Speech recognition APIs are not able to detect the filler sounds like Ahs,Ums, Uhs etc.In order to detect those sounds the algorithm has to be trained to detect those particular sounds.

Broadcast text using eddystone url layout (or altbeacon)

I have a 16 length string that I would like to broadcast as the identifier, which the app uses to do certain actions.
Relatively new to the different beacon layouts, so would love to get the right opinion. I was thinking of doing it the following ways
hex encode the string and use it as identifier in eddystone url layout
But the 16 length takes up 32 bytes and doesn't help
Another option is to use altbeacon library like below.
byte[] dataBytes = new String("16 length string").getBytes()
Identifier identifier = Identifier.fromBytes(dataBytes, 0, dataBytes.length, false);
I am not sure either works mainly because of the 16 length string. Is there a better way to achieve or do it at all in first place?

Google Translate API Detecting Wrong Language with Greek Translation

I have set up functionality in one of my php applications to use the google translate api. Seems to be working pretty good for single and multiple translation requests using both POST and GET methods...
The API does seem to have issues auto detecting many of my Greek strings as being English with very low confidence when I don't supply a source language... One example is this string.
Σε προσωπικό επίπεδο συζητάμε με τους πελάτες. Διοργανώνουμε πεζοπορίες στα μονοπάτια που έχουμε διανοίξει και σηματοδότηση και εξηγούμε όλα τα περιβαλλοντικά ζητηματα
My application is not always aware of the actual source language, so I'm trying to get this to work with auto language detection.
The translate.google.com interface seems to have no problem detecting these strings however...
Does the API use a separate detection algorithm than translate.google.com?
Any info or suggestions on helping the auto detection work better?
Actual Request for above greek string:
https://www.googleapis.com/language/translate/v2?q=%CE%A3%CE%B5+%CF%80%CF%81%CE%BF%CF%83%CF%89%CF%80%CE%B9%CE%BA%CF%8C+%CE%B5%CF%80%CE%AF%CF%80%CE%B5%CE%B4%CE%BF+%CF%83%CF%85%CE%B6%CE%B7%CF%84%CE%AC%CE%BC%CE%B5+%CE%BC%CE%B5+%CF%84%CE%BF%CF%85%CF%82+%CF%80%CE%B5%CE%BB%CE%AC%CF%84%CE%B5%CF%82.+%CE%94%CE%B9%CE%BF%CF%81%CE%B3%CE%B1%CE%BD%CF%8E%CE%BD%CE%BF%CF%85%CE%BC%CE%B5+%CF%80%CE%B5%CE%B6%CE%BF%CF%80%CE%BF%CF%81%CE%AF%CE%B5%CF%82+%CF%83%CF%84%CE%B1+%CE%BC%CE%BF%CE%BD%CE%BF%CF%80%CE%AC%CF%84%CE%B9%CE%B1+%CF%80%CE%BF%CF%85+%CE%AD%CF%87%CE%BF%CF%85%CE%BC%CE%B5+%CE%B4%CE%B9%CE%B1%CE%BD%CE%BF%CE%AF%CE%BE%CE%B5%CE%B9+%CE%BA%CE%B1%CE%B9+%CF%83%CE%B7%CE%BC%CE%B1%CF%84%CE%BF%CE%B4%CF%8C%CF%84%CE%B7%CF%83%CE%B7+%CE%BA%CE%B1%CE%B9+%CE%B5%CE%BE%CE%B7%CE%B3%CE%BF%CF%8D%CE%BC%CE%B5+%CF%8C%CE%BB%CE%B1+%CF%84%CE%B1+%CF%80%CE%B5%CF%81%CE%B9%CE%B2%CE%B1%CE%BB%CE%BB%CE%BF%CE%BD%CF%84%CE%B9%CE%BA%CE%AC+%CE%B6%CE%B7%CF%84%CE%B7%CE%BC%CE%B1%CF%84%CE%B1&target=en&key={YOUR_API_KEY}
And Response:
{
"data": {
"translations": [{
"translatedText": "Σε προσωπικό επίπεδο συζητάμε με τους πελάτες. Διοργανώνουμε πεζοπορίες στα μονοπάτια που έχουμε διανοίξει και σηματοδότηση και εξηγούμε όλα τα περιβαλλοντικά ζητηματα",
"detectedSourceLanguage": "en"}]
}
}

Resources