Youtube Api v3 restricting result in one language - utf-8

Is there any way to restrict to get result from youtube in one language? My code is almost working nice but it returns some videos that has title or description in other language. For example in Turkish we have word "iş" means work or "aşk" means love. Youtube returns some videos including in its title/description "is","ask". We have speacial charachter in Turkish but youtube makes it resembling to English. How can i avoid from this?
Could it be due to i am encoding my code and database UTF-8?

You can use region codes for the videos, that will help.
E.g. For Turkish videos, you can use regionCode = "TR"
GET https://www.googleapis.com/youtube/v3/search?part=snippet&q=is&regionCode=TR&key={YOUR_API_KEY}

Related

translate language in image preserving structure

we are looking to translate images found in pdf documents from different languages to english.
they are scanned images and many times have tables or some structure in them.. we would like to translate to English but preserve the structure of document as much possible. Hence just a pure text based translation doesn't suffice.
we saw the Google translate app on Android which seems to do something similar with photos on phone..is there a Google cloud api which does the same?
In order to do this over the Google cloud , which api should we use, can you point us to the api an documentation that does this...
thanks
Using Google Cloud products, you can achieve this using an OCR to extract text and translate API to translate the text to English.
I suggest to use Document AI for OCR since the API is designed to parse forms and tables. You can check Document AI Table parsing and Document AI Document parsing for examples on how to use the API. Using the extracted text, you can use Translate API to translate the extracted text.
High level steps:
Use Document AI to extract data from pdf files
Use Translate API to translate the extracted data to English

How can I change the pronunciation of a specific word by Alexa in a custom skill?

Sometimes, when developing an Alexa skill and programming the responses from my service, Alexa mispronounces one of the words in my reply, confusing the user.
For example, if I wanted Alexa to say a word, let it be live, how can I tell Alexa how to pronounce the word correctly because there exist two pronunciations for live.
Is there a way to dictate to Alexa the correct pronunciation, or replace it with a custom sound that is correct? Do I need to use additional markup or an API call?
Alexa supports SSML, which is an XML-like markup language for speech. Instead of returning plain text from your service, you can use SSML responses. The <phoneme> tag is what you need in particular:
phoneme
Provides a phonemic/phonetic pronunciation for the contained text. For example, people may pronounce words like “pecan” differently.
For English words (especially US English), Alexa should be able to pronounce any word if you give it the correct phonetic pronunciation:
The following tables list the supported symbols for use with the phoneme tag. These symbols provide full coverage for the sounds of US English. Note that many non-English languages require the use of symbols not included in this list, which are not supported. Using symbols not included in this list is discouraged, as it may result in suboptimal speech synthesis.
Quotes from Amazon documentation on SSML.
Here's an example of giving Alexa a specific pronunciation for your word live:
<speak>
<phoneme alphabet="ipa" ph="lɪv">live 1</phoneme>.
<phoneme alphabet="ipa" ph="laɪv">live 2</phoneme>.
</speak>
The <phoneme> tag supports the IPA and X-SAMPA phonetic alphabets. You can typically find IPA spellings for any word on Wiktionary or through Google.
For longer messages, it may be best to use the <audio> tag and record a custom voice:
The audio tag lets you provide the URL for an MP3 file that the Alexa service can play while rendering a response. You can use this to embed short, pre-recorded audio within your service’s response. For example, you could include sound effects alongside your text-to-speech responses, or provide responses using a voice associated with your brand.
Quoted from Amazon documentation on <audio>.

Google Translate API - English subscripts to translated script

I am trying to convert English to Hindi via Google's API but I also need the English translation of the Hindi string.
To illustrate, if I convert
"a quick brown fox...."
to Hindi , it reads
"फुर्तीली भूरी लोमड़ी आलसी कुत्ते के उपर से कूद गई।"
But if you look at the web interface, Google also translates the Hindi version as
"phurtilee bhoori lomdi ..".
This doesn't show up in the response format of Translate API. I tried searching all their docs but this is all I got https://cloud.google.com/translate/docs/reference/translate#translatetextresponsetranslation and it just has a translated text in the response.
Google Translation API does not currently offer phonetic translation, despite being available in the web interface.
You can file a request for that feature to be included in the API by following the procedure explained in this forum where your same question is made.

Can the Google Speech API be configured to return only numbers / letters?

Can the Google Speech API be configured to only return numbers and letters, as opposed to full words?
The use case is translating Canadian postal codes.
Ex. M 1 B 0 R 3. Google may return "Em 1 Be 0 Are 3"
We have tried:
Using speechContexts and feeding in letters A - Z, as individual phrases. This improved the accuracy for us. We did not have much success passing in individual numbers (ex 1, 2, 3).
Specifying the codec and sample rate of our WAV file using the encoding and sampleRateHertz configuration options. We saw no improvement in doing this as we believe Google already does a great job of auto-recognizing the the sample rate and encoding.
Our audio file is 8000hz and encoded with "M-ULAW". We have no flexibility in changing the sample rate or encoding.
Is there a way to get a more accurate response from Google for this use case? Even ideas for better speechContexts phrases are welcome.
Thank you
We are experiencing the same results, we would love to have a syntax based "context" suggestion or a parameter to force only digit return variable.
Changes in api version isn't fixing the way the digits are recognised, not even using model: phone_call.
What actually was better for recognising some kind of numbers, was to switch to en_US locale and that in turn forced the recognition engine to identify a list of numbers as a phone. So it was returned in phone-like syntax with +XXX-XXX-XXX-XXXX and this made detection really really good.
So I don't understand why Google has syntax matching behind the curtains and doesn't make it available through their api.

Is it possible to create INTERNATIONAL permalinks?

i was wondering how you deal with permalinks on international sites. By permalink i mean some link which is unique and human readable.
E.g. for english phrases its no problem e.g. /product/some-title/
but what do you do if the product title is in e.g chinese language??
how do you deal with this problem?
i am implementing an international site and one requirement is to have human readable URLs.
Thanks for every comment
Characters outside the ISO Latin-1 set are not permitted in URLs according to this spec, so Chinese strings would be out immediately.
Where the product name can be localised, you can use urls like <DOMAIN>/<LANGUAGE>/DIR/<PRODUCT_TRANSLATED>, e.g.:
http://www.example.com/en/products/cat/
http://www.example.com/fr/products/chat/
accompanied by a mod_rewrite rule to the effect of:
RewriteRule ^([a-z]+)/product/([a-z]+)? product_lookup.php?lang=$1&product=$2
For the first example above, this rule will call product_lookup.php?lang=en&product=cat. Inside this script is where you would access the internal translation engine (from the lang parameter, en in this case) to do the same translation you do on the user-facing side to translate, say, "Chat" on the French page, "Cat" on the English, etc.
Using an external translation API would be a good idea, but tricky to get a reliable one which works correctly in your business domain. Google have opened up a translation API, but it currently only supports a limited number of languages.
English <=> Arabic
English <=> Chinese
English <=> Russian
Take a look at Wikipedia.
They use national characters in URLs.
For example, Russian home page URL is: http://ru.wikipedia.org/wiki/Заглавная_страница. The browser transparently encodes all non-ASCII characters and replaces them by their codes when sending URL to the server.
But on the web page all URLs are human-readable.
So you don't need to do anything special -- just put your product names into URLs as is.
The webserver should be able to decode them for your application automatically.
I usually transliterate the non-ascii characters. For example "täst" would become "taest". GNU iconv can do this for you (I'm sure there are other libraries):
$ echo täst | iconv -t 'ascii//translit'
taest
Alas, these transliterations are locale dependent: in languages other than german, 'ä' could be translitertated as simply 'a', for example. But on the other side, there should be a transliteration for every (commonly used) character set into ASCII.
How about some scheme like /productid/{product-id-number}/some-title/
where the site looks at the {number} and ignores the 'some-title' part entirely. You can put that into whatever language or encoding you like, because it's not being used.
If memory serves, you're only able to use English letters in URLs. There's a discussion to change that, but I'm fairly positive that it's not been implemented yet.
that said, you'd need to have a look up table where you assign translations of products/titles into whatever word that they'll be in the other language. For example:
foo.com/cat will need a translation look up for "cat" "gato" "neko" etc.
Then your HTTP module which is parsing those human reading objects into an exact url will know which page to serve based upon the translations.
Creating a look up for such thing seems an overflow to me. I cannot create a lookup for all the different words in all languages. Maybe accessing an translation API would be a good idea.
So as far as I can see its not possible to use foreign chars in the permalink as the sepecs of the URL does not allow it.
What do you think of encoding the specials chars? are those URLs recognized by Google then?

Resources