List of SSML tags and features supported with Google Actions - ssml

I'm trying to write a function that will generate a script read with SSML tags, and with elements of the script being inserted dynamically based on data pulled from the Twitter API.
I can't find a list of SSML tags and features that Actions on Google support. Is there a comprehensive list somewhere?
Appreciate any input.

The basic tags are listed here: https://developers.google.com/actions/reference/ssml
But there are actually a lot more tags for example:
<voice gender="male" variant="1"> </voice>
You can look at https://www.w3.org/TR/speech-synthesis11/#S3.2.1 what else is there.

Related

Correct way to search videos with multiple keywords with OR condition for youtube search API

I'm trying to use youtube data search and video API in my web application to display top view-counted videos related with several keywords. I'm planing to use totally two calls: the first call get id list with search API, and the second call get details for the ids hit on the first call, with video API.
My question is with regard to search API. Based on my trial and error, If I input multiple keyword with space separation in the parameter q for search API, it's looks behaves as AND condition it's not same as common behavior such as google. To search with multiple keywords with OR condition, As far as I tried, it's looks working if I Include the OR between keywords, but I would like to confirm my assumption correct, officially if possible.
I should be able to find this kind of specification in the official documentation, but finally I have no luck. It's very helpful if you could share these links if exists or give me the official answer.
By the way, it is my first post to stackoverflow. If there is missing point of my question, please kindly advice.

Azure DevOps Wiki: What is the value of YAML atop a page's markdown?

What is the value in adding YAML atop an Azure DevOps Wiki page's markdown, as supported by its markdown syntax: Syntax guidance for Markdown usage in Wiki, YAML tags?
It seems to offer nothing more than an alternative syntax with which to specify tables. Perhaps more elaborate tables but they'll only render atop the page. What am I missing?
As the introduction in the document,
Any file that contains a YAML block in a Wiki is processed by a table with one head and one row.
So, I think the value of YAML tags in the Wiki markdown is to convert the abstract YAML statements into a visual table on the Wiki page to increase readability and quick understanding.
Especially for a complex YAML block that may contain multiple items or multiple sub-items, the YAML tags should be very helpful.
[UPDATE]
I find an issue ticket (MicrosoftDocs/azure-devops-docs#9976) reported by another user on the GitHub repository "MicrosoftDocs/azure-devops-docs". This issue has reported a similar question.
And in this issue ticket, you also can see #amitkumariiit has given an explanation:
Yaml tags are used for general search engine optimisation. Our plan was to add the basic support for it first and then ingest this in the azure devops wiki search for optimise search. However we could not prioritise the search side of work.
If you need more detailed explanation, you can follow this issue ticket and add your comments to it.
I am going to propose my own answer. It just occurred to me that this is likely intended to replace markdown, not to be used with markdown. That is to say, to support documentation written purely in YAML. That could make some sense, add value for some, and explain why it's ONLY supported atop the page. You use it instead of the markdown, not with the markdown.
The documentation just doesn't make it clear why/how you might want to use this feature.

How to download full article text from Pubmed?

I am working on a project that requires to work with Genia corpus. According to the literature Genia Corpus is made from articles extracted by searching 3 Mesh terms : “transcription factor”, “blood cell” and “human” on Medline/Pubmed. I want to extract full text article(which are freely available) for the articles in Genia corpus from Pubmed. I have tried many approaches but I am not able to find a way to download full text in text or XML or Pdf format.
Using Entrez utils provided by NCBI :
I have tried using the approach mentioned here -
http://www.hpa-bioinformatics.org.uk/bioruby-api/classes/Bio/NCBI/REST/EFetch/Methods.html#M002197
which uses the Ruby gem Bio like this to get the information for a given PubMed ID -
Bio::NCBI::REST::EFetch.pubmed(15496913)
But, it doesn't return the full text for the PMID.
Internally, it makes a call like this -
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=1372388&retmode=text&rettype=medline
But, both the Ruby gem and the above call don't return the full text.
On further Internet search, I found that the allowed values for PubMed for rettype and retmode don't have an option to get the full text, as mentioned in the table here -
http://www.ncbi.nlm.nih.gov/books/NBK25499/table/chapter4.T._valid_values_of__retmode_and/?report=objectonly
All the examples and other scripts I have seen on the Internet are only about extracting abstracts. authors etc. and none of them discuss extracting the full text.
Here is another link that I found that uses Python package Bio, but only accesses the information about authors -
https://www.biostars.org/p/172296/
How can I download full text of the article in text or XML or Pdf format using Entrez utils provided by NCBI? Or are there already available scripts or web crawlers that I can use?
You can use biopython to get articles which are on PubMedCentral and then get PDF from it. For all articles which are hosted somewhere else, it is difficult to get a generic solution to get the PDF.
It seems that PubMedCentral does not want you to download articles in bulk. Requests via urllib are blocked, but the same URL works from a browser.
from Bio import Entrez
Entrez.email = "Your.Name.Here#example.org"
#id is a string list with pubmed IDs
#two of have a public PMC article, one does not
handle = Entrez.efetch("pubmed", id="19304878,19088134", retmode="xml")
records = Entrez.parse(handle)
#checks for all records if they have a PMC identifier
#prints the URL for downloading the PDF
for record in records:
if record.get('MedlineCitation'):
if record['MedlineCitation'].get('OtherID'):
for other_id in record['MedlineCitation']['OtherID']:
if other_id.title().startswith('Pmc'):
print('http://www.ncbi.nlm.nih.gov/pmc/articles/%s/pdf/' % (other_id.title().upper()))
I'm working on the exact same problem using ruby. So far I was able to achieve moderate success by doing the following with ruby:
use the Mechanize+esearch from eutils to get an XML of your pubmed search, and then use Mechanize/Nokogiri to parse the PMIDs from the XML
use the Mechanize+ID converter to convert the PMIDs to PMCIDs (when available). If you really are only interested in the papers available on PMC, you can set up the esearch to return PMCIDs as well.
once you have the PMCIDs, you can use Mechanize to access the webpage, click on the pdf click on the page, and use Mechanize to save the file.
It's by no means straightforward but still not that bad. There is a gem that claims to do the same (https://github.com/billgreenwald/Pubmed-Batch-Download). I plan to test that out soon.
If you want XML or JSON by PubMed ID or PMC, then you want to use the "BioC API" to access PubMed Central (PMC) Open Access articles.
(see https://www.ncbi.nlm.nih.gov/research/bionlp/APIs/BioC-PMC/ )
Here an code-example:
https://www.ncbi.nlm.nih.gov/research/bionlp/RESTful/pmcoa.cgi/BioC_xml/19088134/ascii

How do I take each line of a text file and insert them into a web form? Specifically, for testing domain name availability

I wrote a Ruby script that appended "data" to the beginning of every word of the English dictionary, and then filtered out various strings using different parameters, and now I want to use a site like namecheap or gandi.net in order to take each of these strings and insert them into the domain name availability checker in order to determine which ones are available.
It is my understanding that this will involve making a POST HTTP request of some kind, as well as grabbing the element in question, but I don't really understand the dynamics of what to read about in order to do this kind of thing.
I imagine that after a few requests I will be limited, but as a learning exercise I am still curious as to how I would go about doing this.
I inspected the element (on namecheap) to see what the tag looked like, to find any uniquely identifiable class/id names that I could use to grab that specific part of the source, and found that inside a fieldset tag, there was a line of HTML that I can't seem to paste here, so here is a picture:
Thanks in advance for any guidance in helping me learn about web scripting!

Can rapidminer extract xpaths from a list of URLS, instead of first saving the HTML pages?

I've recently discovered RapidMiner, and I'm very excited about it's capabilities. However I'm still unsure if the program can help me with my specific needs. I want the program to scrape xpath matches from an URL list I've generated with another program. (it has more options then the 'crawl web' operator in RapidMiner)
I've seen the following tutorials from Neil Mcguigan: http://vancouverdata.blogspot.com/2011/04/web-scraping-rapidminer-xpath-web.html. But the websites I try to scrape have thousands of pages, and I don't want to store them all on my pc. And the web crawler simply lacks critical features so I'm unable to use it for my purposes. Is there a way I can just make it read the URLS, and scrape the xpath's from each of those URLS?
I've also looked at other tools for extracting html from pages, but I've been unable to figure out how they work (or even install) since I'm not a programmer. Rapidminer on the other hand is easy to install, the operator descriptions make sense but I've been unable to connect them in the right order.
I need to have some input to keep the motivation going. I would like to know what operator I could use instead of 'process documents from files.' I've looked at 'process documents from web' but it doesn't have an input, and it still needs to crawl. Any help is much appreciated.
Looking forward to your replies.
Web scraping without saving the html pages internally using RapidMiner is a two step process:
Step 1 Follow the video at http://vancouverdata.blogspot.com/2011/04/rapidminer-web-crawling-rapid-miner-web.html by Neil McGuigan with the following difference:
instead of Crawl Web operator use the Process Documents from Web
operator. There will not be an option to specify the output
directory, because the results will be loaded into the ExampleSet.
ExampleSet will contain links matching the crawling rules.
Step 2 Follow the video at http://vancouverdata.blogspot.com/2011/04/web-scraping-rapidminer-xpath-web.html but only from 7:40 with the following difference:
put the Extract Information subprocess inside the Process Documents from Web which has been created previously.
ExampleSet will contain the links and the attributes matching the XPath queries.
I have quite the same problem than you and maybe these posts from RapidMiner's forum will help you a little :
http://rapid-i.com/rapidforum/index.php/topic,2753.0.html
and
http://rapid-i.com/rapidforum/index.php?topic=3851.0.html
See ya ;)

Resources