Extract nested JSON data from variable key element - ruby

I'm using Wikipedia API to get an article summary. The data is nested within a variable page ID key unique to each article.
How do I extract the extract key if I do not initially have the value of the pages key?
Example API response for 'Stack Overflow':
{
"query": {
"pages": {
"21721040": {
"pageid": 21721040, # this is a unique key value for each article
"ns": 0,
"title": "Stack Overflow",
"extract": "Stack Overflow is a privately held website, the flagship site of the Stack Exchange Network, created in 2008 by Jeff Atwood and Joel Spolsky, as a more open alternative to earlier Q&A sites such as Experts Exchange. The name for the website was chosen by voting in April 2008 by readers of Coding Horror, Atwood's popular programming blog.\nIt features questions and answers on a wide range of topics in computer programming. The website serves as a platform for users to ask and answer questions, and, through membership and active participation, to vote questions and answers up or down and edit questions and answers in a fashion similar to a wiki or Digg. Users of Stack Overflow can earn reputation points and \"badges\"; for example, a person is awarded 10 reputation points for receiving an \"up\" vote on an answer given to a question, and can receive badges for their valued contributions, which represents a kind of gamification of the traditional Q&A site or forum. All user-generated content is licensed under a Creative Commons Attribute-ShareAlike license. Questions are closed in order to allow low quality questions to improve. Jeff Atwood stated in 2010 that duplicate questions are not seen as a problem but rather they constitute an advantage if such additional questions drive extra traffic to the site by multiplying relevant keyword hits in search engines.\nAs of April 2014, Stack Overflow has over 2,700,000 registered users and more than 7,100,000 questions. Based on the type of tags assigned to questions, the top eight most discussed topics on the site are: Java, JavaScript, C#, PHP, Android, jQuery, Python and HTML."
}
}
}
}
Update: Solution based on Uri's response...
key = response['query']['pages'].keys # => ["21721040"]
response['query']['pages'][key[0]]['extract'] # data

You can look at the keys of the hash:
response['query']['pages'].keys
# => ["21721040"]

If you're using Ruby 2.3, you could go for a #dig:
Get the value using extracting the key first:
key = response.dig('query', 'pages')&.keys.first
# => "21721040"
response.dig('query', 'pages', key, 'extract')
# => "Stack Overflow is a privately held website..."
Grab the extract value directly:
response.dig('query', 'pages')&.values.dig(0, 'extract')
# => "Stack Overflow is a privately held website..."

Related

Azure QnA Merging Same Questions

We are uploading a PDF with semi structured Question & Answers. QnA maker is merging same question if they are successive. If there is some other question exist between them, then QnA maker is not merging same questions. For example.
Q Machine was not able to be started
Answer 1
Q Machine was not able to be started
Answer 2
Q Burning plastic smell on machine
Answer 3
Now the QnA Maker will train it like this
Q Machine was not able to be started
Answer 1
Answer 2
Q Burning plastic smell on machine
Answer 3
Why is QnA is behaving like this and how to separate same questions. Help is required.
This is expected behavior. QnA works on the 'one question (and related, similar questions)' to one answer idea, and expects unique questions for the queries. The QnA documentation states:
A knowledge base consists of question and answer (QnA) sets. Each set has one answer and a set contains all the information associated with that answer. An answer can loosely resemble a database row or a data structure instance.
The required settings in a question-and-answer (QnA) set are:
a question - text of user query, used to QnA Maker's machine-learning, to align with text of user's question with different wording but the same answer
the answer - the set's answer is the response that's returned when a user query is matched with the associated question
Each set is represented by an ID.
The optional settings for a set include:
Alternate forms of the question - this helps QnA Maker return the correct answer for a wider variety of question phrasings
Metadata: Metadata are tags associated with a QnA pair and are represented as key-value pairs. Metadata tags are used to filter QnA pairs and limit the set over which query matching is performed.
Multi-turn prompts, used to continue a multi-turn conversation
QnA Maker doesn't differentiate between the two questions because it isn't two questions. It's literally the same question with two different answers.
This particular case would be a good use of QnAMaker's multi-turn prompt feature, where, after the customer has put in the query 'The machine won't start', QnA can follow up with a prompt that says "Which machine did you mean? Machine A or Machine B", and whichever they choose leads to the correct answer. I would look into Multi-Turn Conversations for your knowledgebase.

Wrong answer from QnAMaker with keyword

I have been working with the Microsoft Bot Framework v4 and QnA Maker(GA). A problem that I have come across is when the user types a keyword like 'leave absence'. There are 10+ kind of leave absence questions. The QnAMaker will send back the one with the highest score no matter what kind leave it is (not the right answer).
I have a tree to answer question that looks something like this:
Leave of absence
Parental leave
Maternity leave
Care leave
etc.
Each kind can have one or more related questions and a leave can also have a sub-leave.
When the user ask 'leave absence', the bot should answer: 'Which kind of leave absence' and after the user can ask a question about it.
When the user ask 'How many days can I have for a parental leave', the bot should answer straight from the QnA: 'You can have 10 free days'.
My question is, how can I implement this in v4 so the user can receive the right answer? Is LUIS a option for this? Any suggestions?
Thank you.
Its difficult if you have question after question to ask the user. For this, you may need to have a separate Dialog class with a
List<string>
for the set of questions built on runtime of course. At the end it could return back to the original Dialog class. I have implemented something similar for job openings on different posts. Each post having its own set of questions. The control remains in this QuestionnaireDialog (the separate Dialog class) asking next question once the user answers the current question. I don't think QnA Maker will help on this. I have not seen QnA maker much nor v4. I have done the above on v3 and the intent-response mapping was in a database table.
My suggestion is to flatten your structure if possible from multiple levels to just 2-level to avoid the tree.
For eg:
Leaves --> Care Leave --> Medical Care Leave
--> Family Care Leave
Change the structure to
Leaves --> Medical Care Leave
--> Family Care Leave
So that you could manage it with LUIS entities. Simply asking about leaves will bring a response that will have all the type of leaves available and asking specifically about a leave type will bring a different response specific to the type. Again I have done something similar without QnA maker in v3. If you can't flatten the structure then you will probably have to bring in a mixture of the two approaches because you want to respond to user's specific leave type query (LUIS entities) and take user through a questionnaire.

Link to a random page on Yahoo Answers?

I want to link to a random question within the "resolved questions" section of Yahoo Answers.
I've found some js techniques which involve assigning numbers to each URL so the clicked link chooses randomly from a list I'd create that way. But there are 10's of thousands of resolved questions, and new ones added every day. So that method won't work for me.
How can I link to a random "resolved question?"
You could use the Yahoo! Answers API to get the data you need (in either XML or JSON).
The documentation is available here: http://developer.yahoo.com/answers/

#! (hashbang) and Google SEO [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 12 months ago.
Improve this question
I've read over the Google specification for crawling AJAX-enabled pages. Since part of Google's indexing method uses the URL itself, will converting to !# negatively effect SEO?
For instance, if I have a page at www.mysite.com/surfing, Google will be likely to rate it highly if a user searches for "surfing" because it has "surfing" in the URL. Would the same be true for www.mysite.com/#!surfing or does it ignore the hash fragments for the purposes of weighting the URL itself?
Perhaps you have already read in the google Ajax-crawling instructions that the !# is actually transformed into ?_escaped_fragment_ by the google crawler. So let's use your example:
www.mysite.com/#!surfing , the google crawler will see the link as www.mysite.com/?_escaped_fragment_=surfing . So it comes to the question : what is better for google SEO a link with a paremeter ?_escaped_fragment_=surfing or without one /surfing ?
Search engineer representatives have confirmed on numerous occasions that URLs with more than 2 dynamic parameters may not be spidered unless they are perceived as significantly important (i.e. have many, many links pointing to them). So unless you're using too many parameters in the url, you don't have much to worry about. If you haven't done it already, you can always read the detailed google documentation https://developers.google.com/webmasters/ajax-crawling/docs/getting-started . Now, just an advice - don't rely on # in your AJAX website. Use history.pushState() to change your url to whatever you wish. I use #! only on browsers that don't support history.pushState() like IE. The problem with the SEO with #! doesn't come form the url but from the difficulties in the Server Side processing of the information needed to provide HTML snapshot for the crawler.
The question is old.
Now Google not supports AJAX-Crawling anymore:
https://webmasters.googleblog.com/2015/10/deprecating-our-ajax-crawling-scheme.html
And this document officially deprecated:
https://developers.google.com/search/docs/ajax-crawling/docs/getting-started
So don't use hashbangs in URLs.
Traditionally, from SEO perspective, hash tag (#) is used to avoid the following issues
-Cannibalization issues
-Affiliate URLs (Here is a good article about how to use hash for tracking purpose instead of using question mark in the URL)
-Show limited content on the page (pagination issues)
The usage you are refering to is what Google recommends on how to make AJAX pages being able to be read by Google - https://support.google.com/webmasters/answer/174992?hl=en
For more info about hash tag and its SEO benefits, check this blog post - https://digitalreadymarketing.com/adding-hash-in-urls-seo-benefits/
In My personal opinion and 8 years in SEO & development It won't harm but it depends more on the site other parameters so adding the !# won't do harm...
Do you have the site URL so I can take a more in-depth Look ?
That could cause a problem if Google's crawler thought that there could be an infinite number of possibilities. Like with a ? in the url. But the answer beyond that is clear.
website.com/oreo-cookies
is more semantic and easier to understand for both people and crawlers than
website.com/#!oreo-cookies
But is this going to have a major impact? If you were a client paying me for SEO, I would tell you that your incoming text links with relevant keyword phrases from relevant related websites is far more important. I would also say that if you are submitting an xml sitemap for google to digest, and lots of popular websites are using the #! google will figure it out and ignore it.
So bottom line, if my content was worth linking to, and I made sure google was finding all my pages and indexing them, I would not worry about it.
I think that it will not harm your SEO in any way I am in SEO for last 5 years and haven't experienced such problem yet so don't worry about it. So my opinion is you can do it by adding the !# no harm !!

Searching a datastore for related topics by keyword

For example, how does StackOverflow decide other questions are similar?
When I typed in the question above and then tabbed to this memo control I saw a list of existing questions which might be the same as the one I am asking.
What technique is used to find similar questions?
I got an email from team#stackoverflow.com on Mar 20 that mentions how it works:
the "ask a question" search is
exclusively on title and will not
match anything in the body. It is a
mystery to me why people think it's
better.
The last sentence refers to the search bar, which I've found is less useful when I'm trying to find a specific question I've already seen.
I think it's plain old word matching. However, I might add that this feature does not work as well as I would like it to. It's much better to do google search with site:stackoverflow.com prefix than to rely on SO to provide the relevant suggestions.
Poorly -- using MS SQL Full Text Search, I believe. You'll have better luck using Lucene, IMO. For more background on the topic see the Wikipedia article on Lucene or the general topic of information retrieval.
The matching program would store an index of all questions. When you ask a question, all keywords in your question are matched against the index. This is similar to Google Search. Lucene open source search can be (and with high probability has been) used for this. Since the results are not quite accurate, I presume they index just the headlines of the questions, as an approximation.
The other related keyword is collaborative filtering, the algorithm popularized by Amazon to recommend products based on behavior of other similar customers. In the current case, an alternative algorithm based on collaborative filtering is: keywords are extracted from the question, then tags associated (in the history) with the keywords are found. Questions which have those tags are returned. Well, experiments are needed to see whether it works well at all.

Resources