How do you insert text/value in google doc using Watir/ruby - ruby

The google doc is embedded on a website inside an iframe.
Here is the code I use to try to insert a random text on the google doc
sample_text = Faker::Lorem.sentence
$browser.iframe(id: 'google_iframe').div(xpath: "//div[#class='kix-lineview']//div[contains(#class, 'kix-lineview-content')]").send_keys (sample_text)
$advanced_text_info['google_doc_article'] = sample_text
But im getting error when I run the test
element located, but timed out after 30 seconds, waiting for #<Watir::Div: located: true; {:id=>"google_iframe", :tag_name=>"iframe"} --> {:xpath=>"//div[#class='kix-lineview']//div[contains(#class, 'kix-lineview-content')]", :tag_name=>"div"}> to be present (Watir::Exception::UnknownObjectException)

Problem
The root of the problem is how Google Docs has implemented their application. The div you are writing to does not include the contenteditable attribute:
<div class="kix-lineview-content" style="margin-left: 0px; padding-top: 0px; top: -1px;">
As a result, Selenium does not consider this element in an interactable state. A Selenium::WebDriver::Error::ElementNotInteractableError exception is raised. You can see this by bypassing Watir and calling Selenium directly:
browser.div(class: 'kix-lineview-content').wd.send_keys('text')
#=> Selenium::WebDriver::Error::ElementNotInteractableError (element not interactable)
Watir confuses the matter by hiding the exception. As seen in the following code, the not interactable error is raised as an element not present exception - UnknownObjectException. I am not sure if this was intentional (eg backwards compatibility) or an oversight.
rescue Selenium::WebDriver::Error::ElementNotInteractableError
raise_present unless browser.timer.remaining_time.positive?
raise_present unless %i[wait_for_present wait_for_enabled wait_for_writable].include?(precondition)
retry
In summary, the problem is that the element is not considered interactable rather than because it isn't present.
Solution
I think the best approach here is to make the div interactable before trying to set the text. You can do this by adding the contenteditable attribute yourself:
content = browser.div(class: 'kix-lineview-content') # This was sufficient when using Google Docs directly - update this for your specific iframe
content.execute_script('arguments[0].setAttribute("contenteditable", "true")', content)
content.send_keys(sample_text)

Related

Is there something special about a file type input that makes it not able to find?

I'm looking to add files to an <input type="file">
Here is a snippet of the html
<span class="btn btn-xs btn-primary btn-file"> #found
<span class="blahicon blahicon-upload"></span>
Browse
<input type="file" data-bind="value: fileName, event: { change: uploadImagesOnChange }"
accept="blah/txt" multiple=""> #not found
</span>
and here's the capybara and ruby
within_frame('frame1') do
within_frame('frame2') do
within(:xpath, [containing span xpath]) do # finds this
find(:xpath, './/*[#type="file"]').send_keys('C:\Users\...\blah.txt') #ElementNotFound
end
end
end
I see no hidden block and it's super scoped. Any thoughts?
Instead of using within(:xpath, [containing span xpath]) can you directly check the xpath of input like: has_xpath?(".//span[contains(text(),'Browse')]/input")
and if it is returning true then try with using
find(:xpath, ".//span[contains(text(),'Browse')]/input").send_keys ('C:\Users\...\blah.txt')
If you are aware about the 'pry' gem then you can debug this thing by trying various xpath combinations instead of just running the whole script, So you will get to know the actual problem.
I think there is mistake in your xpath
.//*[#type="file"]
Change to
.//*[#type='file']
as browser can detect attribute value with double quotes(" ") but inside script you need to use it in single quote (' ')
Also can make different combinations of your XPath like
//input[#type='file']
Judging by the class btn-file on the wrapper class, it's probable you're using Bootstrap and one of the "standard" methods to hide the actual file input element so that it can be style the same across multiple browsers. There are a number of ways of hiding the button from just setting display:none on it to the more "modern" method of expanding it to the same size as the replacement button and seeting it's opacity to 0 so it become a transparent overlay on the replacement.
The basic strategy to dealing with this kind of setup in Capybara is to use execute_script to make the element visible first and then use attach_file or set as normal. For instance if your site is using the opacity method of hiding the file element you could do something like
within_frame('frame1') do
within_frame('frame2') do
within(:xpath, [containing span xpath]) do # finds this
file_input = find(:file, visible: :all)
page.driver.browser.execute_script("$(arguments[0]).css('opacity',1)", file_input.native)
file_input.set('C:\Users\...\blah.txt')
end
end
end
Note - this code assumes you're using jQuery in your page and will only work with the selenium driver since it uses seleniums driver specific ability to pass elements from capybara to selenium in the execute_script call. If not using jQuery the JS will need to change and if using another driver you would need to find the element in the JS script using DOM methods and then modify its opacity.

Web scraping from youtube with nokogiri

I want to scrape all the names of the users who commented below a youtube video.
I'm using ruby and nokogiri.
require 'rubygems'
require 'nokogiri'
require 'open-uri'
url = "https://www.youtube.com/watch?v=tntOCGkgt98"
doc = Nokogiri::HTML(open(url))
doc.css(".comment-thread-renderer > .comment-renderer").each do |comment|
name = comment.css("#comment-section-renderer-items .g-hovercard").text
puts name
end
But it's not working, I'm not getting any output, no error either.
I won't be able to give you a solution, but at least I can give you a couple of hints that may help you to move forward.
The code you have is not working because the comments section is loaded via an ajax call after the page is loaded. If you do a hard reload in your browser, you will see that there is a spinner icon and a Loading... text in the sections comment, waiting for the content to be loaded. When Nokogiri gets the page via the http request, it gets the html content that you see before the comments are loaded. As a matter of fact the place where the contents will be later added looks like:
<div id="watch-discussion" class="branded-page-box yt-card">
<div id="comment-section-renderer"
class="comment-section-renderer vve-check"
data-visibility-tracking="CCsQuy8iEwjr3P3u1uzNAhXIepAKHRV9D8Ao-B0=">
<div class="action-panel-loading">
<p class="yt-spinner ">
<span class="yt-spinner-img yt-sprite" title="Loading icon">
</span>
<span class="yt-spinner-message">Loading...</span>
</p>
</div>
</div>
</div>
That is the reason why you won't find the divs you are looking for, because they aren't part of the html you have.
Looking at the network console in the browser, it seems that the ajax request to get the comments data is being sent to https://www.youtube.com/watch_fragments_ajax?v=tntOCGkgt98&tr=time&distiller=1&ctoken=EhYSC3RudE9DR2tndDk4wAEAyAEA4AEBGAY%253D&frags=comments&spf=load. As you can see the v parameter is the video id, however there are a couple of caveats:
There is a ctoken param, which you can get by scraping the original page contents. It is inside a <script> tag, in the form of
'COMMENTS_TOKEN': "<token>".
However, you still need to send a session_token as a form data in the body of the AJAX request (which is a POST). That I don't know where is coming from :(.
I think that you will be pushing the limits of Nokogiri here, as AFAIK it is not intended to follow ajax requests or handling Javascript. Maybe the ruby Selenium driver is better suited for this.
HTH
I think you need name.css("#comment-section..."
The each statement will iterate over the elements, using the variable name.
You may want to use node instead of name:
doc.css(".comment-thread-renderer > .comment-renderer").each do |node|
name = node.css("#comment-section-renderer-items .g-hovercard").text
puts name
end
I wrote this rails app using nokogiri to see all the tags that a page has before any javascript is run in the browser. The source code is here, so you can adjust it if you need to add more info about the node in the view.
That can easily tell you if the particular tag element that you are looking for is something you can retrieve without having to do some JS eval.
Most web crawlers don't support client-side rendering, which gives you an idea that it's not a trivial task to execute JS when scraping content.
YouTube is a dynamically rendered JavaScript website, though it could be parsed with Nokogiri without using Selenium or another package. Try open the Network tab in dev tools, scroll to the comment section, and see what request being send.
You need to make a post request in order to fetch comments data. You can preview the output in the "Preview" tab.
Preview output:
Which is equivalent to this comment:
Note: Since this comment brings very little value, this answer will be updated with the attached code once there will be an available solution.

How to get the last element in Selenium IDE thats always changing

In the code:
<img style="cursor: pointer; background-color: transparent;" onclick="changeTeamHint(2155);" src="/images/icon_edit_inactive.png">
onclick="changeTeamHint(2155);" is always changing. The number increases as I add more fields to my form.
How do I get the last element in Selenium IDE? For example, if I add a text field with "changeTeamHint(2156)", how do I use Selenium IDE to select the latest one? If its 2157, it selects 2157. Etc.
Here's what I got so far: xpath=(//img[#style='cursor:pointer;'])[last()]
But my coworker told me to try to find it by onclick, and here's what I got that works: xpath=(//img[#onclick='changeTeamHint(2155);'])[last()]
I tried: xpath=(//img[#onclick='changeTeamHint();'])[last()], but it gives me an error
I think you want to do a starts-with for value of the onclick attribute:
xpath=(//img[starts-with(#onclick, 'changeTeamHint')])[last()]

Watin - Cannot find link element (Javascript)

I'm using watin to automate a process on a web system (internal).
I can open the website and access some links, but others cannot be found. I think this may be either because they are deeply nested or because the href is javascript. This is the format they are in:
<frame>
<html>
<frameset>
<frame>
<html>
<body>
<div>
<table>
<table>
<tr>
<td>
<a id="1_1_1_a" href="javascript:blah" </a>
I've tried various different ways to find by id, element type etc. But I'm stuck on this.
Can anyone help?
Thanks
Elements within a frame have to be searched through its frame because each frame is considered a separate namespace in WatIn. So first get the correct frame (by calling the Frame method or through the Frames property - note that in your example you must do it twice as you have two nested frames) and then search for the link, e.g.:
ie.Frames[0].Frames[0].Link("1_1_1_a")
Try this
using (var browser = new IE("your_web_site_here"))
{
try
{
Frame first_frame = browser.Frame(Find.ById("frame_1_id"));
Frame second_frame = first_frame.Frame(Find.ById("frame_2_id"));
var first_div = second_frame.Div(Find.ById("first_div_id"));
Assert.IsTrue(first_div.Exists);
var Link_to_click = first_div.Link(Find.ById("1_1_1_a"));
Assert.IsTrue(Link_to_click.Exists);
Link_to_click.Click();
}
catch (Exception ex)
{
}
}
Sometimes watin cannot find elements I'm still trying to find out why and what are the preconditions.
I had a similar issue where my test would browse to a page that had 40 links on it and even though you could see all the links on the page the following statement would show links (and other elements collections) had a count of zero.
ie.Links.Count();
I'm still not sure exactly what the underlying issue is but I discovered that if you start Gallio Icarus / Gallio Echo / Visual Studio (or whatever test runner you are using) by Right Click -> Run as administrator the test works as expected and the elements are loaded into the ie browser object correctly.

Preserving Ajax page state with URL hash

There is a page on my site with two sets of tabs, each tab's link is ajax-driven but has a proper href in case javascript is not enabled. I'm about to implement an ajax 'back-button' solution using a plugin such as jQuery Address.
My problem/confusion with this solution is that a page's default content is still loaded before the javascript has a chance to parse the hash and load the correct content. If I initially hide the content, non-javascript users will never see anything. If I don't initially hide the content, the user will see the wrong page for a moment before it gets updated (besides the extra overhead of first loading the wrong tab and then the correct tab).
What are the best / most common approaches to dealing with this?
Thanks, Brian
If you use hashes, you will always have the wrong content first. You need to use a server-side solution with the HTML5 History API to avoid this. Read more
You can use:
https://github.com/browserstate/ajaxify
And have the tabs render on the server side with something like if ( $_GET['tab'] === '2' ) // render 2
I think this is a good question. Have you tried using the <noscript> tag to include css that shows the content that's hidden initially for JS users. Something like this:
<style type="text/css">
#area-1, #area-2 { display: none; }
</style>
<noscript>
<style type="text/css">
#area-1, #area-2 { display: block; }
</style>
</noscript>
Hope this helps!

Resources