Hello is there something that can only mark a certain part of the text?
I can not find the right solution anywhere.
I tried: double_click, flash, select_text didn't work for me.
This works, but this mark everything : browser.send_keys [:control, 'a']
I added picture of example, what i want to do.
Thank you for your answers.
The red rectangle shows the markings
You can use the Element#select_text method. Note that prior to Watir 6.8, you will need to manually include the extension (method).
Here is a working example using the Wikipedia page:
require 'watir'
require 'watir/extensions/select_text' # only include this if using Watir 6.7 or prior
browser = Watir::Browser.new
browser.goto('https://en.wikipedia.org/wiki/XPath')
browser.body.select_text('XPath may be used')
sleep(5) # so that you can see the selection
Note that this will highlight the first match. You may want to restrict searching a specific element rather than the entire body.
Here is another example using ckeditor.com:
require 'watir'
require 'watir/extensions/select_text' # only include this if using Watir 6.7 or prior
browser = Watir::Browser.new
browser.goto('ckeditor.com/')
frame = browser.iframe(class: 'cke_wysiwyg_frame')
frame.p.select_text('Bake the brownies')
browser.link(href: /Bold/).click
sleep(10)
Related
I know how to find an element using Nokogiri. I know how to click a link using Mechanize. But I can't figure out how to find a specific link and click it. This seems like it should be really easy, but for some reason I can't find a solution.
Let's say I'm just trying to click on the first result on a Google search. I can't just click the first link with Mechanize, because the Google page has a bunch of other links, like Settings. The search result links themselves don't seem to have class names, but they're enveloped in <h3 class="r"></h3>.
I could just use Nokogiri to follow the href value of the link like so:
document = open("https://www.google.com/search?q=stackoverflow")
parsed_content = Nokogiri::HTML(document.read)
href = parsed_content.css('.r').children.first['href']
new_document = open(href)
# href is equal to "/url?sa=t&rct=j&q=&esrc=s&source=web&url=https%3A%2F%2Fstackoverflow.com%2F"
but it's not a direct url, and going to that url gives an error. The data-href value is a direct url, but I can't figure out how to get that value - doing the same thing except with ...first['data-href'] returns nil.
Anyone know how I can just find the first .r element on the page and click the link inside it?
Here's the start to my action:
require 'open-uri'
require 'nokogiri'
require 'mechanize'
document = open("https://www.google.com/search?q=stackoverflow")
parsed_content = Nokogiri::HTML(document.read)
Here's the .r element on the Google search results page:
<h3 class="r">
Stack Overflow
</h3>
You should make sure your question is the correct code in your example - it looks like it is not, because you don't surround the url in quotes and the css selector is .r a not r. You use .r a because you want to access the link inside elements with the r class.
Anyway, you can use the approach detailed here like so:
require 'open-uri'
require 'nokogiri'
require 'uri'
base_url = "https://www.google.com/search?q=stackoverflow"
document = open(base_url)
parsed_content = Nokogiri::HTML(document.read)
href = parsed_content.css('.r').first.children.first['href']
new_url = URI.join base_url, href
new_document = open(new_url)
I tested this and following new_url does redirect to StackOverflow as expected.
I am new to Ruby and we are using Ruby Selenium framework for automating the PDF verification testing.
I want to verify the content of PDF, like text and also get the position of the text. Along with that I also need to get the text at a given position.
Something like this maybe
require 'pdf-reader'
require 'open-uri'
reader = PDF::Reader.new(open("SAMPLE_URL")) # my resume pdf
page = reader.pages.first
lines = page.split("\n")
text_match_line_numbers = [0...lines.length].select do |i|
lines[i] .include? "text"
end
Look at their docs here, there are more advanced options for navigating the PDF page.
In the past, I have successfully used Nokogiri to scrape websites using a simple Ruby script. For a current project, I need to scrape a website that only uses inline CSS. As you can imagine, it is an old website.
What possibilities do I have to target specific elements on the page based on the inline CSS of the elements? It seems this is not possible with Nokogiri or have I overlooked something?
UPDATE: An example can be found here. I basically need the main content without the footnotes. The latter have a smaller font size and are grouped below each section.
I'm going to teach you how to fish. Instead of trying to find what I want, it's sometimes a lot easier to find what I don't want and remove it.
Start with this code:
require 'nokogiri'
require 'open-uri'
URL = 'http://www.eximsystems.com/LaVerdad/Antiguo/Gn/Genesis.htm'
FOOTNOTE_ACCESSORS = [
'span[style*="font-size: 8.0pt"]',
'span[style*="font-size:8.0pt"]',
'span[style*="font-size: 7.5pt"]',
'span[style*="font-size:7.5pt"]',
'font[size="1"]'
].join(',')
doc = Nokogiri.HTML(open(URL))
doc.search(FOOTNOTE_ACCESSORS).each do |footnote|
footnote.remove
end
File.write(File.basename(URI.parse(URL).path), doc.to_html)
Run it, then open the resulting HTML file in your browser. Scroll through the file looking for footnotes you want to remove. Select part of their text, then use "Inspect Element", or whatever tool you have that will find that selected text in the source of the page. Find something unique in that text that makes it possible to isolate it from the text you want to keep. For instance, I locate footnotes using the font-sizes in <span> and <font> tags.
Keep adding accessors to the FOOTNOTE_ACCESSORS array until you have all undesirable elements removed.
This code isn't complete, nor is it written as tightly as I'd normally do it for this sort of task, but it will give you an idea how to go about this particular task.
This is a version that is a bit more flexible:
require 'nokogiri'
require 'open-uri'
URL = 'http://www.eximsystems.com/LaVerdad/Antiguo/Gn/Genesis.htm'
FOOTNOTE_ACCESSORS = [
'span[style*="font-size: 8.0pt"]',
'span[style*="font-size:8.0pt"]',
'span[style*="font-size: 7.5pt"]',
'span[style*="font-size:7.5pt"]',
'font[size="1"]',
]
doc = Nokogiri.HTML(open(URL))
FOOTNOTE_ACCESSORS.each do |accessor|
doc.search(accessor).each do |footnote|
footnote.remove
end
end
File.write(File.basename(URI.parse(URL).path), doc.to_html)
The major difference is the previous version assumed all entries in FOOTNOTE_ACCESSORS were CSS. With this change XPath can also be used. The code will take a little bit longer to run as the entries are iterated over, but the ability to dig in with XPath might make it worthwhile for you.
You can do something like:
doc.css('*[style*="foo"]')
That will select any element with foo appearing anywhere in it's style attribute.
I use watir in radrails IDE.
How can I find and print all existing window titles?
I have only been able to get all windows using win32ole. Note that this only works for IE (which I assume you are using since the question's tag is watir not watir-webdriver).
The following is an example of outputting all titles:
require 'win32ole'
WIN32OLE.new('Shell.Application').Windows.each do |window|
if window.path =~ /Internet Explorer/
puts window.Document.Title
end
end
I'm writing a sample test with Watir where I navigate around a site with the IE class, issue queries, etc..
That works perfectly.
I want to continue by using PageContainer's methods on the last page I landed on.
For instance, using its HTML method on that page.
Now I'm new to Ruby and just started learning it for Watir.
I tried asking this question on OpenQA, but for some reason the Watir section is restricted to normal members.
Thanks for looking at my question.
edit: here is a simple example
require "rubygems"
require "watir"
test_site = "http://wiki.openqa.org/"
browser = Watir::IE.new
browser.goto(test_site)
# now if I want to get the HTML source of this page, I can't use the IE class
# because it doesn't have a method which supports that
# the PageContainer class, does have a method that supports that
# I'll continue what I want to do in pseudo code
Store HTML source in text file
# I know how to write to a file, so that's not a problem;
# retrieving the HTML is the problem.
# more specifically, using another Watir class is the problem.
Close browser
# end
Currently, the best place to get answers to your Watir questions is the Watir-General email list.
For this question, it would be nice to see more code. Is the application under test (AUT) opening a new window/tab that you were having trouble getting to and therefore wanted to try the PageContainer, or is it just navigating to a second page?
If it is the first one, you want to look at #attach, if it is the second, then I would recommend reading the quick start tutorial.
Edit after code added above:
What I think you missed is that Watir::IE includes the Watir::PageContainer module. So you can call browser.html to get the html displayed on the page to which you've navigated.
I agree. It seems to me that browser.html is what you want.