Watir: Retrieve all dynamic HTML elements that match an attribute? - ruby

I am trying to scrape dynamic content with Watir and I am stuck.
Basically, I know that I can use
browser.element(css: ".some_class").wait_until_present
in order to scrape only when "some_class" is loaded.
The problem is that it is only giving me the first element having this class name and I want all of them.
I also know I can use
browser.spans(css: ".some_class")
in order to collect ALL the classes having this name, the problem is that I can't combine it with "wait_until_present" (it gives me an error). And spans on his own is not working because the content is not loaded yet, the page is using javascript
Is there a way to combine both? That means waiting for the class_name to be loaded AND select all the elements matching this class name, not just the first one?
I've been stuck for ages...
Thanks a lot for your help

There currently isn't anything in Watir for waiting for a collection of elements (though I had been recently thinking about adding something). For now, you just have to manually wait for an element to appears and then get the collection.
The simplest one is to call both of your lines:
browser.element(css: ".some_class").wait_until_present
browser.spans(css: ".some_class")
If you wanted to one-liner it, you could use #tap:
browser.spans(css: ".some_class").tap { |c| c[0].wait_until_present }
#=> Watir::SpanCollection
Note that if you are just checking the class name, you might want to avoid writing the CSS-selector. Not only is it easier to read without it, it won't be as performant.
browser.spans(class: "some_class").tap { |c| c[0].wait_until_present }

Related

How to get the actual Hyperlink element inside the main document part using docx4j

So I have a case where I need to be able to work on the actual Hyperlink element inside the body of the docx, not just the target URL or the internal/externality of the link.
As a possible additional wrinkle this hyperlink wasn't present in the docx when it was opened but instead was added by the docx4j-xhtmlImporter.
I've iterated the list of relationships here: wordMLPackage.getMainDocumentPart().getRelationshipsPart().getRelationships().getRelationship()
And found the relationship ID of the hyperlink I want. I'm trying to use an XPath query: List<Object> results = wordMLPackage.getMainDocumentPart().getJAXBNodesViaXPath("//w:hyperlink[#r:id='rId11']", false);
But the list is empty. I also thought that it might need a refresh because I added the hyperlink at runtime so I tried with the refreshXMLFirst parameter set to true. On the off chance it wasn't a real node because it's an inner class of P, I also tried getJAXBAssociationsForXPath with the same parameters as above and that doesn't return anything.
Additionally, even XPath like "//w:hyperlink" fails to match anything.
I can see the hyperlinks in the XML if I unzip it after saving to a file, so I know the ID is right: <w:hyperlink r:id="rId11">
Is XPath the right way to find this? If it is, what am I doing wrong? If it's not, what should I be doing?
Thanks
XPathHyperlinkTest.java is a simple test case which works for me
You might be having problems because of JAXB, or possibly because of the specific way in which the binder is being set up in your case (do you start by opening an existing docx, or creating a new one?). Which docx4j version are you using?
Which JAXB implementation are you using? If its the Sun/Oracle implementation (the reference implementation, or the one included in their JDK/JRE), it might be this which is causing the problem, in which case you might try using MOXy instead.
An alternative to using XPath is to traverse the docx; see finders/ClassFinder.java
Try without namespace binding
List<Object> results = wordMLPackage.getMainDocumentPart().getJAXBNodesViaXPath("//*:hyperlink[#*:id='rId11']", false);

How can I efficiently pull specific information from JSON

I currently have a public Google calendar that I am successfully pulling JSON data down using Google's API.
I am using HTTParty to convert the JSON to a ruby object.
response = HTTParty.get('http://www.google.com/calendar/feeds/colorado.edu_mdpltf14q21hhg50qb3e139fjg#group.calendar.google.com/public/full?alt=json&orderby=starttime&max-results=15&singleevents=true&sortorder=ascending&futureevents=true')
I want to retrieve many titles, event names, start times, end times ect. I can get these with commands like
response["feed"]["title"["$t"]
for the calendar's title, and
response["feed"]["entry"][0]["title"]["$t"]
for the event's title.
My question is two-fold. One, Is there a simpler way to pull this data? Two, how can I go about pulling multiple events information? I tried:
response.each do |x| response["feed"]["title"]["$t"]
but that spits out a no implicit conversion of string into integer error.
Based on your examples this should do it
response["feed"]["entry"].map {|entry| entry["title"]["$t"] }
response['feed']['entry'] is a simple array of hashes. It is probably best to extract that array to a temporary variable with
entries = response['feed']['entry']
thereafter your code it depends entirely on what you need to achieve. For instance, using the URL that you have provided
puts entries.length
shows
2
And
entries.each do |entry|
puts entry['title']['$t']
end
gives
NEW EVENT
Future EVENT
If we can help you to achieve something specific then please alter your answer or ask for clarification in a comment.

Is it possible to drill down to an element with page object

I'm trying to use Cheezy's page-object gem for everything in order to be consistent. However I haven't been able to find how to drill down to an element like this. The situation here is that there would be more than one link with all the same tags so you have to drill down from something identifiable.
#browser.p(:text => /#{app_name}/i).link(:text => 'Add').click
The code I'm looking for would be something like this to click on a link located inside of a paragraph but it doesn't work.
p(:pgraph, id: => 'some-pgraph')
link(:lnk, text: => 'add')
self.pgraph.lnk
Is there a way to do this with page object?
Thanks,
Adam
You can use blocks to define accessors with more complicated locating strategies.
If you want to also keep a reference to the paragraph:
p(:pgraph, id: 'some-pgraph')
link(:lnk){ pgraph_element.link_element(text: 'add') }
Or if you do not need the paragraph for other things, you might do:
link(:lnk){ paragraph_element(id: 'some-pgraph').link_element(text: 'add') }
Basically you can use a block with nested elements, to define accessors similar to how you would in Watir.
Note that if you want to specify the id dynamically at run time, you can always define a method to click the link instead of using the accessors:
def click_link_in(paragraph_id)
paragraph_element(id: paragraph).link_element(text: 'add').click
end

How to get content between HTML tags that have been loaded by jQuery?

I'm loading data using jQuery (AJAX), which is then being loaded into a table (so this takes place after page load).
In each table row there is a 'select' link allowing users to select a row from the table. I then need to grab the information in this row and put it into a form further down the page.
$('#selection_table').on('click', '.select_link', function() {
$('#booking_address').text = $(this).closest('.address').text();
$('#booking_rate').text = $(this).closest('.rate').val();
});
As I understand it, the 'closest' function traverses up the DOM tree so since my link is in the last cell of each row, it should get the elements 'address' and 'rate from the previous row (the classes are assigned to the correct cells).
I've tried debugging myself using quick and dirty 'alert($(this).closest(etc...' in many variations, but nothing seems to work.
Do I need to do something differently to target data that was loaded after the original page load? where am I going wrong?
You are making wrong assumptions about .closest() and how .text() works. Please make a habit of studying the documentation when in doubt, it gives clear descriptions and examples on how to use jQuery's features.
.closest() will traverse the parents of the given element, trying to match the selector you have provided it. If your .select_link is not "inside" .address, your code will not work.
Also, .text() is a method, not a property (in the semantical way, because methods are in fact properties in Javascript). x.text = 1; simply overrides the method on this element, which is not a good idea, you want to invoke the method: x.text(1);.
Something along these lines might work:
var t = $(this).closest('tr').find('.address').text();
$('#booking_address').text(t);
If #booking_address is a form element, use .val() on it instead.
If it does not work, please provide the HTML structure you are using (edit your question, use jsFiddle or a similar service) and I will help you. When asking questions like this, it is a good habit anyways to provide the relevant HTML structure.
You can try using parent() and find() functions and locate the data directly, the amount of parent() and find() methods depends on your HTML.
Ex. to get previous row data that would be
$('#selection_table').on('click', '.select_link', function(){
$('#booking_address').text = $(this).parent().parent().prev().find('.address').text();
});
Where parent stands for parent element (tr), then prev() as previous row and find finds the element.
Is there a demo of the code somewhere? Check when are you calling the code. It should be after the 'success' of AJAX call.

Duplicated Zend_Form Element ID in a page with various forms

How do I tell the Zend_Form that I want an element (and it's ID-label, etc) to use another ID value instead of the element's name?
I have several forms in a page. Some of them have repeated names. So as Zend_Form creates elements' IDs using names I end up with multiple elements with the same ID, which makes my (X)HTML document invalid.
What is the best solution to fix this, given that I really have to stick with using the same element names (they are a hash common to all forms and using the Zend_Form Hash Element is really out of question)?
Zend_Form_Element has a method called setAttribs that takes an array. You may be able to do something like $element->setAttribs(array('id' => "some_id"));
or you can do $element->setAttrib('id', 'some_id');
Thanks, Chris Gutierrez.
However, as I said, I needed to get ride of the default decorator generated IDs like -label. Wiht the $element->setAttribs() it is not possible, however.
So based on http://framework.zend.com/issues/browse/ZF-7125 I just did the following:
$element->clearDecorators();
$element->setAttrib('id', 'some_id');
$element->addDecorator("ViewHelper");
Whoever sees this: please note this was enough for what I needed. But may not be for you (the default settings has more than the viewHelper decorator).

Resources