REXML - Having trouble asserting a value is present in XML response - ruby

I got help here this morning using REXML and the answer helped me understand more about it. However I've encountered another problem and can't seem to figure it out.
My response is like this:
<?xml version="1.0"?>
<file>
<link a:size="2056833" a:mimeType="video/x-flv" a:bitrate="1150000.0" a:height="240" a:width="320" rel="content.alternate"> https://link.com</link>
</file>
So, what I want to do is assert that a:mimeType is video/x-flv
Here's what I have tried:
xmlDoc = REXML::Document.new(xml)
assert_equal xmlDoc.elements().to_a("file/link['#a:mimeType']").first.text, 'video/x-flv'
and also:
assert xmlDoc.elements().to_a("file/link['#a:mimeType']").include? 'video/x-flv'
and various combinations. I actually get lots of these links back but I only really care if one of them has this mimeType. Also, some of the links don't have mimeType.
Any help greatly appreciated.
Thanks,
Adrian

text is retrieving for text content of elements (text between tags). You want to access an "attribute". Try
xmlDoc.elements().to_a("file/link").first.attributes['a:mimeType']
To see if either of the links has the correct mimeType, you can convert the array of elements
into an array of mimeType attributes and check if it contains the right value:
xmlDoc.elements().to_a("file/link").map { | elem | elem.attributes['a:mimeType'] }.include? 'video/x-flv'
UPDATE
Or much simpler, check if there is an element with the attribute mimeTypes having the right value:
xmlDoc.elements().to_a("file/link[#a:mimeType='video/x-flv']") != []
Thanks for teaching me something about XPath ;-)

Related

Can I get value of an specific attribute only by XPath?

I have a code like this:
doc = Nokogiri::HTML("<a href='foo.html'>foo</a><a href='bar.html'>bar</a>")
doc.xpath('//a/#href').map(&:value) # => ["foo.html", "bar.html"]
It works as I expected.
But just out of curiosity I want to know, can I also get the value of href attributes only by using XPath?
Locate attributes first
example:
site name:
https://www.easymobilerecharge.com/
We want to locate "MTS" link
In your case, to locate this element, we can use x-path like:
//a[contains(text(),'MTS')]
Now to get href attribute, use:
//a[contains(text(),'MTS')]/#href
Judging from the first answer to this question the answer seems to be yes and no. It offers
xml.xpath("//Placement").attr("messageId")
which is quite close to "only XPath", but not entirely. Up to you to judge if that is enough for you.

Correct xpath returns empty result

I want to scrape data from the table on this webpage http://www.changning.sh.cn/jact/front/front_mailpublist.action?sysid=9
Before writing a spider, I tested my Xpath expressions in Scrapy shell, but ran into one problem: Xpath can't get any text out of the table.
Say I want to extract the text LM2015122827458 in the upperleft cell, I used response.xpath("//tr[#class = 'tr_css']/td[1]/text()").extract(). Only an empty list was returned. I tried alternative Xpath expressions including the ones inspired by Chrome "copy Xpath," but had no luck. I even used response.xpath("//text()") to extract all the texts on the page to see if LM2015122827458 is there. It wasn't. So, is this a page that Xpath can't deal with? Or did I do something wrong? Thank you very much!
This Xpath is working fine for me:-
//tr[#class='tr_css'][1]/td[#class='text-center'][1]
Below code work in java is working fine for me :-
driver.get("http://www.changning.sh.cn/jact/front/front_mailpublist.action?sysid=9");
driver.manage().timeouts().implicitlyWait(30, TimeUnit.SECONDS);
String a = driver.findElement(By.xpath("//tr[#class='tr_css'][1]/td[#class='text-center'][1]")).getText();
System.out.println(a);
Hope it will help you :)

How to use substring() with Import.io?

I'm having some issues with XPath and import.io and I hope you'll be able to help me. :)
The html code:
<a href="page.php?var=12345">
For the moment, I manage to extract the content of the href ( page.php?var=12345 ) with this:
./td[3]/a[1]/#href
Though, I would like to just collect: 12345
substring might be the solution but it does not seem to work on import.io as I use it...
substring(./td[3]/a[1]/#href,13)
Any ideas of what the problem is?
Thank's a lot in advance!
Try using this for the xpath: (Have the field selected as Text)
.//*[#class='oeil']/a/#href
Then use this for your regex:
([^=]*)$
This will get you the ISBN number you are looking for.
import.io only support functions in XPath when they return a node list
Your path expression is fine, but perhaps it should be
substring(./td[3]/a[1]/#href,14)
"Does not seem to work" is not a very clear description of what is wrong. Do you get error messages? Is the output wrong? Do you have any code surrounding the path expression you could show?
You can use substring, but using substring-after() would be even better.
substring-after(/a/#href,'=')
assuming as input the tiny snippet you have shown:
<a href="page.php?var=12345"/>
will select
12345
and taking into account the structure of your input
substring-after(./td[3]/a[1]/#href,'=')
A leading . in a path expression selects only immediate child td nodes of the current context node. I trust you know what you are doing.

#driver.find_element(:id=>"body").text.include?(textcheck) not verifying the text only the id

I am using Selenium-WebDriver for Ruby and I am trying to verify that text is present on a page. I have done many searches and tried many things and the best answer I have found is to use something like
def check_page(textcheck)
if verify {#driver.find_element(:id=>"body").text.include?(textcheck)}
yield it_to "fail"
else
yield it_to "pass"
end
end
The expected outcome if the value of textcheck is present in the body would be pass and if the value of textcheck is not present in the body it would be fail. What is actually happening is if :id=>"body" is present then it is pass and if it is not present then it is fail regardless of .text.include?(textcheck)
If anyone could point me in the right direction for how to verify text is present on a page using Selenium-WebDriver in Ruby it would be greatly appreciated. I have found workarounds for certain cases where I can do
verify {#driver.find_element(:tag_name, 'h1').text!=(textcheck)}
but the element I am trying to verify I can't get to so easily. I looked into css locators and was very confused on how to simplify the tag so I could use it. Any help would be greatly appreciated. Thank you very much. If you require any more information from me please let me know and I will provide it as soon as possible.
I am using Ruby 1.93 with Selenium-WebDriver 2.25 testing in Firefox 14.0.1
I do it this way
#wait = Selenium::WebDriver::Wait.new(:timeout => 30)
begin
#wait.until { #driver.find_element(:tag_name => "body").text.include?("your text")}
rescue
puts "Failure! text is not present on the page"
#Or do one of the options below
#raise
#assert_match "true","false", "The text is not present"
end
UPDATE
Answer to your question in the comments section.
There are two kind of "waits", implicit wait and explicit wait. You can read more about it here. The reason your code failed was because you were searching by "id"=>"body" and not by "tag_name"=>"body". Usually all text is encompassed within the "body" HTML tags in your DOM.

XPath-REXML-Ruby: Selecting multiple siblings/ancestors/descendants

This is my first post here. I have just started working with Ruby and am using REXML for some XML handling. I present a small sample of my xml file here:
<record>
<header>
<identifier>oai:lcoa1.loc.gov:loc.gmd/g3195.ct000379</identifier>
<datestamp>2004-08-13T15:32:50Z</datestamp>
<setSpec>gmd</setSpec>
</header>
<metadata>
<titleInfo>
<title>Meet-konstige vertoning van de grote en merk-waardige zons-verduistering</title>
</titleInfo>
</metadata>
</record>
My objective is to match the last numerical value in the tag with a list of values that I have from an array. I have achieved this with the following code snippet:
ids = XPath.match(xmldoc, "//identifier[text()='oai:lcoa1.loc.gov:loc.gmd/"+mapid+"']")
Having got a particular identifier that I wish to investigate, now I want to go back to and select and then select to get the value in the node for that particular identifier.
I have looked at the XPath tutorials and expressions and many of the related questions on this website as well and learnt about axes and the different concepts such as ancestor/following sibling etc. However, I am really confused and cannot figure this out easily.
I was wondering if I could get any help or if someone could point me towards an online resource "easy" to read.
Thank you.
UPDATE:
I have been trying various combinations of code such as:
idss = XPath.match(xmldoc, "//identifier[text()='oai:lcoa1.loc.gov:loc.gmd/"+mapid+"']/parent::header/following-sibling::metadata/child::mods/child::titleInfo/child::title")
The code compiles but does not output anything. I am wondering what I am doing so wrong.
Here's a way to accomplish it using XPath, then going up to the record, then XPath to get the title:
require 'rexml/document'
include REXML
xml=<<END
<record>
<header>
<identifier>oai:lcoa1.loc.gov:loc.gmd/g3195.ct000379</identifier>
<datestamp>2004-08-13T15:32:50Z</datestamp>
<setSpec>gmd</setSpec>
</header>
<metadata>
<titleInfo>
<title>Meet-konstige</title>
</titleInfo>
</metadata>
</record>
END
doc=Document.new(xml)
mapid = "ct000379"
text = "oai:lcoa1.loc.gov:loc.gmd/g3195.#{mapid}"
identifier_nodes = XPath.match(doc, "//identifier[text()='#{text}']")
record_node = identifier_nodes.first.parent.parent
record_node.elements['metadata/titleInfo/title'].text
=> "Meet-konstig"

Resources