XPath returning IndexError: list index out of range with Specific Example - xpath

In order to test lxml's XPath, I have created a test file called output1. This test file is a small, isolated snippet of XML. Running XPath on output1 works as expected. However, when I try to run my code on the full test file that output1 is extracted from, the code throws an error that does not make sense.
This is the file:
<ADMIN-DATA>
<SOLAR>
<PLANET GID="SOLAR-SYSTEM">
<MOON GID="EDITABLE">
<EARTH GID="XPATH-EDITABLE">
<NUM GID="NAME">BongiornoPizza</NUM>
</EARTH>
</MOON>
</PLANET>
</SOLAR>
</ADMIN-DATA>
My script begins parsing PLANET's children (starting with MOON, who has GID="EDITABLE"). The issue is that I can't use XPath to get the direct and only child of MOON, which is EARTH. However, I can iterate using a for loop and manually get to EARTH. This is strange, considering I also tested it on an even smaller example using XPath and it worked.
The for loop for inner_node in node: works fine, and will get me EARTH.
Breaking code: (Using XPath to get EARTH)
if "function Comment" not in str(node.tag):
if node.attrib["GID"] == "EDITABLE": # passes
xpath_editable_node = node.xpath('./EARTH')[0] # BREAKS ON THIS LINE
name_node = xpath_editable_node.xpath('./NUM')[0]
print(name_node.text) # should contain BongiornoPizza
Working code: (Uses for loop to get EARTH)
if "function Comment" not in str(node.tag):
if node.attrib["GID"] == "EDITABLE": # passes
for inner_node in node:
print(inner_node)
if inner_node.attrib["GID"] == "XPATH-EDITABLE":
print("INSIDE")
name_node = inner_node.xpath('./NUM')[0]
print(name_node.text) # should contain BongiornoPizza
I can achieve what I want with for loops and manual iteration, but it would be nice to be able to consistently use the XPath library in a clean manner. I think I'm probably missing a simple thing and that's why it is not working.

Related

Logging and asserting the number of previously-unknown DOM elements

I'ts my first tme using Cypress and I almost finalized my first test. But to do so I need to assert against a unknown number. Let me explain:
When the test starts, a random number of elements is generated and I shouldn't have control on such a number (is a requirement). So, I'm trying to get such number in this way:
var previousElems = cy.get('.list-group-item').its('length');
I'm not really sure if I'm getting the right data, since I can not log it (the "cypress console" shows me "[Object]" when I print it). But let's say such line returns (5) to exemplify.
During the test, I simulate a user creating extra elements (2) and removing an (1) element. Let's say the user just creates one single extra element.
So, at the end os the test, I need to check if the number of eements with the same class are equals to (5+2-1) = (6) elements. I'm doing it in this way:
cy.get('.list-group-item').its('length').should('eq', (previousTasks + 1));
But I get the following message:
CypressError: Timed out retrying: expected 10 to equal '[object Object]1'
So, how can I log and assert this?
Thanks in advance,
PD: I also tryed:
var previousTasks = (Cypress.$("ul").children)? Cypress.$("ul").children.length : 0;
But it always returns a fixed number (2), even if I put a wait before to make sure all the items are fully loaded.
I also tryed the same with childNodes but it always return 0.
Your problem stems from the fact that Cypress test code is run all at once before the test starts. Commands are queued to be run later, and so storing variables as in your example will not work. This is why you keep getting objects instead of numbers; the object you're getting is called a chainer, and is used to allow you to chain commands off other commands, like so: cy.get('#someSelector').should('...');
Cypress has a way to get around this though; if you need to operate on some data directly, you can provide a lambda function using .then() that will be run in order with the rest of your commands. Here's a basic example that should work in your scenario:
cy.get('.list-group-item').its('length').then(previousCount => {
// Add two elements and remove one...
cy.get('.list-group-item').its('.length').should('eq', previousCount + 1);
});
If you haven't already, I strongly suggest reading the fantastic introduction to Cypress in the docs. This page on variables and aliases should also be useful in this case.

Trying to exclude a portion of an xPath

I have looked through several posts about this, but have failed to apply the principles used to get the result I desire, so I'm going to just post my specific problem.
I am building a Google Sheet that enables the user to pull up Bible verses.
I have it all working, however I am running into an issue with a hidden element being pulled into my text().
FUNCTION:
=IMPORTXML("http://www.biblestudytools.com/ESV/Numbers/5-3.html",
"//*[#class='scripture']//span[2]//text()")
RESULT: You shall put out both male and female, putting them outside the camp, that they may not defile their camp, 1in the midst of which I dwell."
You can see the "1" that is showing up before the word "in"
I have found the xPath that pulls only that "1"
//*[#class='scripture']//span[2]//sup//text()
I am trying to remove that "1" from the text.
HELP PLEASE!!! :)
You can add a predicate to the end to exclude text nodes that are inside sup elements:
=IMPORTXML("http://www.biblestudytools.com/ESV/Numbers/5-3.html",
"//*[#class='scripture']//span[2]//text()[not(ancestor::sup)]")
This will retrieve only the text nodes that are not inside a sup element, but it will still result in having the verse spread out across two cells, because there are two text nodes. You can rectify this by wrapping this expression in a JOIN():
=JOIN("", IMPORTXML("http://www.biblestudytools.com/ESV/Numbers/5-3.html",
"//*[#class='scripture']//span[2]//text()[not(ancestor::sup)]"))

How to make nightwatch.js test more DRY

I'm learning nightwatch.js and am finding a lot of repeated code. For example,
// works
objects.expect.element('#user').to.be.present
objects.expect.element('#user').value.to.match(/\S/)
objects.expect.element('#user').value.to.not.equal(username)
Where objects.expect.element('#user') is obviously repeated. In this case, if I don't repeat that for each line, then I end up failing.
For example, if instead I use
// fails
objects.expect.element('#user').to.be.present
.value.to.match(/\S/)
.value.to.not.equal(username)
results in a fail message starting with, Expected element <#who> to be present not equal: "Ned the Nighthawk" not match: "/\S/" - expected "not present" but got: present
Is there a way to make this code more DRY?
Unfortunately nightwatch's expect namespace doesn't offer chaining.
In order to chain your assertions you need to use classic assert/verify library.
http://nightwatchjs.org/api#assertions

getting attribute via xpath query succesfull in browser, but not in Robot Framework

I have a certain XPATH-query which I use to get the height from a certain HTML-element which returns me perfectly the desired value when I execute it in Chrome via the XPath Helper-plugin.
//*/div[#class="BarChart"]/*[name()="svg"]/*[name()="svg"]/*[name()="g"]/*[name()="rect" and #class="bar bar1"]/#height
However, when I use the same query via the Get Element Attribute-keyword in the Robot Framework
Get Element Attribute//*/div[#class="BarChart"]/*[name()="svg"]/*[name()="svg"]/*[name()="g"]/*[name()="rect" and #class="bar bar1"]/#height
... then I got an InvalidSelectorException about this XPATH.
InvalidSelectorException: Message: u'invalid selector: Unable to locate an
element with the xpath expression `//*/div[#class="BarChart"]/*[name()="svg"]/*
[name()="svg"]/*[name()="g"]/*[name()="rect" and #class="bar bar1"]/`
So, the Robot Framework or Selenium removed the #-sign and everything after it. I thought it was an escape -problem and added and removed some slashes before the #height, but unsuccessful. I also tried to encapsulate the result of this query in the string()-command but this was also unsuccessful.
Does somebody has an idea to prevent my XPATH-query from getting broken?
It looks like you can't include the attribute axis in the XPath itself when you're using Robot. You need to retrieve the element by XPath, and then specify the attribute name outside that. It seems like the syntax is something like this:
Get Element Attribute xpath=(//*/div[#class="BarChart"]/*[name()="svg"]/*[name()="svg"]/*[name()="g"]/*[name()="rect" and #class="bar bar1"])#height
or perhaps (I've never used Robot):
Get Element Attribute xpath=(//*/div[#class="BarChart"]/*[name()="svg"]/*[name()="svg"]/*[name()="g"]/*[name()="rect" and #class="bar bar1"])[1]#height
This documentation says
attribute_locator consists of element locator followed by an # sign and attribute name, for example "element_id#class".
so I think what I've posted above is on the right track.
You are correct in your observation that the keyword seems to removes everything after the final #. More correctly, it uses the # to separate the element locator from the attribute name, and does this by splitting the string at that final # character.
No amount of escaping will solve the problem as the code isn't doing any parsing at this point. This is the exact code (as of this writing...) that performs that operation:
def _parse_attribute_locator(self, attribute_locator):
parts = attribute_locator.rpartition('#')
...
The simple solution is to drop that trailing slash, so your xpath will look like this:
//*/div[#class="BarChart"]/... and #class="bar bar1"]#height`

Using Nokogiri with multiple search elements

In this XML snippet I need to replace the data in the UID for some of the blocks. The actual file contains more than 100 similar blocks.
Although I have been able to extract subsets based on name="Track (Timeline)", I am struggling to reduce this subset to the specific block I need by also using the data in the <TrackID>, if name="Track (TimeLine)" and the text of <TrackID> is 0x1200 then set UID to xxxx.
I am new to Nokogiri and, although I write test scripts, I do not consider myself a programmer.
<StructuralMetadata key="06.0E.2B.34.02.53.01.01.0D.01.01.01.01.01.3B.00" length="116" name="Track (TimeLine)">
<EditRate>25/1</EditRate>
<Origin>0</Origin>
<Sequence>32-04-25-67-E7-A7-86-4A-9B-28-53-6F-66-74-65-6C</Sequence>
<TrackID>0x1200</TrackID>
<TrackName>Softel VBI Data</TrackName>
<TrackNumber>0x17010101</TrackNumber>
<UID>34-C1-B9-B9-5F-07-A4-4E-8F-F4-53-6F-66-74-65-6C</UID>
</StructuralMetadata>
<StructuralMetadata key="06.0E.2B.34.02.53.01.01.0D.01.01.01.01.01.3B.00" length="116" name="Track (TimeLine)">
<EditRate>25/1</EditRate>
<Origin>0</Origin>
<Sequence>35-12-2D-86-E6-74-0B-4C-B4-24-53-6F-66-74-65-6C</Sequence>
<TrackID>0x1300</TrackID>
<TrackName>Softel VBI Data</TrackName>
<TrackNumber>0x0</TrackNumber>
<UID>37-0C-80-34-4C-8D-CE-41-85-F3-53-6F-66-74-65-6C</UID>
</StructuralMetadata>
Using xpath:
//StructuralMetadata
will select all StructuralMetadata elements in your XML. The double slash at the start means to select nodes wherever they appear in the document.
You don't want all the nodes though, you can filter the ones you want with a predicate:
//StructuralMetadata[#name="Track (TimeLine)" and TrackID="0x1200"]
This will select all StructuralMetadata elements that have a name attribute with the value Track (TimeLine), and a TrackID child element with contents 0x1200.
As you're interested in the UID element, you can further refine the expression:
//StructuralMetadata[#name="Track (TimeLine)" and TrackID="0x1200"]/UID
This expression will match all the UID elements that are children of StructuralMetadata elements that match the predicate described above.
Putting this to use:
require 'nokogiri'
# Parse the document, assuming xml_file is a File object containing the XML
doc = Nokogiri::XML(xml_file)
# I'm assuming there is only one element in the document that matches
# the criteria, so I'm using at_xpath
node = doc.at_xpath('//StructuralMetadata[#name="Track (TimeLine)" and TrackID="0x1200"]/UID')
# At this point, doc contains a representation of the xml, and node points to
# the UID node within that representation. We can update the contents of
# this node
node.content = 'XXX'
# Now write out the updated XML. This just writes it to standard output,
# you could write it to a file or elsewhere if needed
puts doc.to_xml
A great way to approach this problem is with the ‘map reduce’ style of programming, which works to take a large list of things and narrow it down and combine it into the result you're after. Specifically, Array#find and Array#select are really useful for this sort of problem. Check out this example:
require 'nokogiri'
xml = Nokogiri::XML.parse(File.read "sample.xml")
element = xml.css('StructuralMetadata').find { |item|
item['name'] == "Track (TimeLine)" and item.css('TrackID').text == "0x1200"
}
puts element.to_xml
This little program first uses the CSS selector to get all of the <StructuralMetadata> elements in the document. It returns an array, which we can filter to just what we want using the Array#find method. Array#select is its cousin which returns an array of all the matching objects instead of the first one it happens to find.
Inside the block we have a test to check if the <StructuralMetadata> tag is the one we’re after. Then it puts the element.to_xml string to the console so you can see which thing it found if you run this as a command-line script. Now you can find the element, you can modify it in the usual way and save out a new XML file or whatever.

Resources