Getting nodes under a specific node element - xpath

I need help with my problem over here or at least some advice. I am parsing a HTML document using a HTMLcleaner with the use of XPATH.
I have something like this:
<html>
[code and other <h4> tags]
<h4>Random name</h4>
Text I want to get
Text I want to get 2
Text I want to get 3
Text I want to get 4
<h4> Random name 2 </h4>
Text I don't want to get
[code and other <h4> tags]
</html>
Ok. I have several <h4> tags, each one of them with <a> tags and with the some text. My problem is that I don't know how to get all the respective the text from a specific , just like a "h4[i]". I tried something like this but it didn't work:
String xpath = "h4["+number+"]//a" //where number will increment
Thank you in advice for you help!

Use:
/*/h4[1]/following-sibling::a[not(preceding-sibling::h4[2])]/text()
XSLT - based verification:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:copy-of select=
"/*/h4[1]/following-sibling::a[not(preceding-sibling::h4[2])]/text()"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the following XML document (the provided fragment, wrapped in a single top element to become an well-formed XML document):
<html>
<h4>Random name</h4>
Text I want to get
Text I want to get 2
Text I want to get 3
Text I want to get 4
<h4> Random name 2 </h4>
Text I don't want to get
</html>
The Xpath expression is evaluated and all selected (text) nodes are copied to the output:
Text I want to get Text I want to get 2 Text I want to get 3 Text I want to get 4

Related

XSLT3 joining values with separator

I'm pretty new to XSLT and I've been struggling to replicate the solution mentioned here
XSL for-each: how to detect last node?
for longer than I'm willing to admit :(
I've setup this fiddle. https://xsltfiddle.liberty-development.net/naZXVFi
I was hoping I could use just the value-of + separator, vs choose / when xslt tools, as it did seem more idiomatic.
I can't get the separator to show up;
nor can I select just the child of skill, I always get the descendants too. That's to say, I shouldn't see any detail in the output.
bonus: not sure why that meta tag is not self closing (warning in the html section)
Desired output:
skill1, skill2, skill3, skill4, skill5 (no comma space for the last one)
Any help would be greatly appreciated. Thanks.
EDIT: including the code here too:
xml: (need to add ref to xslt):
<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="test.xsl"?> <!-- not in fiddle -->
<skills>
<skill>skill1</skill>
<skill>skill2</skill>
<skill>skill3
<details>
<detail>detail1</detail>
<detail>detail2</detail>
</details>
</skill>
<skill>skill4</skill>
<skill>skill5</skill>
</skills>
And test.xsl:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
xmlns:map="http://www.w3.org/2005/xpath-functions/map"
xmlns:array="http://www.w3.org/2005/xpath-functions/array"
exclude-result-prefixes="#all"
version="3.0">
<xsl:mode on-no-match="shallow-copy"/>
<xsl:output method="html" indent="yes" html-version="5"/>
<xsl:template match="/">
<html>
<head>
<title>.NET XSLT Fiddle Example</title>
</head>
<body>
<xsl:for-each select="/skills/skill">
<xsl:value-of select="." separator=", "/>
</xsl:for-each>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
In general, with XSLT 2/3 to output a sequence separated by some separator string, you simply use xsl:value-of select="$sequence" with the appropriate separator string in the separator attribute (and no for-each):
<xsl:template match="skills">
<xsl:value-of select="skill/text()[normalize-space()]/normalize-space()" separator=", "/>
</xsl:template>
https://xsltfiddle.liberty-development.net/naZXVFi/1
In most cases you would just need select="skill" separator=", " but given your descendants and the white space you seem to want to eliminate the select expression above is a bit more complicated.
Martin has given you the detailed work-through to get the final result including getting rid of the extra spaces etc, but at a high level, here's how to use xsl:value-of with separator correctly.
You have:
<body>
<xsl:for-each select="/skills/skill">
<xsl:value-of select="." separator=", "/>
</xsl:for-each>
</body>
This says that for each skill node, take the content of that node and display it. Notably, the value-of only sees one skill at a time, so there is nothing to join with the comma separator.
The answer which would get you what you want is:
<body>
<xsl:value-of select="/skills/skill" separator=", "/>
</body>
This says to take the set of skill nodes and display them joined by comma separators. You can see the output at https://xsltfiddle.liberty-development.net/naZXVFi/4

Xpath how do I grab contents of an href based on the contents of the href

How do I grab the contents of an href if it includes a specific word, example:
click here
How do I grab 'contacts.asp' based on that it has the word 'contact' in it?
tried variations of //a/#href[contains(#href,'contact')] but don't seem to be getting anywhere
tried variations of //a/#href[contains(#href,'contact')] but don't seem to be getting anywhere
You are nearly there.
In the contains test, you are already in the context of the href attribute, so your test should be against . rather than the #href your xpath has, which is attempting to look for a href attribute under the href attribute. This of course won't work.
Try
//a/#href[contains(.,'contact')]
This says "find all href attributes on a elements, such that the href attribute value itself contains contact".
Note that this returns the href attribute; the library you're usnig will then have a way to pick out the value.
In your Path you are below #href, so your contains won't work.
Try it like this:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes" indent="yes" version="1.0" encoding="utf-8"/>
<xsl:template match="/">
<xsl:value-of select="//a[contains(#href,'contact')]"/>
</xsl:template>
</xsl:stylesheet>

why does this xpath selector fail?

given the following html
<p>
<div class="allpricing">
<p class="priceadorn">
<FONT CLASS="adornmentsText">NOW: </FONT>
<font CLASS="adornmentsText">$1.00</font>
</p>
</div>
</p>
why does
//div[#class="allpricing"]/p[#class="priceadorn"][last()]/font[#class="adornmentsText"][last()]
return the expected value of $1.00
but adding the p element
//p/div[#class="allpricing"]/p[#class="priceadorn"][last()]/font[#class="adornmentsText"][last()]
returns nothing?
You cannot place a div inside a p. The div start closes the p automatically. See
Nesting block level elements inside the <p> tag... right or wrong?
I've often found that fixing the cases was the culprit. XPath 1.0 is case sensitive and unless you take care of the mixed cases explicitly, it will fail in a lot of cases.
XPath is case-sensitive.
None of the provided XPath expressions selects any node, because in the provided XML document there is no font element with an attribute named class (the element font has a CLASS attribute and this is different from having a class attribute due to the different capitalization).
Due to the same reason, font and FONT are elements with different names.
These two XPath expressions, when evaluated against the provided XML document, produce the same wanted result:
//div[#class="allpricing"]
/p[#class="priceadorn"]
[last()]
/font[#CLASS="adornmentsText"]
[last()]
and
//p/div[#class="allpricing"]
/p[#class="priceadorn"]
[last()]
/font[#CLASS="adornmentsText"]
[last()]
XSLT - based verification:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:copy-of select=
'//div[#class="allpricing"]
/p[#class="priceadorn"]
[last()]
/font[#CLASS="adornmentsText"]
[last()]'/>
=============
<xsl:copy-of select=
'//p/div[#class="allpricing"]
/p[#class="priceadorn"]
[last()]
/font[#CLASS="adornmentsText"]
[last()]
'/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the provided XML document:
<p>
<div class="allpricing">
<p class="priceadorn">
<FONT CLASS="adornmentsText">NOW: </FONT>
<font CLASS="adornmentsText">$1.00</font>
</p>
</div>
</p>
the two expressions are evaluated and the results of this evaluation are copied to the output:
<font CLASS="adornmentsText">$1.00</font>
=============
<font CLASS="adornmentsText">$1.00</font>
You describe your source as an HTML rather than an XML document, but you haven't explained how you parsed it. If you parse it using an HTML parser, the parser will "repair" it to turn it into valid HTML, which means that the tree it constructs doesn't directly reflect what you wrote in the source. XPath sees this "repaired" tree, not the original.

Find an element that only has one other kind of child

I want to use XPath to find every <blockquote> element that has at least one child <pre> element, no other kinds of child elements, and optionally text nodes as children:
<body><div><!-- arbitrary nesting -->
<blockquote><pre>YES</pre></blockquote>
<blockquote><p>NO</p></blockquote>
<blockquote><pre>NO</pre><p>NO</p></blockquote>
<blockquote><p>NO</p><pre>NO</pre></blockquote>
<blockquote><pre>YES</pre> <pre>YES</pre></blockquote>
<blockquote>NO</blockquote>
</div></body>
This XPath appears to work, but I suspect that it's overly complicated:
//blockquote[pre][not(*[not(name()="pre")])]
Is there a better (less code, more efficient, more DRY) way to select what I want?
//blockquote[pre][count(pre)=count(*)]
Use:
//blockquote[* and not(*[not(self::pre)])]
This selects all blockquote elements in the XML document that have at least one element child and don't have any element child that isn't a pre element.
This is just an application of the double negation law :).
Do note, that this expression is more efficient than one that counts all element children (because the selection stops right at the moment a non-pre child is found).
XSLT - based verification:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:copy-of select="//blockquote[* and not(*[not(self::pre)])]"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the provided XML document:
<body><div><!-- arbitrary nesting -->
<blockquote><pre>YES</pre></blockquote>
<blockquote><p>NO</p></blockquote>
<blockquote><pre>NO</pre><p>NO</p></blockquote>
<blockquote><p>NO</p><pre>NO</pre></blockquote>
<blockquote><pre>YES</pre> <pre>YES</pre></blockquote>
<blockquote>NO</blockquote>
</div></body>
the XPath expression is evaluated and the selected nodes are copied to the output:
<blockquote>
<pre>YES</pre>
</blockquote>
<blockquote>
<pre>YES</pre>
<pre>YES</pre>
</blockquote>

how to display an image using xslt

I have the following code
<xsl:template name="toggle">
<xsl:param name="target"/>
<xsl:param name="show"/>
<input type="image" src="glass.png" />
<xsl:attribute name="onclick">
toggle('<xsl:value-of select="$target"/>','<xsl:value-of select="$show"/>');
</xsl:attribute>
</input>
</xsl:template>
I want to add an external image which is not part of xml file. I want to replace my Submit button with an image.
When using the above code I am unable to get the image as output.
Any ideas on how to do it?
Do you know what HTML you want to generate?
If you do, then tell us.
If you don't, then you have an HTML problem, not an XSLT problem.
Never try to write code in XSLT until you know what output HTML you want it to produce. Actually, I think it was Dijkstra who said you should never start writing any program until you know what output you want it to produce. A good principle. When applied to XSLT, remember that the output in this sense is an HTML document, not a screen displayed by the browser.

Resources