from the given html :
<span class="flag_16 left_16 armenia_16_left"> First League</span>
how i can get the (armenia) string only or at least (armenia_16_left).
thanks in advance.
Use this XPath 1.0 expression:
substring-before(substring-after(substring-after(/span /#class, ' '), ' '), '_')
In XPath 2.0 one can simply use:
tokenize(tokenize(/span /#class, ' ')[last()], '_')[1]
XSLT-based verification:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
"<xsl:value-of select=
"substring-before(substring-after(substring-after(/span /#class, ' '), ' '), '_')
"/>"
</xsl:template>
</xsl:stylesheet>
when this transformation is applied on the provided XML document:
<span class="flag_16 left_16 armenia_16_left"> First League</span>
the Xpath expression is evaluated and the result is copied to the output:
"armenia"
When this XSLT 2.0 transformation is applied on the same XML document (above):
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
"<xsl:sequence select=
"tokenize(tokenize(/span /#class, ' ')[last()], '_')[1]"/>"
</xsl:template>
</xsl:stylesheet>
again the same correct result is produced:
"armenia"
Related
I am trying to output duplicate values across different nodes and value by using XSLT. I want the node element to be dynamic so it can track different value after the namespace prefix, for example: car:ID, car:Name, car:Location_name, or more. I know i can use the function Local-Name(.) but I am not sure how to apply to my XSLT logic. please help
the sample XML as follow:
<car:root xmlns:car="com.sample">
<Car_Input_Request>
<car:Car_Details>
<car:ID>Car_001</car:ID>
<car:Name>Fastmobile</car:Name>
<car:Local_Name>New York</car:Local_Name>
<car:Transmission_Reference_Type>
<car:ID car:type="Transmission_Reference_Type">Automatic</car:ID>
</car:Transmission_Reference_Type>
</car:Car_Details>
</Car_Input_Request>
<Car_Input_Request>
<car:Car_Details>
<car:ID>Car_002</car:ID>
<car:Name>Slowmobile</car:Name>
<car:Local_Name>New York</car:Local_Name>
<car:Transmission_Reference_Type>
<car:ID car:type="Transmission_Reference_Type">Manual</car:ID>
</car:Transmission_Reference_Type>
</car:Car_Details>
</Car_Input_Request>
<Car_Input_Request>
<car:Car_Details>
<car:ID>Car_001</car:ID>
<car:Name>Fastmobile</car:Name>
<car:Local_Name>New York</car:Local_Name>
<car:Transmission_Reference_Type>
<car:ID car:type="Transmission_Reference_Type">Automatic</car:ID>
</car:Transmission_Reference_Type>
</car:Car_Details>
</Car_Input_Request>
</car:root>
The XSLT used:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:car="com.sample"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="3.0">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:value-of select="//car:ID[ let $v:=string(.),$t:=#car:type return not( preceding::car:ID[string(.) = $v and #car:type=$t]) ]/(let $v:=string(.), $t:=#car:type,$c:=1+count(following::car:ID[string(.)=$v and $t=#car:type]) ,$c:=1+count(following::car:*[string(.)=$v]) return if ($c > 1) then concat( string(.), ' occurs ', $c, ' times for type ', $t, '
') else () )"/>
</xsl:template>
</xsl:stylesheet>
output shown from xslt:
Car_001 occurs 2 times for type
Automatic occurs 2 times for type Transmission_Reference_Type
But I want it to show
Car_001 occurs 2 times for type ID
Fastmobile occurs 2 times for type Name
Automatic occurs 2 times for type Transmission_Reference_Type
New York occurs 3 times for type Local_Name
If you are looking for an XSLT solution (rather than a single line XPath expression), you could make use of xsl:for-each-group with a composite key:
<xsl:stylesheet
xmlns:car="com.sample"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
expand-text="yes"
version="3.0">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:for-each-group select="//car:Car_Details/*" group-by="local-name(), normalize-space()" composite="yes">
<xsl:if test="current-group()[2]">
<xsl:text>{normalize-space()} occurs {count(current-group())} times for {local-name()}
</xsl:text>
</xsl:if>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
Using XSLT2 with the latest Saxon HE.
I'm trying to pass multiple coordinate parameters from a script to XSL in order to filter results based on a location boundary box
Script:
java -jar saxon9he.jar -s:litter_bins.xml -o:"bins.xml" -xsl:"Split xml coords.xsl" Coord_2=51.3725 Coord_4=51.3751 Coord_1=-2.3615 Coord_3=-2.3572
XSL:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="Coord_2" select="Coord_2"/>
<xsl:param name="Coord_4" select="Coord_4"/>
<xsl:param name="Coord_1" select="Coord_1"/>
<xsl:param name="Coord_3" select="Coord_3"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="node[#lat[ . < $Coord_2 or . > $Coord_4 ] or #lon[ . < $Coord_1 or . > $Coord_3]]"/>
</xsl:stylesheet>
The above returns:
<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="JOSM"/>
However if I hard code the coordinates into the match xpath, it returns the expected results.
Xpath:
<xsl:template match="node[#lat[ . < 51.3725 or . > 51.3751 ] or #lon[ . < -2.3615 or . > -2.3572]]"/>
Results:
<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="JOSM">
<node id="-102973" visible="true" lat="51.37283499216" lon="-2.359890029">
<tag k="date_creat" v="17/07/2014 07:59:04 AM UTC"/>
<tag k="form_recor" v="888"/>
</node>
<snip...>
</osm>
What am I misunderstanding?
Try to declare a numeric type for the parameters e.g. <xsl:param name="Coord_2" as="xs:double"/> or <xsl:param name="Coord_2" as="xs:decimal"/>. Of course for that your stylesheet needs to declare xmlns:xs="http://www.w3.org/2001/XMLSchema" as a namespace declaration on the root element.
Without a numeric type I think the comparison will be of two xs:untypedAtomic values and then https://www.w3.org/TR/xpath-31/#id-general-comparisons demands
If both atomic values are instances of xs:untypedAtomic, then the
values are cast to the type xs:string
and then the string comparison of negative numbers fails to give you the wanted result.
I'd like to use XPath to retrieve the longer of two nodes.
E.g., if my XML is
<record>
<url1>http://www.google.com</url1>
<url2>http://www.bing.com</url2>
</record>
And I do document.SelectSingleNode(your XPath here)
I would expect to get back the url1 node. If url2 is longer, or there is no url1 node, I'd expect to get back the url2 node.
Seems simple but I'm having trouble figuring it out. Any ideas?
This works for me, but it is ugly. Cannot you do the comparison outside XPath?
record/*[starts-with(name(),'url')
and string-length(.) > string-length(preceding-sibling::*[1])
and string-length(.) > string-length(following-sibling::*[1])]/text()
<xsl:for-each select="*">
<xsl:sort select="string-length(.)" data-type="number"/>
<xsl:if test="position() = last()">
<xsl:copy-of select="."/>
</xsl:if>
</xsl:for-each>
Even works in XSLT 1.0!
Use this single XPath expression:
/*/*[not(string-length(preceding-sibling::*|following-sibling::*)
>
string-length()
)
]
XSLT - based verification:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:copy-of select=
"/*/*[not(string-length(preceding-sibling::*|following-sibling::*)
>
string-length()
)
]"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the provided XML document:
<record>
<url1>http://www.google.com</url1>
<url2>http://www.bing.com</url2>
</record>
the Xpath expression is evaluated and the result of this evaluation (the selected element) is copied to the output:
<url1>http://www.google.com</url1>
I have a HTML which contains some tags like below:
<div id="SNT">text1</div>
<div id="SNT">text2</div>
<div id="SNT">textbase1<span style='color: #EFFFFF'>text3</span></div>
<div id="SNT">textbase2<span style='color: #EFFFFF'>text4</span></div>
how can I get all the texts included in all <div> tags using XPath, ignoring the span fields?
i.e.:
text1
text2
textbase1text3
textbase2text4
This cannot be specified with a single XPath 1.0 expression.
You need to first select all relevant div elements:
//div[#id='SNT']
then for each selected node get its string node:
string(.)
In XPath 2.0 this can be specified with a single expression:
//div[#id='SNT]/string(.)
XSLT - based verification:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="div[#id='SNT']">
<xsl:copy-of select="string()"/>
========
</xsl:template>
</xsl:stylesheet>
When this XSLT 1.0 transformation is applied on the following XML document (the provided XML fragment, wrapped into a single top element):
<t>
<div id="SNT">text1</div>
<div id="SNT">text2</div>
<div id="SNT">textbase1<span style='color: #EFFFFF'>text3</span></div>
<div id="SNT">textbase2<span style='color: #EFFFFF'>text4</span></div>
</t>
the relevant div elements are selected (matched) and processed by the only specified template, in which the string(.) XPath expression is evaluated and its result is copied to the output:
text1
========
text2
========
textbase1text3
========
textbase2text4
========
And for the XPath 2.0 expression:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:copy-of select="//div[#id='SNT']/string(.)"/>
</xsl:template>
</xsl:stylesheet>
When this XSLT 2.0 transformation is applied on the same XML document (above), the XPath 2.0 expression is evaluated and the result (four strings) is copied to the output:
text1 text2 textbase1text3 textbase2text4
You could simply use:
//div/text()
or
div/text()
Hope this helps.
Here's a link The lxml.etree Tutorial, and search Using XPath to find text
For example:
from lxml import etree
html = """
<span class='demo'>
Hi,
<span>Tom</span>
</span>
tree = etree.HTML(html)
node = tree.xpath('//span[#class="demo"]')[0]
print(node.xpath('string()')
If there is no other content in the HTML files, just those <div>s inside the usual HTML root elements, the following stylesheet will be sufficient to extract the text:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
</xsl:stylesheet>
If you only want the <div>s, and only with those particular IDs, use the following code - it also makes sure the linebreaks are like in your example:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="//div[#id='SNT']">
<xsl:copy-of select="node()|text()"/><xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
Using only an XPath expression (and not in XSLT or DOM - just pure XPath), I'm trying to create a relative path from the current node (in a td) to an associated td in the same column of the same HTML table.
For example, suppose I have this type of data:
<table>
<tr> <td><a>Blue Jeans</a></td> <td><a>Shirt</a></td> </tr>
<tr> <td><span>$21.50</span></td> <td><span>$18.99</span></td> </tr>
</table>
and I'm on the a with "Blue Jeans" and want to find the price ($21.50). In XSLT, I could use the current() function to get the answer like this:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />
<xsl:template match="/">
<xsl:apply-templates select="//a" />
</xsl:template>
<xsl:template match="a">
Name: <xsl:value-of select="."/>
Price: <xsl:value-of select="../../following-sibling::tr[1]/td[position() = count(current()/../preceding-sibling::td) + 1]" />
</xsl:template>
</xsl:stylesheet>
But the problem I'm running into is that there is no current() defined in XPath 1.0. I tried using the self:: axis, but like the "." shorthand, that only points to the "context" node, not the "current" node. The language that I'm seeing in the XPath standard suggests that XPath doesn't have a concept of "current node."
Is there perhaps another way to form this path or is this a limitation of XPath?
In XPath 1.0 you could do:
/table/tr/td/a[.='Blue Jeans']/following::td[count(../td)]/span
Of course, this assumes there is no colspan.
EDIT: The proof. This stylesheet:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text"/>
<xsl:param name="pProduct" select="'Blue Jeans'"/>
<xsl:template match="/">
<xsl:value-of select="/table/tr/td/a[.=$pProduct]
/following::td[count(../td)]/span"/>
</xsl:template>
</xsl:stylesheet>
Output:
$21.50
With param pProduct set to 'Shirt', output:
$18.99
Note: Of course, you need the a element in context in order to select the span element. So, with your stylesheet:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text"/>
<xsl:template match="text()"/>
<xsl:template match="a">
Name: <xsl:value-of select="."/>
Price: <xsl:value-of select="following::td[count(../td)]/span" />
</xsl:template>
</xsl:stylesheet>
Output:
Name: Blue Jeans
Price: $21.50
Name: Shirt
Price: $18.99
This cannot be achieved with a single XPath 1.0 expression.
In XPath 2.0 one could write:
for $vPreceeding in count(../preceding-sibling::td)
return ../../following-sibling::tr[1]/td[$vPreceeding]