How to escape double quote for Ruby inside of XML - ruby

Although I know that I can use &quote, I was wondering if there was a less blunt and long way, such as \", or the like.
Here is an example of the XML:
<root name="test" type="Node" action="{puts :ROOT.to_s}">
<leaf type="Node" decider="{print :VAL1.to_s; gets.chomp.to_i}" action="{puts :ONE.to_s}" />
<leaf type="Node" decider="{print :VAL2.to_s; gets.chomp.to_i}" action="{puts :TWO.to_s}" />
<branch type="Node" decider="{100}" action="{}">
<leaf type="LikelihoodNode" decider="{100}" action="{puts :HI.to_s}" arg="0"/>
</branch>
</root>
The attributes that need this are decider and action. Right now the embedded code is using a little :sym.to_s hack, but that is not a solution.
NOTE: Although the action attribute is only a block in brackets, the processing code pre-pends the lambda.

A double quote inside an XML attribute is written as &quote; (or " or "). You'll have similar issues with single quotes too so you can't use those. However, you can use % as-is in an XML attribute so %|...|, %Q|...|, and %q|...| are available and they're as easy to read and type as quotes:
<root name="test" type="Node" action="{puts %|ROOT|}">
<leaf type="Node" decider="{print %|VAL1|; gets.chomp.to_i}" action="{puts %|ONE|}" />
<!-- ... -->
</root>
Choose whichever delimiters you find the easiest to type and read.
You can also use single quotes for your attributes in XML so you can have:
<leaf type='Node' decider='{print "VAL1"; gets.chomp.to_i}' ...
But then you'd have to use &apos; inside the attribute if you needed to include a single quote.
Alternatively, you could switch to elements instead of attributes:
<leaf type="Node">
<decider><![CDATA[
print "VAL1"
gets.chomp.to_i
]]></decider>
<action><![CDATA[
puts "ONE"
]]></decider>
</leaf>
but that's a bit verbose, ugly, and not as easy to work with as attributes (IMHO).

Related

How to skip paragraphs with comments in XPath expression?

I'm trying to scrape websites like this with the following Xpath expression:
.//div[#class="tresc"]/p[not(starts-with(text(), "<!--"))]
The thing is that the first paragraph is a comment section, so I'd like to skip it:
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:HyphenationZone>21</w:HyphenationZone>
<w:PunctuationKerning />
<w:ValidateAgainstSchemas />
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid
<w:IgnoreMixedContent>false</w:IgnoreMixedContent
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:Compatibility>
<w:BreakWrappedTables />
<w:SnapToGridInCell />
<w:WrapTextWithPunct />
<w:UseAsianBreakRules />
<w:DontGrowAutofit />
</w:Compatibility>
<w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel>
</w:WordDocument>
</xml><![endif]-->
Unfortunately, my expression does not skip the paragraph with comments. Anyone know what I'm doing wrong?
Comments are not part of text(), they constitute a node of their own: comment(). To exclude p's that contain comments, use
p[not(comment())]

XMLUNIT 2 using comparison with ignore element order with diffbuilder and namespaces fails

I am trying to use DiffBuilder to ignore XML elements order when comparing two .xml files but it fails. I have tried every possible combination and read many articles before posting this question.
For example:
<Data:Keys>
<Data:Value Key="1" Name="Example1" />
<Data:Value Key="2" Name="Example2" />
<Data:Value Key="3" Name="Example3" />
</Data:Keys>
<Data:Keys>
<Data:Value Key="2" Name="Example2" />
<Data:Value Key="1" Name="Example1" />
<Data:Value Key="3" Name="Example3" />
</Data:Keys>
I want these two treated as same XML. Notice that elements are empty, they have only attributes.
What I did so far:
def diff = DiffBuilder.compare(Input.fromString(xmlIN))
.withTest(Input.fromString(xmlOUT))
.ignoreComments()
.ignoreWhitespace()
.checkForSimilar()
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.conditionalBuilder()
.whenElementIsNamed("Data:Keys").thenUse(ElementSelectors.byXPath("./Data:Value",
ElementSelectors.byNameAndText))
.elseUse(ElementSelectors.byName)
.build()))
But it fails every time. I don't know if the issue is the namespace, or that the elements are empty.
Any help will be appricated. Thank you in advance.
if you aim to match tags Data:Value by their attributes together, you should start with this:
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.conditionalBuilder()
.whenElementIsNamed("Data:Value")
and since that tag doesn't have any text, the byNameAndText won't work. You can only work on names and attributes. My advice is to do it like this:
.thenUse(ElementSelectors.byNameAndAttributes("Key"))
or
.thenUse(ElementSelectors.byNameAndAllAttributes())
//equivalent
.thenUse(ElementSelectors.byNameAndAttributes("Key", "Name"))
As of issues with namespaces, checkForSimilar() should output SIMILAR, this means they are not DIFFERENT, so this is what you need. If you didn't use checkForSimilar() the differences in namespaces would be outputed as DIFFERENT.

Using Variable in Node Attribute

I am writing a XSL template. I am not getting how to specify a variable for a custom attribute inside XSL file.
I am trying this code in XSL:
<xsl:variable name="var1" select="DEF"/>
<frequency myAttr="ABC"+$var1 >
<xsl:value-of select="frequency"/>
</frequency>
Expected result is
<frequency myAttr="ABCDEF" >20</frequency>
I am getting this error:
Unable to generate the XML document using the provided XML/XSL input. org.xml.sax.SAXParseException; lineNumber: 18; columnNumber: 24; Element type "sourceId" must be followed by either attribute specifications, ">" or "/>"
The issue is the way I am concatenating is wrong. Any help in achieving this?
I think you mean :
<frequency myAttr="ABC{$var1}" >
This will concatenate the literal string "ABC" with the content of the $var1 variable - see: Attribute Value Templates.
Note that the way you populate the variable suggests there is an element named DEF in the source XML. The expected result will be obtained only if this element has a string value of "DEF".
XSL is a declarative language in XML format. An XSLT file must be well-formed XML.
What you tried looks like you have programming knowledge of an imperative language where you can write expressions like "ABC" + $var1. This violates the XML format rules in XSL however. Elements follow the pattern <elementname attribute1="value1" attributeN="valueN">...</elementname>. In your code, you put +$var1 after the end of the myAttr attribute, which ends at the second quote mark - and this is invalid: <frequency myAttr="ABC"+$var1 >
<frequency myAttr="ABC+$var1"> would result in a literal attribute value of ABC+$var1, which is not what you want. This would also happen if you tried to use XPath syntax as attribute value, concat('ABC', $var1) as string from <frequency myAttr="concat('ABC', $var1)">.
You can use the Attribute Value Template syntax as michael.hor257k suggested, which essentially means to wrap XPath expressions in curly braces inside of attribute value strings.
Another way would be to not write the element as literal in your code, but rather declare it:
<xsl:variable name="var1" select="'DEF'"/>
<xsl:element name="frequency">
<xsl:attribute name="myAttr">
<xsl:value-of select="concat('ABC', $var1)"/>
</xsl:attribute>
</xsl:element>
Note that I corrected the variable definition: In the select attribute, you must put DEF in single quote marks if this is supposed to be a string, like select="'DEF'". Without the single quote marks you define the variable to refer to <DEF> elements in the source XML.
In the select attribute of <xsl:value-of> I used the XPath function concat() to concatenate the string 'ABC' and the content of the variable $var1. The attribute value template syntax can not be used here: select="'ABC{$var1}'" would result in the string ABC{$var1}.
A variation of above example would be to use <xsl:text> for the string ABC and <xsl:value-of> to output the content of $var1:
<xsl:variable name="var1" select="'DEF'"/>
<xsl:element name="frequency">
<xsl:attribute name="myAttr">
<xsl:text>ABC</xsl:text>
<xsl:value-of select="$var1"/>
</xsl:attribute>
</xsl:element>
So there are three solutions, a concise one and two declarative ones. Which one you choose is up to you. The quickest to type is certainly the first:
<xsl:variable name="var1" select="'DEF'"/>
<frequency myAttr="ABC{$var1}">
<xsl:value-of select="frequency"/>
</frequency>
... but you should also be aware of both declarative ways to achieve the same result and understand how they work.

XPath remove single node (via Saxon CLI)

I want to remove a node from an XML file (using SaxonHE9-8-0-11J):
<project name="Build">
<property name="src" value="src/main/resources" />
<property name="target" value="target/classes" />
<condition property="target.exists">
<available file="target" />
</condition>
</project>
Apparently there are 2 ways I can do this.
XPath1: using a not function
XPath2: using an except clause. But both simply return the entire node-set.
With a not function:
saxonb-xquery -s:test.xml -qs:'*[not(local-name()="condition")]'
With an except clause:
saxonb-xquery -s:test.xml -qs:'* except condition'
With -explain switch the queries are:
<query>
<body>
<filterExpression>
<axis name="child" nodeTest="element()"/>
<operator op="ne (on empty return true())">
<functionCall name="local-name">
<dot/>
</functionCall>
<literal value="condition" type="xs:string"/>
</operator>
</filterExpression>
</body>
</query>
and
<query>
<body>
<operator op="except">
<axis name="child" nodeTest="element()"/>
<path>
<root/>
<axis name="descendant" nodeTest="element(condition, xs:anyType)"/>
</path>
</operator>
</body>
</query>
In general, XPath select nodes from one or more input documents, it doesn't allow you to construct new ones, for that you need XSLT or XQuery. And removing the condition child of the project root, if that is what you want to achieve, is something you need XSLT or XQuery for, with XPath, even if you use /*/(* except condition), you then get all children except the condition element, but as a sequence, not wrapped into a a root.
So with XQuery you could use
/*/element {node-name()} { * except condition }
as a compact but generic way to reconstruct any root with all child elements except the condition: https://xqueryfiddle.liberty-development.net/948Fn5b
Whether you get such an expression through a command line shell is a different problem, on Windows with a Powershell window and the cmd shell it works for me to use
-qs:"/*/element {node-name()} { * except condition }"

XPATH expression that Matches on the attribute value "true"

I have some XML like this:
<engine-set>
<engine host-ref="blah1.com">
<property name="foo" value="true"/>
<property name="bar" value="true"/>
</engine>
<engine host-ref="blah2.com">
<property name="foo" value="true"/>
<property name="bar" value="false"/>
</engine>
</engine-set>
I want to match on all engine elements that have a child node property with a name equal to "bar" and and value equal to "true". I'm finding the fact that "true" appears in my XML is causing my condition to always evaluate to true in an XPath expression. Is there a way around? I'm using Python and lxml.
EDIT:
My xpath expression is (that isn't working) is:
//engine[(property/#name='bar' and property/#value="true")]
Thanks,
I want to match on all engine elements
This is:
//engine
that have a child node property
Now this becomes:
//engine[property]
with a name equal to "bar"
Still more specific:
//engine[property[#name = 'bar']]
and and value equal to "true".
Finally:
//engine[property[#name = 'bar' and #value = 'true']]
So you're saying
//engine[property[#name='bar' and #value='true']]
gives you too many results? Because for me it gives just one.
What XPath expression did you try?
The following seems to work well in getting "blah1.com" but not "blah2.com":
//engine[property[#value="true"][#name="bar"]]
Remember that you need to encase your parameter test values in quotes.

Resources