is it possible to specify a OR in a Xpath in a XSD? - xpath

I want to be able to guarantee uniqueness for two types of elements : MainQuestion and AlternateQuestion.
In the select query for my xsd:key, can I specify something that would do "//MainQuestion or //AlternateQuestion"? Someone told me that something like this existed, but it seems that XSD only supports a subset of the XPath syntax...

You should be able to use | as usual:
//MainQuestion | //AlternateQuestion
the syntax is indeed restricted - it is roughly the same as restrictions for template patterns in XSLT 1.0, but in addition to that it cannot have any filters in path steps. However, | is explicitly listed as supported.

Supporting Pavel's answer that you can use "|" in an XPath in XML Schema.
XML Schema supports a subset of XPath (which I think of as "fake XPath"). What it supports is explicitly stated in the spec. You have to trace through a few sections to find it. This is a link to the exact section:
http://www.w3.org/TR/xmlschema-1/#c-selector-xpath

Do you have tried the or ?
http://w3schools.com/xpath/xpath_operators.asp

Related

exist-db: XQuery and documents with XInclude

I'm embarking on a new project with eXist. We'll be storing a few hundred TEI XML documents that represent manuscripts. A number of things we want to capture are repetitve, mainly people and places. My colleague has asked the TEI community about strategies for representing what we want to capture and using XInclude had been suggested as a way of reducing duplication.
I've had a quick play with adding an XInclude into a document and the serialized XML does render the include XML file. However, the included text was missing from an XQuery. I notice in the eXist docs (http://exist-db.org/exist/apps/doc/xinclude.xml) that:
eXist-db expands XIncludes at serialization time, which means that the
query engine will see the XInclude tags before they are expanded. You
therefore cannot query across XIncludes - unless you create your own
code (e.g. an XQuery function) for it. We would certainly like to
support queries over xincluded content in the future though.
What is the best practice for querying files that use XInclude?
I'm wondering whether I should have a 'job' that serializes the source TEI XML files to expand the XIncludes and store these files in a separate collection? In that case, would file:serialize be the correct function for this task?
We are at the start of the project, so any advice appreciated.
Can you describe what kind of query you tried that was missing the text?
Generally, since the files referenced via XInclude are well-formed xml documents, you can use collections (folders) to organise your queries in exist-db. So instead of for $search in doc("mydoc.xml") you could for $search in collection('/app/mydata')/*
more elaborate answers would follow the attribute of the unexpanded xinclude statement in source document and find the matching element in the target, but its difficult to abstract that without a concrete MWE.
have you tried to create a temporary and expanded fragment in a let clause, and query that instead of the stored xml?
Beware of namespaces !
Hope this helps, and greetings to Sebastiaan.

Does Jackrabbit support the XPath union (|) operator?

I am trying to search under 2 different nodes for a specific name. This works
/jcr:root/db067409/libraries/bd0b868d/_x0030_//*[#name="FIRST"]
But when I try to OR it with the second node like so...
/jcr:root/db067409/libraries/bd0b868d/_x0030_//*[#name="FIRST"]|/jcr:root/db067409/libraries/_x0033_78d57e4/_x0031_//*[#name="FIRST"]
I no longer get any search results. Please could someone point out what I've done wrong.
What I'd really like to do is along these lines; if I have /a/b/ID1/VERSION1 and /a/b/ID2/VERSION2 I'd like an xpath something like this /a/b/(ID1/VERSION1 or ID2/VERSION2)//*[#name="some name"].
Answer is no. Unfortunately, it doesnt throw an UnsupportedOperationException like you'd expect. There was an item in Jira but I guess they ignored it as Xpath is now deprecated.
Use JCR_SQL2 if you do need a union.
Edit
This thread indicates that a union-like feature is available in Jackrabbit 2.0, but not earlier.
[Joins] are not possible with JCR
Xpath or JCR-SQL, but with the new query model in JCR 2.0 (JCR-SQL2).
This is supported since CQ 5.3 / CRX 2.0 / Jackrabbit 2.0. Please note
that these joins aren't optimized very much.
Indeed, XPath is deprecated in JCR 2.0.
JCR 1.0 defines a dialect of SQL different from JCR-SQL2, as well as a dialect of XPath. Support for these languages is deprecated.

How to use not contains() in XPath?

I have some XML that is structured like this:
<whatson>
<productions>
<production>
<category>Film</category>
</production>
<production>
<category>Business</category>
</production>
<production>
<category>Business training</category>
</production>
</productions>
</whatson>
And I need to select every production with a category that doesn't contain "Business" (so just the first production in this example).
Is this possible with XPath? I tried working along these lines but got nowhere:
//production[not(contains(category,'business'))]
XPath queries are case sensitive. Having looked at your example (which, by the way, is awesome, nobody seems to provide examples anymore!), I can get the result you want just by changing "business", to "Business"
//production[not(contains(category,'Business'))]
I have tested this by opening the XML file in Chrome, and using the Developer tools to execute that XPath queries, and it gave me just the Film category back.
I need to select every production with a category that doesn't contain "Business"
Although I upvoted #Arran's answer as correct, I would also add this...
Strictly interpreted, the OP's specification would be implemented as
//production[category[not(contains(., 'Business'))]]
rather than
//production[not(contains(category, 'Business'))]
The latter selects every production whose first category child doesn't contain "Business". The two XPath expressions will behave differently when a production has no category children, or more than one.
It doesn't make any difference in practice as long as every <production> has exactly one <category> child, as in your short example XML. Whether you can always count on that being true or not, depends on various factors, such as whether you have a schema that enforces that constraint. Personally, I would go for the more robust option, since it doesn't "cost" much... assuming your requirement as stated in the question is really correct (as opposed to e.g. 'select every production that doesn't have a category that contains "Business"').
You can use not(expression) function.
not() is a function in xpath (as opposed to an operator)
Example:
//a[not(contains(#id, 'xx'))]
OR
expression != true()
Should be xpath with not contains() method, //production[not(contains(category,'business'))]

XPATH remove attribute

Hi does anyone know hwo to remove an attrbute using xpath. In particular the rel attribute and its text from a link. i.e. <a href='http://google.com' rel='some text'>Link</a> and i want to remove rel='some text'.
There will be multiple links in the html i am parsing.
You can select items using xpath, but that's all it can do - it is a query language.
You need to use XSLT or an XML parser in order to remove attributes/elements.
As pointed out by Oded, Xpath merely identifies XML nodes. To remove/edit XML, you need some additional tooling.
One solution is the Ant-based plugin XMLTask (disclaimer - I wrote this). It provides a simple mechanism to read an XML file, identify parts of that using XPath, and change it (including removing nodes).
e.g.
<remove path="web/servlet/context[#id='redundant']"/>
Have you already tried using Javascript for this If that is applicable in your scenario:-
var allLinks=document.getElementsByTagName("a");
for(i=0;i<allLinks.length;i++)
{
allLinks[i].removeAttribute("rel");
}

What's the difference between XSL Pattern and XPath in syntax?

I'm updating codes to use MSXML6.0 from MSXML3.0.
However, I noticed that, for MSXML3.0, the default "SelectionLanguage" is "XSL Pattern", while MSXML6.0 only support XPath.
I have concerns that this change would introduce differences in the query syntax.
Can somebody list the difference of syntax between these two syntax?
The one thing that has tripped me up is selecting the first node in a node set. For example, we'd been using MSXML 3.0 (which uses XSLPattern) and has queries like this:
/root/book[0]
This query was supposed to select the first book. This works with XSLPattern. But with XPath, this is correct:
/root/book[1]
So when I switched us to using MSXML 6.0, which uses correct XPath, all those queries with "[0]" stopped working.
Update:
I just found this link that talks some more about XSLPattern and XPath:
MSDN Magazine: MSXML 3.0 Supports XPath 1.0, XSLT 1.0, XDR, and SAX2
http://msdn.microsoft.com/en-us/magazine/cc302348.aspx
Update #2:
Here's the W3C Spec on XSLT which includes XSL Patterns:
http://www.w3.org/TR/1998/WD-xsl-19981216.html#AEN376
Update #3
Here's another post that describes the same thing I mentioned above:
http://www.eggheadcafe.com/software/aspnet/29579789/xml-parsing.aspx
XSL Pattern, if I remember correctly, was a selection language like XPath but was implemented by Microsoft before XPath was standardised (possibly even created). I don't think anyone even has anything that documents XSL Pattern any more. You can basically forget about it and concentrate on XPath. It has the same purpose but is supported and standardised.
XSL Patterns appear to be part of WD-XSL, "working draft XSL", which means versions predating the XSL recommendation (1999), which differ significantly from the final 1.0 version.
Microsoft has the relevant info on "XSL Patterns". Here's a quote from the section XPath 1.0 APIs:
MSXML 2.0 provides support for XSL Patterns, the precursor to XPath 1.0. The notion of an XML addressing language was introduced into the original W3C XSL Working Drafts (http://www.w3.org/TR/1998/WD-xsl-19981216.html) and called XSL Patterns. MSXML 2.0 implements the XSL Patterns language as described in the original XSL specification with a few minor exceptions.
MSXML 3.0 provides support for the legacy XSL Patterns syntax as well as XPath 1.0.
XPath, in my experience, is much easier to get your head around. I avoid XSL like the plague if I can. But you are right, the syntax is very different, so if you want to switch from XSL to XPath you have some work ahead of you. I cannot explain the differences easily, but this tutorial should give you some idea of what XPath is about:
http://www.w3schools.com/XPath/xpath_examples.asp

Resources