Exist-db: specify order of transformation? - exist-db

I have +100 files in my exist-DB instance.
I have them transform using a function. (https://pastebin.com/7Q2g4TPM)
I need several things:
I need them being transformed in order from 0 -> last number (will be 162). They are named 00001.xml, the tens begin with 00010.xml, the hundreds with 00100.xml (do you get, what I mean?)
I tried adding one file a time (up to 15 files) and I tried adding batches of files. All files are in the directory edition, with the first file at the moment being 00029.xml, which you find hardcoded as starting point for my Carousel (Bootstrap). (https://pastebin.com/WNKAgihw this pastebin is where I want them to be displayed for now. The structure etc. will probably change a little, but the general idea is this.)
Most of the time it seems to work fine, HOWEVER, with file 36 I get the case that this is displayed not at the needed position but two elements later. Later on, following 38, there is 142 inserted, then several mid-hundreds and then it goes back to "intended" order. I did not check for all files, but I saw this quite some times ...
Another question I have is this one:
Can I somehow get a
<ol class="carousel-indicators">
<li data-target="#carouselIndicators" data-slide-to="0" class="active">File 1</li>
<li data-target="#carouselIndicators" data-slide-to="1">File 2</li>
<li data-target="#carouselIndicators" data-slide-to="2"> File 3</li>
</ol>
where the data-slide-to="" is 1,2,3, etc. without hardcoding it for every file?
I guess the function (first pastebin) can serve as a starting point, but how to make the numbers go up ?
I hope I am clear with these questions and that someone knows how to help :-)
best wishes and many thanks in advance,
K

You likely need to use a order by clause in the FLWOR expression of your XQuery.

Related

Avoid parentheses in path using XPath 1.0

The following XML structure represents a website with many articles. Every article contains, among many other things, date of its creation and possibly arbitrarily many dates of its modification. I want to get the date of the last access (either creation or last modification) to every article using XPath 1.0.
<website>
<article>
<date><strong>22.11.2017</strong></date>
<edits>
<edit><strong>17.12.2017</strong></edit>
</edits>
</article>
<article>
<date><strong>17.4.2016</strong></date>
<edits></edits>
</article>
<article>
<date><strong>3.5.2011</strong></date>
<edits>
<edit><strong>4.5.2011</strong></edit>
<edit><strong>12.8.2012</strong></edit>
</edits>
</article>
<article>
<date><strong>12.2.2009</strong></date>
<edits></edits>
</article>
<article>
<date><strong>23.11.1987</strong></date>
<edits>
<edit><strong>3.4.2001</strong></edit>
<edit><strong>11.5.2006</strong></edit>
<edit><strong>13.9.2012</strong></edit>
</edits>
</article>
</website>
In other words, the expected output is:
<strong>17.12.2017</strong>
<strong>17.4.2016</strong>
<strong>12.8.2012</strong>
<strong>12.2.2009</strong>
<strong>13.9.2012</strong>
So far I've only created this path:
//article/*[self::date or self::edits/edit][last()]
that looks for date and nonempty edits nodes in every article and selects the latter one. But I don't know how to access the latest strong of every such selection and the naive //strong[last()] appended to the end of the path doesn't work.
I found a solution in XPath 2.0. Either of these paths should work, if I'm not mistaken:
//article/(*[self::date or self::edits/edit][last()]//strong)[last()]
//article/(*//strong)[last()]
Such use of parentheses within path is invalid in XPath 1.0 though.
This XPath 1.0 expression
/website/article/descendant::strong[parent::date|parent::edit][last()]
Selects the nodes:
<strong>17.12.2017</strong>
<strong>17.4.2016</strong>
<strong>12.8.2012</strong>
<strong>12.2.2009</strong>
<strong>13.9.2012</strong>
Tested in http://www.xpathtester.com/xpath/56d8f7bc4b9c8c064fdad16f22469026
Do note: position predicates acts over the context list.
Here is the simple xpath to get your output.
//article/descendant-or-self::strong[last()]

Reversing order in ng-repeat output using TamperMonkey

So I have this line on a page that provides a list of serial numbers as they are scanned in:
<div class='animate' ng-repeat='serialIdEntry in scan.scannedSerialIds track by $index'>
Presently each new scanned serial is appended to the bottom of the list, and it would be wildly more efficient and helpful were the order reversed.
Meaning, I need the most recent scan to be at top, and the whole of the list (as displayed) in reverse order.
I believe I can accomplish that by adding | orderBy:'':true" after the track by $index argument. Or by simply replacing that line with one that included the orderBy argument.
I suspect the method of updating on that page is AJAX/Angular, and I'm wondering how best to modify the output by way of a TamperMonkey script. I've been reading up on `waitForKeyElements', but that seems to largely speak to appending a new line of code rather than inserting a new snippet inline, or replacing a single line with one that has been modified.

How does empty start tag work in HTML4?

The HTML4 specification mentions various SGML shorthand markup constructs. While I understand what others do, with a help of HTML validator, I cannot find understand why anyone would want an empty start tag. It cannot even have attributes, so it's not a shorter <span>.
The SGML definition of HTML4 enables the empty start feature. In it, there is an interesting section with features.
FEATURES
MINIMIZE
DATATAG NO
OMITTAG YES
RANK NO
SHORTTAG YES
LINK
SIMPLE NO
IMPLICIT NO
EXPLICIT NO
OTHER
CONCUR NO
SUBDOC NO
FORMAL YES
APPINFO NONE
The important section of features is MINIMIZE section. It enables OMITTAG which is a standard feature of HTML, which allows start or end tags to be ommited. This is particular allows you to write code like <p> a <p> b, without closing paragraphs.
The more important part is SHORTTAG feature, which is actually a category. However, because it's not expanded, the SGML automatically assumed YES for all entries in it. It has the following categories in it. Feel free to skip this list, if you aren't interested in other shorthand features in SGML.
ATTRIB, which deals with attributes, and has following options.
DEFAULT - defines whether attributes can contain default values. This allows writing <p> without defining every single attribute. Nobody would want to write <p id="" class="" style="" title="" lang="en" dir="ltr" onclick="" ondblclick="" ...></p> after all. Hey, I even gave up trying to write all that. This is a commonly supported feature.
OMITNAME - if the attribute and value have the same name, the value is optional. This allows writing <input type="checkbox" checked> for instance. This is a commonly supported feature (although, HTML5 defines default to be empty string, not an attribute name).
VALUE - allows writing values without quotes. This allows writing code like <p class=warning></p> for instance. This is a commonly supported feature.
ENDTAG, which is a category for end tags containing the following options.
UNCLOSED - allows starting a new tag before ending the previous tag, allowing code like <p><b></b</p>.
EMPTY - allows unnamed end tags, such as <b>something</>. They close most recent element which is still open.
STARTTAG, which is a category for start tags containing the following options.
NETENABL - allows using Null End Tag notation. It's worth noting this notation is incompatible with XHTML. Anyway, the feature allows writing code like <b/<i/hello//, which means the same thing as <b><i>hello</i></b>.
UNCLOSED - allows starting a new tag before ending the previous tag, allowing code like <p<b></b></p>.
EMPTY - this is the asked feature.
Now, it's important to understand what EMPTY does. While <> may appear useless at first (hey, how could you determine what it does, when nothing aside of Validator supports it), it's actually not. It opens the previous sibling, allowing code like the following.
<ul>
<li class=plus> hello world
<> another list element
<> yet another
<li class=minus> nope
<> what am I doing?
</ul>
In this example, the list has two classes, plus and minus for positive and negative arguments. However, the webmaster was lazy (and doesn't care about that HTML4 doesn't support this), and decided to use empty start tag in order to not specify the class for next elements. Because <li> has optional end tag, this automatically closed previous <li> tag.

Select either A or B with Or Operator in XPath

I'm trying to crawl some websites, and the data I want can be found either of these places depending on the site:
Page 1:
<div>
<ul>
<li class="asd"> SomeText1 </li>
</ul>
</div>
Page 2:
<div>
<ul>
<li class="dsa"> SomeText2 </li>
</ul>
</div>
I would like an XPath expression which tries to select SomeText1 first, and if it doesn't exist, tries to get SomeText2.
I've tried //li[#class="asd"]/text() or //li[#class="dsa"]/text(), but this doesn't seem to cut it.
Am I using the or operator wrong? If so, how is it supposed to be used?
EDIT
I'm trying to feed a crawler an XPath in order to find information to store in a DB. On a given webpage, can the information I'm trying to get be two different places?
Which means webpage 1 could be:
<AA>
<BB>
<CC> Test </CC>
</BB>
</AA>
and on another there could be
<DD>
<EE>
<FF> Test </FF>
</EE>
</DD>
How can I construct an XPath expression which can say either do
AA/BB/CC or (if it fails/doesn't exist) DD/EE/FF?
You can shorten it to:
//li[#class = 'asd' or #class = 'dsa']/text()
Having said that, "not working" is never an accurate description of what went wrong. A potential source of error is double quotes instead of single quotes. If there are double quotes arround the expression, any quotes inside must be single.
Am I using the or operator wrong ?
No, your usage of the or operator is fine. Something else went wrong. (To really diagnose your problem, we'd need more context).
Try...
//li[#class="asd" or #class="dsa"]/text()

combining XPATH axes (preceding-sibling & following-sibling)

Say I have the following UL:
<ul>
<li>barry</li>
<li>bob</li>
<li>carl</li>
<li>dave</li>
<li>roger</li>
<li>steve</li>
</ul>
I need to grab all the LIs between bob & roger. I can grab everything after bob with //ul/li[contains(.,"bob")]/following-sibling::li, and I can grab everything before roger with //ul/li[contains(.,"roger")]/preceding-sibling::li. The problem is when I try to combine the two, I end up getting extra results.
For example, //ul/li[contains(.,"bob")]/following-sibling::li[contains(.,"roger")]/preceding-sibling::li will of course get everything before roger, instead of ignoring the items before bob.
Is there a way to chain these two xpaths together?
Try:
/ul/li[preceding-sibling::li='bob' and following-sibling::li='roger']

Resources