XSLT FOP: force page break within fo:block - pdf-generation

I'm generating a table with different length strings in one column:
The page break within an <fo:block/> in the Arbeitsgangbeschreibung-column makes me probles.
As you can see, the string normaly starts on the same line like the numbers and takes the rows it needs.
But the string which is after the page break (starting with 'Schleifen Stirnseite Gewinde sauber...') won't start on the same line like the numbers it belongs to. It seems the <fo:block/> forces it together within one page.
But I just want the <fo:block/> to break at the end of the page. Which means
Schleifen Stirnseite Gewinde sauber | -
should stand on the first page. And the rest
Leisten unter Vorrichtung unterlegen (damit Schraubenkopf frei ist)
should stand on the next page.
There arent any keep-together-calls which it could accidentally inherit from.
Thats the <fo:block/>:
<fo:table-cell border-right="{$Standardrand}">
<fo:block margin-top="4pt" font-size="9pt" font-weight="bold"margin-left="3pt">
<xsl:value-of select="beschreibung" />
</fo:block>
</fo:table-cell>

Add widows="1" and orphans="1" to the fo:block.
widows (https://www.w3.org/TR/xsl11/#widows) and orphans (https://www.w3.org/TR/xsl11/#orphans) set the minimum number of lines of a block of text to be left at the bottom or top of a page, respectively. The initial value for both is 2, with the result that you can't split a three-line block of text. By default, the minimum number of lines that can be split is 4.

Related

Two Columns - One Ordered List Jekyll

How can I adapt my current two-column (using Bootstrap) ordered list in a Jekyll site? The goal is to have Jekyll take my .md file, of ordered list items and split it into two columns. I currently have to individually edit each ordered list in the two columns which will be problematic in the long run, as this document is updated and added to constantly.
Here is my current code:
<!-- Left Column -->
<div class="col-sm">
<ol>
<li>Zebras</li>
<li>Lions</li>
</ol>
</div>
<!-- Right Column -->
<div class="col-sm">
<ol start="3">
<li>Zebras</li>
<li>Lions</li>
</ol>
</div>
Here is a visual of what I would like to accomplish via Jekyll.
Input (Markdown file):
---
layout: default
---
1. Zebras
2. Lions
3. Tigers
4. Gorillas
Ideally, this would then be parsed into:
1. Zebras 3. Tigers
2. Lions 4. Gorillas
I found two on-topic StackOverflow questions, however, neither fit this use case. The first uses YAML front-matter to build the list. However, since I often have links in my ordered list, I do not think this method would work—also, would be equally as tedious as adding each list item. The second gets closer, however, I think it is using the post files whereas I am using list items.
Can I have Jekyll take the number of ordered list items, split those in half (or if not exact, the left column should be greater than the right), then put the list items in their respective columns? Finally, carrying over the counter from the left column (either using <ol start="x"> or something else).
Here's an idea: Use one column and add style="column-count: 2" to that. Done!
More css column options: https://css-tricks.com/almanac/properties/c/columns/#article-header-id-0
You can try to use Cards columns from Bootstrap. By default, they are 3 columns, but you can modify the default value.
Check this link: https://getbootstrap.com/docs/4.0/components/card/#card-columns

How does this script generate a random image?

I found this script by accident: http://lorempixel.com/250/250/business/?a=7
But I have zero idea how it works. There is nothings in the code. If I try to save the page I only get the image save pop up. Can someone give me a hint, because this is absolutely fantastic!
In the code I found this line:
img ...(some crapy css styles)... src="http://lorempixel.com/250/250/business/?a=7"
So, how it works:
You choose a size and category of pictures (adress shows that you are navigating to folders "250", "250" and then "businnes", I guess the numbers stand for the size.
Then, in this folder, lays a script that returns an image, you are passing parameter "7" into it, it can stand for a visitor ID, or it just represents some seed for the generator.
The script generates some random number and returns image associated with this number (you can imagine there's a 1000 of JPEGs named 1.jpg, 2.jpg and so on). Alternatively, it asks some search engine for pictures that matches those criteria (size, category) and returns them (again, "show me picture numer x after searching for business category and certain size").
Thing that can be confusing tho is, even refreshing the page with the same parameter (in this case 7), causes the page to load different pictures. Perhabs it remembers recently shown pictures for a specified IP number or something that identificates you (f.e. session ID stored in a cookie) and prevents the page from loading the same pictures), or the randomizing algorithim can be built in a way that it's very unprobable to get the same outcome even with the same seed or parameter (imagine you are adding numbers of miliseconds from now since last Monday, it constantly changes, right?).
you can have an array of image names perhaps. Each time you open the page, a random number from 0 to size of the array is generated. And the display the page with the name in the element of the array, with the random index?
say you have an array with 3 names: pic1.png pic2.jpg pic3.gif.
have a var pic=array[Math.floor(3*Math.random())];
and a img src=pic;

Selecting text from multiple paragraphs using xpath

I have a situation where my end points and mid points can vary.
I always have:
<p style="margin-top: 0px;" >
and
<p class="contactAdvisor">
in between, I will have varying items including <b>, <i>, <strong>, <br> headings 1,2 or 3. I might also have one or more <p> in between the two fixed items.
What I'm trying to get is all of the text in between these two elements no matter whether wrapped in headings, various stylings or inside sub paragraph elements.
I've messed around with contains and preceding/following-sibling but my best attempt has been to create based on pre/follow for each use case. And even that leaves me with some issues because if there are multiple <p> inside and I'm trying to select all of them, I only get one.
Depending on the hierarchy, you can use either preceding:: or preceding-sibling::.
Try selectong something along the lines of:
//*[preceding::p[#style="margin-top: 0px;"] and not (preceding::p[#class="contactAdvisor"])]
This should exclude everything before the first p with the first condition and everything after the second with the second. Untested so you may have to tweak the check a little.
//p[#style="margin-top: 0px;"]/following::*[following::p[#class="contactAdvisor"]]
//*[preceding::p[#style="margin-top: 0px;"] and following::p[#class="contactAdvisor"]]

Watin : How to iterate over table rows when there is nothing to identify them

I have tried different solutions to get the table node which I can identify as next sibling to the text node in the existing dom.
I used the following code, but the nextsibling is always null.
var element = browser.Element(Find.ByText(t => t.Contains("Individual Notices")));
if (element != null)
{
var table = element.NextSibling as Table;
}
Would appreciate the help if any one can guide me how to iterate through the rows which are there in the table next to the node "Individual Notices"
Thanks
You're having trouble as the Contains ends up not finding the element you want. Put Console.WriteLine("START" + element.Text+ "END"); in there right after the variable declaration/assignment, and I bet you'll see a whole lot of text output besides "Individual Notices".
If the Dom element you need ONLY has the text "Individual Notices" text, simply remove the lambda call and have Find.ByText("Individual Notices") and then table will have your table.
If this is not an option as the text isn't a known value, you might be able to search on a specific element type (eg: Div) so that parent nodes aren't being returned as the lambda contains result.
Edit:
Sometimes searching for an individual element by text is problematic due to browser oddities. At times text values shown to the user don't necessarily equal the text values seen by the DOM due to whitespaces being added or removed. Basically you might think you have "Individual Notices" but WatiN might see "Individual Notices " <- See the space at the end. The way I run not being able to find a particular element after easy/obvious methods are exhausted is to just iterate through the elements in WatiN code by searching for what I think should find it and then flashing the elements found and/or writing to the console. If not found, widen the search. Repeat as needed.

How to find the first link on the page containing this text?

If I have two links:
<div class="abc">
<a id="def1" href="/definitely">Definitely 1</a>
<a id="def2" href="/definitely">Definitely 2</a>
</div>
And I want to identify the first (def1), I thought this would work:
var linkXPath = "//div[#class='abc']//a[contains(#href,'def')][1]";
But it doesn't seem to.
What am I doing wrong?
It is a FAQ why
//someName[1]
doesn't select the first element of //someName.
Looking at the definition of the // abbreviation, one would realize that in fact
//someName[1]
is equivalent to:
/descendant-or-self::node()/someName[1]
and this selects every someName element that is the first someName child of its parent node.
Thus, if there are two or more someName elements that are the first someName child of their parent, all of them are selected.
Solution:
Instead of
//someName[1]
use:
(//someName)[1]
So, in your particular case use:
(//div[#class='abc']//a[contains(#href,'def')]) [1]
Apart from this, none of the above expressions would select any node, if in the actual XML document a default namespace was specified. Selecting nodes in a document with a default namespace is the biggest XPath FAQ. To find the solution just search for "default namespace" in this SO tag and anywhere on the Internet.
Your XPath expression selects the first a element (with the right href) of every div (that has the right class) that contains one. So if there were two divs that matched, each with multiple a elements that matched, you'd get a reault set containing two elements -- the first a in the first div, and the first a in the second div.
To select just the first element of the entire result set, use parentheses like so:
(//div[#class='abc']//a[contains(#href,'def')])[1]
Other than that, your expression works fine for me (tested here).

Resources