I have the following html structure:
<p><b> Some bold text which starts with xy <b>
<p> text
<p> text
<p><b> Next bold text <b>
<p> text
<p> text
I need to construct an xpath which grasps all text after the bold text which starts with xy ONLY until the next bold text which does not start with xy. My attempts so far:
"//p/*[starts-with(text()),'xy']/following::text()"
Yet, this grasps all text - also that after the next bold text which does not start with xy. Any suggestions?
I have found a solution which seems to work:
"//p/b[starts-with(.,'xy')]/following::p[count(preceding::b) = 1]"
So, the trick is here the counter. The trade-off is that by setting the counter for all preceding b at 1, it will not go further than the first b after the b which starts with xy, yet, it also includes the text in that b. This can certainly be improved but it is ok for my purposes now.
Related
I have an AsciiDoc page which has a number of images. I am converting this into html via antora.
On my AsciiDoc page, some of the images have a caption and some do not.
For the images with a caption, the first one is named "Figure 1. Some interesting caption", then the second one is named "Figure 2. Some fascinating caption" and so on. In fact, the "Figure 1", "Figure 2" text is added automatically by Antora. In AsciiDoc itself, the markup is as follows:
.Some interesting caption;
image::images/image1.png
.Some fascinating caption;
image::images/image2.png
However, now I have a third image which has no caption to display. I would like this image to simply read "Figure 3". However, I do not know how to do this. The only thing I could come up with is to put some character after the "." symbol just above it (I chose a semi-colon), as follows:
.;
image::images/image3.png
This produces "Figure 3;" once converted into html.
It's better than nothing, but I would like to be able to use, for example, a whitespace character, instead of the semi-colon, so that I could simply produce text that reads "Figure 3 " (with an invisible whitespace character that nobody can see). Unfortunately, if I try to do that, the whitespace is ignored and I just see the '.' character in the generated html.
You can use an attribute for the non-breaking space: {nbsp}
For example:
.Some interesting caption;
image::images/image1.png[]
.Some fascinating caption;
image::images/image2.png[]
.{nbsp}
image::images/image3.png[]
Note that I added square brackets to each image macro invocation, because those lines are just text with them. And, there doesn't need to be a blank line between the caption and its associated image.
When I paste an SVG image to Word, the Y labels get disoriented.
Here is how it looks in browser:
Here is how it looks pasted into a Word document:
Edit 1: thank you for the answer.
Now the picture looks like this, still misaligned:
The reason you are having your issue is in part because there are two text elements,
one for 10 and one for ² etc..
If you have an understanding of how to navigate the code you would:
find each instance of 10 and put (copy & paste):
10⁴, 10², 10⁰, 10⁻²
And while your at each 10 find the
4, 2, 0, -2
and just leave an empty element there.
example of what you have:
<text>10</text>
<text>2</text>
example of what you want:
<text>10²</text>
<text></text>
note the second element might be
<tspan></tspan>
If this doesn't work,
You could try putting in the alt codes for superscript if using windows.
Otherwise you could try:
write superscript character in word
copy that superscript character into the svg code
save svg
import the svg back into word
if works, do the same for the rest of the characters
I have the following text:
$3.00 x 2 = $6.00
When I apply direction: rtl; to the body of the page, this text displays as
x 2 = $6.00 $3.00
Can anyone explain why it's displaying this way?
A full answer to your question would have to explain how the Unicode Bidirectional Algorithm works, and it's immensely complicated.
From my limited understanding, the algorithm has detected that "x 2 = $6.00" and "$3.00" are two separate "runs" of text that should be displayed in left-to-right order. As the whole block is right-to-left, you see the two runs in RTL order.
It's not clear if your question is trying to solve a problem, or if you're just curious. However, if you need to display your equation fully LTR, but in the middle of some other RTL text you can use Unicode control characters.
e.g. The text in this RTL block will display the text between the two markers as a continuous run of LTR text.
<body dir="RTL">
$3.00 x 2 = $6.00
</body>
Simpler in most cases (if you can) is to isolate the LTR text with HTML elements, e.g:
<body dir="RTL">
<span dir="LTR">$3.00 x 2 = $6.00</span>
</body>
As I can't read Arabic or Hebrew I can't tell you how your example text should appear when embedded into RTL script. However, you do have control over the rendering.
I want to extract text from pdf with bold and italics identifiction. for example bold letters need t be extracted like this.<b>TEST</b> and italics must be enclosed like <i> test </i>
Currently i am using texttopdf.exe to extract text..the accuracy was good.but not able to identify bold italics.
any one have another idea or the same pdftoexe having the feature?
Thanks in Advance
Hi I have a textbox containing some text. I am looking to replicate the red spelling mistake squiggle type behaviour.
Using WinAPI I can
draw the squiggle between 2 points.
find out the height and width of
the word to be "squiggled".
What is the api call (or perhaps the methodology if it is more than a single api call) to find the position of that word in the text box so that I can position the sqiggle undeneath it.
Also, what are the messages I need to trap to ensure that the squiggle is redrawn. I'm currently only using WM_PAINT, which obviously isn't good enough.
EDIT (3 Sept 2012):
FYI, Here's where I got to so far. Needs a lot of refining but shows basic principles
https://gist.github.com/3607272
Many thx
S
What might work is using an auto-sized label. Make sure the fonts in the label and textbox are identical.
Detect the number of rows that are before the sentence containing the misspelled word.
Fill the label caption with the number linefeeds (vbcrlf) you got from question 1.
Append the words from the misspelled line (up to the misspelled word) to the labels caption.
The label size should now be identical with the beginning of the misspelled word.
Example text:
This is my first line.
And my second line.
And over here i have my mispeled word.
Label caption output should be (ignore the dots, they are empty lines):
.
.
And over here i have my
The labels height and width should match the position in the textbox, unless you have scrollbars. If the textbox has borders then you should add a fixed value to the height and width to get an exact match.