Make Xpath choose parts of an attribute - xpath

I have xml-documents with tables. Every table has an attribute hdsl-percent.
First of all, I'd like to know, what exactly that is . Never came across it. Google didn't yield any useful results.
Now, this attribute contains the widths of the table-columns in percentages, e.g. hsdl-percent="23.5 36.7 39.7".
Is there any way that I could get XPath to use these values for the widths of the table-columns? So 23.5% width for the first column and so on...
The problem is that each table is different, many of them with rowspans and colspans and since I'm using Apache FOP and it doesn't support table-layout auto, my tables just have width="100%", no column-width specified and therefore some columns are wider than they should be.
Thanks for your help and suggestions!

FYI: I've got it. I managed to merge FOP with Saxon9he and am thus able to use XPath 2.0 and XSLT 2.0. I solved it like this:
<xsl:for-each select="tokenize(#hsdl-percent,'\s+')">
<fo:table-column column-width="{concat(.,'%')}" />
</xsl:for-each>

XPath 1.0 has rather limited string manipulation support, so splitting up is rather annoying. Use substring($string, $start[, $length]), substring-before($string, $needle) and substring-after($string, $needle):
substring(...) will be fine if the strings are of fixed length (eg., no 4.2 occuring which is only three characters in length):
substring(//#hsdl-percent, 1, 4)
substring(//#hsdl-percent, 6, 4)
substring(//#hsdl-percent, 11, 4)
If length can change, you need to split at the space characters:
substring-before(//#hsdl-percent, ' ')
substring-before(substring-after(//#hsdl-percent, ' '), ' ')
substring-after(substring-after(//#hsdl-percent, ' '), ' ')
If you've got support for XPath 2.0 (or better), use tokenize($string, $needle):
tokenize(//#hsdl-percent, ' ') (: returns sequence of individual values :)
tokenize(//#hsdl-percent, ' ')[1]
tokenize(//#hsdl-percent, ' ')[2]
tokenize(//#hsdl-percent, ' ')[3]

Related

Assessing from the end of a split array in Hive

I need to split a tag that looks something like "B1/AHU/_1/RoomTemp", "B1/AHU/_1/109/Temp", so with a variable with a variable number of fields. I am interested in getting the last field, or sometimes the last but one. I was disappointed to find that negative indexes do not count from the right and allow me to select the last element of an array in Hive as they do in Python.
select tag,split(tag,'[/]')[ -1] from sensor
I was more surprised when this did not work either:
select tag,split(tag,'[/]')[ size(split(tag,'[\]'))-1 ] from sensor
Both times giving me an error along the lines of this:
FAILED: SemanticException 1:27 Non-constant expressions for array indexes not supported.
Error encountered near token '1'
So any ideas? I am kind of new to Hive. Regex's maybe? Or is there some syntactic sugar I am not aware of?
This question is getting a lot of views (over a thousand now), so I think it needs a proper answer. In the event I solved it with this:
select tag,reverse(split(reverse(tag),'[/]')[0]) from sensor
which is not actually stated in the other suggested answers - I got the idea from a suggestion in the comments.
This:
reverses the string (so "abcd/efgh" is now "hgfe/dcba")
splits it on "/" into an array (so we have "hgfe" and "dcba")
extracts the first element (which is "hgfe")
then finally re-reverses (giving us the desired "efgh")
Also note that the second-to-last element can be retrieved by substituting 1 for the 0, and so on for the others.
There is a great library of Hive UDFs here. One of them is LastIndexUDF(). It's pretty self-explainatory, it retrieves the last element of an array. There are instructions to build and use the jar on the main page. Hope this helps.
This seem to work for me, this returns the last element from the SPLIT array
SELECT SPLIT(INPUT__FILE__NAME,'/')[SIZE(SPLIT(INPUT__FILE__NAME,'/')) -1 ] from test_table limit 10;
After reading the LanguageManual UDF a while, I luckily found the function substring_index exactly meets your requirement, dosen't need any additional calculations at all.
The manual says:
substring_index(string A, string delim, int count) returns the substring from string A before count occurrences of the delimiter delim (as of Hive 1.3.0). If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. Substring_index performs a case-sensitive match when searching for delim. Example: substring_index('www.apache.org', '.', 2) = 'www.apache'.
Use cases:
SELECT SUBSTRING_INDEX('www.mysql.com', '.', 2);
--www.mysql
SELECT SUBSTRING_INDEX('www.mysql.com', '.', -1);
--com
See here for more information.

substr/instr calculation in Discoverer

I am creating a calculation in Discoverer 10g and only need to grab information between two points (".") An example of the string looks like this:
30068496.CR Order.ORDER ENTRY(1.1).Y.3
I only need to grab the "Y" between the last two periods.
I have come close with substr and instr functions, but have yet been able to just isolate only what I am trying to get.
The closest I've been is using this:
SUBSTR(MSCG_CS_Pegging_Details.End_Demand_Item_Order_Number,
INSTR(MSCG_CS_Pegging_Details.End_Demand_Item_Order_Number,'.',1,4)+1,
INSTR(MSCG_CS_Pegging_Details.End_Demand_Item_Order_Number,'.',1,1)-1-
INSTR(MSCG_CS_Pegging_Details.End_Demand_Item_Order_Number,'.',1,1))
Any advice?
I do not fully understand your requirements. You can search from the end of the string - find the first dot from the end (or the last dot) and subtract 1 to get the character before that last dot, which is 'Y'. This is the easiest and probably the safest way.
SELECT SUBSTR(str, INSTR(str, '.', -1, 1)-1, 1) search_str
FROM
(
SELECT '30068496.CR Order.ORDER ENTRY(1.1).Y.3' str FROM dual
);
You can also find positions of the two last dots, for example, and get the value between them.

Combining XPath selector in selenium IDE

I'm looking for a way to combine 2 XPath selectors into 1 to use in Selenium IDE, so I can check if an element with a certain ID has a certain class.
These 2 selectors do work but aren't narrowing enough to do an assertElementPresent on.
xpath= .//*[contains (#class,'ui-tabs-hide')]
xpath= .//*[#id='${newTableID}']
I've unsuccesfully tried following XPath
xpath= .//*[contains (#class,'ui-tabs-hide')]/*[#id='${newTableID}']
Can anyone help me out on this one please?
Thanks,
J.
Okay... x-mass is still making my head a bit fuzzy...
xpath=.//*[#id='${newTableID}' and contains (#class,'ui-tabs-hide')]
was the way to go
Use:
xpath=.//*[#id='${newTableID}'
and contains(concat(' ', #class, ' '), ' ui-tabs-hide ')]
Do note how contains() is specified. This guarantees that elements with classnames having the wanted class name as prefix or as suffix, will not be selected.

Negative decimal value formatting in Altova Style Vision

I have a problem with value formatting in Altova StyleVision. Altova forums seem to be dead. Maybe someone encountered similar problem.
I have created an Auto Calculation inside XBRL table generated by StyleVision. It contains " sum( xbrli:xbrl/n1:Wages ) " xpath expression. This expression gives me a negative value. I want to format it so that it's surrounded by parentheses instead of leading minus.
I have tried using prefixes ans suffixes in "value formatting", like this (###,##0.##) or this [###,##0.##] . But I still get minus instead of parentheses. Is there a way to get around this? Any of those prefixes seem not to work for me at all.
http://manual.altova.com/Stylevision/stylevisionbasic/index.html?svpres_inputformatting.htm
Ok. It seems that problem is solved.
Created ch.xsl file with following contents:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:decimal-format name='ch' grouping-separator=" " decimal-separator=","/>
</xsl:stylesheet>
In Altova StyleVision under Design Overview -> Add new XSLT file. Choose ch.xsl.
Afterwards in Auto Calculation xpath used following expression:
format-number(sum( xbrli:xbrl/n1:Wages ),'### ##0,##;(### ##0,##)','ch')
Maybe there is a better way to do this, but it worked for me. Hope it will help someone

XPath to return string concatenation of qualifying child node values

Can anyone please suggest an XPath expression format that returns a string value containing the concatenated values of certain qualifying child nodes of an element, but ignoring others:
<div>
This text node should be returned.
<em>And the value of this element.</em>
And this.
<p>But this paragraph element should be ignored.</p>
</div>
The returned value should be a single string:
This text node should be returned. And the value of this element. And this.
Is this possible in a single XPath expression?
Thanks.
In XPath 2.0 :
string-join(/*/node()[not(self::p)], '')
In XPath 1.0:
You can use
/div//text()[not(parent::p)]
to capture the wanted text nodes. The concatenation itself cannot be done in XPath 1.0, I recommend doing it in the host application.
/div//text()
double slash forces to extract text regardless of intermediate nodes
This look that works:
Using as context /div/:
text() | em/text()
Or without the use of context:
/div/text() | /div/em/text()
If you want to concat the first two strings, use this:
concat(/div/text(), /div/em/text())
If you want all children except p, you can try the following...
string-join(//*[name() != 'p']/text(), "")
which returns...
This text node should be returned.
And the value of this element.
And this.
I know this comes a bit late, but I figure my answer could still be relevant. I recently ran into a similar problem. And because I use scrapy in Python 3.6, which does not support xpath 2.0, I could not use the string-join function suggested in several online answers.
I ended up finding a simple workaround (as shown below) which I did not see in any of the stackoverflow answers, that's why I'm sharing it.
temp_selector_list = response.xpath('/div')
string_result = [''.join(x.xpath(".//text()").extract()) for x in temp_selector_list]
Hope this helps!
You could use a for-each loop as well and assemble the values in a variable like this
<xsl:variable name="newstring">
<xsl:for-each select="/div//text()">
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:variable>

Resources