XPath 2.0:reference earlier context in another part of the XPath expression - xpath

in an XPath I would like to focus on certain elements and analyse them:
...
<field>aaa</field>
...
<field>bbb</field>
...
<field>aaa (1)</field>
...
<field>aaa (2)</field>
...
<field>ccc</field>
...
<field>ddd (7)</field>
I want to find the elements who's text content (apart from a possible enumeration, are unique. In the aboce example that would be bbb, ccc and ddd.
The following XPath gives me the unique values:
distinct-values(//field[matches(normalize-space(.), ' \([0-9]\)$')]/substring-before(., '(')))
Now I would like to extent that and perform another XPath on all the distinct values, that would be to count how many field start with either of them and retreive the ones who's count is bigger than 1.
These could be a field content that is equal to that particular value, or it starts witrh that value and is followed by " (". The problem is that in the second part of that XPath I would have refer to the context of that part itself and to the former context at the same time.
In the following XPath I will - instead of using "." as the context- use c_outer and c_inner:
distinct-values(//field[matches(normalize-space(.), ' \([0-9]\)$')]/substring-before(., '(')))[count(//field[(c_inner = c_outer) or starts-with(c_inner, concat(c_outer, ' ('))]) > 1]
I can't use "." for both for obvious reasons. But how could I reference a particular, or the current distinct value from the outer expression within the inner expression?
Would that even be possible?

XQuery can do it e.g.
for $s
in distinct-values(
//field[matches(normalize-space(.), ' \([0-9]\)$')]/substring-before(., '(')))
where count(//field[(. = $s) or starts-with(., concat($s, ' ('))]) > 1
return $s

Related

Extract last word using Xpath 1.0

I need to select only the last word using xpath 1.0. I have something like this:
<Example>
<Ctry> Portugal PT </Ctry>
</Example>
I want to select only the PT word but the order is not exact, i.e: <Ctry> Portugal - Lisbon - PT </Ctry>, but the word i want to extract is always the last one.
I've already tried:
//*[name()='Example'][substring(., string-length(.) - string-length('PT')+1) = 'PT']/text() but extracts always the whole string.
Can anyone help me please?
You're selecting a node using the substring as a predicate to filter out other nodes. If you want the substring to be your output, it shouldn't go inside brackets.
substring(//*[name()='Example'], string-length(//*[name()='Example']) - string-length('PT')+1)
note that /text() can be ommited when working with string functions

SSRS [Sort Alphanumerically]: How to sort a specific column in a report to be [A-Z] & [ASC]

I have a field set that contains bill numbers and I want to sort them first alphabetically then numerically.
For instance I have a column "Bills" that has the following sequence of bills.
- HB200
- SB60
- HB67
Desired outcome is below
- HB67
- HB200
- SB60
How can I use sorting in SSRS Group Properties to have the field sort from [A-Z] & [1 - 1000....]
This should be doable by adding just 2 separate Sort options in the group properties. To test this, I created a simple dataset using your examples.
CREATE TABLE #temp (Bills VARCHAR(20))
INSERT INTO #temp(Bills)
VALUES ('HB200'),('SB60'),('HB67')
SELECT * FROM #temp
Next, I added a matrix with a single row and a single column for my Bills field with a row group.
In the group properties, my sorting options are set up like this:
So to get this working, my theory was that you needed to isolate the numeric characters from the non-numeric characters and use each in their own sort option. To do this, I used the relatively unknown Regex Replace function in SSRS.
This expression gets only the non-numeric characters and is used in the top sorting option:
=System.Text.RegularExpressions.Regex.Replace(Fields!Bills.Value, "[0-9]", "")
While this expression isolates the numeric characters:
=System.Text.RegularExpressions.Regex.Replace(Fields!Bills.Value, "[^0-9]", "")
With these sorting options, my results match what you expect to happen.
In the sort expression for your tablix/table which is displaying the dataset, set the sort to something like:
=IIF(Fields!Bills.Value = "HB67", 1, IIF(Fields!Bills.Value = "HB200", 2, IIF(Fields!Bills.Value = "SB600", 3, 4)))
Then when you sort A-Z, it'll sort by the number given to it in the sort expression.
This is only a solution if you don't have hundreds of values, as this can become quite tedious to create if there's hundreds of possible conditions.

string.IndexOf exact match

I have the following:
string text = "Select [id] AS [FROMId] FROM [TASK] ORDER BY id"
and I want to use text.IndexOf("FROM") in order to find where the FROM starts.
I want to find the position of FROM and not the position of FROMId.
LastIndexOf or FirstIndexOf are not correct answers cause the text could be anything like
string text = #"Select [id] AS [FROMId],
newId as [newFROMId] FROM [TASK] ORDER BY [FROMId]"
I need the indexof to do exact matching.
Any ideas?
Since FROM is an SQL reserved word that will generally have spaces on either side, you could look for that then, since that will give you the address of the space before the F, add one to get the location of the F itself:
int index = text.IndexOf(" FROM ") + 1
This may not necessarily take care of all edge cases(a) but, to do that properly, you may have to implement an SQL parser to ensure you can correctly locate the real from keyword and distinguish it from other possibilities.
(a) Such as things like:
select [a]FROM[tble] ...
select 'got data from unit #' | unit from tbl ...
and so on.

Xpath 1.0 using an arithmetic operators

Let's say we have this:
something
Now is there a way to return the #href like: "www.something/page/2". Basically to return the #href value, but with the substring-after(.,"page/") incremented by 1. I've been trying something like
//a/#href[number(substring-after(.,"page/"))+1]
but it doesn't work, and I don't think I can use
//a/#href/number(substring-after(.,"page/"))+1
It's not precisely a paging think, so that I can use the pagination, I just picked that for an example. The point is just to find a way to increment a value in xpath 1.0. Any help?
What you can do is
concat(
translate(//a/#href, '0123456789', ''),
translate(//a/#href, translate(//a/#href, '0123456789', ''), '') + 1
)
So that concatenates the 'href' attribute with all digits being removed with the the sum of 1 and the 'href' with anything but digits being removed.
That might suffice is all digits in your URLs occur at the end of your URL. But generally XPath 1.0 is good at selecting nodes in your input but bad at constructing new values based on parts of node values.
There is a simpler way to achieve this, just take the substring after the page, add 1, and then munge it all back together:
This XPath is based on the current node being the #href attribute:
concat(substring-before(.,'page/'),
'page/',
substring-after(.,'page/')+1
)
Your order of operations is a little, well, out of order. Use something like this:
substring-after(//a/#href, 'page/') + 1
Note that it is not necessary to explicitly convert the string value to a number. From the spec:
The numeric operators convert their operands to numbers as if by
calling the number function.
Putting it all together:
concat(
substring-before(//a/#href, 'page/'),
'page/',
substring-after(//a/#href, 'page/') + 1)
Result:
www.something/page/2

Whats the XPath equivalent to SQL In query?

I would like to know whats the XPath equivalent to SQL In query. Basically in sql i can do this:
select * from tbl1 where Id in (1,2,3,4)
so i want something similar in XPath/Xsl:
i.e.
//*[#id= IN('51417','1121','111')]
Please advice
(In XPath 2,) the = operator always works like in.
I.e. you can use
//*[#id = ('51417','1121','111')]
A solution is to write out the options as separate conditions:
//*[(#id = '51417') or (#id = '1121') or (#id = '111')]
Another, slightly less verbose solution that looks a bit like a hack, though, would be to use the contains function:
//*[contains('-51417-1121-111-', concat('-', #id, '-'))]
Literally, this means you're checking whether the value of the id attribute (preceeded and succeeded by a delimiter character) is a substring of -51417-1121-111-. Note that I am using a hyphen (-) as a delimiter of the allowable values; you can replace that with any character that will not appear in the id attribute.

Resources