xpath conditional match possible? - xpath

Ok basically I'm trying to use XPath to find a match "given a condition." basically, with the following xml:
<stuff1>
<stuff2>abc</stuff2>
</stuff1>
<stuff3>
<stuff4>abc</stuff4>
<stuff5>true</stuff5>
</stuff3>
<stuff3>
<stuff4>abc</stuff4>
<stuff5>false</stuff5>
<stuff6>extra stuff</stuff6>
</stuff3>
what I'd like to do is select stuff1's that match stuff3's based on stuff2==stuff4, but also where the stuff3's have a "stuff5" value of false, but not true.
I know //stuff1[stuff2/text()=../stuff3/stuff4/text()] will select me the stuff1's that match the stuff3's, but how do I specify that the stuff3's must have a stuff5 value of false? Sorry if this elementary or answered elsewhere, random searching didn't seem to reveal the answer easily.
Thank you.

You're almost there, if you want to only match stuff3 elements with a stuff5 value of false, you can do that by adding the predicate [stuff5/text()="false"]to the stuff3 match.
The complete XPath should look something like the following:
//stuff1[stuff2/text()=../stuff3[stuff5/text()="false"]/stuff4/text()]

Related

Find HTML Tags in Properties

My current issue is to find HTML-Tags inside of property values. I thought it would be easy to search with a query like /jcr:root/content/xgermany//*[jcr:contains(., '<strong>')] order by #jcr:score
It looks like there is a problem with the chars < and > because this query finds everything which has strong in it's property. It finds <strong>Some Text</strong> but also This is a strong man.
Also the Query Builder API didn't helped me.
Is there a possibility to solve it with a XPath or SQL Query or do I have to iterate through the whole content?
I don't fully understand why it finds This is a strong man as a result for '<strong>', but it sounds like the unexpected behavior comes from the "simple search-engine syntax" for the second argument to jcr:contains(). Apparently the < > are just being ignored as "meaningless" punctuation.
You could try quoting the search term:
/jcr:root/content/xgermany//*[jcr:contains(., '"<strong>"')]
though you may have to tweak that if your whole XPath expression is enclosed in double quotes.
Of course this will not be very robust even if it works, since you're trying to find HTML elements by searching for fixed strings, instead of actually parsing the HTML.
If you have an specific jcr:primaryType and the targeted properties you can do something like this
select * from nt:unstructured where text like '%<strong>%'
I tested it , but you need to know the properties you are intererested in.
This is jcr-sql syntax
Start using predicates like a champ this way all of this will make sense to you!
HTML Encode <strong>
HTML Decimal <strong>
Query builder is your friend:
Predicates: (like a CHAMP!)
path=/content/geometrixx
type=nt:unstructured
property=text
property.operation=like
property.value=%<strong>%
Have go here:
http://localhost:4502/libs/cq/search/content/querydebug.html?charset=UTF-8&query=path%3D%2Fcontent%2Fgeometrixx%0D%0Atype%3Dnt%3Aunstructured%0D%0Aproperty%3Dtext%0D%0Aproperty.operation%3Dlike%0D%0Aproperty.value%3D%25%3Cstrong%3E%25
Predicates: (like a CHAMP!)
path=/content/geometrixx
type=nt:unstructured
property=text
property.operation=like
property.value=%<strong>%
Have a go here:
http://localhost:4502/libs/cq/search/content/querydebug.html?charset=UTF-8&query=path%3D%2Fcontent%2Fgeometrixx%0D%0Atype%3Dnt%3Aunstructured%0D%0Aproperty%3Dtext%0D%0Aproperty.operation%3Dlike%0D%0Aproperty.value%3D%25%26lt%3Bstrong%26gt%3B%25
XPath:
/jcr:root/content/geometrixx//element(*, nt:unstructured)
[
jcr:like(#text, '%<strong>%')
]
SQL2 (already covered... NASTY YUK..)
SELECT * FROM [nt:unstructured] AS s WHERE ISDESCENDANTNODE([/content/geometrixx]) and text like '%<strong>%'
Although I'm sure it's entirely possible with a string of predicates, it's possibly heading down the wrong route. Ideally it would be better to parse the HTML when it is stored or published.
The required information would be stored on simple properties on the node in question. The query will then be a lot simpler with just a property = value query, than lots of overly complex query syntax.
It will probably be faster too.
So if you read in your HTML with something like HTMLClient and then parse it with a OSGI service, that can accurately save these properties for you. Every time the HTML is changed the process would update these properties as necessary. Just some thoughts if your SQL is getting too much.

LDAP search on multiple fields like an if/else-statement

I have a question regarding LDAP search, i have three attributes that i want to involve in my filter.
I want that the filter always shall search for objectClass, if attribute skaPersonType has a value, look for that, else look for employeeType.
I'm stuck and really don't now how to continue.
Best regards / C
Always search for objectclass
Unnecessary, but (objectClass=*): all LDAP entries have an objectClass.
IF skaPerson=EMP is met, look for that value
(skaPerson=EMP)
ELSE look for employeetype=External
(employeetype=External)
Any ideas how i can manage that?
You're looking for (2) or (3). So:
(|(skaPerson=EMP)(employeetype=External))
If you must have the redundant objectClass test:
(&(objectClass=*)(|(skaPerson=EMP)(employeetype=External)))
Not sure what filter you actually want:
...always shall search for objectClass, if attribute skaPersonType has a
value, look for that, else look for employeeType...
Are you looking for something like this?
(&(objectClass=MyClass)(|(skaPersonType=A)(&(!(skaPersonType=*))(employeeType=B))))
Above filter will get object which:
objectClass equals MyClass, AND
one of following condition is met
skaPersonType equals A, OR
skaPersonType has no value, and employeeType equals B
The code is not tested.

Xpath Multiple Predicates

I am trying to quickly find a specific node using XPath but it seems my multiple predicates are not working. The div I need has a specific class, but there are 3 others that have it. I want to select the fourth one so I did the following:
//div[#class='myCLass' and 4]
However the "4" is being ignored. Any help? I am new to XPath.
Thanks.
If a xpath query returns a node set you can always use the [OFFSET] operator to access a certain element of it.
Use the following query to access the fourth element that matches the #class='myClass' predicate:
//div[#class='myCLass'][4]
#WilliamNarmontas answer might be an alternative to the syntax showed above.
Alternatively,
//div[#class='myCLass' and position()=4]
The accepted answer works correctly only if all of the div elements have the same parent. Otherwise use:
(//div[#class='myCLass'])[4]

Prefix the result of a XPATH query

I use libxmljs to parse some html.
I have a xpath query which has an "or" conjunction to retrieve basically the information of two queries
Example
doc.find("//div[contains(#class,'important') or contains(#class,'overdue')]")
this returns all the divs with either important or overdue...
Can I prefix or see within my result set which comes from which condition?
The result could be an array with an index for the match 0 for the first condition and 1 for the 2... Is this possible...
Or how can I find out which result comes from which query condition...
Thanks for any help...
P.S.: this is a simplified exampled of a sequence of elements which either have an important or an overdue item ... both, one or none of them... So I cannot go by looking for every second entry ... etc
This is the result I want to get...
message:{},
message:{
.....
important: "some immportant text",
overdue: "overdue date,
.....
}
There is no way to know which clause of an or XPath query caused a particular result to be included. It's simply not information that's kept around.
You'll either need to do entirely separate queries for important and overdue, or do one large query to get the entire result set (as you are now) and then further test each result's class to find out which one it is.

Use XPath to select the element with a certain token in the value

I have the following XML:
<ZMARA SEGMENT="1">
<MATERIAL>000000000030001004</MATERIAL>
<PRODUCT_GROUP>14000IAA</PRODUCT_GROUP>
<PRODUCT_GROUP_DESC>HER 30 AR NEW Size</PRODUCT_GROUP_DESC>
<CLASS_CODE>I046</CLASS_CODE>
<CLASS_CODE_DESC>Heritage 30</CLASS_CODE_DESC>
<CHARACTERISTICS_01>,001,PLANNING_ALERT_PERCENTAGE, 50.000,PLANNI</CHARACTERISTICS_01>
<CHARACTERISTICS_02>X,001,COLOR_ATTRIBUTE,Weathered Wood,WEWD,Col</CHARACTERISTICS_02>
<CHARACTERISTICS_03>,001,ARMA_UOM,SALES SQUARE,SSQ,ARMA UNIT OF M</CHARACTERISTICS_03>
<CHARACTERISTICS_04>,001,ARMA_A_CATEGORY,05-Below 260 Lam/Multi-l</CHARACTERISTICS_04>
</ZMARA>
Using XPath I need to select the CHARACTERISTICS_XX element whose value contains the COLOR_ATTRIBUTE token. It will not always be characteristics_02. Thanks for the help. I am a total noob at XPath.
This looks like its taken from a sap idoc, you can probably be lucky that the fieldnamed are not 6 character long abbreviations :)
The answer given by spinon is correct, however if there could be another element that contains the text 'COLOR_ATTRIBUTE', this would give a more specific match:
/ZMARA/*[starts-with(local-name(.), 'CHARACTERISTICS_')][contains(.,'COLOR_ATTRIBUTE')]
Another suggestion is to avoid the '//' expression if you know where the ZMARA element can occur, in the expression above ZMARA would only be searched as a root element which would be more performant.
This should work:
//ZMARA/*[contains(.,'COLOR_ATTRIBUTE')]

Resources