Xpath - search root with conditions and below root with conditions - xpath

I have a structure like this:
<sv:a>
<sv:b sv:name="one"/>
<sv:b sv:name="two"/>
<sv:c sv:name="exclude"/>
<sv:b sv:name="error"/>
<sv:a>
I am trying to get all a's and b's but exclude from my search the content of any c.
I have this structure so far for my xpath query
//*[not(name()='error') and jcr:contains(*, 'searchInput')]
I want to add something to this to essentially say, "do not give me any node named exclude" or maybe a better way to put it is "exclude any node named exclude from the search". I am not sure if I can do that using the path initially used of //* and just filtering a different way. I know I cannot just say not(name()='exclude') because it is only looking at one level below root and only excludes nodes at that level.
Is there a way to search 1 more level below and exclude certain nodes by their name or search everything in the entire document and exclude those nodes of a particular name?
Im not sure it matters, but I am working the CMS Magnolia and trying to make a site search. I hit a limitation using jcr sql2 and cannot do what I am trying to do here as far as I have found in researching this.
EDIT:
Based on answers and comments, here is what I am looking at now:
//*[not(#sv:name='exclude' or #sv:name='error') and jcr:contains(*, 'searchInput)]
I still seem to be getting the 'exclude' results so I must either not be registering 'sv:' correctly or missing something in the query needed to exclude some of the results from the search.

I want to add something to this to essentially say, "do not give me any node named exclude"
That's easy: Nodes (elements) named exclude can be selected via the self axis,
using *[self::exclude]. Corollary: An element not named exclude is *[not(self::exclude)].
But I think you don't refer to element names. You don't have any <exclude> elements in your input.
You actually seem to refer to attributes.
//*[not(#sv:name = 'error' or #sv:name = 'exclude') and jcr:contains(*, 'searchInput')]

I am trying to get all a's and b's but exclude from my search the content of any c.
You can't, at least not with pure XPath. XPath is a language for selecting nodes out of an XML tree, not for building new trees that are different. An XPath expression can either select the a or not select the a, but it can't give you a new a element that has only some of the children of the original a and not others.

Related

Element Paths for Search Parameters

I'm trying to find the most reliable way to determine which element in a resource a given search parameter refers to. So far I process the xpath expression and hope to find a match, but this seems hacky.
Is there a standard or more consistent way to determine what element(s) within a resource a search parameter should use?
Not right now - parsing the XPath is it. This is an active point of discussion in the standards group and will probably result in some modification to the SearchParameter resource (I hope!).

Shorten XPATH with wildcards

I'm currently trying to figure out how to shorten my extremely long xpath.
//div[#class='m_set_part'][1]/div/div[2]/div[#class='row']/div[#class='col details detail-head']/div[#class='detail-body']/div[2]/div/div[#class='size']/div/div[#class='m_product_finder_size']/ul/li[1]/span[#class='size-btn']/a
This is the one I have right now and it's way too long, the problem is I need the first node to differentiate between products. Is there a way to shorten it like
//div[#class='m_set_part']/*/span[#class='size-btn']/a
Or do I have to go through all childnodes to reach the last nodes?
Link
I want to find the for each product the sizebuttons. The only way to differentiate them, I guess, is via adding a [1] or [2] to the m_set_part node.
You are basically correct. As said in the comments, you can use // to select descendant or self nodes. Hence, this will give you all the size links:
//span[#class='size-btn']/a
As you suggest, you can select the specific product using a positional predicate. However, if you prefer you could also use another detail, e.g. the name. This would simply be
//div[#class="m_set_part"][.//label="Vælg"]
to given you the Vælg product.
Now combine them both and you can get the size link for this specifc product using
//div[#class="m_set_part"][.//label="Vælg"]//span[#class='size-btn']/a
or using the psoitional predicate it would be
//div[#class="m_set_part"][1]//span[#class='size-btn']/a
Also, please make sure you use a proper namespace as this is an actual XHTML document. One other thing is that you might prefer to use contains(#class, 'm_set_part') instead of #class="m_set_part" and the like, because the query will still work even if the add new CSS classes to this element.
To answer to your question: No you don't have to go through all nodes.
You may use the // descendant-or-self selector to 'skip' zero or more nodes in between the preceeding and the next part of the expression. So //div[#class='m_set_part']//span[#class='size-btn']/a might give you exactly what you want. * on the other hand matches any node, but exactly one node. Therfore
//div[#class='m_set_part'][1]/*/*[2]/*[#class='row']/*[#class='col details detail-head']/*[#class='detail-body']/*[2]/*/*[#class='size']/*/*[#class='m_product_finder_size']/*/*[1]/*[#class='size-btn']/a
is another way to shorten your original expression. Whether it's still returns only the interested node or more is solely depends on the document you apply the expression on.

How to uniquly identify an two objects in same page having same url

I Have two objects in same page but with different locations(tabs), I want to verify those objects each a part ...
i cant uniquely any of objects because the have same properties.
These objects clearly are unique to a point because they have completely different text, this means that you will be able to create an object to match only one of them. My suggestion would be to look for the object by using its text property, one of them will always have "Top Ranking" the other you wil need to turn into a regular expression for the text and will be something "Participants (\d+)".
I am assuming that this next answer is unlikely to be possible so saved it for after the answer you are likely to use but the best solution would of course be to get someone with access to give these elements ids for you to search for. This will in the long term be much easier for you to maintain and not using text will allow this test to run in any language.
Manaysah, do these objects have different indexes? Use the object spy and determine which index they have, the ordinal identifier index may be a solution to your problem. You could also try adding an innertext object property if possible, using a wildcard for the number inside the () as it appears dynamic.
try using xpath for the objects...xpath will definitely be different

XPath simple conditional statement? If node X exists, do Y?

I am using google docs for web scraping. More specifically, I am using the Google Sheets built in IMPORTXML function in which I use XPath to select nodes to scrape data from.
What I am trying to do is basically check if a particular node exists, if YES, select some other random node.
/*IF THIS NODE EXISTS*/
if(exists(//table/tr/td[2]/a/img[#class='special'])){
/*SELECT THIS NODE*/
//table/tr/td[2]/a
}
You don't have logic quite like that in XPath, but you might be able to do something like what you want.
If you want to select //table/tr/td[2]/a but only if it has a img[#class='special'] in it, then you can use //table/tr/td[2]/a[img[#class='special']].
If you want to select some other node in some other circumstance, you could union two paths (the | operator), and just make sure that each has a filter (within []) that is mutually exclusive, like having one be a path and the other be not() of that path. I'd give an example, but I'm not sure what "other random node" you'd want… Perhaps you could clarify?
The key thing is to think of XPath as a querying language, not a procedural one, so you need to be thinking of selectors and filters on them, which is a rather different way of thinking about problems than most programmers are used to. But the fact that the filters don't need to specifically be related to the selector (you can have a filter that starts looking at the root of the document, for instance) leads to some powerful (if hard-to-read) possibilities.
Use:
/self::node()[//table/tr/td[2]/a/img[#class='special']]
//table/tr/td[2]/a

XPath Query in JMeter

I'm currently working with JMeter in order to stress test one of our systems before release. Through this, I need to simulate users clicking links on the webpage presented to them. I've decided to extract theese links with an XPath Post-Processor.
Here's my problem:
I have an a XPath expression that looks something like this:
//div[#data-attrib="foo"]//a//#href
However I need to extract a specific child for each thread (user). I want to do something like this:
//div[#data-attrib="foo"]//a[position()=n]//#href
(n being the current index)
My question:
Is there a way to make this query work, so that I'm able to extract a new index of the expression for each thread?
Also, as I mentioned, I'm using JMeter. JMeter creates a variable for each of the resulting nodes, of an XPath query. However it names them as "VarName_n", and doesn't store them as a traditional array. Does anyone know how I can dynamicaly pick one of theese variables, if possible? This would also solve my problem.
Thanks in advance :)
EDIT:
Nested variables are apparently not supported, so in order to dynamically refer to variables that are named "VarName_1", VarName_2" and so forth, this can be used:
${__BeanShell(vars.get("VarName_${n}"))}
Where "n" is an integer. So if n == 1, this will get the value of the variable named "VarName_1".
If the "n" integer changes during a single thread, the ForEach controller is designed specifically for this purpose.
For the first question -- use:
(//div[#data-attrib="foo"]//a)[position()=$n]/#href
where $n must be substituted with a specific integer.
Here we also assume that //div[#data-attrib="foo"] selects a single div element.
Do note that the XPath pseudo-operator // typically result in very slow evaluation (a complete sub-tree is searched) and also in other confusing problems ( this is why the brackets are needed in the above expression).
It is recommended to avoid using // whenever the structure of the document is known and a complete, concrete path can be specified.
As for the second question, it is not clear. Please, provide an example.

Resources