How to fetch multiple cq pages using Xpath based on page's property - xpath

I have two cq page and I want to retrive these two using jcr:title property using XPATH. The below query is working fine for single.
/jcr:root/content/test//element(*, cq:Page)[((jcr:like(jcr:content/#jcr:title, dell)))]
But I want to excuate it for multiple items. I have tried with following option but it is not working
/jcr:root/content/test//element(*, cq:Page)[((jcr:like(jcr:content/#jcr:title, [dell,samusng])))]
Could anyone help me to write xpath query?

As rakhi4110 already mentioned in the comment, you can combine multiple where clauses with an or just like in SQL. Though I think you either want exact matches or use jcr:containts instead of jcr:like.
Exact match:
/jcr:root/content/test//element(*, cq:Page) [jcr:content/#jcr:title='dell' or jcr:content/#jcr:title='samsung']
Contains:
/jcr:root/content/test//element(*, cq:Page) [jcr:contains(jcr:content/#jcr:title, 'dell') or jcr:contains(jcr:content/#jcr:title, 'samsung')]
Or if you really want to use jcr:like which, as in SQL, uses % for wildcard:
/jcr:root/content/test//element(*, cq:Page) [jcr:like(jcr:content/#jcr:title, '%dell%') or jcr:like(jcr:content/#jcr:title, '%samsung%')]

Related

How to extract items inside a table using scrapy

I want to extract all the functions listed inside the table in the below link : python functions list
I have tried using the chrome developers console to get the exact xpath to be used in the file spider.py as below:
$x('//*[#id="built-in-functions"]/table[1]/tbody//a/#href')
but this returns a list of all href's ( which I think what the xpath expression refers to).
I need to extract the text from here I believe but appending /text() to the above xpath return nothing. Can someone please help me to extract the function names from the table.
I think this should do the trick
response.css('.docutils .reference .pre::text').extract()
a non-exact xpath equivalent of it (but that also works in this case) would be:
response.xpath('//table[contains(#class, "docutils")]//*[contains(#class, "reference")]//*[contains(#class, "pre")]/text()').extract()
Try this:
for td in response.css("#built-in-functions > table:nth-child(4) td"):
td.css("span.pre::text").extract_first()

xpath query url with one folder depth only

I am using this XPath query succesfully:
//div[(#class="result")]//a[contains(#href,"pinterest.com")]/#href
The URL I am using the XPath query (with simple_html_dom.php) is this one here.
Now, I would like to find results for pinterest.com/one-folder-deep-only and exclude all URLs deeper than one directory, like pinterest.com/one-folder-deep-only/this or pinterest.com/one-folder-deep-only/this/this. I have no idea if there is a way to achieve that. Have googled a lot, but not found anything. Maybe my search terms weren't the best.
Do you have any ideas? Thanks for helping me here.
I am testing the query using the Chrome XPath Helper.
"//" is to evaluate all levels/depths. Instead use only one "/" for the "a" query to only evaluate immediate children
//div[(#id="first-result")]/a[contains(#href,"url.com")]/#href
Note use of / instead of // before the "a" tag.
Try below XPath to select #href from required anchors only:
//a[contains(#href, "url.com") and not(contains(substring-after(./#href, 'url.com/'), "/"))]/#href
Solution for XPath 2.0:
//a[contains(#href, "url.com") and count(tokenize(#href, "/"))=2]/#href
Note that if in real HTML source href starts-with "http://url.com" you should specify =4 instead of =2

How to show more than 10 results in eXide (eXist-db)?

I wonder how to make eXide to return more than 10 results. No matter how I query the database, it is not possible to get more. Is there some special rule in $EXIST_HOME or so?
I use eXist-db 3.0.RC1.
One way is to wrap your query in an element.
<results>{... your query here ...}</results>
Wrapping results is the way to go, but if you wish, you can edit /db/apps/eXide/resources/scripts/eXide.min.js, changing "10" in "q=n+10-1" to some other number.

Find HTML Tags in Properties

My current issue is to find HTML-Tags inside of property values. I thought it would be easy to search with a query like /jcr:root/content/xgermany//*[jcr:contains(., '<strong>')] order by #jcr:score
It looks like there is a problem with the chars < and > because this query finds everything which has strong in it's property. It finds <strong>Some Text</strong> but also This is a strong man.
Also the Query Builder API didn't helped me.
Is there a possibility to solve it with a XPath or SQL Query or do I have to iterate through the whole content?
I don't fully understand why it finds This is a strong man as a result for '<strong>', but it sounds like the unexpected behavior comes from the "simple search-engine syntax" for the second argument to jcr:contains(). Apparently the < > are just being ignored as "meaningless" punctuation.
You could try quoting the search term:
/jcr:root/content/xgermany//*[jcr:contains(., '"<strong>"')]
though you may have to tweak that if your whole XPath expression is enclosed in double quotes.
Of course this will not be very robust even if it works, since you're trying to find HTML elements by searching for fixed strings, instead of actually parsing the HTML.
If you have an specific jcr:primaryType and the targeted properties you can do something like this
select * from nt:unstructured where text like '%<strong>%'
I tested it , but you need to know the properties you are intererested in.
This is jcr-sql syntax
Start using predicates like a champ this way all of this will make sense to you!
HTML Encode <strong>
HTML Decimal <strong>
Query builder is your friend:
Predicates: (like a CHAMP!)
path=/content/geometrixx
type=nt:unstructured
property=text
property.operation=like
property.value=%<strong>%
Have go here:
http://localhost:4502/libs/cq/search/content/querydebug.html?charset=UTF-8&query=path%3D%2Fcontent%2Fgeometrixx%0D%0Atype%3Dnt%3Aunstructured%0D%0Aproperty%3Dtext%0D%0Aproperty.operation%3Dlike%0D%0Aproperty.value%3D%25%3Cstrong%3E%25
Predicates: (like a CHAMP!)
path=/content/geometrixx
type=nt:unstructured
property=text
property.operation=like
property.value=%<strong>%
Have a go here:
http://localhost:4502/libs/cq/search/content/querydebug.html?charset=UTF-8&query=path%3D%2Fcontent%2Fgeometrixx%0D%0Atype%3Dnt%3Aunstructured%0D%0Aproperty%3Dtext%0D%0Aproperty.operation%3Dlike%0D%0Aproperty.value%3D%25%26lt%3Bstrong%26gt%3B%25
XPath:
/jcr:root/content/geometrixx//element(*, nt:unstructured)
[
jcr:like(#text, '%<strong>%')
]
SQL2 (already covered... NASTY YUK..)
SELECT * FROM [nt:unstructured] AS s WHERE ISDESCENDANTNODE([/content/geometrixx]) and text like '%<strong>%'
Although I'm sure it's entirely possible with a string of predicates, it's possibly heading down the wrong route. Ideally it would be better to parse the HTML when it is stored or published.
The required information would be stored on simple properties on the node in question. The query will then be a lot simpler with just a property = value query, than lots of overly complex query syntax.
It will probably be faster too.
So if you read in your HTML with something like HTMLClient and then parse it with a OSGI service, that can accurately save these properties for you. Every time the HTML is changed the process would update these properties as necessary. Just some thoughts if your SQL is getting too much.

Handling Dynamic Xpath

Am automating things using Selenium. Need your help to handle Dynamic Xpath as below:
Driver.findElement(By.xpath("//[#id='INQ_2985']/div[2]/tr/td/div/div[3]/div")).click();
As above INQ_2985 changes to 2986,2987,2988 etc during each run
HTML CODE:
< div> class="context-menu-item-inner" style="background-image:url(../images/productSmall.png);">Tender Assignment < /div>
Tried different combinations as below but with no success:
// Driver.findElement(By.name("//input[#name='Tender Assignment']")).click();
// Driver.findElement(By.className("context-menu-item-inner")).click();`
Can you help me on this.
you can try using contains() or starts-with() in xpath,
above xpath can be rewritten as follows,
Driver.findElement(By.xpath("//*[starts-with(#id,'INQ')]/div[2]/tr/td/div/div[3]/div")).click();
if you can post more of your html, we can help improve your xpath..
moreover using such long xpath's is not recommended, this may cause your test to fail more often
for example,if a "new table data or div" is added to the UI, above xpath will no longer be valid
you should try and use id, class or other attributes to get closer to the element your trying to find
i personally recommend using cssSelectors over xpath
you can use many methods,
use implicity wait;
driver.findElement(By.xpath("//*[contains(#id,'select2-result-label-535')]").click();
driver.findElement(By.xpath("//*[contains(text(), 'select2-result-label-535')]").click();
Good to use Regular expression
driver.findElement(By.xpath("//*[contains(#id,'INQ_')]")
Note: If you have single ID with name starts from INQ_ then you can take action on the element . If a bunch of ID then you can extract as a List<WebElements> and then match with the specific text of the element ( element.getText().trim() =="Linked Text" and if it matched then take action. You can follow other logic to traverse and match.
you can use css -
div.context-menu-item-inner
Use this xpath:
driver.findElement(By.cssSelector("div.context-menu-item-inner").click();
The best choice is using full xpath instead of id which you can get easily via firebug.
e.g.
/html/body/div[3]/div[3]/div[2]/div/div[2]/div[1]/div/div[1]
if your xpath is varying
Ex: "//*[#id='msg500']" , "//*[#id='msg501']", "//*[#id='msg502']" and so on...
Then use this code in script:
for (int i=0;i<=9;i++) {
String mpath= "//*[#id='msg50"+i+"']";
driver.findElement(By.xpath(mpath)).click();
}

Resources