xpath find text within a table - xpath

I am trying to find and select the text "Paris" within a dynamic table. When I run my selenium tests it can find the table and it can verify the text "Paris" exists however it cannot click Paris
I think it's something like this:
/html/body/div[class='yui-dt-liner']/td/tr[contains 'Paris']/div
Or:
//div[class='yui-dt-liner']/table/tr[contains text(), "Paris"]/div
But I can't get it to work. Any help will be appreciated.

Besides the "cannot click Paris" part (very much a Selenium specific type of question), both expression you'd provide are not sintactically correct.
Probably, they should be:
/html/body/div[#class='yui-dt-liner']/table/tr/td[contains(.,'Paris')]/div
And
//div[#class='yui-dt-liner']/table/tr/td[contains(.,'Paris')]/div
Note: # abbreviated form of attribute:: axis. Correct contains() function syntax. You should better use the string value of the element (. is the abbreviated form of self::node()) instead of the first text node child.

Related

How to properly scraping filtered content using XPath Query to Google Sheet?

So, this is about a content from a website which I want to get and put it in my Google Sheets, but I'm having difficulty understanding the class of the content.
target link: https://www.cnbc.com/quotes/?symbol=XAU=
This number is what I want to get from. Picture 1: The part which i want to scrape
And this is what the code looks like in inspector. Picture 2: The code shown in inspector
The target is inside a span attribute but the span attribute looks very difficult to me, so I tried to simplify it using this line of code here =IMPORTXML("https://www.cnbc.com/quotes/?symbol=XAU=","//table[#class='quote-horizontal regular']//tr/td/span")
Picture 3: List is shown when putting the code
After some tries, I am able to get the right target, but it confuse me, Im using this code =IMPORTXML("https://www.cnbc.com/quotes/?symbol=XAU=","//table[#class='quote-horizontal regular']//tr/td/span[#class='last original'][1]")
Picture 4: The right target is shown when the xpath query is more specified
As what you can see in 2nd Picture, 'last original' is not really the full name of the class, when I put the 'last original ng-binding' instead it gave me an error saying imported content is empty
So, correct me if my code is wrong, or accidental worked out somehow because there's another correct way?
How about this answer?
Modified formula 1:
When the name of class is last original and last original ng-binding, how about the following xpath and formula?
=IMPORTXML(A1,"//span[contains(#class,'last original')][1]")
In this case, the URL of https://www.cnbc.com/quotes/?symbol=XAU= is put in the cell "A1".
In this case, //span[contains(#class,'last original')][1] is used as the xpath. The value of span that the name of class includes last original is retrieved. So last original and last original ng-binding can be used.
Modified formula2:
As other xpath, how about the following xpath and formula?
=IMPORTXML(A1,"//meta[#itemprop='price']/#content")
It seems that the value is included in the metadata. So this sample retrieves the value from the metadata.
Reference:
IMPORTXML
To complete #Tanaike's answer, two alternatives :
=IMPORTXML(B2;"//span[#class='year high']")
"Year high" seems always equal to the current stock index value.
Or, with value retrieved from the script element :
=IMPORTXML(B2;"substring-before(substring-after(//script[contains(.,'modApi')],'""last\"":\""'),'\')")
Note : since I'm based in Europe, you need to replace ; with , in the formulas.

How to extract items inside a table using scrapy

I want to extract all the functions listed inside the table in the below link : python functions list
I have tried using the chrome developers console to get the exact xpath to be used in the file spider.py as below:
$x('//*[#id="built-in-functions"]/table[1]/tbody//a/#href')
but this returns a list of all href's ( which I think what the xpath expression refers to).
I need to extract the text from here I believe but appending /text() to the above xpath return nothing. Can someone please help me to extract the function names from the table.
I think this should do the trick
response.css('.docutils .reference .pre::text').extract()
a non-exact xpath equivalent of it (but that also works in this case) would be:
response.xpath('//table[contains(#class, "docutils")]//*[contains(#class, "reference")]//*[contains(#class, "pre")]/text()').extract()
Try this:
for td in response.css("#built-in-functions > table:nth-child(4) td"):
td.css("span.pre::text").extract_first()

X-Path Query won't work in Google-Sheets

I've been trying to use the following X-Path within Google-Sheets with the =ImportXML function
=importXml("http://www.managetickets.com/morecApp/ticketSearchAndStatusTicketList.jsp?msgCount=23&outputEmail=&db=nd", "/table[2]/tbody/tr#[td]")
But no matter what minor adjustments I try I continually get "#N/A" with a hover-text box that says "imported content is empty".
I know it's a valid x-path, I've cross verified it with 'X-Path Helper Wizard' chrome-extension.
Any ideas what I'm doing wrong!?
No, actually it's not a valid XPath. Note that # used to select an attributes e.g. #class, #id, etc. Also it's a bad idea to use tbody tag in your expressions as this tag is not always present in initial source code
So if you want to match table rows which contain cells from second table, you can use
/table[2]//tr[td]

Retrieve an xpath text contains using text()

I've been hacking away at this one for hours and I just can't figure it out. Using XPath to find text values is tricky and this problem has too many moving parts.
I have a webpage with a large table and a section in this table contains a list of users (assignees) that are assigned to a particular unit. There is nearly always multiple users assigned to a unit and I need to make sure a particular user is assigned to any of the units on the table. I've used XPath for nearly all of my selectors and I'm half way there on this one. I just can't seem to figure out how to use contains with text() in this context.
Here's what I have so far:
//td[#id='unit']/span [text()='asdfasdfasdfasdfasdf (Primary); asdfasdfasdfasdfasdf, asdfasdfasdfasdf; 456, 3456'; testuser]
The XPath Query above captures all text in the particular section I am looking at, which is great. However, I only need to know if testuser is in that section.
text() gets you a set of text nodes. I tend to use it more in a context of //span//text() or something.
If you are trying to check if the text inside an element contains something you should use contains on the element rather than the result of text() like this:
span[contains(., 'testuser')]
XPath is pretty good with context. If you know exactly what text a node should have you can do:
span[.='full text in this span']
But if you want to do something like regular expressions (using exslt for example) you'll need to use the string() function:
span[regexp:test(string(.), 'testuser')]

Query html tag with XPath

I am writing the selenium test.
I have a label there "Assign Designer" and the select box followed right after the label.
Unfortunetely, select box has the dynamic id and I can not query it by id or any other it's attribute.
Can I build the XPath query that returns "First select tag after text 'Assign Designer'"?
PS. Selenium supports only XPath 1.0
This would be something like:
//label[text() = 'Assign Designer']/following-sibling::select[1]
Note that:
The // shorthand is quite inefficient, because it causes a document-wide scan. If you can be more specific about the label's position, I recommend doing so. If the document is small, however, this won't be a problem.
Since I don't know much about Selenium, I used "label". If it is not a <label>, you should use the actual element name, of course. ;-)
be sure to include a position predicate ([1], in this case) whenever you use an axis like "following-sibling". It's easily forgotten and if it is, your expressions may produce unexpected results.

Resources