Xpath for regular text after bold text? - xpath

I am using xpath and I am trying to get the regular text from a this html that looks like this:
<p><strong>Gender: </strong>Female</p>
<p><strong>Years in Practice: </strong>30 Years</p>
<p><strong>Languages Spoken: </strong>English</p>
I tried getting the regular text with the xPath:
.//strong//text()
But I only get "bold texts" as result, not "regular text".
Female
30 Years
English
Not: Gender Years in Practice - Languge spoken
How do I get the regular text?

To get only the "regular" text, try:
//p/text()
If you need to make sure that the text returned is after strong, try:
//p/strong/following-sibling::text()

If you want to get just the regular text, use //p/text()

Related

How to find xpath of an element under a heading

in a Web page :
<h3 class="xh-highlight">Units Currently On Bed List</h3>
"[total beds=0]
"
i want to find xpath of total beds=0.
how can i do?
Your question and your comment are a bit contradictory. Do you want to find the text after a heading or do you want to find the element containing the text [total beds=0]? Also, how exact do you want to navigate your document?
To find a text after any h3 element you can use this: //h3/following-sibling::text()[1] (see XPath - select text after certain node).
To find a text after an h3 element with the class "xs-highlight" you can use this: //h3[#class='xh-highlight']/following-sibling::text()[1]
To be even more precise you can also look for the heading text: //h3[#class='xh-highlight' and text()='Units Currently On Bed List']/following-sibling::text()[1]
This doesn't match the html in your first comment however, so you might want to adjust the header class and text values. Also, it will find any first text even if there are other elements between it and the h3 element.
Now, your second comment makes it seem you actually want to find the element containing the text. The reason //*[text()='[total beds=0]'] doesn't work is because of the newline in the text. If you can get rid of that in the source it should match, otherwise you can "ignore" it in the xpath by using //*[normalize-space(text())='[total beds=0]']. (This is assuming the quotes around the text in your question aren't actually in the document.)

Ruby/Regex: Dealing with strings containing forward slashes and parentheses using gsub and regex

Hi I am using Watir to click through some links. I go to a page, click a link based on its text, and the do it again click a new link. I am locating the links based on their text (it is the only way I can based on their HTML) and need to match the text I pulled from the page to the link. The text that I get contains some extra text not part of the link, so I need to gsub it out. Here is my issue:
String: text = "Nuclear Launch Codes (Levels One/Two)"
Link: Nuclear Launch Codes (Levels One/Two) Blah Blah Blah
Because the links do not always have the exact text I need to locate them like so: /#{text}/
Problem is that returns "Nuclear Launch Codes (Levels One\/Two)"
I though I would gsub the 1st parenthesis and everything after, but I need to keep it because I can have Nuclear Launch Codes (Levels Four/Five)
Is there anyway to modify the string to match the link while ignoring the rest of the link text?
If I understand you correctly, try:
/#{Regexp.escape(text)}/
Or equivalently, if you prefer:
Regexp.new(Regexp.escape(text))
This will automatically escape parentheses, slashes and so on in the text so they are not treated as special regexp characters.

How to remove link's text from html table

I want to extract the plain text in the html table (that is, I don't want to grab the information including red arrow),
However, I tried to get the plain text by cell.text, it will get the unnecessary hyperlinks' text
"\n central tendency1 \n "
I expected that I can get
"central tendency"
So I tried cell.text.strip.downcase.gsub!(/\d/, ""),
However the gsub method will also clear the information in the green rectangle.
Is there any way to grab the text in html excepting the text of hyperlink ?
here's the html link I need to parse
You can remove all the links before converting to text with nokogiri:
table = doc.css(".page table")[0]
table.css("a").each(&:remove)
Edit: Alternatively, you can have a regexp that only removes numbers at the end of a string and if they're preceded by a letter, which seems like it may work in this specific case but cannot be relied upon to work in similar cases:
cell.text.strip.downcase.gsub(/(?<=\w)\d$/, "")

Selected text in ruby Tk Text widget?

I can't seem to find how to get the currently selected text from a Text widget in ruby. In perl there was a ->getSelected function, which does not seem to exist in the ruby implementation. Also, the selected text is supposed to be marked with a tag "sel", but whenever I try to use it with get("sel"), it says invalid text index. There must be a way to get the selected text though...
Also, another question, by default, the text widget in perl has a pop menu with all sorts of functionality like search, copy/paste. Was this just a perl specific add on?
Yes, the popup menu in perl is a perl-specific add-on.
As for getting the selected text, you are correct that the selected text has the "sel" tag, and you use that to get the selected text. To retrieve the selected text you should use the index sel.first and sel.last, for example:
get("sel.first", "sel.last")
A really good resource on Tk that covers usage in Tcl, Python, Ruby and Perl see tkdocs.com. The text widget is documented on that site in the tutorial on text.
Of course I finally figured this out right after posting. The index is "sel.first" and "sel.last". so I used get("sel.first", "sel.last")

How can i get Plain text from Ajax Editor?

I need to get both plain text as well as html text from Ajax Editor. I'm able to get the html text and not able to retrieve plain text. i'm not supposed to eliminate html tags from the editor to retrieve plain text.
Is there any property, which gives plain text from ajax editor?
Sample code from my app:
i'm able to get rich html text like this:
string desc = QuestionAndAnswerEditor.Content;
Same way i want plain text.
Please help me.
Use HTML.Encode for getting encoded text. and html.decode ..

Resources