In a table, there are rows like this:
<tr id="filtersJob_intrinsicTable_row6" class="evenRow" style="display: none;">some stuff here<tr>
<tr id="filtersJob_intrinsicTable_row7" class="evenRow">some stuff here<tr>
How do i use watir to get the rows which are to be displayed, i.e the rows which do NOT have style="display: none; ?
You have a number of ways of collecting elements without the style attribute:
Using a :css locator:
browser.trs(css: 'tr:not([style])')
Using a :xpath locator:
browser.trs(xpath: '//tr[not(#style)]')
You could also check the attribute value:
browser.trs.select { |tr| tr.attribute_value('style').nil? }
Note that you should be cautious about using the style attribute as an indicator of the row being displayed. Someone could add some other unrelated style property and then all of the tests will fail. Instead, I would suggest that you look for rows that are present:
browser.trs.select(&:present?)
I think that this also makes the purpose of the code more obvious and readable.
Using xpath:
browser.element_by_xpath(".//tr[not(#style)]")
[not(#style)] meaning having no style attribute.
Related
I want to scrape data using Nokogiri from some HTML:
<td data-bar="hoge" data-date="2000-01-01" class="modals"></td>
<td data-bar="fuga" data-date="2000-01-02" class="modals"></td>
I wrote:
element = page.css("td[data-bar='hoge'][data-date='2000-01-01']")
but element.length returns 0.
How do I distinguish elements having two data- attributes?
Try using XPath selectors instead. This worked for me:
element = page.xpath "//td[#data-bar='hoge'][#data-date='2000-01-01']"
In this example, the // portion will match any td element (with those attributes) in the document, which may not be desirable. In that case, you would need to write a more explicit XPath to the node.
Here's the documentation for XPath: https://www.w3.org/TR/xpath/
This question already has answers here:
In which direction do selector engines read, exactly?
(2 answers)
Closed 6 years ago.
So I came across a couple of articles on CSS optimization:
http://csswizardry.com/2011/09/writing-efficient-css-selectors/
https://developer.mozilla.org/en-US/docs/Web/Guide/CSS/Writing_efficient_CSS
Apparently CSS is read from right to left. That means that div table a is read like: first all a elements on the page are retrieved, then all table elements that have an a in them (right?), then all div elements with both of those in them (right?).
My question, which I couldn't find an answer to anywhere, is: how is a CSS rule like div#div_id parsed? Do first all elements with the id "div_id" get parsed, and is a filter then applied to fetch from that bunch of #div_id elements all div elements? Or are first all div elements parsed, and is a filter then applied to fetch everything with the id "div_id"?
The first article I mentioned says that the recommended order of efficiency in CSS is: #id > > .class > tag > rest. But what about tag#id?
To clarify: I like to type div#div_id just to have it clear for myself that #div_id applies to a div element without having to look up the HTML to find out which element's styling I'm looking at, but I wouldn't want to use it that way if it costs me much of my website's performance .What would be the recommended way of writing the rule then? Should I drop the tags in my selectors? Is it really that expensive?
The answer
The answer would be, as jbutler483 says: leaving the tag name out is faster. If you want to have clarification on what element you're styling, don't use div#my_id but #div_my_id. If you don't care that much about performance, you could still go with the div#my_id, but it will be a bit slower (but you can ask yourself if it will really impact your application that much).
Ok, I think you've gotten a little confused.
In your example, you use:
div table a
So i'll use that.
Pretty much, that could look like this in your html
<div>
<table>
<a>
//styling applied here
</a>
</table>
</div>
or something else like
<div>
<div></div>
<table>
<tr>
<th>hi there</th>
<th>
<a>i'm an a tag!</a>
So, looking at that:
div table a
will be
div table a
^ ^ ^
| | |
| | a child
| |
| parent
|
grandparent
This means that you'll be styling any 'a' element that is a child/descendant of a table, which, in turn, is a descendant of a div element
so, in your other example:
div#div_id
you would be styling all id's of div_id in which have a div as a parent.
BTW looking at your example, I would like to point out that (in case you didn't know):
the id attribute should be unique
an <a> attribute shouldn't be used directly within a <table> element (instead nest it within a th or td tag)
If you wish to style multiple elements (of varying types), it would be more efficient to create a class, and use that instead
Answer after Clarification:
Your
div#div_id
In HTML, since the id is meant to be unique, it will look up 'all id's' with the specified id.
It will then check if it is a div element.
This seems to be a bad example, as obviously some (older) browsers will only look for the first id, and return it instead of checking the whole webpage for any 'duplicate' id's.
With your id's being unique, you could then drop your tag as it will be left redundant/ no use
Summary
So, an example of this extended conversation in the comments:
if I wanted to style a single div (and still know it was a div that i was adding styling to), i would use the naming convention of:
<div id="my-div-to-style">
^
|
[the word 'div' here could be anything]
in my css i would write:
_ this word must match the
/ id i used above
|
#my-div-to-style{
//styling...
}
If i wanted to add the same styling to multiple div elements (with the scope to add it to others), i would instead use a class:
<div class="myDivStyle">
and then use:
.myDivStyle{
//styling...
}
in this last example, I would not be restricted to just styling divs, so i wouldn't include this in my naming:
<div class="myStyle">
<a class="myStyle">
<table class="myStyle">
.myStyle{
//styling for any element I want
}
As you say, rules are parsed right to left, the same applies here.
Although duplicate id values are not valid, it is up to the browser to decide whether to accept and parse them, the below (in Chrome) for example, renders the first and last elements with red text.
Demo Fiddle
div#test {
color:red;
}
<div id='test'>text</div>
<span id='test'>
text
</span>
<div id='test'>text</div>
In modern browsers you may want to be less mindful of selector resolution performance and instead look to obtain valid CSS adhering to best practices, keeping selectors as short and concise as possible.
What about tag#id? The second link you mention contains the answer.
Don’t qualify ID rules with tag names or classes
If a rule has an ID selector as its key selector, don’t add the tag
name to the rule. Since IDs are unique, adding a tag name would slow
down the matching process needlessly.
Don’t qualify class rules with tag names
The previous concept also applies here. Though classes can be used
many times on the same page, they are still more unique than a tag.
You may learn more about your question here: css-tricks => efficiently rendering html
There are four kinds of key selectors: ID, class, tag, and universal. It is that same order in how efficient they are.
#main-navigation { } /* ID (Fastest) */
body.home #page-wrap { } /* ID */
.main-navigation { } /* Class */
ul li a.current { } /* Class *
ul { } /* Tag */
ul li a { } /* Tag */
* { } /* Universal (Slowest) */
#content [title='home'] /* Universal */
When we combine this right-to-left idea, and the key selector idea, we can see that this selector isn't very efficient:
#main-nav > li { } /* Slower than it might seem */
Even though that feels weirdly counter-intuitive... Since ID's are so efficient we would think the browser could just find that ID quickly and then find the li children quickly. But in reality, the relatively slow li tag selector is run first.
<div id="ctl00_ContentHolder_vs_ValidationSummary" class="errorblock">
<p><strong>The following errors were found:</strong></p>
<ul><input type="hidden" Name="SummaryErrorCmsIds" Value="E024|E012|E014" />
<li>Please select a title.</li>
<li>Please key in your first name.</li>
<li>Please key in your last name.</li>
</ul>
</div>
here is my snippet for example. i want to get the value of ID i.e., ct100_contentHolder_vs_ValidationSummary. using selenium web driver. h
You can try this :
String id=driver.findElementByXpath("//div[#class='errorblock']").getAttribute("id"));
But in this case the class of this division should be unique.
Use following code to extract id of first div:
WebElement div = driver.findElement(By.tagName("div"));
div.getAttribute("id");
This is the code for all div available on the page:
List<WebElement> div = driver.findElements(By.tagName("div"));
for ( WebElement e : div ) {
div.getAttribute("id");
}
I know this answer is really late but I wanted to put this here for those who come later. Searching by XPath should be avoided unless absolutely necessary because it is more complicated, more error prone, and slower. In this case you can easily do what the accepted answer did without having to use XPaths:
String id = driver.findElement(By.cssSelector("div.errorblock")).getAttribute("id");
Some explanation... this line finds the first element (.findElement vs .findElements) using a CSS Selector. The CSS Selector, div.errorblock, locates all div elements with the class (symbolized by the period .) errorblock. Once it is located, we get the ID using .getAttribute().
CSS Selectors are a great tool that all automators should have in their toolbox. There's a great CSS Selector reference here: http://www.w3.org/TR/selectors/#selectors.
I have this piece of html:
<tr>
<td class="has-checkbox">
<input id="abc" class=... value=...>
</td>
<td class="has-label">
<label for="abc">Name1234</label>
</td>
<tr>
I need to make an xpath that gets me the input element, based on whats in the label, in this case Name1234.
In other words, for this case, I need an xpath to the input element, and the path must contain Name1234, as its variable.
Anyone who can help me out here?
//input[#id = //label[. = 'Name1234']/#for] selects input element(s) with an id attribute value equal to the for attribute value of label elements where the contents is Name1234.
You can use /.. , this syntax use to move back to parent node. In your case:
//label[.='Name1234']/../../td/input
You must move back 2 times because input tag is the child of another td tag.
Here are others introduction and example about you should read.
Here is a solution using the Axes parent and preceding-sibling:
//label[.='Name1234']/parent::td/preceding-sibling::td/input
It's not so complicated as you think:
xpath=//tr[//label[.="Name1234"]]//input
in other words, you are looking for the 'tr' which contains 'label' with text "Name1234". If the condition is true, you are getting the 'input' element
I'm stuck not being able to parse irregularly embedded html tags. Is there a way to remove all html tags from a node and retain all text?
I'm using the code:
rows = doc.search('//table[#id="table_1"]/tbody/tr')
details = rows.collect do |row|
detail = {}
[
[:word, 'td[1]/text()'],
[:meaning, 'td[6]/font'],
].collect do |name, xpath|
detail[name] = row.at_xpath(xpath).to_s.strip
end
detail
end
Using Xpath:
[:meaning, 'td[6]/font']
generates
:meaning: ! '<font size="3">asking for information specifying <font
color="#CC0000" size="3">what is your name?</font> /what/ as in, <font color="#CC0000" size="3">I'm not sure what you mean</font>
/what/ as in <a style="text-decoration: none;" href="http://somesecretlink.com">what</a></font>
On the other hand, using Xpath:
'td/font/text()'
generates
:meaning: asking for information specifying
thus ignoring all children of the node. What I want to achieve is this
:meaning: asking for information specifying what is your name? /what/ as in, I'm not sure what you mean /what/ as in what? I can't hear you
This depends on what you need to extract. If you want all text in font elements, you can do it with the following xpath:
'td/font//text()'
It extracts all text nodes in font tags. If you want all text nodes in the cell, then:
'td//text()'
You can also call the text method on a Nokogiri node:
row.at_xpath(xpath).text
I added an answer for this same sort of question the other day. It's a very easy process.
Take a look at: Convert HTML to plain text and maintain structure/formatting, with ruby