Can I nest itemprop if it read semantically? - microdata

I'm working on a product page on a eComm solution and I'm using Schema.org for the first time. I have a product name and inside of that is the brand and model. Is this acceptable?
<h2 itemprop="name">
<span itemprop="brand">Brand Name</span>
<span itemprop="model">######</span>
</h2>

I can't see anywhere in the Microdata spec that explicitly allows this but it looks like the Google parser accepts it.

The algorithm for finding the value of a property defines (in the last step, which applies in your example):
The value is the element's textContent.
(i.e., the text content of the element and its child elements)
So according to this, the value of name should be "Brand Name ######".
The algorithm for finding the properties of an item contains this step:
If current does not have an itemscope attribute, then: add all the child elements of current to pending.
So the child elements of an element containing itemprop are checked for itemprop attributes, too.

Related

How to find the element who's xpath change dynamically?

We have element which doesn't have id or name to find while automation testing.
Below is the html code for it.
<div _ngcontent-c1="" class="col-md-4">
<div _ngcontent-c1="" class="form-group"> Account Name: </div>
</div>
<div _ngcontent-c1="" class="form-group">
<span _ngcontent-c1="" class="ng-tns-c1-0">1234567890</span>
</div>
We found element by using xpath. we can click the element in one case. Here the xpath was,
/html/body/app-root/app-root/div/div[4]/div[2]/form/div/div[5]/div[2]/div/span
But in the second case we are unable to find the element ,because new element will in the page. Now the above xpath will change to ,
/html/body/app-root/app-root/div/div[4]/div[2]/form/div/div[1]/div[5]/div[2]/div/span
Now the question how to find the above element?.
Venkatesh from the comment section I understood you want to find the Account Name value from the page
You can write xpath with visible text.
//*[contains(text()," Account Name: ")]/ancestor::div/following-sibling::div[contains(#class,"form-group")]/span
You can find the Preceding div with visible text Account Name: and later find the following sibling which will hold the Account Name: value in it.
Similarly you can find each element by changing the visible text alone.

XPath expression -hierarchy

<div class="summary-item">
<label >Price</label>
<div class="value">
0.99 GBP
</div>
</div>
<div class="summary-item">
<label >Other info</label>
<div class="value">
All languages
</div>
</div>
I am trying to get the "0.99 GBP" using an XPath expression, so far I have reached the label using this (note there is another class by the name summary-item, therefore I need to uniquely identify with the label name Price)
sel.xpath('//*/div[#class="summary-item"]/label[text()="Price"]').extract()
However, I am unable to get to the class, I tried using following-sibling, but I did not succeed, any help will be appreciated.
The existence of child nodes can be part of the predicate. Put the test for label into a predicate for the parent, either as a separate predicate (adding the target node as well):
//div[#class="summary-item"][label[text()="Price"]]/div[#class="value"]
or joined with and:
//div[#class="summary-item" and label[text()="Price"]]/div[#class="value"]
(Note you don’t need //*/div at the start.)
You could use following-sibling if you wanted, it would look like this:
//div[#class="summary-item"]/label[text()="Price"]/following-sibling::div[#class="value"]
(here the label div isn’t part of the predicate).
One more thing to be aware of, using XPath to select HTML classes doesn’t work the same as using CSS – XPath will only match the exact string whereas CSS matches even if the element is in more than one class. In this case it works out okay but you should watch out for it. Search StackOverflow if it will be an issue, there are a few answers descibing it.

Can I use multiple itemtypes in one itemscope for Schema.org? [duplicate]

This question already has answers here:
Correct way to use multiple itemtypes in Microdata
(2 answers)
Closed 4 years ago.
I am wondering if I can use multiple itemtypes inside one item scope. For example I have this at the moment:
<body id="home" itemscope itemtype="http://schema.org/WebPage">
<div class="wrapper" itemscope itemtype="http://schema.org/ProfessionalService">
<p itemprop from professional service></p>
<p itemprop from web page></p>
</div>
</body>
When I do a structured data test within Google's Web developer tools it only picks up items within the professional service schema and every itemprop that is related to the webpage schema is ignored and not recognised as part of the professional service. I understand about nesting them and why it's happening.
Can I have a multiple itemtype within an item scope? Such as:
<div class="wrapper" itemscope itemtype="http://schema.org/ProfessionalService http://schema.org/WebPage">
<p itemprop from professional service></p>
<p itemprop from web page></p>
</div>
Yes, you can use several item types in one itemtype attribute, as long as they are from the same vocabulary. See Microdata: itemtype:
The itemtype attribute, if specified, must have a value that is an unordered set of unique space-separated tokens that are case-sensitive, each of which is a valid URL that is an absolute URL, and all of which are defined to use the same vocabulary.
But note that then all properties (itemprop values) need to be defined for all the specified item types. So you cannot say that a particular property should belong only to a particular item type.
So you’d still have the same problem. In your case, you should either use correct nesting, or you might use the itemref attribute to add properties to the corresponding items that are scattered on the page.
FWIW, the schema.org vocabulary also defines the additionalType property. This can also be used to specify additional item types from other vocabularies. But this doesn’t allow you to use the properties from the additional item type.

Microdata markup with properties on multiple pages

I'm creating a web page and currently I'm adding Microdata markup to the code. I’m using schema.org’s MusicGroup.
I have an index.html page from where I'd like to take the name and the image properties for this band:
<div class="container" itemscope itemtype="http://schema.org/MusicGroup">
...
<img itemprop="image" src="img/logo.png" alt="logo" />
<p>We are <span itemprop="name">NAME OF THE BAND</span>.</p>
...
</div>
However on the about_us.html page there is a short description which I'd also like to use:
<div class="container" itemscope itemtype="http://schema.org/MusicGroup">
...
<p itemprop="description">A description of the band.</p>
...
</div>
When I use the code like this, search enginges (understandably) treat them as two different MusicGroups:
MusicGroup 1:
Image: .../img/logo.png
Name: NAME OF THE BAND
MusicGroup 2:
Description: A description of the band.
How can I link these properties into one item?
Microdata’s name-value pairs are per webpage, not per website.
So on a website about a music group, it can be expected that each page contains an "own" MusicGroup item, which is, however, actually always about the same music group. But from the Microdata or schema.org perspective, these different items would not be semantically connected that way (consumers might guess this however, e.g. by comparing property values).
Microdata’s itemid attribute could be used to uniquely identify each item. But it is required that the used vocabulary supports "global identifiers for items" (itemid is used for some types on schema.org (e.g., in the example for MedicalScholarlyArticle), but it’s not clear to me if it’s really supported as required by Microdata for other types, like MusicGroup).
So in your case, you could:
leave it as it is
duplicate the information, so that each item has all relevant content (possibly using meta/link elements)
move all information on one page (possibly using itemref)
(if it should be allowed for general use with schema.org) use itemid to state that several items are actually about the same thing

xpath accessing information in nodes

i need to scrap information form a website contain the property details.
<div class="inner">
<div class="col">
<h2>House in Digana </h2>
<div class="meta">
<div class="date"></div>
<span class="category">Houses</span>,
<span class="location">Kandy</span>
</div>
</div>
<div class="attr polar">
<span class="data">Rs. 3,600,000</span>
</div>
what is the xpath notation for "Kandy" and "Rs. 3,600,000" ?
It is not wise to address text nodes directly using text() because of nuances in an XML document.
Rather, addressing an element node directly returns the concatenation of all descendant text nodes as the element value, which is what people usually want (and think they are getting when they address text nodes).
The canonical example I use in the classroom is this example of OCR'ed content as XML:
<cost>39<!--that 9 may be an 8-->.22</cost>
The value of the element using the XPath address cost is "39.22", but in XSLT 1.0 the value of the XPath address cost/text() is "39" which is not complete. In XSLT 2.0 (which is how the question is tagged), you get two text nodes "39" and ".22", which if you concatenate them it looks correct. But, if you pass them to a function requiring a singleton argument, you will get a run-time error. When you address an element, the text returned is concatenated into a single string, which is suitable for a singleton argument.
I tell students that in all of my professional work there are only very (very!) few times that I ever have to use text() in my stylesheets.
So //span[#class='location' or #class='data'] would find the two fields if those were the only such elements in the entire document. You may need to use ".//span" from a location inside of the document tree.

Resources