Microdata markup with properties on multiple pages - microdata

I'm creating a web page and currently I'm adding Microdata markup to the code. I’m using schema.org’s MusicGroup.
I have an index.html page from where I'd like to take the name and the image properties for this band:
<div class="container" itemscope itemtype="http://schema.org/MusicGroup">
...
<img itemprop="image" src="img/logo.png" alt="logo" />
<p>We are <span itemprop="name">NAME OF THE BAND</span>.</p>
...
</div>
However on the about_us.html page there is a short description which I'd also like to use:
<div class="container" itemscope itemtype="http://schema.org/MusicGroup">
...
<p itemprop="description">A description of the band.</p>
...
</div>
When I use the code like this, search enginges (understandably) treat them as two different MusicGroups:
MusicGroup 1:
Image: .../img/logo.png
Name: NAME OF THE BAND
MusicGroup 2:
Description: A description of the band.
How can I link these properties into one item?

Microdata’s name-value pairs are per webpage, not per website.
So on a website about a music group, it can be expected that each page contains an "own" MusicGroup item, which is, however, actually always about the same music group. But from the Microdata or schema.org perspective, these different items would not be semantically connected that way (consumers might guess this however, e.g. by comparing property values).
Microdata’s itemid attribute could be used to uniquely identify each item. But it is required that the used vocabulary supports "global identifiers for items" (itemid is used for some types on schema.org (e.g., in the example for MedicalScholarlyArticle), but it’s not clear to me if it’s really supported as required by Microdata for other types, like MusicGroup).
So in your case, you could:
leave it as it is
duplicate the information, so that each item has all relevant content (possibly using meta/link elements)
move all information on one page (possibly using itemref)
(if it should be allowed for general use with schema.org) use itemid to state that several items are actually about the same thing

Related

Can not input to EditableElement of CKEditor5

In CKEditor5, I tried implementing custom element to convert model to view for editing. Then, editable element(#ckeditor/ckeditor5-engine/src/view/editableelement) in container element(#ckeditor/ckeditor5-engine/src/view/containerelement) is focused on the parent container element and can not be edited.
For example, if it is implemented as follows:
buildModelConverter().for(editing.modelToView)
.fromElement('myElement')
.toElement(new ContainerElement('div', {}, [new EditableElement('h4')]));
The result of actual editing dom after inserting 'myElement' and keydown "abc". (I hope inputting text of "abc" to h4 tag but...)
<div>​​​​​​​
abc
<h4>
<br data-cke-filler="true">
</h4>
</div>
I also tried using widget for applying contenteditable attribute.
But, text couldn't be entered in h4.
<div class="ck-widget" contenteditable="false">​​​​​​​
<h4 class="ck-editable" contenteditable="true">
<br data-cke-filler="true">
</h4>
</div>
Is this bug, or I made mistake understanding of container element?
[Additional details]
I am assuming to make a widget plugin for ranking list.
First, the ranking list is structured by <ol> and <li> tags because of having multiple items.
I solved that by defining two schema such as "rankingList" and "rankingListItem",
so I realized dynamic elements using nested model elements.
const item1 = new ModelElement('rankingListItem');
const item2 = new ModelElement('rankingListItem');
const model = new ModelElement('rankingList', {}, [item1, item2]);
// and insert
Next, the item of ranking list has link, image, title and note.
Therefore, the ranking list item has the following DOM structure:
<ol><!-- apply toWidget -->
<li>
<a href="link[editable]">
<img src="image[editable]">
<h3>title[editable]</h3>
<p>notes[editable]</p>
</a>
</li>
...
</ol>
I expect the view element is the following:
const {ref, src, title, notes} = data; // how to get data?
const view = new ContainerElement('a', {ref}, [
new EmptyElement('img', {src}),
new EditableElement('h3', {}, new Text(title)),
new EditableElement('p', {}, new Text(title)),
]);
// maybe incorrect ...
In conclusion, I want to use editable view not to break defined DOM tree.
How can I realize it?
Thank you for describing your case. It means a lot for us at the moment to know how developers are using the editor and what are your expectations.
Unfortunately, this looks like a very complex feature. It also looks like it would need custom UI (to edit link url and image src -- unless they do not change after added to the editor). It seems that you struggle with two problems:
position mapping between the model and the view,
nested editables.
First, to answer your question about EditableElement - it seems to be correct to use them for h3 and p elements.
However, such complex feature needs custom converters. Converter builders (which you used) are dedicated to being used in simple cases, like element-to-element conversion, or attribute-to-attribute conversion.
Behind the nice API, build converter is a function factory, that creates one or multiple functions. Those are then added as callbacks to ModelConversionDispatcher (which editor.editing.modelToView is an instance of). ModelConversionDispatcher fires a series of events during the conversion, which is a process of translating a change in the model to the view.
As I've mentioned, you would have to write those converting functions by yourself.
Since this is too big of a subject for a detailed and thorough answer, I'll just briefly present you what you should be interested in. Unfortunately, there are no guides yet about creating custom converters from scratch. This is a very broad subject.
First, let me explain you from where most of your problems come from. As you already know, the editor has three layers: model (data), view (DOM-like structure) and DOM. Model is converted to view and view is rendered to DOM. Also, the other way, when you load data, DOM is converted to view and view is converted to model. This is why you need to provide model-to-view converter and view-to-model converter for your feature.
The important piece of this puzzle is engine.conversion.Mapper. Its role is to map elements and positions from model to view. As you already might have seen, the model might be quite different than the view. Correct position mapping between those is key. When you type a letter at caret position (in DOM), this position is mapped to model and the letter is inserted in the model and only then converted back to the view and DOM. If view-to-model position conversion is wrong, you will not be able to type, or really do anything, at that place.
Mapper is pretty simple on its own. All it needs is that you specify which view elements are bound to which model elements. For example, in the model you might have:
<listItem type="bulleted" indent="0">Foo</listItem>
<listItem type="bulleted" indent="1">Bar</listItem>
While in the view you have:
<ul>
<li>
Foo
<ul>
<li>Bar</li>
</ul>
</li>
</ul>
If the mapper knows that first listItem is bound with first <li> and the second listItem is bound with second <li>, then it is able to translate positions correctly.
Back to your case. Each converter has to provide data for Mapper. Since you used converter builder, the converters build by it already do this. But they are simple, so when you provide:
buildModelConverter().for(editing.modelToView)
.fromElement('myElement')
.toElement(new ContainerElement('div', {}, [new EditableElement('h4')]));
it is assumed, that myElement is bound with the <div>. So anything that is written inside that <div> will go straight to myElement and then will be rendered at the beginning of myElement:
<div>
<h4></h4>
</div>
Assuming that you just wrote x at <h4>, that position will be mapped to myElement offset 0 in the model and then rendered to view at the beginning of <div>.
Model:
<myElement>x</myElement>
View:
<div>x<h4></h4></div>
As you can see, in your case, it is <h4> which should be bound with myElement.
At the moment, we are during refactoring phase. One of the goals is providing more utility functions for converter builders. One of those utilities are converters for elements which have a wrapper element, like in that "div + h4" case above. This is also a case of image feature. The image is represented by <image> in model but it is <figure><img /></figure> in the view. You can look at ckeditor5-image to see how those converters look like now. We want to simplify them.
Unfortunately, your real case is even more complicated because you have multiple elements inside. CKE5 architecture should be able handle your case but you have to understand that this is almost impossible to write without proper guides.
If you want to tinker though, you should study ckeditor5-image repo. It won't be easy, but this is the best way to go. Image plugin together with ImageCaption are very similar to your case.
Model:
<image alt="x" src="y">
<caption>Foo</caption>
</image>
View:
<figure class="image">
<img alt="x" src="y" />
<figcaption>Foo</caption>
</figure>
While in your case, I'd see the model somewhere between those lines:
<rankItem imageSrc="x" linkUrl="y">
<rankTitle>Foo</rankTitle>
<rankNotes>Bar</rankNotes>
</rankItem>
And I'd make the view a bit heavier but it will be easier to write converters:
<li contenteditable="false">
<a href="y">
<img src="x" />
<span class="data">
<span class="title" contenteditable="true">Foo</span>
<span class="notes" contenteditable="true">Bar</span>
</span>
</a>
</li>
For rankTitle and rankNotes - base them on caption element (ckeditor5-image/src/imagecaption/imagecaptionengine.js).
For rankItem - base it on image element (ckeditor5-image/src/image/).
Once again - keep in mind that we are in the process of simplifying all of this. We want people to write their own features, even those complicated ones like yours. However, we are aware of how complex it is right now. That's why there are no docs at the moment - we are looking to change and simplify things.
And lastly - you could create that ranking list simpler, using Link plugin and elements build with converter builder:
rankList -> <ol class="rank">,
rankItem -> <li>,
rankImage -> <img />,
rankNotes -> <span class="notes">,
rankTitle -> <span class="title">.
However, it will be possible to mess it up because the whole structure will be editable.
Model:
<rankList>
<rankItem>
<rankImage linkHref="" />
<rankTitle>Foo</rankTitle>
<rankNotes>Bar</rankNotes>
</rankItem>
...
</rankList>
Where "Foo" and "Bar" also have linkHref attribute set.
View:
<ol class="rank">
<li>
<img src="" />
<span class="title">Title</span>
<span class="notes">Foo</span>
</li>
...
</ol>
Something like this, while far from perfect, should be much easier to write as long as we are before the refactor and before writing guides.
Of course you will also have to provide view-to-model converters. You might want to write them on your own (take a look at ckeditor5-list/src/converters.js viewModelConverter() -- although yours will be easier because it will be flat, not nested list). Or you can generate them through converter builder.
Maybe it will be possible to use the approach above (simpler) but use contentEditable attribute to control the structure. rankList would have to be converted to <ol> with contentEditable="false". Maybe you could somehow use toWidget for better selection handling, for example. rankNotes and rankTitle would have to be converted to element with contentEditable="true" (maybe use toWidgetEditable()).

Xpath extraction between two nodes

I have several web pages that have the following structure
<div id="w_33086" class = "eight columns">
<h2 id="about">About<span itemprop="name">Name of Place</span></h2>
<p style="margin-top: 0px;">
.
.
.
<p class="contactAdvisor"><a href="http://www.example.com/contact">
The text and markup between the two paragraphs is quite variable as it is created through individual users. In some cases it includes markup and in some cases it does not. When it does include markup, the mark up can be quite variable.
I'm trying to select all of the text and markup between these two <p> but have not been successful.
The best result I've achieved comes from //div[id='w_33086']/node()
However, that is dropping the <p> tags when those are present. It also picks up the <h2> tag and the <p class="contactAdvisor"> that I would rather exclude.
I'm using Google Sheets (and or Screaming Frog) to apply the xpath
If the two <p> elements are siblings of each other, then you could try
//div[id='w_33086']/p[#style="margin-top: 0px;"]/
following-sibling::node()[following-sibling::p[#class="contactAdvisor"]]
This is assuming that there is only one <p> under <div id="w_33086"> that has the style or class attribute value (respectively) that we're using to identify them.
Please note that XPath does not select any "markup". It selects nodes such as text nodes, elements (which have descendants attached), and attributes. Those nodes can be serialized as markup, but that's not XPath's business.

Is it possible to add custom attributes in Microdata?

Is there an option to add custom attributes to a scheme? (same as we can expand DTD?)
itemprop="description" isn't enough for me. I got more attributes that I wish to add, that do not exist in the original scheme:
Objective
Duration
Availability
I need this attributes cause they project the full characteristic of my product.
In Microdata, you can use a "proprietary item property name":
one used by the author for private purposes, not defined in a public specification
It has to be an absolute URL, e.g.:
<div itemscope itemtype="http://schema.org/Thing">
<p itemprop="description">…</p>
<p itemprop="http://example.com/voc/objective">…</p>
</div>
(Of course you can’t expect other consumers to make use of it.)
If you are using the Schema.org vocabulary, you could also:
propose new Schema.org properties/types
extend an existing Schema.org property (but it’s considered outdated)

Can I use multiple itemtypes in one itemscope for Schema.org? [duplicate]

This question already has answers here:
Correct way to use multiple itemtypes in Microdata
(2 answers)
Closed 4 years ago.
I am wondering if I can use multiple itemtypes inside one item scope. For example I have this at the moment:
<body id="home" itemscope itemtype="http://schema.org/WebPage">
<div class="wrapper" itemscope itemtype="http://schema.org/ProfessionalService">
<p itemprop from professional service></p>
<p itemprop from web page></p>
</div>
</body>
When I do a structured data test within Google's Web developer tools it only picks up items within the professional service schema and every itemprop that is related to the webpage schema is ignored and not recognised as part of the professional service. I understand about nesting them and why it's happening.
Can I have a multiple itemtype within an item scope? Such as:
<div class="wrapper" itemscope itemtype="http://schema.org/ProfessionalService http://schema.org/WebPage">
<p itemprop from professional service></p>
<p itemprop from web page></p>
</div>
Yes, you can use several item types in one itemtype attribute, as long as they are from the same vocabulary. See Microdata: itemtype:
The itemtype attribute, if specified, must have a value that is an unordered set of unique space-separated tokens that are case-sensitive, each of which is a valid URL that is an absolute URL, and all of which are defined to use the same vocabulary.
But note that then all properties (itemprop values) need to be defined for all the specified item types. So you cannot say that a particular property should belong only to a particular item type.
So you’d still have the same problem. In your case, you should either use correct nesting, or you might use the itemref attribute to add properties to the corresponding items that are scattered on the page.
FWIW, the schema.org vocabulary also defines the additionalType property. This can also be used to specify additional item types from other vocabularies. But this doesn’t allow you to use the properties from the additional item type.

Google Spreadsheet importxml timestamp

I been trying for over 2 hours to import timestamp from zap2it.com link to my google spreasheet.
Here is link I am trying to importxml from.
http://affiliate.zap2it.com/tvlistings/ZCGrid.do?zipcode=78238&lineupId=DISH641:-
Here is what I am tryign to import
Here is what I tried so far
=importxml("http://affiliate.zap2it.com/tvlistings/ZCGrid.do?aid=dish&pkg=8388608&fromProvider=true&zipcode=78238&x=52&y=18"&B1,"//body//div[3]/div/div/div[3]/div/div")
EDIT
I was able to improve and get better results
//body//div[3]/div/div/div[1]//*
but it shows timestamp from all over the page. not exactly what I need.
[The first complication is that the data stream returned from dereferencing that URI is not actually XML; it has several thousand well-formedness errors (unescaped ampersands in URIs, unescaped ampersands and less-than signs in scripts, some embedded HTML, some miscellaneous errors). Since you're not reporting problems from that, however, I'll assume that somewhere between the server and your XPath expression someone is doing some tidying.]
I think you'll get better results if you use the id and class attributes that are extensively used in the document. The material you want looks like this in the source (you can use any browser-based debugging tool to find it; I used the 'Web Inspector' in Safari); I have indented to make the structure more visible, and fixed some well-formedness errors in one of the a elements (missing whitespace between attribute-value pairs).
<div class="zc-tn" id="zc-tn-top">
<div class="zc-tn-i">
<a href="ZCGrid.do?fromTimeInMillis=1355781600000"
class="zc-tn-l"
title="Move the grid three hours earlier"></a>
<div class="zc-tn-c">
<span class="zc-tn-z"
title="Central Standard Time">CST</span>
<div class="zc-tn-t">7:00 PM</div>
<div class="zc-tn-t">7:30 PM</div>
<div class="zc-tn-t">8:00 PM</div>
<div class="zc-tn-t">8:30 PM</div>
<div class="zc-tn-t">9:00 PM</div>
<div class="zc-tn-t">9:30 PM</div>
</div>
<a href="ZCGrid.do?fromTimeInMillis=1355803200000"
class="zc-tn-r"
title="Advance the grid three hours"></a>
</div>
</div>
A simple search verifies that the value zc-tn-top is indeed unique as an ID value in the document. Given that, a simple XPath expression to retrieve all the elements whose display is circled in your image is (assuming xhtml is bound to the XHTML namespace):
//xhtml:div[#id='zc-tn-top']//xhtml:div[#class='zc-tn-t']
It looks from your question as if your XPath evaluator is namespace-challenged or namespace-oblivious, so you may need to write this as
//div[#id='zc-tn-top']//div[#class='zc-tn-t']

Resources