What is the reason browsers do not correctly recognize:
<script src="foobar.js" /> <!-- self-closing script element -->
Only this is recognized:
<script src="foobar.js"></script>
Does this break the concept of XHTML support?
Note: This statement is correct at least for all IE (6-8 beta 2).
The non-normative appendix ‘HTML Compatibility Guidelines’ of the XHTML 1 specification says:
С.3. Element Minimization and Empty Element Content
Given an empty instance of an element whose content model is not EMPTY (for example, an empty title or paragraph) do not use the minimized form (e.g. use <p> </p> and not <p />).
XHTML DTD specifies script elements as:
<!-- script statements, which may include CDATA sections -->
<!ELEMENT script (#PCDATA)>
To add to what Brad and squadette have said, the self-closing XML syntax <script /> actually is correct XML, but for it to work in practice, your web server also needs to send your documents as properly formed XML with an XML mimetype like application/xhtml+xml in the HTTP Content-Type header (and not as text/html).
However, sending an XML mimetype will cause your pages not to be parsed by IE7, which only likes text/html.
From w3:
In summary, 'application/xhtml+xml'
SHOULD be used for XHTML Family
documents, and the use of 'text/html'
SHOULD be limited to HTML-compatible
XHTML 1.0 documents. 'application/xml'
and 'text/xml' MAY also be used, but
whenever appropriate,
'application/xhtml+xml' SHOULD be used
rather than those generic XML media
types.
I puzzled over this a few months ago, and the only workable (compatible with FF3+ and IE7) solution was to use the old <script></script> syntax with text/html (HTML syntax + HTML mimetype).
If your server sends the text/html type in its HTTP headers, even with otherwise properly formed XHTML documents, FF3+ will use its HTML rendering mode which means that <script /> will not work (this is a change, Firefox was previously less strict).
This will happen regardless of any fiddling with http-equiv meta elements, the XML prolog or doctype inside your document -- Firefox branches once it gets the text/html header, that determines whether the HTML or XML parser looks inside the document, and the HTML parser does not understand <script />.
Others have answered "how" and quoted spec. Here is the real story of "why no <script/>", after many hours digging into bug reports and mailing lists.
HTML 4
HTML 4 is based on SGML.
SGML has some shorttags, such as <BR//, <B>text</>, <B/text/, or <OL<LI>item</LI</OL>.
XML takes the first form, redefines the ending as ">" (SGML is flexible), so that it becomes <BR/>.
However, HTML did not redfine, so <SCRIPT/> should mean <SCRIPT>>.
(Yes, the '>' should be part of content, and the tag is still not closed.)
Obviously, this is incompatible with XHTML and will break many sites (by the time browsers were mature enough to care about this), so nobody implemented shorttags and the specification advises against them.
Effectively, all 'working' self-ended tags are tags with prohibited end tag on technically non-conformant parsers and are in fact invalid.
It was W3C which came up with this hack to help transitioning to XHTML by making it HTML-compatible.
And <script>'s end tag is not prohibited.
"Self-ending" tag is a hack in HTML 4 and is meaningless.
HTML 5
HTML5 has five types of tags and only 'void' and 'foreign' tags are allowed to be self-closing.
Because <script> is not void (it may have content) and is not foreign (like MathML or SVG), <script> cannot be self-closed, regardless of how you use it.
But why? Can't they regard it as foreign, make special case, or something?
HTML 5 aims to be backward-compatible with implementations of HTML 4 and XHTML 1.
It is not based on SGML or XML; its syntax is mainly concerned with documenting and uniting the implementations.
(This is why <br/> <hr/> etc. are valid HTML 5 despite being invalid HTML4.)
Self-closing <script> is one of the tags where implementations used to differ.
It used to work in Chrome, Safari, and Opera; to my knowledge it never worked in Internet Explorer or Firefox.
This was discussed when HTML 5 was being drafted and got rejected because it breaks browser compatibility.
Webpages that self-close script tag may not render correctly (if at all) in old browsers.
There were other proposals, but they can't solve the compatibility problem either.
After the draft was released, WebKit updated the parser to be in conformance.
Self-closing <script> does not happen in HTML 5 because of backward compatibility to HTML 4 and XHTML 1.
XHTML 1 / XHTML 5
When really served as XHTML, <script/> is really closed, as other answers have stated.
Except that the spec says it should have worked when served as HTML:
XHTML Documents ... may be labeled with the Internet Media Type "text/html" [RFC2854], as they are compatible with most HTML browsers.
So, what happened?
People asked Mozilla to let Firefox parse conforming documents as XHTML regardless of the specified content header (known as content sniffing).
This would have allowed self-closing scripts, and content sniffing was necessary anyway because web hosters were not mature enough to serve the correct header; IE was good at it.
If the first browser war didn't end with IE 6, XHTML may have been on the list, too. But it did end. And IE 6 has a problem with XHTML.
In fact IE did not support the correct MIME type at all, forcing everyone to use text/html for XHTML because IE held major market share for a whole decade.
And also content sniffing can be really bad and people are saying it should be stopped.
Finally, it turns out that the W3C didn't mean XHTML to be sniffable: the document is both, HTML and XHTML, and Content-Type rules.
One can say they were standing firm on "just follow our spec" and ignoring what was practical. A mistake that continued into later XHTML versions.
Anyway, this decision settled the matter for Firefox.
It was 7 years before Chrome was born; there were no other significant browser. Thus it was decided.
Specifying the doctype alone does not trigger XML parsing because of following specifications.
In case anyone's curious, the ultimate reason is that HTML was originally a dialect of SGML, which is XML's weird older brother. In SGML-land, elements can be specified in the DTD as either self-closing (e.g. BR, HR, INPUT), implicitly closeable (e.g. P, LI, TD), or explicitly closeable (e.g. TABLE, DIV, SCRIPT). XML, of course, has no concept of this.
The tag-soup parsers used by modern browsers evolved out of this legacy, although their parsing model isn't pure SGML anymore. And of course, your carefully-crafted XHTML is being treated as badly-written SGML-inspired tag-soup unless you send it with an XML mime type. This is also why...
<p><div>hello</div></p>
...gets interpreted by the browser as:
<p></p><div>hello</div><p></p>
...which is the recipe for a lovely obscure bug that can throw you into fits as you try to code against the DOM.
Internet Explorer 8 and earlier do not support XHTML parsing. Even if you use an XML declaration and/or an XHTML doctype, old IE still parse the document as plain HTML. And in plain HTML, the self-closing syntax is not supported. The trailing slash is just ignored, you have to use an explicit closing tag.
Even browsers with support for XHTML parsing, such as IE 9 and later, will still parse the document as HTML unless you serve the document with a XML content type. But in that case old IE will not display the document at all!
The people above have already pretty much explained the issue, but one thing that might make things clear is that, though people use <br/> and such all the time in HTML documents, any / in such a position is basically ignored, and only used when trying to make something both parseable as XML and HTML. Try <p/>foo</p>, for example, and you get a regular paragraph.
The self closing script tag won't work, because the script tag can contain inline code, and HTML is not smart enough to turn on or off that feature based on the presence of an attribute.
On the other hand, HTML does have an excellent tag for including
references to outside resources: the <link> tag, and it can be
self-closing. It's already used to include stylesheets, RSS and Atom
feeds, canonical URIs, and all sorts of other goodies. Why not
JavaScript?
If you want the script tag to be self enclosed you can't do that as I said, but there is an alternative, though not a smart one. You can use the self closing link tag and link to your JavaScript by giving it a type of text/javascript and rel as script, something like below:
<link type="text/javascript" rel ="script" href="/path/tp/javascript" />
Unlike XML and XHTML, HTML has no knowledge of the self-closing syntax. Browsers that interpret XHTML as HTML don't know that the / character indicates that the tag should be self-closing; instead they interpret it like an empty attribute and the parser still thinks the tag is 'open'.
Just as <script defer> is treated as <script defer="defer">, <script /> is treated as <script /="/">.
Internet Explorer 8 and older don't support the proper MIME type for XHTML, application/xhtml+xml. If you're serving XHTML as text/html, which you have to for these older versions of Internet Explorer to do anything, it will be interpreted as HTML 4.01. You can only use the short syntax with any element that permits the closing tag to be omitted. See the HTML 4.01 Specification.
The XML 'short form' is interpreted as an attribute named /, which (because there is no equals sign) is interpreted as having an implicit value of "/". This is strictly wrong in HTML 4.01 - undeclared attributes are not permitted - but browsers will ignore it.
IE9 and later support XHTML 5 served with application/xhtml+xml.
That's because SCRIPT TAG is not a VOID ELEMENT.
In an HTML Document - VOID ELEMENTS do not need a "closing tag" at all!
In xhtml, everything is Generic, therefore they all need termination e.g. a "closing tag"; Including br, a simple line-break, as <br></br> or its shorthand <br />.
However, a Script Element is never a void or a parametric Element, because script tag before anything else, is a Browser Instruction, not a Data Description declaration.
Principally, a Semantic Termination Instruction e.g., a "closing tag" is only needed for processing instructions who's semantics cannot be terminated by a succeeding tag. For instance:
<H1> semantics cannot be terminated by a following <P> because it doesn't carry enough of its own semantics to override and therefore terminate the previous H1 instruction set. Although it will be able to break the stream into a new paragraph line, it is not "strong enough" to override the present font size & style line-height pouring down the stream, i.e leaking from H1 (because P doesn't have it).
This is how and why the "/" (termination) signalling has been invented.
A generic no-description termination Tag like < />, would have sufficed for any single fall off the encountered cascade, e.g.: <H1>Title< /> but that's not always the case, because we also want to be capable of "nesting", multiple intermediary tagging of the Stream: split into torrents before wrapping / falling onto another cascade. As a consequence a generic terminator such as < /> would not be able to determine the target of a property to terminate. For example: <b>bold <i>bold-italic < /> italic </>normal. Would undoubtedly fail to get our intention right and would most probably interpret it as bold bold-itallic bold normal.
This is how the notion of a wrapper ie., container was born. (These notions are so similar that it is impossible to discern and sometimes the same element may have both. <H1> is both wrapper and container at the same time. Whereas <B> only a semantic wrapper). We'll need a plain, no semantics container. And of course the invention of a DIV Element came by.
The DIV element is actually a 2BR-Container. Of course the coming of CSS made the whole situation weirder than it would otherwise have been and caused a great confusion with many great consequences - indirectly!
Because with CSS you could easily override the native pre&after BR behavior of a newly invented DIV, it is often referred to, as a "do nothing container". Which is, naturally wrong! DIVs are block elements and will natively break the line of the stream both before and after the end signalling. Soon the WEB started suffering from page DIV-itis. Most of them still are.
The coming of CSS with its capability to fully override and completely redefine the native behavior of any HTML Tag, somehow managed to confuse and blur the whole meaning of HTML existence...
Suddenly all HTML tags appeared as if obsolete, they were defaced, stripped of all their original meaning, identity and purpose. Somehow you'd gain the impression that they're no longer needed. Saying: A single container-wrapper tag would suffice for all the data presentation. Just add the required attributes. Why not have meaningful tags instead; Invent tag names as you go and let the CSS bother with the rest.
This is how xhtml was born and of course the great blunt, paid so dearly by new comers and a distorted vision of what is what, and what's the damn purpose of it all. W3C went from World Wide Web to What Went Wrong, Comrades?!!
The purpose of HTML is to stream meaningful data to the human recipient.
To deliver Information.
The formal part is there to only assist the clarity of information delivery.
xhtml doesn't give the slightest consideration to the information. - To it, the information is absolutely irrelevant.
The most important thing in the matter is to know and be able to understand that xhtml is not just a version of some extended HTML, xhtml is a completely different beast; grounds up; and therefore it is wise to keep them separate.
Simply modern answer is because the tag is denoted as mandatory that way
Tag omission None, both the starting and ending tag are mandatory.
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/script
Difference between 'true XHTML', 'faux XHTML' and 'ordinary HTML' as well as importance of the server-sent MIME type had been already described here well.
If you want to try it out right now, here is simple editable snippet with live preview including self-closed script tag (see <script src="data:text/javascript,/*functionality*/" />) and XML entity (unrelated, see &x;).
As you can see, depending on the MIME type of embedding document the data-URI JavaScript functionality is either executed and consecutive text displayed (in application/xhtml+xml mode) or not executed and consecutive text 'devoured' by the script (in text/html mode).
div { display: flex; }
div + div {flex-direction: column; }
<div>Mime type: <label><input type="radio" onchange="t.onkeyup()" id="x" checked name="mime"> application/xhtml+xml</label>
<label><input type="radio" onchange="t.onkeyup()" name="mime"> text/html</label></div>
<div><textarea id="t" rows="4"
onkeyup="i.src='data:'+(x.checked?'application/xhtml+xml':'text/html')+','+encodeURIComponent(t.value)"
><?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
[<!ENTITY x "true XHTML">]>
<html xmlns="http://www.w3.org/1999/xhtml">
<body>
<p>
<span id="greet" swapto="Hello">Hell, NO :(</span> &x;.
<script src="data:text/javascript,(g=document.getElementById('greet')).innerText=g.getAttribute('swapto')" />
Nice to meet you!
<!--
Previous text node and all further content falls into SCRIPT element content in text/html mode, so is not rendered. Because no end script tag is found, no script runs in text/html
-->
</p>
</body>
</html></textarea>
<iframe id="i" height="80"></iframe>
<script>t.onkeyup()</script>
</div>
You should see Hello, true XHTML. Nice to meet you! below textarea.
For incapable browsers you can copy content of the textarea and save it as a file with .xhtml (or .xht) extension (thanks Alek for this hint).
Related
I have error in product view page on w3c validator
my product is downloadable product. i have a custom option for that product
when i validate the test-product page in w3c validator it shows a error like this
there is no attribute "price"
…" id="options_21_2" value="27" price="0" />
Error line:
<ul id="options-21-list" class="options-list"><li><input type="checkbox" class="checkbox product-custom-option" onclick="opConfig.reloadPrice()" name="options[21][]" id="options_21_2" value="27" price="0" /><span class="label"><label for="options_21_2">Test Product</label></span></li></ul>
help to fix this issue.
Background
The problem here is, that price is a custom attribute. Even though it's usable by almost all browsers per JavaScript the way you posted it, it's not a valid (X)HTML attribute, like id or name are, for example.
The W3C validator validates your source code against the DTD (Document Type Definition) found in the <!DOCTYPE .. > declaration of your document.
Magento CE/EE 1.x versions use a XHTML 1.0 Strict DTD by default.
A DTD declares which rules a document must follow to be valid for the given document type. It defines which element types are allowed, which attributes a specific element can have, which entities can be used, etc.
If you check the linked DTD above, you'll see that there's no price attribute defined anywhere in the file.
That's why the W3C validator rightfully complains .. there is no attribute "price".
What can you do?
Mainly the following three things come to my mind on about how to handle such situation:
Ignore after double checking
You could simply ignore this (and only this) specific kind of W3C validation errors.
I guess that's what most devs do with ".. there is no attribute attr_name" validation errors when they already double checked, that it's a custom attribute really in use and only failing W3C validation (using a pre-HTML5 DTD), but working completely fine otherwise.
Extend the DTD
You could extend the XHTML 1.0 Strict DTD, adding custom attributes to specific elements.
Example on how to add a custom price attribute for input elements, using an internal subset:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
[
<!ATTLIST input price CDATA #IMPLIED>
]
>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-type" content="text/html;charset=UTF-8" />
<title>Test</title>
</head>
<body>
<p>
<input type="checkbox" class="checkbox product-custom-option" onclick="opConfig.reloadPrice()"
name="options[21][]" id="options_21_2" value="27" price="0" />
</p>
</body>
</html>
Containing this internal subset, the W3C validator will validate without errors.
But, most major browsers will render an ugly ]> as a result, when internal subsets come into play.
Maybe, because they don't support nested tags (at all, or correctly), or maybe they switch to their hardwired DTDs as soon as they found an official one in the <!DOCTYPE .. >, I can't tell exactly.
To avoid the ]>, you could build a custom DTD, using the original DTD as a base, extend it with custom attributes and change the !<DOCTYPE .. > to use that custom DTD.
The crux with custom DTDs is, even though it's technically correct and the browsers won't render that ugly ]> anymore, you also can't use the W3C validator anymore. It doesn't support custom DTDs.
So, if W3C compliance is a must, your only choice is to stay with internal subsets. But then you still need to get rid of the ugly ]> somehow. And to achieve this, you could use some CSS, e.g. similiar to this:
html {
color: transparent;
}
Be aware though, that extending DTDs can result in lots of work. You'll need to extend all element types where your custom attribute could appear. And you'd need to do this for each custom attribute, of course.
Use HTML5 data-* attributes
You could rewrite your Magento templates to use HTML5 and its data-* attributes, a way where you only have to prefix custom attribute names with data- to make them perfectly valid.
But since fully transferring Magento 1.x from XHTML 1.0 Strict to HTML5 would result in tons of complex work, I don't really consider this an option.
Afaik, even Magento 2.x will not switch to HTML5, but continue to use XHTML 1.0 Strict.
Maybe for the very same reason^^
I want to transform an XML document to HTML using XSL, tinker with it a little, then render it out. This is essentially what I'm doing:
source = Nokogiri::XML(File.read 'source.xml')
xsl = Nokogiri::XSLT(File.read 'transform.xsl')
transformed = xsl.transform(source)
html = Nokogiri::HTML(transformed.to_html)
html.title = 'Something computed'
Stylesheet::transform always returns XML::Document, but I need a HTML::Document instance to use methods like title=.
The code above works, but exporting and re-parsing as HTML is just awful. Since the target is a subclass of the source, there must be a more effective way to perform the conversion.
How can I clean up this mess?
As a side question, Nokogiri has generally underwhelmed me with its handling of doctypes, unawareness of <meta charset= etc... does anyone know of a less auto-magic library with similar capabilities?
Many thanks ;)
HTML::Document extends XML::Document, but the individual nodes in a HTML document are just plain XML::Nodes, i.e. there aren’t any HTML::Nodes. This suggests a way of converting an XML document to HTML by creating a new empty HTML::Document and setting its root to that of the XML document:
html = Nokogiri::HTML::Document.new
html.root= transformed.root
The new document has the HTML methods like title= and meta_encoding= available, and when serializing it creates a HTML document rather than HTML: adds a HTML doctype, correctly uses empty tags like <br>, displays minimized attributes where appropriate (e.g. <input type="checkbox" selected>) and doesn’t escape things like > in <script> blocks.
Sometimes you want two body backgrounds. One for the header and one for the footer.
I accidentally discovered that it is possible to style the actual <html> tag.
HTML:
<html xmlns="http://www.w3.org/1999/xhtml">
CSS:
html {background:#000}
Is it OK to style this, or will it cause any problem?
Although it's debatable as to whether it's valid [see here]:
For HTML documents, however, we recommend that authors specify the
background for the BODY element rather than the HTML element. ...
Many large sites still use it with seemingly consistent results.
Yes, it's OK to style the html tag.
It is very common practice. Used all the time on many websites including this one.
I recently have found a strange occurrence in IE8 & FF.
The designers where using js to dynamically create some span tags for layout (they were placing rounded corner graphics on some tabs). Now the xhtml, in js, looked like this: <span class=”leftcorner” /><span class=”rightcorner” /> and worked perfectly!
As we all know dynamically rendering elements in js can be quite processor intensive so I moved the elements from js into the page source, exactly as above.
... and it didn’t work... not only didn’t it work, it crashes IE8.The fix was simple, put the close span in ie: <span class=”leftcorner”></span>
I am a bit confused by this.
Firstly as far as I am aware <span class=”leftcorner” /> is perfectly valid XHTML!
Secondly it works dynamically, but not in XHTML?!?!?
Can anyone shed any light on this or is it simply another odd occurrence of browsers?
The major browsers only support a small subset of self-closing tags. (See this answer for a complete list.)
Depending on how you were creating the elements in JS, the JavaScript engine probably created a valid element to place in the DOM.
I had similar problem with a tags in IE.
The problem was my links looked like that (it was an icon set with the css, so I didn't need the text in it:
<a href="link" class="icon edit" />
Unfortunately in IE these links were not displayed at all. They have to be in
format (leaving empty text didn't work as well so I put there). So what I did is I add an few extra JS lines to fix it as I didn't want to change all my HTML just for this browser (ps. I'm using jQuery for my JS).
if ($.browser.msie) {
$('a.icon').html(' ');
}
IE in particular does not support XHTML. That is, it will never apply proper XML parsing rules to a document - it will treat it as HTML even with proper DOCTYPE and all. XHTML is not always valid SGML, however. In some cases (such as <br/>) IE can figure it out because it's prepared to parse tagsoup, and not just valid SGML. However, in other cases, the same "tagsoup" behavior means that it won't treat /> as self-closing tag terminator.
In general, my advice is to just use HTML 4.01 Strict. That way you know exactly what to expect. And there's little point in feeding XHTML to browsers when they're treating it as HTML anyway...
See I think that one of the answers to Is writing self closing tags for elements not traditionally empty bad practice? will answer your question.
XHTML is only XHTML if it is served as application/xhtml+xml — otherwise, at least as far as browsers are concerned, it is HTML and treated as tag soup.
As a result, <span /> means "A span start tag" and not "A complete span element". (Technically it should mean "A span start tag and a greater than sign", but that is another story).
The XHTML spec tells you what you need to do to get your XHTML to parse as HTML.
One of the rules is "For non-empty elements, end tags are required". The list of elements includes a quick reference to which are empty and which are not.
I have an XML document that I'm displaying in a web browser, with a stylesheet attached:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?xml-stylesheet type="text/css" href="abc.css"?>
<myxml xmlns:xhtml="http://www.w3.org/1999/xhtml">
<para>I wish i was editable</para>
<xhtml:script type="text/javascript" src="abc.js"/>
</myxml>
With the xhtml namespace declaration, and xhtml:script tag, I can execute javascript.
What I'd like to do is to make arbitrary non-XHTML elements in this document content editable. (Actually, they'll be in another namespace)
Even if I explicitly add #contentEditable="true" (ie without resorting to Javascript), the content is not actually editable (in Firefox 3.0.4).
Is it possible to edit it in any of the current browsers? (I had no problems with <div contentEditable="true">Edit me</div> in an XHTML 1.0 Transitional doc)
I can't even edit an xhtml:div in this document (in Firefox); if I could do that, that may offer a way forward.
In Firefox 3, #content-editable="true" only makes the relevant element editable if the
content type is text/html (which also happens if a local filename ends with .html)
It doesn't work for content types app/xhtml+xml or text/xml (local filenames ending with .xhtml or .xml)
I've logged an enhancement for this: https://bugzilla.mozilla.org/show_bug.cgi?id=486931
contentEditable works (tested in Firefox and Chrome) on elements which are foreign to html/xhtml if I use this doctype:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
and a .html file extension (instead of .xml).
I don't have to include any html elements at all (eg head, body, div, p).
css isn't applied though (if my xml is in a namespace, which i guess makes sense, given the doctype!).
Not an elegant solution.
Firefox is one of the few browsers that strictly enforces the XHTML spec. So, to make an element editable, you must specify the contenteditable attribute as true. Note that the whole attribute name is lower case. In your example the first "E" in editable was capitalized.
Another quirk that should be mentioned is that IE(6,7,8) act exactly the opposite. To make an element editable in IE, you MUST add contentEditable="true" exactly. For what ever reason, contenteditable="true" (as well as any other variation in capitalization) does not work.