Importing HTML table into OO Calc as UTF8 without converting to entities - utf-8

I have a problem when opening a HTML table in OpenOffice or LibreOffice if it contains UTF8 extended characters like ÅÄÖåäö.
When opening the table into M$ Excel it works as intended but I can't make OO do the same thing.
By converting all extended characters to its HTML entity eqivalent Å etc. it works but it would be nice to get the correct characters directly.
Is there anyone who knows what I should do?
The following content I have in a file called excelsample.xls and if I open that with OO Calc it will not look nice.
<!DOCTYPE html>
<html>
<head>
<title></title>
<meta http-equiv="content-type" content="application/vnd.ms-excel" charset="UTF-8">
<meta charset="UTF-8">
</head>
<body>
<table>
<tr>
<td>Prawn sandwich</td><td>Räksmörgås</td>
</tr>
</table>
</body>
</html>

Your meta tag is malformed and OO doesn't probably recognize the html5 charset tag.
So fix it with:
<meta http-equiv="content-type" content="application/vnd.ms-excel; charset=UTF-8">

Related

canvas not outputting expected £ text symbol to canvas, contains an additional character

When I try to write a £ sign on to canvas
context.fillText("£ ",600,165);
The output will write  £ to the canvas object, anyone got ideas on what to do... I tried
context.fillText("£ ",600,165);
but that only writes &pound to the output object.
it's likely not to work if encoding of the page isn't defined. try this in the html page at the very top
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8"> <!-- THIS ONE !! -->
blabla...
example below show it works when it is a html5 page with utf-8
document.getElementById("myCanvas").getContext("2d").fillText("£ ",10,10);
<canvas id="myCanvas" width="300px" height="50px">no html5 support</canvas>

How to Properly Define UTF-8 Charset in in <head> Tag Section of Web Document

If my doc type is <!DOCTYPE html> is it best or more correct to use
<meta charset="utf-8" />
or
<meta http-equiv="content-type" content="text/html;charset=utf-8" />
to define utf-8?
Thanks.
The first one is only valid with HTML5.
The second one is also valid for older (X)HTML versions
With this doctype (indicating HTML5) both are valid, I prefer the first as it is shorter. :)

Inserting images on a webpage in notepad using html5

I'm building a webpage in Notepad. I'm using html5 for the first time. I believe I did the correct coding to insert these images but they don't show up on the page. Here is the code: I could use some help, please. Thank you.
<html>
<head>
<title>My practice website</title>
<meta charset="utf-8">
<html lang="en">
<meta name="keywords" content="html, css, javascript, history, poems, poetry"/>
<meta name="description" content="This site is about my personal life, poems, poetry, images of family, myself"/>
<meta name="author" content="schweidel tyson">
<meta http-equiv="refresh" content="30" />
<link rel="stylesheet" type="text/css" href="mainstyesheet.css"/>
<body style="background-color: #ccffff;">
</head>
<body/>
<h1>Welcome to my website</h1>
<img src="http://www.html.net/logo.png"/>
<p>This is basically a personal website build to showcase my fledging talent in webdesign to put up pictures of my family and friends. I also like poetry, so there will be some poems.</p><b/>
This is a link to a good html tutorial<br/><br/>
This is another great tutorial link<br/><br/>
A tutorial for styles link
<img src=My practice website/My Website/images/high yellow.jpg" width="192 height="256"/><alt="African Amereican light-skined woman"><br/><br/>
<img src="http://www.zimbio.com/My website/images/trendy.jpg" width="352" width="400"/><alt="African American Woman">
<img src="My practice website/My website/my new pic.jpg" width="104" height="104"/> <alt="me at the domiciliary">
</body>
</html>
Your HTML is a bit off:
<img src="..." width="104" height="104"/> <alt="me at the domiciliary">
alt is just the alternate text for the image. It's an attribute just like width, src and height:
<img src="..." width="104" height="104" alt="me at the domiciliary" />
Also, make sure your URLs are correct.
Also, without a DOCTYPE, your markup is invalid. Include a DOCTYPE (here's a HTML5 one):
<!DOCTYPE html>
<html>
...

Character encoding in ruby

I am parsing some data from one Holland site using Nokogiri, and saving data into csv. But data are not correctly displayed. For example on form thre is Einddatum1 empty space but when I print it into console before saving it is showed as "\u00A0". Also other strings are not correctly displayed, for example "Univ\u00E9 Zorg Geregeld Polis".
{:Bsn=>"112511111",
:Verzekerde=>"VerzekerdeAHM Andes-Faasse",
:Pakketnaam1=>"Univ\u00E9 Zorg Geregeld Polis",
:Verzekerdennummer1=>"1234987654",
:Begindatum1=>"01 jan 2012",
:Einddatum1=>"\u00A0",
}
Maybe header of this html page could be relevant:
<!doctype html>
<!-- paulirish.com/2008/conditional-stylesheets-vs-css-hacks-answer-neither/ -->
<!--[if lt IE 7 ]> <html class="no-js ie6" lang="en"> <![endif]-->
<!--[if IE 7 ]> <html class="no-js ie7" lang="en"> <![endif]-->
<!--[if IE 8 ]> <html class="no-js ie8" lang="en"> <![endif]-->
<!--[if (gte IE 9)|!(IE)]><!--> <html class="no-js" lang="en"> <!--<![endif]-->
<head id="Head1"><meta charset="utf-8" />
<!-- Always force latest IE rendering engine (even in intranet)
Remove this if you use the .htaccess -->
<meta http-equiv="X-UA-Compatible" content="IE=edge" /><title>
Verzekeringsrecht controleren
</title><meta http-equiv="cache-control" content="no-cache" /><meta http-equiv="content-language" content="nl-NL" />
It seams like it's utf-8 but there is problem with these characters. How to properly encode them?
Then the line would read :Pakketnaam1=>"Univé Zorg Geregeld Polis",
Is that what is supposed to be there and your console encoding is just not defined so Ruby does not know how to display the Unicode characters when printing them or should there be some more text?

MVC Razor quirks mode - umbraco

I seem to have an obscure issue with a razor template forcing browsers into quirks mode. It is a simple razor template in umbraco 5. The following code makes chrome, firefox, IE all go into quirks mode:
#inherits RenderViewPage
#using System.Web.Mvc.Html;
#using Umbraco.Cms.Web;
#{
Layout = "";
}
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta charset="utf-8" />
<title>Page title</title>
</head>
<body>
</body>
</html>
If I move the razor syntax completely or move it down so it is not before the doctype it goes into standards compliance mode. I've tried adding various X-UA-Compatible meta tags to no effect.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta charset="utf-8" />
<title>Page title</title>
</head>
<body>
#inherits RenderViewPage
#using System.Web.Mvc.Html;
#using Umbraco.Cms.Web;
#{
Layout = "";
}
</body>
</html>
Anyone any ideas what could be the cause? It's as though the browsers think it is rendering something before the doctype but there is nothing I can see.
Thanks
You don't need a semi-colon on your #using statements, perhaps this is what the browser is seeing?
So e.g.
#using Umbraco.Cms.Web;
can just be
#using Umbraco.Cms.Web
Same here
It looks like that it places extra chars (whitespace) right before opening tag < of doctype. I think that it is an editor bug.
Try to remove the opening "<" and insert it back and save after that. also doctype should be 1st line of the file.
The # statements are translated to whitespace. The doctype is expected to be the first line of the document. In this case, the first line is blank, so the doctype is defined as an empty line, which triggers quirksmode.

Resources