I tried to post characters such as "āēīū" from a form in a PDF viewer to a web server, but on the server side those characters look like this: "????". What can I do to pass these characters as UTF-8?
Use xml or xfdf posts.
Encoding doesn't work for html, fdf posts.
Related
I'm sending base64 encoded images in a markdown newsletter to different email services from a rich text editor. Every service renders the images properly except gmail. Instead it displays the base64 string:
<img src="...
The main SO thread regarding this problem does not provide a solution, as can be seen in the comments of the accepted answer.
How does one display images from a data string in gmail? Is it possible to insert a transformation layer to make it work? (I can't believe gmail doesn't support this after 6 years)
Gmail does not support embedded Base 64 images (see Can I email). I believe this is for general security reasons. You either need to generate your image server side. Or send it as an attachment (like in the other post you mentioned).
Can anybody tell me why Facebook doesn't scrape my page, and also the debug/linter tools cant scrape it? I've searched and searched and can't find a way to fix it.
As far as I can tell all the og:tags and scripts are implemented correctly.
The page is at http://www.coincident.dk
The debug url is this: http://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Fwww.coincident.dk
It looks to me like Facebook is scraping your page successfully, just not fully. Weird.
I would try moving the OG meta tags lower in the header. After your content-type meta tag at the very least.
These problems might be due to character encoding issues. If the Facebook scraper is relying on the content-type tag to know the encoding is UTF-8, they it might not be reading your OG tags correctly.
I hope that helps!
I have a page in JSP, which has a tag like:
<img src="images/1.bmp"></img>
The 1.bmp is like:
But the image which looks in my page, visited by firefox, is like:
what should i do to fix this problem?
I've converted the images which you uploaded into your question back to BMP and investigated their source. Everywhere where a non-ISO-8859-1 character appears in the original source, a ? appears in the malformed source.
This means that you've a servlet on /images/* which uses response.getWriter() to write the image using the platform default charset. You shouldn't do that. BMP files are not text files. BMP files are binary files. You should be using response.getOutputStream() to write binary data. You can find a basic and proper example of an image servlet in this article.
I have a multi language website that is hosted on a server that appears to have character encoding set to default to iso-8859-1.
I thought I would be best having the pages utf-8 and included a meta tag to declare this. Unfortunately this meta tag seems to get overridden and the page defaults to iso-8859.
Many special characters in the German and Dutch pages are not showing correctly.
Do I need to try and change the server default to utf-8 or something? Maybe I could remove the server default completly? Hmm... really not sure what's best to do here.
Any advice would be great!
The HTML meta tags for the content type are not used when the HTML page is served over HTTP. Instead, the content type header in the HTTP response will be used. You can determine the content type header with for example Firebug, in the Net panel.
How to change this depends on the programming language and/or the webserver which you are using, which is unclear from your current question. As per you question history, you seem to be using PHP. In that case, you need to add the following line to the PHP file, before you emit any character to the response.
header('Content-Type: text/html; charset=UTF-8');
See also:
PHP UTF-8 cheatsheet
If you're unable to change the HTTP response header, you have to give more detail about the programming language and webserver which you're using. This way we can give you better suited answers.
If you want to stick to ISO-8859-1, then you need to ensure that your pages are saved as ISO-8859-1 as well instead of as UTF-8. Otherwise some characters may indeed go mojibake when you display a UTF-8 saved resource as ISO-8859-1.
There are several possible solutions, but the cleanest solution would be to properly declare your character encoding.
When serving web pages from an HTTP server, the encoding is normally not given by the meta-tags of the HTML file, but by the Content-type HTTP header.
The webserver is probably sending something like Content-type: text/html; charset=ISO-8859-1, and you need to change that.
How to do this depends on the webserver.
As an addition: Yes, iso-8859-1 is fine for German; it will work for all western European languages. It is missing a few characters, however, notably the Euro sign (that is in iso-8859-15). But using UTF-8 is better, as it covers just about every language.
You can see the characters supported and the languages that should cover in this Wikipedia article. According to that, German is fully supported and Dutch is almost.
It's not just a matter of selecting the correct character encoding, you also have to save the pages using that encoding. If you save a page as ISO-8859-1 and use a content type that says that it's UTF-8, then it will be decoded incorrectly by the browser. Both ISO-8859-1 and Unicode support the characters you need, but you have to make sure that the content type corresponds to how the pages are actually saved.
Some website has images with strange URLs
For example below PNG image (Strange URL):

What type of URL is this? and how this works in that website?
Thanks
data URI scheme
That's an embedded image whose binary contents were encoded by Base64 so that it fits in a "normal" String inside a HTML page. With other words, that's an embedded image. Also see this: http://www.w3.org/TR/xhtml-print/#s.4.1.2
However not all browsers do support it and it's also not really efficient. The image is now tied to the parent (X)HTML page and you cannot control its request nor (caching) headers separately. It's only useful when transferring images or any other binary data through real XML files.
Like everyone mentioned, it is data URL scheme. It was detailed in RFC 2397 back in 1998 and follows the following syntax:
data:[<mediatype>[;base64],<data>
IE 5 - 7 do not support it, other standard-compliant browsers such as Firefox, Safari, Opera and Chrome do support data URIs. Work arounds are available for older versions such as IE.
Just a side note, you can generate it with one line of PHP :
<?php echo base64_encode(file_get_contents("yourimage.gif")); ?>
It appears you are looking at a URI specified using the data URI scheme.
In this case I believe the PNG data is encoded directly into the URI.