So I have this doubly encoded UTF-8 file. eg.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>test</title>
</head>
<body>
<p>this is a “testâ€Â</p>
</body>
</html>
URL: http://www.frostjedi.com/terra/scripts/demo/utf8-1.html
If, in Firefox, I view the source and then copy / paste it into a new file I've effectively undone the double encoding. eg.
http://www.frostjedi.com/terra/scripts/demo/utf8-2.html
My question is... how can I do this via the CLI?
I tried this:
iconv -f UTF-8 -t ISO-8859-1 utf8-1.html > utf8-3.html
But got this:
iconv: illegal input sequence at position 294
Any ideas?
Try Windows-1252 instead of ISO-8859-1.
This is the difference between Windows Latin-1, and Latin-1. All browsers, also Mac and Linux, wrongly accept as ISO-8859-1 the Windows-Latin-1, as that fills in the 0x80-0x9F characters.
No guarantee that everything is resolved though.
Related
I need to implement something similar to this answer
https://stackoverflow.com/a/41749105/1004374
but I have several issues.
I changed it slightly so be able to pass arguments into the url:
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>openie</title>
</head>
<body>
<h1>Hello world!</h1>
Google1
Google2
</body>
</html>
and changed reg script:
Windows Registry Editor Version 5.00
[HKEY_CURRENT_USER\Software\Classes\openie]
"URL Protocol"="\"\""
#="\"URL:OPENIE Protocol\""
[HKEY_CURRENT_USER\Software\Classes\openie\DefaultIcon]
#="\"explorer.exe,1\""
[HKEY_CURRENT_USER\Software\Classes\openie\shell]
[HKEY_CURRENT_USER\Software\Classes\openie\shell\open]
[HKEY_CURRENT_USER\Software\Classes\openie\shell\open\command]
#="cmd /k set myvar= & call set myvar=\"%1\" & call set myvar=%%myvar:openie:=%% & call \"C:\\Program Files (x86)\\Internet Explorer\\iexplore.exe\" %%myvar%% & exit /B"
The only update is shielding of %1 argument:
myvar=\"%1\
This is needed to pass arguments with &. Otherwise will be copied url until first ampersand:
openie:https://www.google.com/?word=abc&word2=abc2
All is fine when you click the link first time. When IE is already opened url is copied incorrectly with encoded quotes inside it and automatically added http in the begining:
http://%22https//www.google.com/?word=abc&word2=abc2"
I realize that issue with cmd script inside but cannot guess what should be changed to be able to pass arguments and click links many times.
Not found a good way to modify the script to accept the '&'. but as a workaround, I suggest you could encode the url, and change the '&' to '%26', the link as below:
Google2
Then, in the destination page, you could decode the url and change '%26' to '&', then, split the string and get the parameters.
More details, please refer to the HTML URL Encoding.
I want to set shortcut key to my vim editor(like php storm) for sorting all of the blocks of parent-child(tags) in html or other programming languages!
For example:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title></title>
</head>
<body>
</body>
</html>
When i press ctrl+alt+z,My *.html file sorted like this:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title></title>
</head>
<body>
</body>
</html>
Is it possible? If it's, what should i add to my .vimrc?
I remember when i used phpstorm, we had shortcut key for such things!
open the text in vim, try this:
"if your file is with name foo.html this step can be skipped
:set ft=html (press enter)
gg=G
The --margin-top option is for the contents margin, but I would like to set the margin from the top of the page to the header. The project I'm working on allows users to create header and footer themselves, so the height of the header or footer is dynamic.
I don't know how to do it so can anyone help?
The built-in options for top margin are
--margin-top (as you mentioned above) and
--header-spacing Spacing between header and content in mm (refer: http://wkhtmltopdf.org/usage/wkhtmltopdf.txt).
None of them will probably help you as there is no option (at least to my knowledge) that can explicitly set some margin from the top of the page to the header. However, in your case, you could explore --header-html <url> and add a html header. This can take an HTML where you could probably set the custom header and add space/margin accordingly and then the HTML gets displayed in on the header.
Use -T -B -L and -R for margins.
wkhtmltopdf -B 13 -L 13 -R 13 -T 53 /tmp/e0cb9c4597860b5abfbf2bafc1000d5a.html /tmp/e0cb9c4597860b5abfbf2bafc1000d5a.pdf
-T 10 is working by adding an empty HTML.
wkhtmltopdf.exe -T 10 --header-html header.html content.html generatedpdf.pdf
the empty html:
<!DOCTYPE html>
<html lang="en">
<head>
<META http-equiv="Content-Type" content="text/html; charset=utf-16">
</head>
<body >
</body>
</html>
I am using Pandoc to write the contents of a site. How do I include meta tags (specifically, description and keywords tags) on a document, without changing the command line arguments passed to Pandoc?
I mean, can I include meta tags somehow in the document text? I don't want to pass command line options, because there are several different pages, with different keywords, that I'd like to send to Pandoc from within Emacs, and customizing each of them would be a problem.
I've found that adding the --self-contained or -s option to the pandoc command allows header contents to be defined per-file in YAML at the top.
For example:
$ cat foo.md
---
title: Foo
header-includes:
<meta name="keywords" content="Foo,Bar" />
<meta name="description" content="My description" />
---
# Bar #
Baz
$ pandoc -s -o foo.html foo.md
$ cat foo.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta name="generator" content="pandoc" />
<title>Foo</title>
<style type="text/css">code{white-space: pre;}</style>
<meta name="keywords" content="Foo,Bar" /> <meta name="description" content="My description" />
</head>
<body>
<div id="header">
<h1 class="title">Foo</h1>
</div>
<h1 id="bar">Bar</h1>
<p>Baz</p>
</body>
</html>
Ok -- so option 1 suggested by David Cain seems like a reasonably easy solution. My implementation of it is a bit ugly, but works:
First, use YAML headers with a field name ending in underscore to add a header line. The Pandoc manual says that these identifiers will be ignored.
---
head_: <meta name="description" content="x is super cool">
head_: <meta name="keywords" content="cool,cold,temperature,super things">
---
Make Emacs search for it in the current buffer and save the line to a file.
(defvar my-markdown-header-file "head.html")
(defun my-markdown-add-headers ()
(if (file-exists-p my-markdown-header-file)
(delete-file my-markdown-header-file))
(append-to-file "" nil my-markdown-header-file)
(save-excursion
(goto-char 1)
(while (re-search-forward "head_:" nil t)
;; get the first and last positions:
(let ((start (point))
(end (progn (end-of-line) (point))))
;; include this line, and a newline after it:
(append-to-file start end my-markdown-header-file)
(append-to-file "\n" nil my-markdown-header-file)))))
(add-hook 'markdown-before-export-hook 'my-markdown-add-headers)
(My elisp ability is not that great, so there probably are better ways of writing this)
Finally -- use pandoc -s -H head.html as markdown command in Emacs markdown-mode.
Thanks to David Cain for suggesting the -H option!
edit: as a bonus, we get to include anything in the headers, including favicons!
head_: <link rel="icon" type="image/x-icon" href="favicon.ico" />
The standard way to insert meta-tags with Pandoc would be to use the -H option.
Using -H/--include-in-header:
input.md:
### Header
Body text
header.html:
<meta name="description" content="My dummy HTML page">
Create file:
pandoc -s input.md -o out.html -H header.html
Templates
However, if you're averse to using command-line arguments, you can make use of Templates. Pandoc builds your document by inserting parsed data into a pre-defined template:
$ pandoc -D html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"$if(lang)$ lang="$lang$" xml:lang="$lang$"$endif$>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta name="generator" content="pandoc" />
$for(author-meta)$
<meta name="author" content="$author-meta$" />
... (and so on)
There's no reason why you couldn't modify the template to suit your needs. Pandoc already extracts some useful meta-data (such as the author) from the body of your document. You could modify it to do the same with some custom meta-tags. This would obviously involve modifying the source of Pandoc.
An alternate solution (that wouldn't involve any Haskell coding) would be to make Emacs parse out meta-data from each file, before passing the remainder to Pandoc for rendering. This leaves two reasonable approaches:
Allow a section for an HTML header in each document, extract this section and automatically insert it with -H
Define your own format for meta-data: Write a template to place meta-data accordingly, extract this meta-data, then pass variable values with -V.
I'd go with option 1, as it's far simpler.
Conclusion
To my knowledge, there's no easy way to do this without some degree of modification (whether it be a custom Emacs routine, modifying the Pandoc source, or other scripting). The easiest way I can think of would be to automatically extract some section of raw HTML and insert it into the final document with -H.
I am having problems with my magento site. I am not the only developer that has worked on it and am really confused by the problem as I do not use Magento much.
The problem is the title tag for all of my pages and products is exactly the same - it is what I have entered into config > design > html head > default title. Eg - 'welcome to blabla' If I remove what is in default title, then no title tag at all is displayed.
I have given all of my products meta titles that I want used... but it is not picking them up.
My head file has not been altered and shows:
<title><?php echo $this->getLayout()->getBlock('breadcrumbs')->toHtml(); ?></title>
<meta http-equiv="Content-Type" content="<?php $this->getLayout()->getBlock('breadcrumbs')->toHtml(); ?>" />
<meta name="description" content="<?php echo htmlspecialchars($this->getDescription()) ?>" />
<meta name="keywords" content="<?php echo htmlspecialchars($this->getKeywords()) ?>" />
<meta name="robots" content="<?php echo htmlspecialchars($this->getRobots()) ?>" />
If I change title to something like test, it does pick up the change, so my head file is working.
Really need some help with this. I am on version 1.4.2 and cant really upgrade.
Thanks!
I think you might have the wrong echo statement in that snippet. Right now, it is trying to render the block with a name of 'breadcrumbs'. If that block only contains one title, then it would show up with one title on all pages.
Maybe try replacing this:
<title><?php echo $this->getLayout()->getBlock('breadcrumbs')->toHtml(); ?></title>
With this:
<title><?php echo $this->getTitle() ?></title>