I do not know what I changed, but today I can no longer build my site's front page with jekyll successfully. It is now complaining about:
[2012-10-30 14:22:10] regeneration: 1 files changed
Liquid Exception: incompatible character encodings: UTF-8 and ASCII-8BIT in index.html
And I'm at a loss to resolve the issue. I believe it's being introduced via a loop of posts I create on the front page, index.html, where I include an excerpt from the last 'n' posts. I used file(1) against my _posts/ directory, and do have some mixture in there:
_posts/2012-08-10-canned-responses-your-silent-partner.md: UTF-8 Unicode English text, with very long lines
_posts/2012-08-21-alternate-ssh-for-osx.md: UTF-8 Unicode English text, with very long lines
_posts/2012-08-21-appus-interruptus.md: ASCII English text
_posts/2012-10-25-emoryfocuslight.md: ASCII English text
_posts/2012-10-28-distributed-social-networking-with-tent.md: ASCII English text, with very long lines
I'm not sure if this is my problem, though. I use vim and bbedit to edit these files, and they're stored in Dropbox (I build/stage in my Dropbox folder but publish elsewhere). Most of my writing/editing is done on OS X.
When I search for this error message I get a lot of hits for rails applications or forcing ruby gems to use a specific encoding, I don't know if that is relevant or would even help me. Would love to be aimed in the right direction or be told how to resolve this situation. It's a sad state of affairs!
A fix is to use the configuration
Jekyll Configuration
Example
encoding: utf-8
No mention of UTF fix
Additionally, you might need to change the code page of the console window to UTF-8 in case you get a “Liquid Exception: Incompatible character encoding” error during the site generation process. It can be done with the following command:
chcp 65001
(From the jekyll "Installation for Windows page": http://jekyllrb.com/docs/windows/)
I have been struggling with this same issue lately and finally found out the root cause.
I went through all post files and noticed that the header matter in some old posts contained following:
title: !binary | {mime encoded string}
Propably the Wordpress migration script, which I had used,
encoded latin1 strings as !binary in YAML and this
caused "incompatible character encodings" error in my case.
I replaced those with correct UTF-8 strings and all went smoothly after that.
Sound daft but re install Jekyll. Then try and re compile your site.
Had a quick google search, bear in mind these are not for Jekyll but ruby so similar.
ruby 1.9 + sinatra incompatible character encodings: ASCII-8BIT and UTF-8
http://www.ruby-forum.com/topic/206925
What have you set the markdown to in your _config?
Related
We are using QT 5.5 successfully throughout our VC++ projects in VS2015.
Now, i am adding i18n thereto, using QTs Linguist tools to create my strings 2b translated and the resulting .qm files. I load the files through QTranslator object, the translation itself seems to work, but they get displayed wrongly.
As german is my mother tongue, I have to type several umlauts, beside any other special unicode-characters I definitely want to support.
As en example, I use linguist to translate over to über, and the resulting text in my application reads über. What I can surely recognize as an encoding mismatch.
I already had a look on the i18n example, which displays correctly for all of the provided languages, so I right now do not know what's wrong after I checked all file encodings.
Anyone any ideas? Or even has the same problems? Or had them but solved? Any suggestions were greatly appreciated!
This seems to be a Windows-specific problem.
Instead of using QString.toStdString() (what breaks the correct string), better use QString.toLatin1() at least for the languages to support yet.
For ecommerce web applications, I need to verify whether the correct currency symbol is displayed or not, depending on the country.
In the below site,
http://www.moltonbrown.co.uk/store/index.jsp
am checking for the currency symbol in the basket summary.
In Selenium IDE, when I do verifyText for xpath //div/span[contains(text(),'£18.00')], it runs well. But, when I use the same xpath in selenium webdriver automation code and try to verify the element, it displays:
//div/span[contains(text(),'�18.00')]
Element not found
false
***************
//div/span[contains(text(),'�18.00')]
*************
***************not present***************
I saw this get currency symbol using php . But couldnt understand what should be done to overcome this.
Thanks in advance,
Suchitra
The fact that your currency symbol is turned into a � indicates there is a character set issue that is causing your problem. From the web-driver documentation, there is "limited" support of unicode by using UTF-8. Limited probably means "buggy".
In any case, you should verify that that the file that runs this test is saved using UTF-8, and that the bytes for your currency symbol are correct for UTF-8. If this file is in some other character set, save it as UTF-8 and try again.
Here is somebody that had a similar problem with the ä character and python drivers. Sending unicode with Selenium Webdriver on python They were able to solve the problem by changing the declared character set of their python file.
When I try to compile the example from the front page of the go language website with the 6g compiler, I get this error:
hello.go:5: syntax error near "<string>"
I search on Google reveals that a few people have experienced this, but I have found no solution. The answer always seems to be: "It's works for me, you must do something wrong".
I've found a description of the problem that dates back 5 months, so I suspect it's not a problem with the particular build of go that I'm using. Besides, I've tried pulling a newer version, and the problem persists.
The source code in question:
package main
import "fmt"
func main() {
fmt.Println("Hello, 世界")
}
Btw, I'm saving the source code as UTF-8 with LFs for newlines. It shouldn't be a text encoding issue. I've also tried with different strings not containing "exotic" characters
Try "which 6g".
You might be picking up an old build.
At least that was my case. I had an old 2009 build in my path.
After fixing the environment it worked.
Your special characters in there might cause conflicts with the compiler. Try to save this code in multiple ways using notepad (ANSI, UTF-8), and see whether the compiler will take any of them.
Problems like this are typical when there's an encoding issue.
If you're on Windows, an editor like Notepad++ can convert between many encoding formats, so I'd suggest converting your source to UTF-8 without BOM and then recompile.
If you're on Linux, there's a guide available showing you how to determine and change a document's encoding.
I'm using hpricot to read HTML. I got a segmentation fault error, I googled and some say upgrade to latest version of Ruby. I am using rails 2.3.2 and ruby 1.8.7. How to resolve this error?
I was trying to parse html pages with many unicode characters in them and Hpricot kept crashing. Finally, I used the monkey patch from sanitize and put it in the environment.rb for my rails application. There hasn't been a single crash since I added this patch:
http://github.com/rgrove/sanitize/blob/1e1dc9681de99e32dc166f591343dfa60fc1f648/lib/sanitize/monkeypatch/hpricot.rb
If you're free to choose your HTML parsing library, switch it.
Why, the creator of Hpricot, recently posted that you should better use Nokogiri instead of HPricot, nowadays.
You may also have a look at HTTParty.
On ruby 1.8.5 try using hpricot -v 0.6.161
That worked for me.
From memory, since I last used it about a year ago:
Hpricot stores attributes in a fixed-size buffer, and some frameworks generate outrageously long hashes in document attributes. There's some static field you can set before parsing that lets you set the size of this buffer.
I remember it being fairly prominent in the docs on the webpage, though of course it's gone now.
Well, based on your own question, I'd say "Upgrade to the latest version of Ruby". However, I've also had problems with hpricot segfaulting, which seemed to be related to my usage of threading.
This appears to be an outstanding issue on the bug list. I have experienced it to. My theory is has to do with the HTML structure or bad/corrupt character in the file but I have not found where exactly.
Here are the links to the issues:
http://github.com/why/hpricot/issues/#issue/10
http://github.com/why/hpricot/issues/#issue/4
I'm having the same segfault issue but sadly can't consult the issues Dave cited above, even via Google cache -- from what I've been googling the parse.rb segfaults have to do with encoded entities or alt character sets (accented characters perhaps)
The sanitize lib encountered the same issue and posted a monkeypatch here:
http://github.com/rgrove/sanitize/blob/1e1dc9681de99e32dc166f591343dfa60fc1f648/lib/sanitize/monkeypatch/hpricot.rb
Does anyone know of a library that I can use on OS X/Linux to parse Word files and output the content as HTML?
I've had a look at win32ole but as far as I can see it's for Windows only, although I could be wrong.
Any suggestions?
The Word document format (ignoring docx for the moment) is terrible and was constantly changing. IMHO that is why there are so few (read: zero) Ruby libraries out there to parse them.
What I recommend doing is using JRuby and some of the established Java libraries for reading the doc format. Google should help you out there: http://schmidt.devlib.org/java/libraries-word.html.
There is a Java project for reading MIcrosoft file formats, POI (http://poi.apache.org/) and they do have Ruby bindings (http://poi.apache.org/poi-ruby.html) but I'm not sure how up-to-date those are. On their site it says the Ruby bindings are for 1.8.2...