Character cannot be represented in character set in CLISP (2.49) - windows

Im trying to use CLISP on Windows. So, when I start it in Command line I see next
*** - SYSTEM::DRIVER: Character #\u0414 cannot be represented in the character set CHARSET:cp437
Break 1 [3]>
How can I fix this?

This is an FAQ:
What do these error messages mean: “invalid byte #x94 in CHARSET:ASCII conversion” and “character #\u00B3 cannot be represented in the character set CHARSET:ASCII”?
This means that you are trying to read (“invalid byte”) or write (“character cannot be represented”) a non-ASCII character from (or to) a character stream which has ASCII :EXTERNAL-FORMAT. The default is described in -Edomain encoding.
This may also be caused by filesystem access. If you have files with names incompatible with your CUSTOM:*PATHNAME-ENCODING*, filesystem access (e.g., DIRECTORY) will SIGNAL this ERROR. You will need to set CUSTOM:*PATHNAME-ENCODING* or pass -Edomain encoding to CLISP. Using a “1:1” encoding, such as CHARSET:ISO-8859-1, should help you avoid this error.
Note that this error may be signaled by the “Print” part of the read-eval-print loop and not by the function you call. E.g., if file "foo" contains non-ASCII characters, you will see such an error when you type
(WITH-OPEN-FILE (s "foo"
:direction :input
:EXTERNAL-FORMAT CHARSET:ISO-8859-1)
(READ-LINE s))
If instead you type
(WITH-OPEN-FILE (s "foo"
:direction :input
:EXTERNAL-FORMAT CHARSET:ISO-8859-1)
(SETQ l (READ-LINE s))
NIL)
CLISP will just print NIL and signal the error when you type l.

cp437 seems to indicate a code page. Code page 437 is "US-ASCII" if I remember correctly, that is only 7 bits. It seems that you need to configure your "Command line" to display unicode.

Related

Un-Escaping double quotes in a Common Lisp string

For generating a string command for using in uiop:run-command I need to have a string which contains other strings (as required by the command in question). However the command (Fontforg's legacy scripting language) requires that some strings are enclosed in double quotes, e.g.:
"fontforge -lang=ff -c 'Print("A Doublequote String")'"
How can I get such a string with the "A Doublequote String" being literally part of the command-string, i.e. no doublequote being escaped?
Update
To be more concrete, the command I want to send to uiop:run-command is fontforge -lang=ff -c 'Open($1);SelectAll();foreach Print(GlyphInfo("Name")); endloop' haydn-11.svg, where the "Name" argument to GlyphInfo should be written with double quotes. Using escape characters backslash to preserve the quotes
(uiop:run-program
(format nil
"fontforge -lang=ff -c 'Open($1);SelectAll();foreach Print(GlyphInfo(\"Name\")); endloop' haydn-11.svg")
:output t)
the subprocess command exits with error code 1:
Subprocess with command "fontforge -lang=ff -c 'Open($1);SelectAll();foreach Print(GlyphInfo(\"Name\")); endloop' haydn-11.svg"
exited with error code 1
[Condition of type UIOP/RUN-PROGRAM:SUBPROCESS-ERROR]
I suppose the command is seeing the escape characters too, and that is why it fails to perform, since otherwise the command is syntactically correct.
Use Single Escape Character:
Backslash is a single escape character in standard syntax.
I.e., what you are looking for is
"fontforge -lang=ff -c 'Print(\"A Doublequote String\")'"
Note that by default *print-escape* is t, i.e., the above string will be printed with backslashes even though the string itself does not contain it:
(defparameter s (string #\"))
s
==> "\""
(length s)
==> 1
(char s 0)
==> #\"
I don't have fontforge, but I can do the equivalent. First of all the way you are doing it is not one, but two levels of language-in-a-string: the shell language is in a string in CL, and whatever fontforge language is in a string in the shell language. So let's minimise that by avoiding the whole shell language altogether. We still have one level of language-in-a-string, but that's a whole lot better than two.
I also don't understand why you're using (format nil <fixed string>) to make ... a fixed string. I'm guessing that eventually you're intending to replace the filename using format, but this is not needed since we're no longer going to use the shell at all.
So instead, do this
(defun runit (file)
(uiop:run-program
(list "echo" ; because I don't have fontforge
"fontforge" "-lang=ff" "-c"
"Open($1); SelectAll(); foreach Print(GlyphInfo(\"Name\")); endloop"
file)
:output t
:force-shell nil))
Note that the command here is echo: because I don't have fontforge I'll just get echo to print the command line. I've also added some spaces into the big chunk of whatever-language-fontforge-uses to make it clearer.
So now
> (runit "foo.svg")
fontforge -lang=ff -c Open($1); SelectAll(); foreach Print(GlyphInfo("Name")); endloop foo.svg
nil
nil
0
And this is fine, although it's not clear what the individual arguments are because echo doesn't do that. Well, I have a little utility called argv whose whole job is to print its argv clearly, so rewriting runit as
(defun runit (file)
(uiop:run-program
(list "argv" ; because I don't have fontforge
"fontforge" "-lang=ff" "-c"
"Open($1); SelectAll(); foreach Print(GlyphInfo(\"Name\")); endloop"
file)
:output t
:force-shell nil))
We get
> (runit "foo.svg")
"fontforge"
"-lang=ff"
"-c"
"Open($1); SelectAll(); foreach Print(GlyphInfo("Name")); endloop"
"foo.svg"
nil
nil
0
Note that argv isn't smart enough to escape the quotes inside the long argument: it's just a very tiny Perl script I use for debugging things.

(Scheme) Unbound Variable when copy-pasting code

I'm copying the following Scheme code into a buffer file on emacs from a pdf:
(define (plural wd)
(if (equal? (last wd) ’y)
(word (bl wd) ’ies)
(word wd ’s)))
The initial formatting is as a long string, and I manually edit it to the format seen above. The file loads, but when I use the function I get the error:
*** Error:
unbound variable: |’y|
Current eval stack:
__________________
0 (equal? (last wd) |’y|)
1 (if (equal? (last wd) |’y|) (word (bl wd) |’ies|) (word wd |’s|))
When I manually type this code and load the file, however, the function runs no problem.
In what way is the pasting/editing of the code messing with the formatting of the code?
Is there a proper way to copy-paste code into a file? I tried formatting the code in a text editor before pasting into the buffer, but that didn't work either.
Thank you for your time and help.
It was already answered in the comments by Barmar, but this should enable you to complete your question, and help anybody else with the same problem in the future.
When you copy/pasted the code from the PDF, you did not copy a simple ASCII quote character '. Instead, you copied a "right single quotation mark" (unicode U+2019) ’. As this is not a reserved character in Scheme, it can be used as an identifier, and so what you expected to be the quoted symbol 'y was in fact the identifier ’y. The error was caused by there being no binding for the variable ’y.
A simple way of fixing this that does not required manually copying the code or fixing every quotation mark by hand is to find-and-replace ’ for ' (as long as you don't expect any ’ characters in your strings).

Regex not matching string in scheme but works on other platform

I am running string-match using the pattern [ \[\]a-zA-Z0-9_:.,/-]+ to match a sample text Text [a,b]. Although the pattern works on regex101, when I run it on scheme it returns #f. Here is the regex101 link.
This is the function I am running
(string-match "[ \\[\\]a-zA-Z0-9_:.,/-]+" "Text [a,b]")
Why isn't it working on scheme but works eleswhere? Am I missing something?
After discussing the issue on the guile gnu mailing list, I found out that Guile's (ice-9 regex) library uses POSIX extended regular expressions. And this flavor of regular expression doesn't support escaping in character classes [..], hence that's why it wasn't matching the strings.
However, I used the following function as a workaround and it works:
(string-match "[][a-zA-Z]+" "Text[ab]")
I don't see anything wrong with your regular expression syntax as it is quoted correctly so I assume there must be a bug in Guile, or the regexp library it uses, where \] just isn't interpreted the correct way inside brackets. I found a workaround by using the octal code point values instead:
(string-match "[A-Za-z\\[\\0135]+" "Text [a,b]")
; ==> #("Text [a,b]" (0 . 4))
Your regular expression isn't very good. It matches any combination of those chars so "]/Te,3.xt[2" also matches. If you are expecting a string like "Something [something, something]" I would rather have made /[A-Z][a-z0-9]+ [[a-z0-9]+,[a-z0-9]+]/ instead. eg.
(define pattern "[A-Z][a-z0-9]+ \\[[a-z0-9]+,[a-z0-9]+\\]")
(string-match pattern "Test [q,w]") ; ==> #("Test [q,w]" (0 . 10))
(string-match pattern "Be100 [sub,45]") ; ==> #("Be100 [sub,45]" (0 . 14))

Emacs lisp; how to make a string from a variable of any type?

Like error messages for wrongly called functions show, eg.:
(message (file-attributes "."))
Produces the message:
"eval: Wrong type argument: stringp, ("/home14/tjones" 1 0 0 (20415 35598) (20211 19255) (20211 19255) 14 "lrwxrwxrwx" t ...)"
How do you do this type of translation intentionally, eg.:
(message (thing-to-string (file-attributes ".")))
To message something like:
("/home14/tjones" 1 0 0 (20415 35598) (20211 19255) (20211 19255) 14 "lrwxrwxrwx" t ...)
This is for debugging/info only. I'm assuming there's a way as message is doing it, but is this exposed to us users?
Look into prin1-to-string and related functions (prin1, princ, etc). And do try the manual! http://www.gnu.org/software/emacs/manual/html_node/elisp/Output-Functions.html
In your example, message did not do anything (it just refused to run), so the translation to string was done by the read-eval-print loop which caught the error and turned it into a text message.
But yes, message can also do that, and it does that by calling format, which internally uses things like prin1-to-string.
So (format "%S" <foo>) would do your thing-to-string.
The first argument to message is supposed to be a format string (same as the one you pass to the format function. If you give it the format "%s" (or "%S" as in Stefan's answer.) it will stringify anything you give it as the next argument.
The capital S version will escape characters in the string so that it can be read again as an s-expression. In this case, I think that is what you want. So, you don't need to change your code very much to get what you are looking for:
(message "%S" (file-attributes "."))

What would the following display in scheme?

I have tried this expression on a few (online) scheme interpreters/parsers and sometimes get different answers. For the following expression:
(display "okkk") \; "ok;" ;"ok"
What would it display and/or return? Why? For example:
Are \ acceptable outside of an s-expression, and do they escape the next character?
How is a string interpreted outside an expression, or is that invalid?
That probably depends on what Scheme syntax your implementation may support.
One might for example expect:
(display "okkk") -> displays okkk
\; -> error: unbound variable
"ok;" -> displays nothing, but returns the string
;"ok" -> end of line comment

Resources