unpacking binary file via octets->string->unpack fails: signed int `#(243 0)` is illegal UTF8

unpacking binary file via octets->string->unpack fails: signed int `#(243 0)` is illegal UTF8 - utf-8

I am parsing a binary file (nifti) with a mix of chars, floats, ints, and shorts (using the PDL::IO::Nifti cpan module as reference).
I am having some luck parsing sequences of octets to a string so they can be passed to cl-pack:unpack. This is convoluted but convenient for porting using the perl module as reference.
This strategy fails on reading #(243 0) as binary
(setf my-problem (make-array 2
:element-type '(unsigned-byte 8)
:initial-contents #(243 0)))
(babel:octets-to-string my-problem)
Illegal :UTF-8 character starting at position 0
and, when trying to read the file as char*
the octet sequence #(243 0 1 0) cannot be decoded.
I'm hoping there is a simple encoding issue I haven't figured out. Trying to go in the reverse direction (packing 243 and getting octets) gives a vector of length 3 for what I expect to be 2.
(babel:string-to-octets (cl-pack:pack "s" 243))
; yields #(195 179 0) expect #(243 0)
Full context
;; can read up to position 40. at which we expect 8 signed ints.
;; 4th int is value "243" but octet cannot be parsed
(setq fid-bin (open "test.nii" :direction :input :element-type 'unsigned-byte))
(file-position fid-bin 40)
(setf seq (make-array (* 2 8) :element-type '(unsigned-byte 8)))
(read-sequence seq fid-bin)
; seq: #(3 0 0 1 44 1 243 0 1 0 1 0 1 0 1 0)
(babel:octets-to-string seq) ; Illegal :UTF-8 character starting at position 6.
(sb-ext:octets-to-string seq) ; Illegal ....
;; first 3 are as expected
(cl-pack:unpack "s3" (babel:octets-to-string (subseq seq 0 6)))
; 3 256 300
(setf my-problem (subseq seq 6 8)) ; #(243 0)
(babel:octets-to-string my-problem) ; Illegal :UTF-8 character starting at position 0.
;; checking the reverse direction
;; 243 gets represented as 3 bytes!?
(babel:string-to-octets (cl-pack:pack "s3" 3 256 300)) ; #(3 0 0 1 44 1)
(babel:string-to-octets (cl-pack:pack "s4" 3 256 300 243)) ; #(3 0 0 1 44 1 195 179 0)
(setq fid-str (open "test.nii" :direction :input))
(setf char-seq (make-array (* 2 8) :initial-element nil :element-type 'char*))
(file-position fid-str 40)
(read-sequence char-seq fid-str)
;; :UTF-8 stream decoding error on #<SB-SYS:FD-STREAM ....
;; the octet sequence #(243 0 1 0) cannot be decoded.
The perl equivalent
open my $f, "test.nii";
seek $f, 46, 0;
read $f,my $b, 2;
print(unpack "s", $b); # 243

The problem is that you are using functions which try to treat some sequence of octets as a representation of an encoding of a sequence of characters (or of some Unicode things: I think there are things other than characters in Unicode). In particular, in your case, the functions you are using are treating a sequence of octets as the UTF-8 encoding of some string. Well, not all sequences of octets are legal UTF-8 so the functions are, correctly, puking on an illegal sequence of octets.
But that's because you're not doing the right thing: what you want to do is to take a sequence of octets and make a string whose char-codes are those octets. You don't want to be doing with any silly encoding-big-characters-in-small-integers rubbish, because you will never see any big characters. You want something like these functions (both somewhat misnamed, since they aren't fussed about the whole octet thing unless you are).
(defun stringify-octets (octets &key
(element-type 'character)
(into (make-string (length octets)
:element-type element-type)))
;; Smash a sequence of octets into a string.
(map-into into #'code-char octets))
(defun octetify-string (string &key
(element-type `(integer 0 (,char-code-limit)))
(into (make-array (length string)
:element-type element-type)))
;; smash a string into an array of 'octets' (not actually octets)
(map-into into #'char-code string))
And now you can check everything works:
> (octetify-string (pack "s" 243))
#(243 0)
> (unpack "s" (stringify-octets (octetify-string (pack "s" 243))))
243
and so on. Given your example sequence:
> (unpack "s8" (stringify-octets #(3 0 0 1 44 1 243 0 1 0 1 0 1 0 1 0)))
3
256
300
243
1
1
1
1
A really much better approach would be to have the packing & unpacking functions simply handle sequences of octets. But I suspect that's a lost cause. An interim approach which is horrible but less horrible than converting sequences of octets to characters would be to read the file as text but with an external-format which does no translation at all. How to do that is implementation-dependent (but something based on latin-1 will be a good start).

It seems that the problem is indeed encoding-related:
CL-USER> (cl-pack:pack "s" 243)
"ó\0"
which is the same as the result of:
(babel:octets-to-string my-problem :encoding :iso-8859-1)

Related

Built in Binary Conversion in Scheme/Lisp

Is there a built in binary to decimal conversion function in Scheme?
I've found the built in number->string conversion which can convert binary to decimal form.
However, the opposite string->number doesn't convert decimals to binary string like I'd thought.
Is there a built in function or would we have to define it?

The string->number function accepts an optional radix parameter:
(string->number "1001" 2)
==> 9

Binary and decimal are representations of numbers; numbers themselves are not binary or decimal.
number->string converts from a number (such as twelve) to a string (such as "12"), outputting the number's base 10 representation by default.
(It does not convert from binary to decimal - its name describes what it does.)
string->number converts from a string (such as "12") to a number (such as twelve), interpreting the string as the base 10 representation of a number by default.
(This function's name also describes what it does.)
You can pass a second argument to both functions for a different base representation (2,8,10, or 16).
To get a string with the binary representation of the number n, use (number->string n 2).
To get a number from a string s with its binary representation, use (string->number s 2).
Examples:
> (number->string 120)
"120"
> (string->number "120")
120
> (number->string 120 2)
"1111000"
> (string->number "1111000" 2)
120
> (number->string 120 16)
"78"
> (string->number "78" 16)
120

Common Lisp
As with Scheme, numbers does not have bases in Common Lisp as well, only their representations.
Visualizing a number in a base using write-to-string:
(write-to-string 10 :base 2)
; ==> "1010"
Reading a number represented in a certain base using parse-integer:
(parse-integer "1010" :radix 2)
; ==> 10
; ==> 4 (index where the parser terminated)
(parse-integer "1010.1" :radix 2)
; parse-integer: substring "1010.1" does not have integer syntax at position 4
(parse-integer "1010.1" :radix 2 :junk-allowed t)
; ==> 10
; ==> 4 (index where the parser terminated)
Alternatively you can use the reader/printer, however reading only works if the next token cannot be interpreted as a float:
(let ((*print-base* 2))
(prin1-to-string 10))
; ==> "1010"
(let ((*read-base* 2))
(read-from-string "1010"))
; ==> 10
; ==> 5
;; *read-base* ignored when interpreted as float
(let ((*read-base* 2))
(read-from-string "1010.1"))
; ==> 1010.1
; ==> 6
I assume global *print-base* and *read-base* is both ten.
read-from-string doesn't care if there is junk after the number so it behaves as (parse-integer "1010" :radix 2 :junk-allowed t)
As an added info on the read base doc. You can tell the reader for literals for base 2, 8 and 16 and arbitrary which overrides the dynamic setting:
#b1010 ; ==> 10 (base 2)
#o1010 ; ==> 520 (base 8)
#x1010 ; ==> 4112 (base 16)
#3r1010 ; ==> 30 (base 3)
#36rToBeOrNotToBe ; ==> 140613689159812836698 (base 36)

How to export flattened image with GIMP Script-Fu

I've got a script that's supposed to flatten, resize and export an initial image. I'm stuck on the export. I'm using file-png-save. It seems fine with my parameters except for the third parameter, which is supposed to be a drawable.
For the drawable, I'm using the flattened layer I got from gimp-image-flatten. I'm getting a response that my third argument is invalid. Do layers not always work as drawables? Do I need to convert it?
(define (script-fu-panel-export inImg drawable inWidth)
(define filename (car(gimp-image-get-filename inImg)))
(define comHeight (/ inWidth .75))
(define piece (car(cdr(cdr(cdr(cdr(cdr(strbreakup filename "/"))))))))
(define base (car(strbreakup piece ".")))
(define destination (string-append "/home/samjones/Dev/mobinge/lib/images/" base "-" (number->string inWidth) ".png"))
(let* ((duplicateImg (car(gimp-image-duplicate inImg))))
(gimp-image-scale duplicateImg inWidth comHeight)
(let* ((flatLayer (gimp-image-flatten duplicateImg)))
(file-png-save 1 duplicateImg flatLayer destination destination 1 0 0 0 0 0 0)
)
(gimp-display-new duplicateImg)
)
)
(script-fu-register
"script-fu-panel-export"
"Export Panel. . ."
"Creates a flattened image export to a selected size.."
"Sam Jones"
"copyright 2017, Sam Jones"
"December 19, 2017"
""
SF-IMAGE "Image" 0
SF-DRAWABLE "Maybe unused" 0
SF-ADJUSTMENT "Width" '(320 20 1200 10 50 0 SF-SLIDER)
)
(script-fu-menu-register "script-fu-panel-export" "<Image>/Filters")

Looking around for people who save from scripts, I found a slightly different route that worked. I replaced these two lines:
(let* ((flatLayer (gimp-image-flatten duplicateImg)))
(file-png-save 1 duplicateImg flatLayer destination destination 1 0 0 0 0 0 0))
With this:
(gimp-image-flatten duplicateImg)
(file-png-save 1 duplicateImg (car(gimp-image-get-active-drawable duplicateImg))_destination destination 1 0 0 0 0 0 0)
So I think gimp-image-flatten probably returns a list with the layer as its first element instead of returning a layer. I now know that gimp-image-get-active returns a list with the element.
It's weird, but it works.

You are not accessing the desired element of the result of gimp-image-flatten. Use car to access it:
(let* ((flatLayer (car (gimp-image-flatten duplicateImg))))
(file-png-save 1 duplicateImg flatLayer destination destination 1 0 0 0 0 0 0)
)

Parsing strings representing lists of integers and integer spans

I am looking for a function that parses integer lists in Emacs Lisp, along the lines of Perl's Set::IntSpan. I.e., I would like to be able to do something like this:
(parse-integer-list "1-3, 4, 8, 18-21")
⇒ (1 2 3 4 8 18 19 20 21)
Is there an elisp library somewhere for this?

The following does what you want:
(defun parse-integer-list (str)
"Parse string representing a range of integers into a list of integers."
(let (start ranges)
(while (string-match "\\([0-9]+\\)\\(?:-\\([0-9]+\\)\\)?" str start)
(push
(apply 'number-sequence
(seq-map 'string-to-int
(seq-filter
'identity
(list (match-string 1 str) (match-string 2 str)))))
ranges)
(setq start (match-end 0)))
(nreverse (seq-mapcat 'nreverse ranges))))
The code loops over the incoming string searching for plain numbers or ranges of numbers. On each match it calls number-sequence with either just a number for a plain match or two numbers for a range match and pushes each resulting number sequence into a list. To account for push building the result backwards, at the end it reverses all ranges in the list, concatenates them, then reverses the result and returns it.
Calling parse-integer-list with your example input:
(parse-integer-list "1-3, 4, 8, 18-21")
produces:
(1 2 3 4 8 18 19 20 21)

ALU-n Procedure in Scheme

I'm a beginner to the Scheme language, so I'm having trouble writing a procedure to take in an n-bit number and put it into an ALU. The ALU is supposed to be constructed using 1-bit ALU's.
Here is the 1-bit ALU:
(define ALU1
(lambda (sel a b carry-in)
(multiplexor4 sel
(cons (andgate a b) 0)
(cons (orgate a b) 0)
(cons (xorgate a b) 0)
(multiplexor2 sub
(full-adder a b carry-in)
(full-adder a (notgate b) carry-in)))))
which, along with the multiplexors and full-adder, works.
Here is my attempt at using a couple of procedures to simulate the n-bit ALU:
(define ALU-helper
(lambda (selection x1 x2 carry-in n)
(if (= n 0)
'()
(ALU1 (selection x1 x2 carry-in)))))
(define ALUn
(lambda (selection x1 x2 n)
(ALU-helper (selection x1 x2 c n))))
And when it's done, it's supposed to take 2 n-bit numbers and add them, or subtract etc, according the to "selection." This would be the input:
(define x1 '(0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0) )
(define x2 '(1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1) )
(ALUn 'add x1 x2 32)
And I get errors when running it that seem to be happening because of the "selection" parameter. I'm sure I'm just getting confused by all the parameters, but I'm not sure how to fix the problem and get the ALU to work. I'm running this using the Dr. Racket program, language R5RS.

By putting parentheses around your arguments to ALU1 inside ALU-helper, you are asking selection to be treated as a function, and only passing 1 argument to ALU-helper. Try:
(ALU1 selection x1 x2 carry-in))))
Same thing for the call to ALU-helper in ALUn.

(Number and Number) in VBscript

I have some VB script in classic ASP that looks like this:
if (x and y) > 0 then
'do something
end if
It seems to work like this:
(46 and 1) = 0
and
(47 and 1) = 1
I don't understand how that works. Can someone explain that?

It's a Bitwise AND.
47 is 101111
AND 1 is 000001
= 000001
while
46 is 101110
AND 1 is 000001
= 000000

It's doing a bitwise comparison -
Bitwise operations evaluate two integral values in binary (base 2)
form. They compare the bits at corresponding positions and then assign
values based on the comparison.
and a further example -
x = 3 And 5
The preceding example sets the value of x to 1. This happens for the
following reasons:
The values are treated as binary:
3 in binary form = 011
5 in binary form = 101
The And operator compares the binary representations, one binary
position (bit) at a time. If both bits at a given position are 1, then
a 1 is placed in that position in the result. If either bit is 0, then
a 0 is placed in that position in the result. In the preceding example
this works out as follows:
011 (3 in binary form)
101 (5 in binary form)
001 (The result, in binary form)
The result is treated as decimal. The value 001 is the binary
representation of 1, so x = 1.
From - http://msdn.microsoft.com/en-us/library/wz3k228a(v=vs.80).aspx

Try
x = 47
y = -1
if (x AND y) > 0 then
'erroneously passes condition instead of failing
end if
Code should be
if (x > 0) AND (y > 0) then
'do something
end if
and then it'll work as expected.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

unpacking binary file via octets->string->unpack fails: signed int `#(243 0)` is illegal UTF8 - utf-8

It seems that the problem is indeed encoding-related: CL-USER> (cl-pack:pack "s" 243) "ó\0" which is the same as the result of: (babel:octets-to-string my-problem :encoding :iso-8859-1)

Related

Built in Binary Conversion in Scheme/Lisp

How to export flattened image with GIMP Script-Fu

Parsing strings representing lists of integers and integer spans

ALU-n Procedure in Scheme

(Number and Number) in VBscript

Categories

Resources