Haskell breaks my type when I call show/read - debugging

I have defined a new type in my haskell code, which takes a list of lists of strings as a record. An example might be
Board{size=(4,7),pieces=[["OA","AA","AA"],["BBB","BOO"],["OCCC","CCCO","OOCO"]]}
I've set it up as a derived instance of Show and Read. If I just input the code above into ghci, then I get out exactly what I put in, which is fine. However, if I call it with show, I get
"Board {size = (4,7), pieces = [[\"OA\",\"AA\",\"AA\"],[\"BBB\",\"BOO\"],[\"OCCC\",\"CCCO\",\"OOCO\"]]}"
The speech marks are fine, but I've no idea why the backslashes are there. Are you not allowed to next speech marks or something? In any case, this now totally breaks if I try to call read to get it back. I get a long error trying to tell me that none of the strings are data constructors - I don't know why it thinks they are.
Is there any way round this?

Given
> data Board = Board { size :: (Int, Int), pieces :: [[String]] } deriving (Show, Read)
> let b = Board{size=(4,7),pieces=[["OA","AA","AA"],["BBB","BOO"],["OCCC","CCCO","OOCO"]]}
The result of show b is a String
> show b
"Board {size = (4,7), pieces = [[\"OA\",\"AA\",\"AA\"],[\"BBB\",\"BOO\"],[\"OCCC\",\"CCCO\",\"OOCO\"]]}"
The quotes in any String are escaped when show-ing a string. If you output the string instead you'll see that it doesn't contain the \ characters.
> putStrLn $ show b
Board {size = (4,7), pieces = [["OA","AA","AA"],["BBB","BOO"],["OCCC","CCCO","OOCO"]]}
The string produced by show can be read back in again as a board by read
> (read . show $ b) :: Board
Board {size = (4,7), pieces = [["OA","AA","AA"],["BBB","BOO"],["OCCC","CCCO","OOCO"]]}

GHCI already calls show. When you type show something under GHCI, it’s as if you called show $ show something.

Related

How to convert global enum values to string in Godot?

The "GlobalScope" class defines many fundamental enums like the Error enum.
I'm trying to produce meaningful logs when an error occurs. However printing a value of type Error only prints the integer, which is not very helpful.
The Godot documentation on enums indicates that looking up the value should work in a dictionary like fashion. However, trying to access Error[error_value] errors with:
The identifier "Error" isn't declared in the current scope.
How can I convert such enum values to string?
In the documentation you referenced, it explains that enums basically just create a bunch of constants:
enum {TILE_BRICK, TILE_FLOOR, TILE_SPIKE, TILE_TELEPORT}
# Is the same as:
const TILE_BRICK = 0
const TILE_FLOOR = 1
const TILE_SPIKE = 2
const TILE_TELEPORT = 3
However, the names of the identifiers of these constants only exist to make it easier for humans to read the code. They are replaced on runtime with something the machine can use, and are inaccessible later. If I want to print an identifier's name, I have to do so manually:
# Manually print TILE_FLOOR's name as a string, then its value.
print("The value of TILE_FLOOR is ", TILE_FLOOR)
So if your goal is to have descriptive error output, you should do so in a similar way, perhaps like so:
if unexpected_bug_found:
# Manually print the error description, then actually return the value.
print("ERR_BUG: There was a unexpected bug!")
return ERR_BUG
Now the relationship with dictionaries is that dictionaries can be made to act like enumerations, not the other way around. Enumerations are limited to be a list of identifiers with integer assignments, which dictionaries can do too. But they can also do other cool things, like have identifiers that are strings, which I believe you may have been thinking of:
const MyDict = {
NORMAL_KEY = 0,
'STRING_KEY' : 1, # uses a colon instead of equals sign
}
func _ready():
print("MyDict.NORMAL_KEY is ", MyDict.NORMAL_KEY) # valid
print("MyDict.STRING_KEY is ", MyDict.STRING_KEY) # valid
print("MyDict[NORMAL_KEY] is ", MyDict[NORMAL_KEY]) # INVALID
print("MyDict['STRING_KEY'] is ", MyDict['STRING_KEY']) # valid
# Dictionary['KEY'] only works if the key is a string.
This is useful in its own way, but even in this scenario, we assume to already have the string matching the identifier name explicitly in hand, meaning we may as well print that string manually as in the first example.
The naive approach I done for me, in a Singleton (in fact in a file that contain a lot of static funcs, referenced by a class_name)
static func get_error(global_error_constant:int) -> String:
var info := Engine.get_version_info()
var version := "%s.%s" % [info.major, info.minor]
var default := ["OK","FAILED","ERR_UNAVAILABLE","ERR_UNCONFIGURED","ERR_UNAUTHORIZED","ERR_PARAMETER_RANGE_ERROR","ERR_OUT_OF_MEMORY","ERR_FILE_NOT_FOUND","ERR_FILE_BAD_DRIVE","ERR_FILE_BAD_PATH","ERR_FILE_NO_PERMISSION","ERR_FILE_ALREADY_IN_USE","ERR_FILE_CANT_OPEN","ERR_FILE_CANT_WRITE","ERR_FILE_CANT_READ","ERR_FILE_UNRECOGNIZED","ERR_FILE_CORRUPT","ERR_FILE_MISSING_DEPENDENCIES","ERR_FILE_EOF","ERR_CANT_OPEN","ERR_CANT_CREATE","ERR_QUERY_FAILED","ERR_ALREADY_IN_USE","ERR_LOCKED","ERR_TIMEOUT","ERR_CANT_CONNECT","ERR_CANT_RESOLVE","ERR_CONNECTION_ERROR","ERR_CANT_ACQUIRE_RESOURCE","ERR_CANT_FORK","ERR_INVALID_DATA","ERR_INVALID_PARAMETER","ERR_ALREADY_EXISTS","ERR_DOES_NOT_EXIST","ERR_DATABASE_CANT_READ","ERR_DATABASE_CANT_WRITE","ERR_COMPILATION_FAILED","ERR_METHOD_NOT_FOUND","ERR_LINK_FAILED","ERR_SCRIPT_FAILED","ERR_CYCLIC_LINK","ERR_INVALID_DECLARATION","ERR_DUPLICATE_SYMBOL","ERR_PARSE_ERROR","ERR_BUSY","ERR_SKIP","ERR_HELP","ERR_BUG","ERR_PRINTER_ON_FIR"]
match version:
"3.4":
return default[global_error_constant]
# Regexp to use on #GlobalScope documentation
# \s+=\s+.+ replace by nothing
# (\w+)\s+ replace by "$1", (with quotes and comma)
printerr("you must check and add %s version in get_error()" % version)
return default[global_error_constant]
So print(MyClass.get_error(err)), or assert(!err, MyClass.get_error(err)) is handy
For non globals I made this, though it was not your question, it is highly related.
It would be useful to be able to access to #GlobalScope and #GDScript, maybe due a memory cost ?
static func get_enum_flags(_class:String, _enum:String, flags:int) -> PoolStringArray:
var ret := PoolStringArray()
var enum_flags := ClassDB.class_get_enum_constants(_class, _enum)
for i in enum_flags.size():
if (1 << i) & flags:
ret.append(enum_flags[i])
return ret
static func get_constant_or_enum(_class:String, number:int, _enum:="") -> String:
if _enum:
return ClassDB.class_get_enum_constants(_class, _enum)[number]
return ClassDB.class_get_integer_constant_list(_class)[number]

F# is unable to infer type arguments after annotation

So I have some json response content represented as string and I want to get its property names.
What I am doing
let properties = Newtonsoft.Json.Linq.JObject.Parse(responseContent).Properties()
let propertyNames, (jprop: JProperty) = properties.Select(jprop => jprop.Name);
According to this answer I needed to annotate the call to the extension method, however, I still get the error.
A unique overload for method 'Select' could not be determined based on type information prior to this program point. A type annotation may be needed. Candidates: (extension) Collections.Generic.IEnumerable.Select<'TSource,'TResult>(selector: Func<'TSource,'TResult>) : Collections.Generic.IEnumerable<'TResult>, (extension) Collections.Generic.IEnumerable.Select<'TSource,'TResult>(selector: Func<'TSource,int,'TResult>) : Collections.Generic.IEnumerable<'TResult>
Am I doing something wrong?
First, the syntax x => y you're trying to use is C# syntax for lambda expressions, not F# syntax. In F#, the correct syntax for lambda-expressions is fun x -> y.
Second, the syntax let a, b = c means "destructure the pair". For example:
let pair = (42, "foo")
let a, b = pair // Here, a = 42 and b = "foo"
You can provide a type annotation for one of the pair elements:
let a, (b: string) = pair
But this won't have any effect on pair the way you apparently expect it to work.
In order to provide type annotation for the argument of a lambda expression, just annotate the argument, what could be simpler?
fun (x: string) -> y
So, putting all of the above together, this is how your line should look:
let propertyNames = properties.Select(fun (jprop: JProperty) -> jprop.Name)
(also, note the absence of semicolon at the end. F# doesn't require semicolons)
If you have this level of difficulty with basic syntax, I suggest you read up on F# and work your way through a few examples before trying to implement something complex.

How to iterate through a UTF-8 string correctly in OCaml?

Say I have some input word like "føøbær" and I want a hash table of letter frequencies s.t. f→1, ø→2 – how do I do this in OCaml?
The http://pleac.sourceforge.net/pleac_ocaml/strings.html examples only work on ASCII and https://ocaml-batteries-team.github.io/batteries-included/hdoc2/BatUTF8.html doesn't say how to actually create a BatUTF8.t from a string.
The BatUTF8 module you refer to defines its type t as string, thus there is no conversion needed: a BatUTF8.t is a string. Apparently, the module encourages you to validate your string before using other functions. I guess that a proper way of operating would be something like:
let s = "føøbær"
let () = BatUTF8.validate s
let () = BatUTF8.iter add_to_table s
Looking at the code of Batteries, I found this of_string_unsafe, so perhaps this is the way:
open Batteries
BatUTF8.iter (fun c -> …Hashtbl.add table c …) (BatUTF8.of_string_unsafe "føøbær")`
although, since it's termed "unsafe" (the doc's don't say why), maybe this is equivalent:
BatUTF8.iter (fun c -> …Hashtbl.add table c …) "føøbær"
At least it works for the example word here.
Camomile also seems to iterate through it correctly:
module C = CamomileLibraryDefault.Camomile
C.iter (fun c -> …Hashtbl.add table c …) "føøbær"
I don't know of the tradeoffs between Camomile and BatUTF8 here, though they end up storing different types (BatUChar vs C.Pervasives.UChar).

What is the pythonic way to print values right aligned?

I've a list of strings which I want to group by their suffix and then print the values right-aligned, padding the left side with spaces.
What is the pythonic way to do that?
My current code is:
def find_pos(needle, haystack):
for i, v in enumerate(haystack):
if str(needle).endswith(v):
return i
return -1
# Show only Error and Warning things
search_terms = "Error", "Warning"
errors_list = filter(lambda item: str(item).endswith(search_terms), dir(__builtins__))
# alphabetical sort
errors_list.sort()
# Sort the list so Errors come before Warnings
errors_list.sort(lambda x, y: find_pos(x, search_terms) - find_pos(y, search_terms))
# Format for right-aligning the string
size = str(len(max(errors_list, key=len)))
fmt = "{:>" + size + "s}"
for item in errors_list:
print fmt.format(item)
An alternative I had in mind was:
size = len(max(errors_list, key=len))
for item in errors_list:
print str.rjust(item, size)
I'm still learning Python, so other suggestions about improving the code is welcome too.
Very close.
fmt = "{:>{size}s}"
for item in errors_list:
print fmt.format(item, size=size)
The two sorting steps can be combined into one:
errors_list.sort(key=lambda x: (x, find_pos(x, search_terms)))
Generally, using the key parameter is preferred over using cmp. Documentation on sorting
If you are interested in the length anyway, using the key parameter to max() is a bit pointless. I'd go for
width = max(map(len, errors_list))
Since the length does not change inside the loop, I'd prepare the format string only once:
right_align = ">{}".format(width)
Inside the loop, you can now do with the free format() function (i.e. not the str method, but the built-in function):
for item in errors_list:
print format(item, right_align)
str.rjust(item, size) is usually and preferrably written as item.rjust(size).
You might want to look here, which describes how to right-justify using str.rjust and using print formatting.

Debugging HXT performance problems

I'm trying to use HXT to read in some big XML data files (hundreds of MB.)
My code has a space-leak somewhere, but I can't seem to find it. I do have a little bit of a clue as to what is happening thanks to my very limited knowledge of the ghc profiling tool chain.
Basically, the document is parsed, but not evaluated.
Here's some code:
{-# LANGUAGE Arrows, NoMonomorphismRestriction #-}
import Text.XML.HXT.Core
import System.Environment (getArgs)
import Control.Monad (liftM)
main = do file <- (liftM head getArgs) >>= parseTuba
case file of(Left m) -> print "Failed."
(Right _) -> print "Success."
data Sentence t = Sentence [Node t] deriving Show
data Node t = Word { wSurface :: !t } deriving Show
parseTuba :: FilePath -> IO (Either String ([Sentence String]))
parseTuba f = do r <- runX (readDocument [] f >>> process)
case r of
[] -> return $ Left "No parse result."
[pr] -> return $ Right pr
_ -> return $ Left "Ambiguous parse result!"
process :: (ArrowXml a) => a XmlTree ([Sentence String])
process = getChildren >>> listA (tag "sentence" >>> listA word >>> arr (\ns -> Sentence ns))
word :: (ArrowXml a) => a XmlTree (Node String)
word = tag "word" >>> getAttrValue "form" >>> arr (\s -> Word s)
-- | Gets the tag with the given name below the node.
tag :: (ArrowXml a) => String -> a XmlTree XmlTree
tag s = getChildren >>> isElem >>> hasName s
I'm trying to read a corpus file, and the structure is obviously something like <corpus><sentence><word form="Hello"/><word form="world"/></sentence></corpus>.
Even on the very small development corpus, the program takes ~15 secs to read it in, of which around 20% are GC time (that's way too much.)
In particular, a lot of data is spending way too much time in DRAG state. This is the profile:
monitoring DRAG culprits. You can see that decodeDocument gets called a lot, and its data is then stalled until the very end of the execution.
Now, I think this should be easily fixed by folding all this decodeDocument stuff into my data structures (Sentence and Word) and then the RT can forget about these thunks. The way it's currently happening though, is that the folding happens at the very end when I force evaluation by deconstruction of Either in the IO monad, where it could easily happen online. I see no reason for this, and my attempts to strictify the program have so far been in vain. I hope somebody can help me :-)
I just can't even figure out too many places to put seqs and $!s in…
One possible thing to try: the default hxt parser is strict, but there does exist a lazy parser based on tagsoup: http://hackage.haskell.org/package/hxt-tagsoup
In understand that expat can do lazy processing as well: http://hackage.haskell.org/package/hxt-expat
You may want to see if switching parsing backends, by itself, solves your issue.

Resources