I am in the process of creating a simple parsing library in Haskell, which compiles down the parser specification to optimized code using Template Haskell.
However, I am trying to figure out what kind of code is the most efficient to optimize to, so let us consider a simple parserly-written parser snippet.
A parser can be seen as a function String -> (a, String) where a is whatever we want to recognize during parsing, and the resulting string is whatever part of the input we have not matched. When parsers are run in sequence, the next parser should continue working on this string. At the end we can say a full string was successfully parsed if the end result is of the shape ("", _).
In this example, we'll use a parser that parses any number of 'a'-characters.
(Error handling, that is reporting an error if any character other than an 'a' was seen, has been left out of the code snippets to keep everything concise for the sake of question clarity).
Straightforward (naive?) implementation
If we write the code for a recursive parser parserly, a straightforward hand-rolled implementation might be:
parser_raw :: String -> (String, String)
parser_raw str =
case str of
[] -> ([], [])
('a' : rest) ->
let
(res', rest') = parser_raw rest
in
(('a' : res'), rest')
This matches any number of 'a''s. The parser recurses until the string is empty.
Looking at the created GHC Core, we see the following output:
Rec {
-- RHS size: {terms: 36, types: 45, coercions: 0, joins: 0/1}
$wparser_raw
= \ w_s7BO ->
case w_s7BO of {
[] -> (# [], [] #);
: ds_d736 rest_a5S6 ->
case ds_d736 of { C# ds1_d737 ->
case ds1_d737 of {
__DEFAULT -> case lvl6_r7Fw of wild2_00 { };
'a'# ->
let {
ds3_s7vv
= case $wparser_raw rest_a5S6 of { (# ww1_s7C3, ww2_s7C4 #) ->
(ww1_s7C3, ww2_s7C4)
} } in
(# : lvl2_r7Fs
(case ds3_s7vv of { (res'_a65m, rest'_a65n) -> res'_a65m }),
case ds3_s7vv of { (res'_a65m, rest'_a65n) -> rest'_a65n } #)
}
}
}
end Rec }
To my untrained eye, it seems like this is not tail recursive, as we first do the recursive call and then still need to combine the output into a tuple afterwards. I also find it odd/interesting that the let in the source code still is there in the compiled code, which makes it seem like the computation really needs to be done in multiple (i.e. non-tail-recursive) steps.
Two other approaches to write this code come to mind. (Maybe there are more?)
continuation-passing style
parser_cps :: String -> (String, String)
parser_cps str = parser_cps' str id
where
parser_cps' :: String -> ((String, String) -> (String, String)) -> (String, String)
parser_cps' str fun =
case str of
[] -> fun ([], [])
('a' : rest) ->
parser_cps' rest ((\(tl, final) -> (('a' : tl), final) ) . fun)
This results in:
Rec {
-- RHS size: {terms: 28, types: 29, coercions: 0, joins: 0/0}
parser_cps_parser_cps'
= \ str_a5S8 fun_a5S9 ->
case str_a5S8 of {
[] -> fun_a5S9 lvl9_r7FM;
: ds_d72V rest_a5Sa ->
case ds_d72V of { C# ds1_d72W ->
case ds1_d72W of {
__DEFAULT -> lvl8_r7FL;
'a'# ->
parser_cps_parser_cps'
rest_a5Sa
(\ x_i72P ->
case fun_a5S9 x_i72P of { (tl_a5Sb, final_a5Sc) ->
(: lvl2_r7FF tl_a5Sb, final_a5Sc)
})
}
}
}
end Rec }
-- RHS size: {terms: 4, types: 4, coercions: 0, joins: 0/0}
parser_cps = \ str_a5S6 -> parser_cps_parser_cps' str_a5S6 id
This looks tail-recursive, in the sense that we perform a tail-call. It does seem like the result is built up as a large thunk however. Are we using a lot of extra heap space until the base of the recursion is reached?
the State monad
-- Using `mtl`'s
-- import Control.Monad.State.Strict
parser_state :: String -> (String, String)
parser_state = runState parser_state'
where
parser_state' :: State String String
parser_state' = do
str <- get
case str of
[] -> return []
('a' : rest) -> do
put rest
res <- parser_state'
return ('a' : res)
This results in:
Rec {
-- RHS size: {terms: 26, types: 36, coercions: 0, joins: 0/0}
$wparser_state'
= \ w_s7Ca ->
case w_s7Ca of {
[] -> (# [], [] #);
: ds_d72p rest_a5Vb ->
case ds_d72p of { C# ds1_d72q ->
case ds1_d72q of {
__DEFAULT -> case lvl11_r7FO of wild2_00 { };
'a'# ->
case $wparser_state' rest_a5Vb of { (# ww1_s7Cl, ww2_s7Cm #) ->
(# : lvl2_r7FF ww1_s7Cl, ww2_s7Cm #)
}
}
}
}
end Rec }
-- RHS size: {terms: 8, types: 14, coercions: 6, joins: 0/0}
parser_state1
= \ w_s7Ca ->
case $wparser_state' w_s7Ca of { (# ww1_s7Cl, ww2_s7Cm #) ->
(ww1_s7Cl, ww2_s7Cm) `cast` <Co:6>
}
-- RHS size: {terms: 1, types: 0, coercions: 7, joins: 0/0}
parser_state = parser_state1 `cast` <Co:7>
Here, I'm not sure how to interpret the core: The let of the parser_raw-code is gone. Instead, the recursive call ends up immediately in the case. Afterwards we still have to put the result in a tuple, but is this "sans cons" enough to please the recursion gods?
So, to summarize: these are three techniques to write a simple parsing function. I would like to know which one of these is the most memory-efficient and how to get some intuition in recognizing tail-recursion from looking at GHC Core output.
Related
I'm really interested in how this algorithm can be implemented. If possible, it would be great to see an implementation with and without recursion. I am new to the language so I would be very grateful for help. All I could come up with was this code and it goes no further:
print(counterOccur("aabcdddeabb"))
def counterOccur(string: String) =
string.toCharArray.toList.map(char => {
if (!char.charValue().equals(char.charValue() + 1)) (char, counter)
else (char, counter + 1)
})
I realize that it's not even close to the truth, I just don't even have a clue what else could be used.
First solution with using recursion. I take Char by Char from string and check if last element in the Vector is the same as current. If elements the same I update last element by increasing count(It is first case). If last element does not the same I just add new element to the Vector(second case). When I took all Chars from the string I just return result.
def counterOccur(string: String): Vector[(Char, Int)] = {
#tailrec
def loop(str: List[Char], result: Vector[(Char, Int)]): Vector[(Char, Int)] = {
str match {
case x :: xs if result.lastOption.exists(_._1.equals(x)) =>
val count = result(result.size - 1)._2
loop(xs, result.updated(result.size - 1, (x, count + 1)))
case x :: xs =>
loop(xs, result :+ (x, 1))
case Nil => result
}
}
loop(string.toList, Vector.empty[(Char, Int)])
}
println(counterOccur("aabcdddeabb"))
Second solution that does not use recursion. It works the same, but instead of the recursion it is using foldLeft.
def counterOccur2(string: String): Vector[(Char, Int)] = {
string.foldLeft(Vector.empty[(Char, Int)])((r, v) => {
val lastElementIndex = r.size - 1
if (r.lastOption.exists(lv => lv._1.equals(v))) {
r.updated(lastElementIndex, (v, r(lastElementIndex)._2 + 1))
} else {
r :+ (v, 1)
}
})
}
println(counterOccur2("aabcdddeabb"))
You can use a very simple foldLeft to accumulate. You also don't need toCharArray and toList because strings are implicitly convertible to Seq[Char]:
"aabcdddeabb".foldLeft(collection.mutable.ListBuffer[(Char,Int)]()){ (acc, elm) =>
acc.lastOption match {
case Some((c, i)) if c == elm =>
acc.dropRightInPlace(1).addOne((elm, i+1))
case _ =>
acc.addOne((elm, 1))
}
}
Here is a solution using foldLeft and a custom State case class:
def countConsecutives[A](data: List[A]): List[(A, Int)] = {
final case class State(currentElem: A, currentCount: Int, acc: List[(A, Int)]) {
def result: List[(A, Int)] =
((currentElem -> currentCount) :: acc).reverse
def nextState(newElem: A): State =
if (newElem == currentElem)
this.copy(currentCount = this.currentCount + 1)
else
State(
currentElem = newElem,
currentCount = 1,
acc = (this.currentElem -> this.currentCount) :: this.acc
)
}
object State {
def initial(a: A): State =
State(
currentElem = a,
currentCount = 1,
acc = List.empty
)
}
data match {
case a :: tail =>
tail.foldLeft(State.initial(a)) {
case (state, newElem) =>
state.nextState(newElem)
}.result
case Nil =>
List.empty
}
}
You can see the code running here.
One possibility is to use the unfold method. This method is defined for several collection types, here I'm using it to produce an Iterator (documented here for version 2.13.8):
def spans[A](as: Seq[A]): Iterator[Seq[A]] =
Iterator.unfold(as) {
case head +: tail =>
val (span, rest) = tail.span(_ == head)
Some((head +: span, rest))
case _ =>
None
}
unfold starts from a state and applies a function that returns, either:
None if we want to signal that the collection ended
Some of a pair that contains the next item of the collection we want to produce and the "remaining" state that will be fed to the next iteration.
In this example in particular, we start from a sequence of A called as (which can be a sequence of characters) and at each iteration:
if there's at least one item
we split head and tail
we further split the tail into the longest prefix that contains items equal to the head and the rest
we return the head and the prefix we got above as the next item
we return the rest of the collection as the state for the following iteration
otherwise, we return None as there's nothing more to be done
The result is a fairly flexible function that can be used to group together spans of equal items. You can then define the function you wanted initially in terms of this:
def spanLengths[A](as: Seq[A]): Iterator[(A, Int)] =
spans(as).map(a => a.head -> a.length)
This can be probably made more generic and its performance improved, but I hope this can be an helpful example about another possible approach. While folding a collection is a recursive approach, unfolding is referred to as a corecursive one (Wikipedia article).
You can play around with this code here on Scastie.
For
str = "aabcdddeabb"
you could extract matches of the regular expression
rgx = /(.)\1*/
to obtain the array
["aa", "b", "c", "ddd", "e", "a", "bb"]
and then map each element of the array to the desired string.1
def counterOccur(str: String): List[(Char, Int)] = {
"""(.)\1*""".r
.findAllIn(str)
.map(m => (m.charAt(0), m.length)).toList
}
counterOccur("aabcdddeabb")
#=> res0: List[(Char, Int)] = List((a,2), (b,1), (c,1), (d,3), (e,1), (a,1), (b,2))
The regular expression reads, "match any character and save it to capture group 1 ((.)), then match the content of capture group 1 zero or more times (\1*).
1. Scala code kindly provided by #Thefourthbird.
Should one expect performance differences between these two emptys, or is it merely a matter of stylistic preference?
foo list = case list of
[] -> True
(_ : _) -> False
bar list = case list of
(_ : _) -> False
_ -> True
In general you should not expect performance to change predictably between trivial fiddling around with patterns like what you're asking about, and can often expect the generated code to be identical.
But the way to actually check is to look at core and or benchmark with criterion. In this case the generated code is the same, and indeed GHC seems to actually combine them:
I compiled the snippet above with
ghc -Wall -O2 -ddump-to-file -ddump-simpl -dsuppress-module-prefixes -dsuppress-uniques -fforce-recomp YourCode.hs
And we see this core:
foo :: forall t. [t] -> Bool
[GblId,
Arity=1,
Caf=NoCafRefs,
Str=DmdType <S,1*U>,
Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
WorkFree=True, Expandable=True,
Guidance=ALWAYS_IF(arity=1,unsat_ok=True,boring_ok=False)
Tmpl= \ (# t) (list [Occ=Once!] :: [t]) ->
case list of _ [Occ=Dead] {
[] -> True;
: _ [Occ=Dead] _ [Occ=Dead] -> False
}}]
foo =
\ (# t) (list :: [t]) ->
case list of _ [Occ=Dead] {
[] -> True;
: ds ds1 -> False
}
-- RHS size: {terms: 1, types: 0, coercions: 0}
bar :: forall t. [t] -> Bool
[GblId,
Arity=1,
Caf=NoCafRefs,
Str=DmdType <S,1*U>,
Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
WorkFree=True, Expandable=True,
Guidance=ALWAYS_IF(arity=1,unsat_ok=True,boring_ok=False)
Tmpl= \ (# t) (list [Occ=Once!] :: [t]) ->
case list of _ [Occ=Dead] {
[] -> True;
: _ [Occ=Dead] _ [Occ=Dead] -> False
}}]
bar = foo
I think the Tmpl stuff is the original implementation exposed for inlining in other modules, but I'm not certain.
On the odd chance, that someone has a brilliant idea...
I am not sure if there is a good way to generalize that.
EDIT: I think it might be nice to explain exactly what the inputs and outputs are. The code below is only how I approached the solution.
Inputs: data, recipe
data: set of string, string list here also called "set of named lists"
recipe: list of commands
Command Print (literal|list reference)
Adds the literal to the output or if it is a list reference, it adds the head of the referenced list to the output.
Command While (list reference)
when referenced list not empty --> next command
when referenced list empty --> skip entries in recipe list past the matching Wend.
Command Wend (list reference)
replace referenced list with tail (reference list)
when referenced list is empty, next command
when referenced list is not empty, next command is the matching while above.
Outputs: string list
The best answer is the implementation of that which is shortest and which allows to re-use that algorithm in new contexts.
This is not just a programming problem for the fun of it, btw. It is basically what happens if you try to implement data driven text templating.
The code below is my attempt to solve this problem.
The first code snippet is a non-generalized solution.
The second code snippet is an attempt to isolate the algorithm.
If you play with the code, simply paste the second snippet below the first snippet and both versions are working.
The whole topic is about understanding better how to separate the iteration algorithm from the rest of the code and then to simply apply it, in contrast of having all the other code within.
Would it not be great, if there was a way to abstract the way the statements are being processed and the looping of the while/wend, such,
that it can be reused in my main code, just as I keep re-using other "iteration schemes", such as List.map?
The commonalities between my main code and this study are:
An evolving "environment" which is threaded through all steps of the computation.
Collections, which need to be iterated in a well-formed nested manner. (Malformed would be: while x while y wend x wend y)
A series of "execution steps" form the body of each of those "while wend" loops.
Done in a "pure" manner. As you will note, nothing is mutable in the study. Want to keep it like that.
Each "While" introduces a new scope (as for binding values), which is discarded again, once the while loop is done.
So, I am looking for something like:
run: CommandClassifier -> commandExecutor -> Command list -> EnvType -> EnvType
where
CommandClassifier could be a function of the form Command -> NORMAL|LOOP_START|LOOP_END
and commandexecutor: Command -> EnvType -> EnvType
Of course, nesting of those while-blocks would not be limited to 2 (just tried to keep the testProgram() small).
SideNote: the "commands list" is an AST from a preceding parser run, but that should not really matter.
type MiniLanguage =
| Print of string
| While of string
| Wend of string
let testProgram =
[ Print("Hello, I am your Mini Language program")
While("names")
Print("<names>")
While("pets")
Print("<pets>")
Wend("pets")
Print("Done with pets.")
Wend("names")
Print("Done with names.")
]
type MiniEnvironment = { Bindings : Map<string,string>; Collections : Map<string, string list> }
let init collections =
{ Bindings = Map.empty; Collections = Map.ofList collections}
let bind n v env =
let newBindings =
env.Bindings
|> Map.remove n
|> Map.add n v
{ env with Bindings = newBindings; }
let unbind n env =
{ env with Bindings = Map.remove n env.Bindings; }
let bindingValue n env =
if env.Bindings.ContainsKey n then
Some(env.Bindings.Item n)
else
None
let consumeFirstFromCollection n env =
if env.Collections.ContainsKey n then
let coll = env.Collections.Item n
match coll with
| [] -> env |> unbind n
| _ ->
let first = coll.Head
let newCollections =
env.Collections
|> Map.remove n
|> Map.add n coll.Tail
{ env with Collections = newCollections }
|> bind n first
else failwith ("Unknown collection: " + n)
// All do functions take env - the execution environment - as last argument.
// All do functions return (a new) env as single return parameter.
let rec doPrint (s : string) env =
if s.StartsWith("<") && s.EndsWith(">") then
match bindingValue (s.Substring (1, s.Length - 2 )) env with
| Some(v) -> v
| _ -> s
else s
|> printfn "%s"
env
let rec skipPastWend name code =
match code with
| (Wend(cl) :: rest) when cl = name -> rest
| [] -> failwith "No Wend found!"
| (_ :: rest) -> skipPastWend name rest
let rec doWhileX name code env =
match code with
| (Print(s) :: rest) -> env |> (doPrint s) |> doWhileX name rest
| (While(cn) :: rest) -> env |> doWhile cn rest |> ignore; env |> doWhileX name (skipPastWend cn rest)
| (Wend(cn) :: rest) when cn = name -> env
| [] -> failwith ("While without Wend for: " + name)
| _ -> failwith ("nested while refering to same collection!")
and doWhile name code env =
let e0 = env |> consumeFirstFromCollection name
match bindingValue name e0 with
| Some(s) ->
e0 |> doWhileX name code |> doWhile name code
| None -> env
let rec run (program : MiniLanguage list) env =
match program with
| (Print(s) :: rest) -> env |> (doPrint s) |> run rest
| (While(cn) :: rest) ->
env
|> doWhile cn rest |> ignore
env |> run (skipPastWend cn program)
| (Wend(cn) :: rest) -> failwith "wend found in run()"
| [] -> env
let test() =
init [ "names", ["name1"; "name2"; "name3"; ]; "pets", ["pet1"; "pet2"] ]
|> run testProgram
|> printfn "%A"
(*
Running test() yields:
Hello, I am your Mini Language program
name1
pet1
pet2
Done with pets.
name2
pet1
pet2
Done with pets.
name3
pet1
pet2
Done with pets.
Done with names.
{Bindings = map [];
Collections =
map [("names", ["name1"; "name2"; "name3"]); ("pets", ["pet1"; "pet2"])];}
*)
Here my first version of isolating the algorithm. The number of callbacks is not entirely pretty. Can anyone come up with something simpler?
// The only function I had to "modify" to work with new "generalized" algorithm.
let consumeFirstFromCollection1 n env =
if env.Collections.ContainsKey n then
let coll = env.Collections.Item n
match coll with
| [] -> (env |> unbind n , false)
| _ ->
let first = coll.Head
let newCollections =
env.Collections
|> Map.remove n
|> Map.add n coll.Tail
({ env with Collections = newCollections }
|> bind n first , true)
else failwith ("Unknown collection: " + n)
type NamedList<'n,'t when 'n : comparison> = 'n * List<'t>
type Action<'a,'c> = 'c -> 'a -> 'a
type LoopPreparer<'a,'c> = 'c -> 'a -> 'a * bool
type CommandType = | RUN | BEGIN | END
type CommandClassifier<'c> = 'c -> CommandType
type Skipper<'c> = 'c -> List<'c> -> List<'c>
type InterpreterContext<'a,'c> =
{ classifier : CommandClassifier<'c>
executor : Action<'a,'c>
skipper : Skipper<'c>
prepareLoop : LoopPreparer<'a,'c>
isMatchingEnd : 'c -> 'c -> bool
}
let interpret (context : InterpreterContext<'a,'c>) (program : 'c list) (env : 'a) : 'a =
let rec loop front (code : 'c list) e =
let e0,hasData = e |> context.prepareLoop front
if hasData
then
e0
|> loop1 front (code)
|> loop front (code)
else e
and loop1 front code e =
match code with
| x :: more when (context.classifier x) = RUN ->
//printfn "RUN %A" x
e |> context.executor x |> loop1 front more
| x :: more when (context.classifier x) = BEGIN ->
//printfn "BEGIN %A" x
e |> loop x more |> ignore
e |> loop1 front (context.skipper x more)
| x :: more when (((context.classifier x) = END) && (context.isMatchingEnd front x)) -> /// && (context.isMatchingEnd front x)
//printfn "END %A" x
e
| [] -> failwith "No END."
| _ -> failwith "TODO: Not sure which case this is. But it is not a legal one!"
let rec interpr code e =
match code with
| [] -> e
| (first :: rest) ->
match context.classifier first with
| RUN -> env |> context.executor first |> interpr rest
| BEGIN ->
e |> loop first rest |> ignore
e |> interpr (context.skipper first rest)
| END -> failwith "END without BEGIN."
interpr program env
let test1() =
let context : InterpreterContext<MiniEnvironment,MiniLanguage> =
{ classifier = fun c-> match c with | MiniLanguage.Print(_) -> RUN | MiniLanguage.While(_) -> BEGIN | MiniLanguage.Wend(_) -> END;
executor = fun c env -> match c with | Print(s) -> doPrint s env | _ -> failwith "Not a known command.";
skipper = fun c cl -> match c with | While(n) -> skipPastWend n cl | _ -> failwith "first arg of skipper SHALL be While!"
prepareLoop = fun c env -> match c with | While(n) -> (consumeFirstFromCollection1 n env) | _ -> failwith "first arg of skipper SHALL be While!"
isMatchingEnd = fun cwhile cx -> match cwhile,cx with | (While(n),Wend(p)) when n = p -> true | _ -> false
}
init [ "names", ["name1"; "name2"; "name3"; ]; "pets", ["pet1"; "pet2"] ]
|> interpret context testProgram
|> printfn "%A"
I have come across a most troubling bug in my code below. The buttons Map mutates even though I pass it as an immutable Map. The keys remain the same, and the map points to two immutable Ints, yet below you can see the map clearly has different values during the run. I am absolutely stumped and have no idea what is happening.
def makeTrace(trace : List[(String)], buttons : Map[String, (Int,Int)],
outputScreen : ScreenRegion, hashMap : Map[Array[Byte], String])
: (List[(String,String)], Map[Array[Byte], String]) = {
println(buttons.toString)
//clearing the device
val clear = buttons.getOrElse("clear", throw new Exception("Clear Not Found"))
//clear.circle(3000)
val thisButton = new ScreenLocation(clear._1, clear._2)
click(thisButton)
//updates the map and returns a list of (transition, state)
trace.foldLeft((Nil : List[(String,String)], hashMap))( (list, trace) => {
println(buttons.toString)
val transition : String = trace
val location = buttons.getOrElse(transition, throw new Exception("whatever"))
val button = new ScreenLocation(location._1, location._2)
button.circle(500)
button.label(transition, 500)
click(button)
//reading and hashing
pause(500)
val capturedImage : BufferedImage = outputScreen.capture()
val outputStream : ByteArrayOutputStream = new ByteArrayOutputStream();
ImageIO.write(capturedImage, "png", outputStream)
val byte : Array[Byte] = outputStream.toByteArray();
//end hash
//if our state exists for the hash
if (hashMap.contains(byte)){ list match {
case (accumulator, map) => ((transition , hashMap.getOrElse(byte, throw new Exception("Our map broke if"))):: accumulator, map)
}
//if we need to update the map
}else list match {
case (accumulator, map) => {
//adding a new state based on the maps size
val newMap : Map[Array[Byte], String] = map + ((byte , "State" + map.size.toString))
val imageFile : File = new File("State" + map.size.toString + ".png");
ImageIO.write(capturedImage, "png", imageFile);
((transition, newMap.getOrElse(byte, throw new Exception("Our map broke else"))) :: accumulator, newMap)
}
}
})
}
Right before I call this function I initialize the map to an immutable map that points to immutable objects.
val buttons = makeImmutable(MutButtons)
val traceAndMap = TraceFinder.makeTrace(("clear" ::"five"::"five"::"minus"::"five"::"equals":: Nil), buttons, outputScreen, Map.empty)
Where makeImmutable is
def makeImmutable(buttons : Map[String, (Int,Int)]) : Map[String, (Int,Int)] = {
buttons.mapValues(button => button match {
case (x, y) =>
val newX = x
val newY = y
(newX,newY)
})
}
Here is the output, you can see the state change for clear, minus, and five
Map(equals -> (959,425), clear -> (842,313), minus -> (920,409), five -> (842,377))
Map(equals -> (959,425), clear -> (842,313), minus -> (920,409), five -> (842,377))
Map(equals -> (959,425), clear -> (959,345), minus -> (920,409), five -> (842,377))
Map(equals -> (959,425), clear -> (842,313), minus -> (920,409), five -> (842,377))
Map(equals -> (959,425), clear -> (842,313), minus -> (920,409), five -> (881,377))
Map(equals -> (959,425), clear -> (842,313), minus -> (920,409), five -> (842,377))
Map(equals -> (959,425), clear -> (842,313), minus -> (920,441), five -> (842,377))
Map(equals -> (959,425), clear -> (842,313), minus -> (920,409), five -> (881,377))
Map(equals -> (959,425), clear -> (842,313), minus -> (920,409), five -> (842,377))
Map(equals -> (959,425), clear -> (842,313), minus -> (920,409), five -> (842,377))
First, try sprinkling println(map.getClass) around your code. Make sure it really is the map you think it is. It should have immutable in its package name, e.g.:
scala> println(Map(1->1, 2->2, 3->3, 4->4).getClass)
class scala.collection.immutable.Map$Map4
Second, make sure you really are printing out the exact same map you think you are; using the reference identity hash code is a good way to do that: println(System.identityHashCode(map)).
Chances are extraordinarily good that one of these things will not give you the expected result. Then you just have to figure out where the problem leaked in. Without a full runnable example it's hard to offer better advice without inordinate amounts of code-staring.
I suspect you have an import for scala.collection.Map in scope of the makeImmutable function, which allows you to pass MutButtons (presumably a mutable Map) to the makeImmutable function. The mapValues method on Map returns a view of the underlying map, not an immutable copy. For example:
import scala.collection.Map
import scala.collection.mutable.{Map => MutableMap}
def doubleValues(m: Map[Int, Int]) = m mapValues { _ * 2 }
val m = MutableMap(1 -> 1, 2 -> 2)
val n = doubleValues(m)
println("m = " + m)
println("n = " + n)
m += (3 -> 3)
println("n = " + n)
Running this program produces this output:
m = Map(2 -> 2, 1 -> 1)
n = Map(2 -> 4, 1 -> 2)
n = Map(2 -> 4, 1 -> 2, 3 -> 6)
To return a truly immutable map from makeImmutable, call .toMap after mapping over the values.
I want to use a Structure like HashTable. Is there similar structure in Wolfram Mathematica?
Update: Mathematica version 10 introduced the Association data structure (tutorial).
There are a number of possibilities. The easiest possibility, which works well if you don't need to add or delete keys from your table, or change their associated values, is to construct a list of rules with the key on the left-hand side and the value on the right-hand side, and use Dispatch on it.
If you do need to change the entries in your table, you can use the DownValues of a symbol as a hash table. This will support all the operations one commonly uses with hash tables. Here's the most straightforward way of doing that:
(* Set some values in your table.*)
In[1]:= table[a] = foo; table[b] = bar; table[c] = baz;
(* Test whether some keys are present. *)
In[2]:= {ValueQ[table[a]], ValueQ[table[d]]}
Out[2]:= {True, False}
(* Get a list of all keys and values, as delayed rules. *)
In[3]:= DownValues[table]
Out[3]:= {HoldPattern[table[a]] :> foo, HoldPattern[table[b]] :> bar,
HoldPattern[table[c]] :> baz}
(* Remove a key from your table. *)
In[4]:= Unset[table[b]]; ValueQ[table[b]]
Out[4]:= False
I'd say the most similar structure you can get out of the box are sparse arrays.
I agree with Pillsy, but see also this answer:
Mathematica Downvalue Lhs
It includes a handy function for getting the keys of a hash table.
I've made Dictionary.m module, which contained:
DictHasKey = Function[
{
dict,
key
},
ValueQ[dict[key]]
]
DictAddKey = Function[
{
dict,
key,
value
},
If[
DictHasKey[dict,key],
Print["Warning, Dictionary already has key " <> ToString[key]]
];
dict[key] = value;
]
DictKeys = Function[
{
dict
},
res = {};
ForEach[DownValues[dict], Function[{dictKeyDescr},
res = Append[res, ((dictKeyDescr[[1]]) /. dict -> neverUsedSymbolWhatever)[[1, 1]]];
]];
res
]
DictValues = Function[
{
dict
},
res = {};
ForEach[DownValues[dict], Function[{dictKeyDescr},
res = Append[res, dictKeyDescr[[2]]];
]];
res
]
DictKeyValuePairs = Function[
{
dict
},
res = {};
ForEach[DownValues[dict], Function[{dictKeyDescr},
res = Append[res, {((dictKeyDescr[[1]]) /. dict -> neverUsedSymbolWhatever)[[1, 1]], dictKeyDescr[[2]]}];
]];
res
]
ForEach = Function[
{
list,
func
},
len = Length[list];
For[i = 1, i <= len, i++,
func[
list[[i]]
];
];
]
Mathematica 10 introduces Association, <| k -> v |>,
<|a -> x, b -> y, c -> z|>
%[b]
y
Which is basically a wrapper for a list of rules:
Convert a list of rules to an association:
Association[{a -> x, b -> y, c -> z}]
<|a -> x, b -> y, c -> z|>
Convert an association to a list of rules:
Normal[<|a -> x, b -> y, c -> z|>]
{a -> x, b -> y, c -> z}