Huffman decoding (in Scala) - algorithm

I'm trying to write an algorithm to perform Huffman decoding. I am doing it in Scala - it's an assignment for a Coursera course and I don't want to violate the honor code, so the below is pseudocode rather than Scala.
The algorithm I have written takes in a tree tree and a list of bits bits, and is supposed to return the message. However, when I try it on the provided tree, I get a NoSuchElementException (head of empty list). I can't see why.
I know that my code could be tidied up a bit - I'm still very new to functional programming so I've written it in a way that makes sense to me, rather than, probably, in the most compact way.
def decode(tree, bits) [returns a list of chars]: {
def dc(aTree, someBits, charList) [returns a list of chars]: {
if aTree is a leaf:
if someBits is empty: return char(leaf) + charList
else: dc(aTree, someBits, char(leaf) + charList)
else aTree is a fork:
if someBits.head is 0: dc(leftFork, someBits.tail, charList)
else someBits is 1: dc(rightFork, someBits.tail, charList)
}
dc(tree, bits, [empty list])
}
Thanks in advance for your help. It's my first time on StackOverflow, so I probably have some learning to do as to how best to use the site.

If I understand it correctly, you want to go through forks (with directions from bits) until you will find a leaf. Then you are adding leaf value to your char list and from this point you want to repeat steps.
If I am right, then you should pass original tree to your helper method, not a leftFork or rightFork, which are leafs now.
So it would be something like:
if aTree is a leaf:
if someBits is empty: return char(leaf) + charList
else: dc(tree, someBits, char(leaf) + charList)

Related

Concise (one line?) binary search in Raku

Many common operations aren't built in to Raku because they can be concisely expressed with a combination of (meta) operators and/or functions. It feels like binary search of a sorted array ought to be expressable in that way (maybe with .rotor? or …?) but I haven't found a particularly good way to do so.
For example, the best I've come up with for searching a sorted array of Pairs is:
sub binary-search(#a, $target) {
when +#a ≤ 1 { #a[0].key == $target ?? #a[0] !! Empty }
&?BLOCK(#a[0..^*/2, */2..*][#a[*/2].key ≤ $target], $target)
}
That's not awful, but I can't shake the feeling that it could be an awfully lot better (both in terms of concision and readability). Can anyone see what elegant combo of operations I might be missing?
Here's one approach that technically meets my requirements (in that the function body it fits on a single normal-length line). [But see the edit below for an improved version.]
sub binary-search(#a, \i is copy = my $=0, :target($t)) {
for +#a/2, */2 … *≤1 {#a[i] cmp $t ?? |() !! return #a[i] with i -= $_ × (#a[i] cmp $t)}
}
# example usage (now slightly different, because it returns the index)
my #a = ((^20 .pick(*)) Z=> 'a'..*).sort;
say #a[binary-search(#a».key, :target(17))];
say #a[binary-search(#a».key, :target(1))];
I'm still not super happy with this code, because it loses a bit of readability – I still feel like there could/should be a concise way to do a binary sort that also clearly expresses the underlying logic. Using a 3-way comparison feels like it's on that track, but still isn't quite there.
[edit: After a bit more thought, I came up with an more readable version of the above using reduce.
sub binary-search(#a, :target(:$t)) {
(#a/2, */2 … *≤.5).reduce({ $^i - $^pt×(#a[$^i] cmp $t || return #a[$^i]) }) && Nil
}
In English, that reads as: for a sequence starting at the midpoint of the array and dropping by 1/2, move your index $^i by the value of the next item in the sequence – with the direction of the move determined by whether the item at that index is greater or lesser than the target. Continue until you find the target (in which case, return it) or you finish the sequence (which means the target wasn't present; return Nil)]

How do I make this code for detecting palindrome in a Linked List cover all cases?

So, I was solving this problem of detecting a palindrome in a linked list. I came up with the following solution for it:
class Solution {
public boolean isPalindrome(ListNode head) {
ListNode temp=head;
boolean [] arr = new boolean[10];
int count=0;
if(head==null) return false;
while(temp!=null)
{
if(arr[temp.val]==false)
arr[temp.val]=true;
else
arr[temp.val]=false;
temp=temp.next;
}
for(int i=0;i<10;i++)
{
if(arr[i]==true)
count++;
}
if(count<2)return true;
return false;
}
Now, the logic behind this solution is correct as far as I can see but it fails for cases like this: [1,0,0], [4,0,0,0,0] etc. How do I overcome this? (Pls dont reply with a shorter method I want to know the reason behind why this fails for certain cases.)
First of all Welcome to StackOverflow!
Because of how simple this problem's solution can be I feel obligated to tell you that a solution with an auxiliary stack is not only easy to implement but also easy to understand. But since you asked why your code fails for certain cases I'll answer that question first. Your code in particular is counting the number of digits that have an odd count.
Although this seems to be what you are supposed to do to detect a palindrome notice that a linked list that looks like 1 -> 1 -> 0 -> 0 is also considered a palindrome under your code because the count is always going to be less than 0.
Your solution works for telling us if it is possible to create a palindrome given a set of digits. Suppose that the question was like "given a linked list tell me if you can rearrange it to create a palindrome" but it does not work for "is this linked list a palindrome".

What is the time complexity performance of Scala's Vector data structure?

I know that most of the Vector methods are effectively O(1) (constant time) because of the tree they use, but I cannot find any information on the contains method. My first thought is that it would have to be O(n) to check all the elements but I am not sure.
Answering the question in the title, performance characteristics (2.13 docs version) of basic operations head, tail, apply, update, prepend, append, insert are all listed as eC for Vector:
eC The operation takes effectively constant time, but this might depend on some assumptions such as maximum length of a vector or distribution of hash keys.
You are correct contains is O(N), as there is no hashing or nothing else that would avoid the need to compare with all items. Still, if you want to be sure, it is best to check the implementation.
As finding the correct implementation in the library sources can be difficult because of many traits and overrides used to implement the containers, the best way to check this is the debugger. Use a code like:
val v = Vector(0, 1, 2)
v.contains(1)
Use the debugger to step into v.contains and the source you will see is:
def contains[A1 >: A](elem: A1): Boolean = exists (_ == elem)
If you are still not convinced at this point, some more "step into" will lead you to:
def exists(p: A => Boolean): Boolean = {
var res = false
while (!res && hasNext) res = p(next())
res
}

Depth-Limited Search Recursive Pseudocode

I'm looking at a pseudocode implementation of Depth-Limited Search, and I'm having trouble understanding it.
The pseudocode is:
Function Recursive-DLS(node, problem, limit)
cutoff-occurred? = false
if Goal-Test(problem, State[node]) then return node
else if Depth[node] = limit then return cutoff
else for each successor in Expand(node, problem) do
result = Recursive-DLS(successor, problem, limit-1)
if result = cutoff then cutoff-occurred? = true
else if result != failure then return result
if cutoff-occurred? then return cutoff else return failure
Im mainly having trouble understanding the reason for recurring the algo with limit-1 for every successor. Can someone run through this with me? Some graphical explanation would be nice haha.
I'm going to look at other sources in the meantime. Thanks for reading!
The pseudo-code appears to be wrong. (It's actually possible for the base case to never be encountered if the node depth/limit values skip eachother - being simultaneously increased and decreased in each recursive call.)
The recursive case was written with the limit - 1 so a base case can be reached (instead of "limit", think "remaining").
However, the base case for the limit is if Depth[node] = limit then return cutoff. Following David Chan's suggestion it should be if 0 = limit then return cutoff, which would agree with the limit - 1 in the recursive case.
Alternatively, just pass limit in the recursive case and leave the current base case alone.

An efficient technique to replace an occurence in a sequence with mutable or immutable state

I am searching for an efficient a technique to find a sequence of Op occurences in a Seq[Op]. Once an occurence is found, I want to replace the occurence with a defined replacement and run the same search again until the list stops changing.
Scenario:
I have three types of Op case classes. Pop() extends Op, Push() extends Op and Nop() extends Op. I want to replace the occurence of Push(), Pop() with Nop(). Basically the code could look like seq.replace(Push() ~ Pop() ~> Nop()).
Problem:
Now that I call seq.replace(...) I will have to search in the sequence for an occurence of Push(), Pop(). So far so good. I find the occurence. But now I will have to splice the occurence form the list and insert the replacement.
Now there are two options. My list could be mutable or immutable. If I use an immutable list I am scared regarding performance because those sequences are usually 500+ elements in size. If I replace a lot of occurences like A ~ B ~ C ~> D ~ E I will create a lot of new objects If I am not mistaken. However I could also use a mutable sequence like ListBuffer[Op].
Basically from a linked-list background I would just do some pointer-bending and after a total of four operations I am done with the replacement without creating new objects. That is why I am now concerned about performance. Especially since this is a performance-critical operation for me.
Question:
How would you implement the replace() method in a Scala fashion and what kind of data structure would you use keeping in mind that this is a performance-critical operation?
I am happy with answers that point me in the right direction or pseudo code. No need to write a full replace method.
Thank you.
Ok, some considerations to be made. First, recall that, on lists, tail does not create objects, and prepending (::) only creates one object for each prepended element. That's pretty much as good as you can get, generally speaking.
One way of doing this would be this:
def myReplace(input: List[Op], pattern: List[Op], replacement: List[Op]) = {
// This function should be part of an KMP algorithm instead, for performance
def compare(pattern: List[Op], list: List[Op]): Boolean = (pattern, list) match {
case (x :: xs, y :: ys) if x == y => compare(xs, ys)
case (Nil, Nil) => true
case _ => false
}
var processed: List[Op] = Nil
var unprocessed: List[Op] = input
val patternLength = pattern.length
val reversedReplacement = replacement.reverse
// Do this until we finish processing the whole sequence
while (unprocessed.nonEmpty) {
// This inside algorithm would be better if replaced by KMP
// Quickly process non-matching sequences
while (unprocessed.nonEmpty && unprocessed.head != pattern.head) {
processed ::= unprocessed.head
unprocessed = unprocessed.tail
}
if (unprocessed.nonEmpty) {
if (compare(pattern, unprocessed)) {
processed :::= reversedReplacement
unprocessed = unprocessed drop patternLength
} else {
processed ::= unprocessed.head
unprocessed = unprocessed.tail
}
}
}
processed.reverse
}
You may gain speed by using KMP, particularly if the pattern searched for is long.
Now, what is the problem with this algorithm? The problem is that it won't test if the replaced pattern causes a match before that position. For instance, if I replace ACB with C, and I have an input AACBB, then the result of this algorithm will be ACB instead of C.
To avoid this problem, you should create a backtrack. First, you check at which position in your pattern the replacement may happen:
val positionOfReplacement = pattern.indexOfSlice(replacement)
Then, you modify the replacement part of the algorithm this:
if (compare(pattern, unprocessed)) {
if (positionOfReplacement > 0) {
unprocessed :::= replacement
unprocessed :::= processed take positionOfReplacement
processed = processed drop positionOfReplacement
} else {
processed :::= reversedReplacement
unprocessed = unprocessed drop patternLength
}
} else {
This will backtrack enough to solve the problem.
This algorithm won't deal efficiently, however, with multiply patterns at the same time, which I guess is where you are going. For that, you'll probably need some adaptation of KMP, to do it efficiently, or, otherwise, use a DFA to control possible matchings. It gets even worse if you want to match both AB and ABC.
In practice, the full blow problem is equivalent to regex match & replace, where the replace is a function of the match. Which means, of course, you may want to start looking into regex algorithms.
EDIT
I was forgetting to complete my reasoning. If that technique doesn't work for some reason, then my advice is going with an immutable tree-based vector. Tree-based vectors enable replacement of partial sequences with low amount of copying.
And if that doesn't do, then the solution is doubly linked lists. And pick one from a library with slice replacement -- otherwise you may end up spending way too much time debugging a known but tricky algorithm.

Resources