Comparing letter occurrence with comparator - sorting

I am having a problem with sorting a string array[], based on which strings have most letter 'p' in them, i.e., s1 (apple) would be before s2 (ape)...
I am learning how to implement Comparator to do this and use then s1.compareTo(s2) and lambda. The big question is, can't I somehow use a stream to do this ?
This is how I did it for my String array COUNTRIES in reversed alphabetic sorting
Comparator<String> reverseAlphabetic = (s1,s2) -> -s1.compareToIgnoreCase(s2);
Arrays.sort(COUNTRIES,reverseAlphabetic);
System.out.println("\nCountries in reverse alphabetic order");
for (int i=0; i<10;i++)
System.out.println("\t"+COUNTRIES[i]);

Y̶o̶u̶ ̶c̶a̶n̶ ̶d̶o̶ ̶s̶o̶m̶e̶t̶h̶i̶n̶g̶ ̶l̶i̶k̶e̶ ̶t̶h̶i̶s̶ ̶:̶
Comparator<String> comparator = (str1, str2) ->
((str1.length() - str1.replaceAll("p", "").length()) -
(str2.length() - str2.replaceAll("p", "").length()));
List<String> list = Arrays.asList("ape", "apple", "appple");
list.sort(comparator);
Actually my solution had fundamental mistakes. #Holger commented a solution in my deleted answer.
list.sort(Comparator
.comparingLong((String s) -> s.chars().filter(c -> c == 'p').count()).reversed());
commented by #Holger
Your first variant is broken too, as a - b - c - d is not the same as a - b - (c - d). It just produces the desired result by accident; a different original order will lead to different results. Yet another reason not to use minus when you mean, e.g. Integer.compare(…, …). A correct and efficient comparator can be as simple as Comparator.comparingLong(s -> s.chars().filter(c -> c=='p').count()) (or to have most occurences first, Comparator.comparingLong((String s) -> s.chars().filter(c -> c=='p').count()).reversed()), though a old-fashioned counting loop would be even better

Related

Algorithm for enumerating all permutations of algebraic expressions

If I have a list of variables, such as {A, B, C} and a list of operators, such as {AND, OR}, how can I efficiently enumerate all permutations of valid expressions?
Given the above, I would want to see as output (assuming evaluation from left-to-right with no operator precedence):
A AND B AND C
A OR B OR C
A AND B OR C
A AND C OR B
B AND C OR A
A OR B AND C
A OR C AND B
B OR C AND A
I believe that is an exhaustive enumeration of all combinations of inputs. I don't want to be redundant, so for example, I wouldn't add "C OR B AND A" because that is the same as "B OR C AND A".
Any ideas of how I can come up with an algorithm to do this? I really have no idea where to even start.
Recursion is a simple option to go:
void AllPossibilities(variables, operators, index, currentExpression){
if(index == variables.size) {
print(currentExpression);
return;
}
foreach(v in variables){
foreach(op in operators){
AllPossibilities(variables, operators, index + 1, v + op);
}
}
}
This is not an easy problem. First, you need a notion of grouping, because
(A AND B) OR C != A AND (B OR C)
Second, you need to generate all expressions. This will mean iterating through every permutation of terms, and grouping of terms in the permutation.
Third, you have to actually parse every expression, bringing the parsed expressions into a canonical form (say, CNF. https://en.wikipedia.org/wiki/Binary_expression_tree#Construction_of_an_expression_tree)
Finally, you have to actually check equivalence of the expressions seen so far. This is checking equivalence of the AST formed by parsing.
It will look loosely like this.
INPUT: terms
0. unique_expressions = empty_set
1. for p_t in permutations of terms:
2. for p_o in permutations of operations:
3. e = merge_into_expression(p_t, p_o)
4. parsed_e = parse(e)
5. already_seen = False
6. for unique_e in unique_expressions:
7. if equivalent(parsed_e, unique_e)
8. already_seen = True
9. break
10. if not already_seen:
11. unique_expressions.add(parsed_e)
For more info, check out this post. How to check if two boolean expressions are equivalent

Is there a performance difference between head$filter and head$dropWhile with Haskell Strings?

I'm working on lists of "People" objects in Haskell, and I was wondering if there was any difference in performance between head$dropWhile and head$filter to find the first person with a given name. The two options and a snip of the datatype would be:
datatype Person = Person { name :: String
, otherStuff :: StuffTypesAboutPerson }
findPerson :: String -> [Person] -> Person
findPerson n = head $ dropWhile (\p -> name p /= n)
findPerson n = head $ filter (\p -> name p == n)
My thought was, filter would have to compare the full length of n to the full length of every name until it finds the first one. I would think dropWhile would only need to compare the strings until the first non-matching Char. However, I know there is a ton of magic in Haskell, especially GHC. I would prefer to use the filter version, because I think it's more straight-forward to read. However, I was wondering if there actually is any performance difference? Even if it's negligible, I'm also interested from a curiosity standpoint at this point.
Edit: I know I also need to protect from errors with Maybe, etc, but I left that out to simplify the code example.
There are several approaches to the problem
findPerson n = head $ dropWhile (\p -> name p /= n)
findPerson n = head $ filter (\p -> name p == n)
findPerson n = fromJust $ find (\p -> name p == n)
The question also points out two facts:
when x,y are equal strings, == needs to compare all the characters
when x,y are different strings, /= only needs to compare until the first different character
This is correct, but does not consider the other cases
when x,y are equal strings, /= needs to compare all the characters
when x,y are different strings, == only needs to compare until the first different character
So, between == and /= there is no performance winner. We can expect that, at most, one of them will perform an additional not w.r.t. the other one.
Also, all the three implementations of findPerson mentioned above, essentially perform the same steps. Given xs :: [Person], they will all scan xs until a matching name is found, and no more. On all the persons before the match, the name will be compared against n, and this comparison will stop at the first different character (no matter what comparison we use above). The matching person will have their name compared completely with n (again, in all cases).
Hence, the approaches are expected to run in the same time. There might be a very small difference between them, but it could be so small that it would be hard to detect. You can try to experiment with criterion and see what happens, if you wish.

Adding 2 Int Lists Together F#

I am working on homework and the problem is where we get 2 int lists of the same size, and then add the numbers together. Example as follows.
vecadd [1;2;3] [4;5;6];; would return [5;7;9]
I am new to this and I need to keep my code pretty simple so I can learn from it. I have this so far. (Not working)
let rec vecadd L K =
if L <> [] then vecadd ((L.Head+K.Head)::L) K else [];;
I essentially want to just replace the first list (L) with the added numbers. Also I have tried to code it a different way using the match cases.
let rec vecadd L K =
match L with
|[]->[]
|h::[]-> L
|h::t -> vecadd ((h+K.Head)::[]) K
Neither of them are working and I would appreciate any help I can get.
First, your idea about modifying the first list instead of returning a new one is misguided. Mutation (i.e. modifying data in place) is the number one reason for bugs today (used to be goto, but that's been banned for a long time now). Making every operation produce a new datum rather than modify existing ones is much, much safer. And in some cases it may be even more performant, quite counterintuitively (see below).
Second, the way you're trying to do it, you're not doing what you think you're doing. The double-colon doesn't mean "modify the first item". It means "attach an item in front". For example:
let a = [1; 2; 3]
let b = 4 :: a // b = [4; 1; 2; 3]
let c = 5 :: b // c = [5; 4; 1; 2; 3]
That's how lists are actually built: you start with a empty list and prepend items to it. The [1; 2; 3] syntax you're using is just a syntactic sugar for that. That is, [1; 2; 3] === 1::2::3::[].
So how do I modify a list, you ask? The answer is, you don't! F# lists are immutable data structures. Once you've created a list, you can't modify it.
This immutability allows for an interesting optimization. Take another look at the example I posted above, the one with three lists a, b, and c. How many cells of memory do you think these three lists occupy? The first list has 3 items, second - 4, and third - 5, so the total amount of memory taken must be 12, right? Wrong! The total amount of memory taken up by these three lists is actually just 5 cells. This is because list b is not a block of memory of length 4, but rather just the number 4 paired with a pointer to the list a. The number 4 is called "head" of the list, and the pointer is called its "tail". Similarly, the list c consists of one number 5 (its "head") and a pointer to list b, which is its "tail".
If lists were not immutable, one couldn't organize them like this: what if somebody modifies my tail? Lists would have to be copied every time (google "defensive copy").
So the only way to do with lists is to return a new one. What you're trying to do can be described like this: if the input lists are empty, the result is an empty list; otherwise, the result is the sum of tails prepended with the sum of heads. You can write this down in F# almost verbatim:
let rec add a b =
match a, b with
| [], [] -> [] // sum of two empty lists is an empty list
| a::atail, b::btail -> (a + b) :: (add atail btail) // sum of non-empty lists is sum of their tails prepended with sum of their heads
Note that this program is incomplete: it doesn't specify what the result should be when one input is empty and the other is not. The compiler will generate a warning about this. I'll leave the solution as an exercise for the reader.
You can map over both lists together with List.map2 (see the docs)
It goes over both lists pairwise and you can give it a function (the first parameter of List.map2) to apply to every pair of elements from the lists. And that generates the new list.
let a = [1;2;3]
let b = [4;5;6]
let vecadd = List.map2 (+)
let result = vecadd a b
printfn "%A" result
And if you want't to do more work 'yourself' something like this?
let a = [1;2;3]
let b = [4;5;6]
let vecadd l1 l2 =
let rec step l1 l2 acc =
match l1, l2 with
| [], [] -> acc
| [], _ | _, [] -> failwithf "one list is bigger than the other"
| h1 :: t1, h2 :: t2 -> step t1 t2 (List.append acc [(h1 + h2)])
step l1 l2 []
let result = vecadd a b
printfn "%A" result
The step function is a recursive function that takes two lists and an accumulator to carry the result.
In the last match statement it does three things
Sum the head of both lists
Add the result to the accumulator
Recursively call itself with the new accumulator and the tails of the lists
The first match returns the accumulator when the remaining lists are empty
The second match returns an error when one of the lists is longer than the other.
The accumulator is returned as the result when the remaining lists are empty.
The call step l1 l2 [] kicks it off with the two supplied lists and an empty accumulator.
I have done this for crossing two lists (multiply items with same index together):
let items = [1I..50_000I]
let another = [1I..50_000I]
let rec cross a b =
let rec cross_internal = function
| r, [], [] -> r
| r, [], t -> r#t
| r, t, [] -> r#t
| r, head::t1, head2::t2 -> cross_internal(r#[head*head2], t1, t2)
cross_internal([], a, b)
let result = cross items another
result |> printf "%A,"
Note: not really performant. There are list object creations at each step which is horrible. Ideally the inner function cross_internal must create a mutable list and keep updating it.
Note2: my ranges were larger initially and using bigint (hence the I suffix in 50_000) but then reduced the sample code above to just 50,500 elements.

Collection of variable-length Tuples AND map with multiple values TO weighted combinations

This problem is from "Functional Programming Principles in Scala" # Coursera so I need to avoid having complete code in this question - it's past deadline already but there are always years to come. I was looking for general suggestions on methods to implement this transformation.
I have a set of variable-length Tuples, full subset of some String in lowercase
val indexes = Set('a', 'b', 'c')
and a set of tuples with max allowed occurence for each char
val occurences = Set(('a', 1), ('b', 5), ('c', 2))
I want to get combinations of weighted tuples:
val result = Seq(Set(), Set((a, 1)), Set((b, 1)), Set((b, 2)) ... Set((a, 1)(b, 2)(c, 2)) ...)
My assignment suggests easy way of building result via recursive iteration.
I'd like to do it in more "structural?" way. My idea was to get all possible char subsets and multiplex those with added weights (~pseudocode in last line of post).
I got subsets via handly subsets operator
val subsets = Seq(Set(), Set(a), Set(b), Set(c), Set(a, b), Set(a, c), Set(b, c), Set(a, b, c)
and also map of specific Int values for each char, either of
val weightsMax Map(a -> 1, b -> 5, c -> 2)
val weightsAll Map(a -> List(1), b -> List(5,4,3,2,1), c -> List(2,1))
I don't really know which language feature I should use for this operation.
I know about for and collection operations but don't have experience to operate with those on this level as I'm new to functional paradigm (and collection operations too).
I would have no problems with making some corporate-styled java / XML to solve this (yeah ...).
I'd like to have something similar defined:
FOREACH subset (MAP chars TO (COMBINATIONS OF weights FOR chars))
You can express this problem recursively and implement it this exact way. We want to build a function called expend: Set[Char] => List[Set[(Char, Int)]] that returns all the possible combinations of weights for a set of chars (you wrote it chars TO (COMBINATIONS OF weights FOR chars)). The intuitive "by the definition" way is to assign each possible weights to the first char, and for each of these assign each possible weights to the second char and so on...
def expend(set: Set[Char]): List[Set[(Char, Int)]] =
if(set isEmpty) Nil else
allPairsFromChar(set head) flatMap (x => expend(set tail) map (_ + x))
Where allPairsFromChar is trivial from your weightsAll and your FOREACH subset (...) is another flatMap ;)

Ocaml homework need some advices

We have N sets of integers A1, A2, A3 ... An. Find an algorithm that returns a list containg one element from each of the sets, with the property that the difference between the largest and the smallest element in the list is minimal
Example:
IN: A1 = [0,4,9], A2 = [2,6,11], A3 = [3,8,13], A4 = [7,12]
OUT: [9,6,8,7]
I have an idea about this exercise, first we need sort all the elements on one list(every element need to be assigned to its set), so with that input we get this:
[[0,1],[2,2],[3,3],[4,1],[6,2],[7,4],[8,3],[9,1],[11,2],[12,4],[13,3]]
later on we create all possible list and find this one with the difference between smallest and largest element, and return correct out like this: [9,6,8,7]
I am newbie in ocaml so I have some questions about coding this stuff:
Can I create a function with N(infinite amount of) arguments?
Should I create a new type, like list of pair to realize assumptions?
Sorry for my bad english, hope you will understand what I wanted to express.
This answer is about the algorithmic part, not the OCaml code.
You might want to implement your proposed solution first, to have a working one and to compare its results with an improved solution, which I now write about.
Here is a hint about how to improve the algorithmic part. Consider sorting all sets, not only the first one. Now, the list of all minimum elements from all sets is a candidate to the output.
To consider other candidate output, how can you move from there?
I'm just going to answer your questions, rather than comment on your proposed solution. (But I think you'll have to work on it a little more before you're done.)
You can write a function that takes a list of lists. This is pretty much the same
as allowing an arbitrary number of arguments. But really it just has one argument
(like all functions in OCaml).
You can just use built-in types like lists and tuples, you don't need to create or
declare them explicitly.
Here's an example function that takes a list of lists and combines them into one big long list:
let rec concat lists =
match lists with
| [] -> []
| head :: tail -> head # concat tail
Here is the routine you described in the question to get you started. Note that
I did not pay any attention to efficiency. Also added the reverse apply (pipe)
operator for clarity.
let test_set = [[0;4;9];[2;6;11];[3;8;13]; [7;12]]
let (|>) g f = f g
let linearize sets =
let open List in sets
|> mapi (fun i e -> e |> map (fun x -> (x, i+1) ))
|> flatten |> sort (fun (e1,_) (e2, _) -> compare e1 e2)
let sorted = linearize test_set
Your approach does not sound very efficient, with an n number of sets, each with x_i elments, your sorted list will have (n * x_i) elements, and the number of sub-lists you can generate out of that would be: (n * x_i)! (factorial)
I'd like to propose a different approach, but you'll have to work out the details:
Tag (index) each element with it's set identifier (like you have done).
Sort each set individually.
Build the exact opposite to that of your desired result!
Optimize!
I hope you can figure out steps 3, 4 on your own... :)

Resources