How to assign rooms most effeciently? - algorithm

Story:
Our company will go outing soon. For our staying in the resort, every two of our colleagues will share one room. Our admin assistant has collected our preference of who to share rooms with, and now she has to decide how to arrange rooms to minimize the required number of room. Everyone will be arranged to share a room with somebody he or she would like to. For example, there are only colleagues, Allen would like to share a room with Bob or Chris, Bob would like to share with Chris, and Chris would like to share with Allen; then the only result will be: Allen and Chris share a room, and Bob uses a room alone, and in totall, 2 rooms are needed.
Question:
To simplify the story as an algorithm question (which may not be the best simplification though): we have a few nodes in a graph, and the nodes connect to each other. We only care about nodes that are bi-directionally connected, so now we have an undirectional graph. How to divide the nodes in the undirectional graph into groups so that 1) any group contains at most 2 nodes, 2) if a group contains 2 nodes, the nodes are connected, 3) the number of the groups is minimized.
Algorithm:
What comes over my head is to solve the question greedily. In every step of arrangement, just remove one isolated node or two nodes so that the number of edges remain in the graph is maximized. By doing so repeatedly, we will find a solution finally.
Please either solve the question in an optimal way (and I am not looking for a way to try all combinations) or prove the greedy algorithm described above is optimal.

The problem you are solving is finding the maximum matching in a graph. This means finding the maximum number of edges that do not share vertices. In your case, those edges would correspond to shared rooms, and the remaining vertices would be single rooms.
The maximum matching can be found using the Blossom algorithm in polynomial time.

Here's something crude in Haskell. The function, "pairs," lists all pairs with a mutual preference, and people without a mutual partner (paired with ""). The function, "choose," returns pairs from the pair list. If both people in a pair are also paired with another (same) third person, "choose" removes those two people from the rest of the pair list, as well as pairs emptied as a consequence. The number of rooms needed is equal to the length of the final list.
Output (it would be nice to have more varied examples to test):
*Main> choose graph
[["Chris","Allen"],["Bob","Isaak"]]
*Main> choose graph1
[["Allen","Chris"],["Bob",""],["Dave",""],["Chris","Max"]] --four rooms
would be needed, although Chris appears in two pairs (..figured they can
decide later who stays where.)
*Main> choose graph2 --example given by Dante is not a Geek
[["Allen","Chris"],["Bob",""]]
Code:
import Data.List (group, sort, delete)
graph = [("Chris",["Isaak","Bob","Allen"]) --(person,preferences)
,("Allen",["Chris","Bob"])
,("Bob",["Allen","Chris","Isaak"])
,("Isaak",["Bob","Chris"])]
graph1 = [("Allen",["Bob","Chris"]), ("Bob",["Chris"]), ("Dave",[])
,("Chris",["Allen", "Max"]), ("Max", ["Chris"])]
graph2 = [("Allen",["Bob","Chris"]), ("Bob",["Chris"]), ("Chris",["Allen"])]
pairs graph = pairs' graph [] where
pairs' [] result = concat result
pairs' (x#(person1,_):xs) result
| null test = if elem [[person1, ""]] result
then pairs' xs result
else pairs' xs ([[person1,""]]:result)
| otherwise =
pairs' xs ((filter (\[x,y] -> notElem [y,x] (concat result)) test):result)
where isMutual a b = elem (fst a) (snd b) && elem (fst b) (snd a)
test = foldr comb [] graph
comb a#(person2,_) b =
if isMutual a x then [person1,person2]:b else b
choose graph = comb paired [] where
paired = pairs graph
comb [] result = filter (/=["",""]) result
comb (x#[p1,p2]:xs) result
| x == ["",""] = comb xs result
| test =
comb (map delete' xs) (x:map delete' result)
| otherwise = comb xs (x:result)
where delete' [x,y] = if elem x [p1,p2] then ["",y]
else if elem y [p1,p2] then [x,""]
else [x,y]
test = if not . null . filter ((>=2) . length) . group
. sort . map (delete p2 . delete p1)
. filter (\y -> y /= x && (elem p1 y || elem p2 y)) $ paired
then True
else False

Related

sort a list of numbers by their 'visual similarity'

consider a function, which rates the level of 'visual similarity' between two numbers: 666666 and 666166 would be very similar, unlike 666666 and 111111
type N = Int
type Rate = Int
similar :: N -> N -> Rate
similar a b = length . filter id . zipWith (==) a' $ b'
where a' = show a
b' = show b
similar 666666 666166
--> 5
-- high rate : very similar
similar 666666 111111
--> 0
-- low rate : not similar
There will be more sophisticated implementations for this, however this serves the purpose.
The intention is to find a function that sorts a given list of N's, so that each item is the most similar one to it's preceding item. Since the first item does not have a predecessor, there must be a given first N.
similarSort :: N -> [N] -> [N]
Let's look at some sample data: They don't need to have the same arity but it makes it easier to reason about it.
sample :: [N]
sample = [2234, 8881, 1222, 8888, 8822, 2221, 5428]
one could be tempted to implement the function like so:
similarSortWrong x xs = reverse . sortWith (similar x) $ xs
but this would lead to a wrong result:
similarSortWrong 2222 sample
--> [2221,1222,8822,2234,5428,8888,8881]
In the beginning it looks correct, but it's obvious that 8822 should rather be followed by 8881, since it's more similar that 2234.
So here's the implementation I came up with:
similarSort _ [] = []
similarSort x xs = x : similarSort a as
where (a:as) = reverse . sortWith (similar x) $ xs
similarSort 2222 sample
--> [2222,2221,2234,1222,8822,8888,8881]
It seems to work. but it also seems to do lot more more work than necessary. Every step the whole rest is sorted again, just to pick up the first element. Usually lazyness should allow this, but reverse might break this again. I'd be keen to hear, if someone know if there's a common abstraction for this problem.
It's relatively straightforward to implement the greedy algorithm you ask for. Let's start with some boilerplate; we'll use the these package for a zip-like that hands us the "unused" tail ends of zipped-together lists:
import Data.Align
import Data.These
sampleStart = "2222"
sampleNeighbors = ["2234", "8881", "1222", "8888", "8822", "2221", "5428"]
Instead of using numbers, I'll use lists of digits -- just so we don't have to litter the code with conversions between the form that's convenient for the user and the form that's convenient for the algorithm. You've been a bit fuzzy about how to rate the similarity of two digit strings, so let's make it as concrete as possible: any digits that differ cost 1, and if the digit strings vary in length we have to pay 1 for each extension to the right. Thus:
distance :: Eq a => [a] -> [a] -> Int
distance l r = sum $ alignWith elemDistance l r where
elemDistance (These l r) | l == r = 0
elemDistance _ = 1
A handy helper function will pick the smallest element of some list (by a user-specified measure) and return the rest of the list in some implementation-defined order.
minRestOn :: Ord b => (a -> b) -> [a] -> Maybe (a, [a])
minRestOn f [] = Nothing
minRestOn f (x:xs) = Just (go x [] xs) where
go min rest [] = (min, rest)
go min rest (x:xs) = if f x < f min
then go x (min:rest) xs
else go min (x:rest) xs
Now the greedy algorithm almost writes itself:
greedy :: Eq a => [a] -> [[a]] -> [[a]]
greedy here neighbors = here : case minRestOn (distance here) neighbors of
Nothing -> []
Just (min, rest) -> greedy min rest
We can try it out on your sample:
> greedy sampleStart sampleNeighbors
["2222","1222","2221","2234","5428","8888","8881","8822"]
Just eyeballing it, that seems to do okay. However, as with many greedy algorithms, this one only minimizes the local cost of each edge in the path. If you want to minimize the total cost of the path found, you need to use another algorithm. For example, we can pull in the astar package. For simplicity, I'm going to do everything in a very inefficient way, but it's not too hard to do it "right". We'll need a fair chunk more imports:
import Data.Graph.AStar
import Data.Hashable
import Data.List
import Data.Maybe
import qualified Data.HashSet as HS
Unlike before, where we only wanted the nearest neighbor, we'll now want all the neighbors. (Actually, we could probably implement the previous use of minRestOn using the following function and minimumOn or something. Give it a try if you're interested!)
neighbors :: (a, [a]) -> [(a, [a])]
neighbors (_, xs) = go [] xs where
go ls [] = []
go ls (r:rs) = (r, ls ++ rs) : go (r:ls) rs
We can now call the aStar search method with appropriate parameters. We'll use ([a], [[a]]) -- representing the current list of digits and the remaining lists that we can choose from -- as our node type. The arguments to aStar are then, in order: the function for finding neighboring nodes, the function for computing distance between neighboring nodes, the heuristic for how far we have left to go (we'll just say 1 for each unique element in the list), whether we've reached a goal node, and the initial node to start the search from. We'll call fromJust, but it should be okay: all nodes have at least one path to a goal node, just by choosing the remaining lists of digits in order.
optimal :: (Eq a, Ord a, Hashable a) => [a] -> [[a]] -> [[a]]
optimal here elsewhere = (here:) . map fst . fromJust $ aStar
(HS.fromList . neighbors)
(\(x, _) (y, _) -> distance x y)
(\(x, xs) -> HS.size (HS.fromList (x:xs)) - 1)
(\(_, xs) -> null xs)
(here, elsewhere)
Let's see it run in ghci:
> optimal sampleStart sampleNeighbors
["2222","1222","8822","8881","8888","5428","2221","2234"]
We can see that it's done better this time by adding a pathLength function that computes all the distances between neighbors in a result.
pathLength :: Eq a => [[a]] -> Int
pathLength xs = sum [distance x y | x:y:_ <- tails xs]
In ghci:
> pathLength (greedy sampleStart sampleNeighbors)
15
> pathLength (optimal sampleStart sampleNeighbors)
14
In this particular example, I think the greedy algorithm could have found the optimal path if it had made the "right" choices whenever there were ties for minimal next step; but I expect it is not too hard to cook up an example where the greedy algorithm is forced into bad early choices.

Lazy Folding of Infinite Depth & Infinite Breadth Rose Tree to its Edge Paths

This question haskell fold rose tree paths delved into the code for folding a rose tree to its paths. I was experimenting with infinite rose trees, and I found that the provided solution was not lazy enough to work on infinite rose trees with infinity in both depth and breadth.
Consider a rose tree like:
data Rose a = Rose a [Rose a] deriving (Show, Functor)
Here's a finite rose tree:
finiteTree = Rose "root" [
Rose "a" [
Rose "d" [],
Rose "e" []
],
Rose "b" [
Rose "f" []
],
Rose "c" []
]
The output of the edge path list should be:
[["root","a","d"],["root","a","e"],["root","b","f"],["root","c"]]
Here is an infinite Rose tree in both dimensions:
infiniteRoseTree :: [[a]] -> Rose a
infiniteRoseTree ((root:_):breadthGens) = Rose root (infiniteRoseForest breadthGens)
infiniteRoseForest :: [[a]] -> [Rose a]
infiniteRoseForest (breadthGen:breadthGens) = [ Rose x (infiniteRoseForest breadthGens) | x <- breadthGen ]
infiniteTree = infiniteRoseTree depthIndexedBreadths where
depthIndexedBreadths = iterate (map (+1)) [0..]
The tree looks like this (it's just an excerpt, there's infinite depth and infinite breadth):
0
|
|
[1,2..]
/ \
/ \
/ \
[2,3..] [2,3..]
The paths would look like:
[[0,1,2..]..[0,2,2..]..]
Here was my latest attempt (doing it on GHCi causes an infinite loop, no streaming output):
rosePathsLazy (Rose x []) = [[x]]
rosePathsLazy (Rose x children) =
concat [ map (x:) (rosePathsLazy child) | child <- children ]
rosePathsLazy infiniteTree
The provided solution in the other answer also did not produce any output:
foldRose f z (Rose x []) = [f x z]
foldRose f z (Rose x ns) = [f x y | n <- ns, y <- foldRose f z n]
foldRose (:) [] infiniteTree
Both of the above work for the finite rose tree.
I tried a number of variations, but I can't figure out to make the edge folding operation lazy for infinite 2-dimensional rose tree. I feel like it has something to do with infinite amounts of concat.
Since the output is a 2 dimensional list. I can run a 2 dimensional take and project with a depth-limit or a breadth-limit or both at the same time!
Any help is appreciated!
After reviewing the answers here and thinking about it a bit more. I came to the realisation that this is unfoldable, because the resulting list is uncountably infinite. This is because an infinite depth & breadth rose tree is not a 2 dimensional data structure, but an infinite dimensional data structure. Each depth level confers an extra dimension. In other words, it is somewhat equivalent to an infinite dimensional matrix, imagine a matrix where each field is another matrix.. ad-infinitum. The cardinality of the infinite matrix is infinity ^ infinity, which has been proven (I think) to be uncountably infinite. This means any infinite dimensional data structure is not really computable in a useful sense.
To apply this to the rose tree, if we have infinite depth, then the paths never enumerate past the far left of the rose tree. That is this tree:
0
|
|
[1,2..]
/ \
/ \
/ \
[2,3..] [2,3..]
Would produce a path like: [[0,1,2..], [0,1,2..], [0,1,2..]..], and we'd never get past [0,1,2..].
Or in another way, if we have a list containing lists ad-infinitum. We can also never count (enumerate) it either, as there would be an infinite amount of dimensions that the code would jump to.
This also has some relationship to real numbers being uncountably infinite too. In a lazy list of infinite real numbers would just infinitely produce 0.000.. and never enumerate past that.
I'm not sure how to formalise the above explanation, but that's my intuition. (For reference see: https://en.wikipedia.org/wiki/Uncountable_set) It'd be cool to see someone expand on applying https://en.wikipedia.org/wiki/Cantor's_diagonal_argument to this problem.
This book seems to expand on it: https://books.google.com.au/books?id=OPFoJZeI8MEC&pg=PA140&lpg=PA140&dq=haskell+uncountably+infinite&source=bl&ots=Z5hM-mFT6A&sig=ovzWV3AEO16M4scVPCDD-gyFgII&hl=en&sa=X&redir_esc=y#v=onepage&q=haskell%20uncountably%20infinite&f=false
For some reason, dfeuer has deleted his answer, which included a very nice insight and only a minor, easily-fixed problem. Below I discuss his nice insight, and fix the easily-fixed problem.
His insight is that the reason the original code hangs is because it is not obvious to concat that any of the elements of its argument list are non-empty. Since we can prove this (outside of Haskell, with paper and pencil), we can cheat just a little bit to convince the compiler that it's so.
Unfortunately, concat isn't quite good enough: if you give concat a list like [[1..], foo], it will never draw elements from foo. The universe collection of packages can help here with its diagonal function, which does draw elements from all sublists.
Together, these two insights lead to the following code:
import Data.Tree
import Data.Universe.Helpers
paths (Node x []) = [[x]]
paths (Node x children) = map (x:) (p:ps) where
p:ps = diagonal (map paths children)
If we define a particular infinite tree:
infTree x = Node x [infTree (x+i) | i <- [1..]]
We can look at how it behaves in ghci:
> let v = paths (infTree 0)
> take 5 (head v)
[0,1,2,3,4]
> take 5 (map head v)
[0,0,0,0,0]
Looks pretty good! Of course, as observed by ErikR, we cannot have all paths in here. However, given any finite prefix p of an infinite path through t, there is a finite index in paths t whose element starts with prefix p.
Not a complete answer, but you might be interested in this detailed answer on how Haskell's permutations function is written so that it works on infinite lists:
What does this list permutations implementation in Haskell exactly do?
Update
Here's a simpler way to create an infinite Rose tree:
iRose x = Rose x [ iRose (x+i) | i <- [1..] ]
rindex (Rose a rs) [] = a
rindex (Rose _ rs) (x:xs) = rindex (rs !! x) xs
Examples:
rindex (iRose 0) [0,1,2,3,4,5,6] -- returns: 26
rindex infiniteTree [0,1,2,3,4,5,6] -- returns: 13
Infinite Depth
If a Rose tree has infinite depth and non-trivial width (> 1) there can't be an algorithm to list all of the paths just using a counting argument - the number of total paths is uncountable.
Finite Depth & Infinite Breadth
If the Rose tree has finite depth the number of paths is countable even if the trees have infinite breadth, and there is an algorithm which can produce all possible paths. Watch this space for updates.
ErikR has explained why you can't produce a list that necessarily contains all the paths, but it is possible to list paths lazily from the left. The simplest trick, albeit a dirty one, is to recognize that the result is never empty and force that fact on Haskell.
paths (Rose x []) = [[x]]
paths (Rose x children) = map (x :) (a : as)
where
a : as = concatMap paths children
-- Note that we know here that children is non-empty, and therefore
-- the result will not be empty.
For making very infinite rose trees, consider
infTree labels = Rose labels (infForest labels)
infForest labels = [Rose labels' (infForest labels')
| labels' <- map (: labels) [0..]]
As chi points out, while this definition of paths is productive, it will in some cases repeat the leftmost path forever, and never reach any more. Oops! So some attempt at fairness or diagonal traversal is necessary to give interesting/useful results.

Reversible permutations algorithm

Which algorithm for permutation of list is predictable?
For example, i can get number of i-th permutation
(Haskell code)
--List of all possible permutations
permut [] = [[]]
permut xs = [x:ys|x<-xs,ys<-permut (delete x xs)]
--In command line call:
> permut "abc" !! 2
"bac"
but i don't know how to reverse it.
I want to o something like this:
> getNumOfPermut "abc" "bac"
2
Any reversible algorithm goes!
Thank you in advance!
Okay, I wanted to wait until you answered my question about what you had tried, but I had so much fun working out the answer that I just had to write it up and share it. Nerd sniping, I guess! I'm sure I'm not the first to have invented the algorithm below, but I hope you enjoy the presentation.
Our first step is to give an actual runnable implementation of permut (which you have not done). Our implementation strategy will be a simple one: choose some element of the list, choose some permutation of the remaining elements, and concatenate the two.
chooseFrom [] = []
chooseFrom (x:xs) = (x,xs) : [(y, x:ys) | (y, ys) <- chooseFrom xs]
permut [] = [[]]
permut xs = do
(element, remaining) <- chooseFrom xs
permutation <- permut remaining
return (element:permutation)
If we run this on a sample list, it's pretty clear how it behaves:
> permut [1..4]
[[0,1,2,3],[0,1,3,2],[0,2,1,3],[0,2,3,1],[0,3,1,2],[0,3,2,1],[1,0,2,3],[1,0,3,2],[1,2,0,3],[1,2,3,0],[1,3,0,2],[1,3,2,0],[2,0,1,3],[2,0,3,1],[2,1,0,3],[2,1,3,0],[2,3,0,1],[2,3,1,0],[3,0,1,2],[3,0,2,1],[3,1,0,2],[3,1,2,0],[3,2,0,1],[3,2,1,0]]
The result has a lot of structure; for example, if we group by the first element of the contained lists, there are four groups, each containing 6 (which is 3!) elements:
> mapM_ print $ groupBy ((==) `on` head) it
[[0,1,2,3],[0,1,3,2],[0,2,1,3],[0,2,3,1],[0,3,1,2],[0,3,2,1]]
[[1,0,2,3],[1,0,3,2],[1,2,0,3],[1,2,3,0],[1,3,0,2],[1,3,2,0]]
[[2,0,1,3],[2,0,3,1],[2,1,0,3],[2,1,3,0],[2,3,0,1],[2,3,1,0]]
[[3,0,1,2],[3,0,2,1],[3,1,0,2],[3,1,2,0],[3,2,0,1],[3,2,1,0]]
So! The first digit of the list tells us "how many 6s to add". Additionally, each list in the above grouping exhibits similar structure: the lists in the first group have three groups of 2! elements each containing 1, 2, and 3 as their second element; the lists in each of those groups have 2 groups of 1! elements each starting with each of the remaining digits; and each of those groups have 1 group(s) of 0! elements each starting with the only remaining digit. So the second digit tells us "how many 2s to add", the third digit tells us "how many 1s to add", and the last digit tells us "how many 1s to add" (but always tells us to add 0 1s).
If you have implemented a change-of-base function on numbers before (e.g. decimal to hexadecimal or similar) you may recognize this pattern. Indeed, we can treat this as a change-of-base operation with a sliding base: instead of 1s, 10s, 100s, 1000s, and so on columns, we have 0!s, 1!s, 2!s, 3!s, 4!s, and so on columns. Let's write it! For efficiency, we'll compute all the sliding bases up front with a factorials function.
import Data.List
factorials n = scanr (*) 1 [n,n-1..1]
deleteAt i xs = case splitAt i xs of (b, e) -> b ++ drop 1 e
permutIndices permutation original
= go (factorials (length permutation - 1))
permutation
original
where
go _ [] [] = [0]
go _ [] _ = []
go _ _ [] = []
go (base:bases) (x:xs) ys = do
i <- elemIndices x ys
remainder <- go bases xs (deleteAt i ys)
return (i*base + remainder)
go [] _ _ = error "the impossible happened!"
Here's a sample sanity-check:
> map (`permutIndices` [1..4]) (permut [1..4])
[[0],[1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12],[13],[14],[15],[16],[17],[18],[19],[20],[21],[22],[23]]
And, for fun, here you can see it handling ambiguity correctly:
> permutIndices "acbba" "aabbc"
[21,23,45,47]
> map (permut "aabbc"!!) it
["acbba","acbba","acbba","acbba"]
...and showing that it's significantly more efficient than elemIndices:
> :set +s
> elemIndices "zyxwvutsr" (permut "rstuvwxyz")
[362879]
(2.65 secs, 1288004848 bytes)
> permutIndices "zyxwvutsr" "rstuvwxyz"
[362879]
(0.00 secs, 1030304 bytes)
Less than one thousandth the allocation/time. Seems like a win!
So, to be clear, you are looking for a way to find the position of a given permution-
"bac"
in a list of given permutions-
["abc", "acb", "bac", ....]
This problem actually has nothing inherently to do with permutions themselves. You want to find the location of an element in an array.
As #raymonad mentioned in his comment, stackoverflow.com/questions/20641772/ deals with this question, and the answer there was, use elemIndex.
elemIndex thePermutionToFind $ permut theString
Keep in mind, that if letters repeat, a value might appear more than once in the output, if your "permut" function doesn't remove these duplicates (ie- Note that permut "aa" = ["aa", "aa"]).... In this case the elemIndices function will come in useful.
If elemIndex returns Nothing, it means the string you supplied wasn't a permution.
(this isn't the most effecient algorithm for large strings, since the number of permutions grows like the factorial of the size of the string.... Which is worse than exponential.)

Recursion confusion in Haskell again - subsets with an inclusion test

I'm testing a simple program to generate subsets with an inclusion test. For example, given
*Main Data.List> factorsets 7
[([2],2),([2,3],1),([3],1),([5],1),([7],1)]
calling chooseP 3 (factorsets 7), I would like to get (read from right to left, a la cons)
[[([5],1),([3],1),([2],2)]
,[([7],1),([3],1),([2],2)]
,[([7],1),([5],1),([2],2)]
,[([7],1),([5],1),([2,3],1)]
,[([7],1),([5],1),([3],1)]]
But my program is returning an extra [([7],1),([5],1),([3],1)] (and missing a [([7],1),([5],1),([2],2)]):
[[([5],1),([3],1),([2],2)]
,[([7],1),([3],1),([2],2)]
,[([7],1),([5],1),([3],1)]
,[([7],1),([5],1),([2,3],1)]
,[([7],1),([5],1),([3],1)]]
The inclusion test is: members' first part of the tuple must have a null intersection.
Once tested as working, the plan is to sum the internal products of each subset's snds, rather than accumulate them.
Since I've asked a similar question before, I imagine that an extra branch is generated since when the recursion splits at [2,3], the second branch runs over the same possibilities once it passes the skipped section. Any pointers on how to resolve that would be appreciated; and if you'd like to share ideas about how to enumerate and sum such product combinations more efficiently, that would be great, too.
Haskell code:
chooseP k xs = chooseP' xs [] 0 where
chooseP' [] product count = if count == k then [product] else []
chooseP' yys product count
| count == k = [product]
| null yys = []
| otherwise = f ++ g
where (y:ys) = yys
(factorsY,numY) = y
f = let zzs = dropWhile (\(fs,ns) -> not . and . map (null . intersect fs . fst) $ product) yys
in if null zzs
then chooseP' [] product count
else let (z:zs) = zzs in chooseP' zs (z:product) (count + 1)
g = if and . map (null . intersect factorsY . fst) $ product
then chooseP' ys product count
else chooseP' ys [] 0
Your code is complicated enough that I might recommend starting over. Here's how I would proceed.
Write a specification. Let it be as stupidly inefficient as necessary -- for example, the spec I choose below will build all combinations of k elements from the list, then filter out the bad ones. Even the filter will be stupidly slow.
sorted xs = sort xs == xs
unique xs = nub xs == xs
disjoint xs = and $ liftM2 go xs xs where
go x1 x2 = x1 == x2 || null (intersect x1 x2)
-- check that x is valid according to all the validation functions in fs
-- (there are other fun ways to spell this, but this is particularly
-- readable and clearly correct -- just what we want from a spec)
allFuns fs x = all ($x) fs
choosePSpec k = filter good . replicateM k where
good pairs = allFuns [unique, disjoint, sorted] (map fst pairs)
Just to make sure it's right, we can test it at the prompt:
*Main> mapM_ print $ choosePSpec 3 [([2],2),([2,3],1),([3],1),([5],1),([7],1)]
[([2],2),([3],1),([5],1)]
[([2],2),([3],1),([7],1)]
[([2],2),([5],1),([7],1)]
[([2,3],1),([5],1),([7],1)]
[([3],1),([5],1),([7],1)]
Looks good.
Now that we have a spec, we can try to improve the speed one refactoring at a time, always checking that it matches the spec. The first thing I'd want to do is notice that we can ensure uniqueness and sortedness just by sorting the input and picking things "in an increasing way". To do this, we can define a function which chooses subsequences of a given length. It piggy-backs on the tails function, which you can think of as nondeterministically choosing a place to split its input list.
subseq 0 xs = [[]]
subseq n xs = do
x':xt <- tails xs
xs' <- subseq (n-1) xt
return (x':xs')
Here's an example of this function in action:
*Main> subseq 3 [1..4]
[[1,2,3],[1,2,4],[1,3,4],[2,3,4]]
Now we can write a slightly faster chooseP by replacing replicateM with subseq. Recall that we're assuming the inputs are already sorted and unique, though.
choosePSlow k = filter good . subseq k where
good pairs = disjoint $ map fst pairs
We can sanity-check that it's working by running it on the particular input we have from above:
*Main> let i = [([2],2),([2,3],1),([3],1),([5],1),([7],1)]
*Main> choosePSlow 3 i == choosePSpec 3 i
True
Or, better yet, we can stress-test it with QuickCheck. We'll need a tiny bit more code. The condition k < 5 is just because the spec is so hopelessly slow that bigger values of k take forever.
propSlowMatchesSpec :: NonNegative Int -> OrderedList ([Int], Int) -> Property
propSlowMatchesSpec (NonNegative k) (Ordered xs)
= k < 5 && unique (map fst xs)
==> choosePSlow k xs == choosePSpec k xs
*Main> quickCheck propSlowMatchesSpec
+++ OK, passed 100 tests.
There are several more opportunities to make things faster. For instance, the disjoint test could be sped up using choose 2 instead of liftM2; or we might be able to ensure disjointness during element selection and prune the search even earlier; etc. How you want to improve it from here I leave to you -- but the basic technique (start with stupid and slow, then make it smarter, testing as you go) should be helpful to you.

OCaml insert an element in list

What is the standard way of inserting an element to a specific position in a list in OCaml. Only recursion is allowed. No assignment operation is permitted.
My goal is to compress a graph in ocaml by removing vertexes with in_degree=out_degree=1. For this reason I need to remove the adjacent edges to make a single edge. Now the edges are in a list [(6,7);(1,2);(2,3);(5,4)]. So I need to remove those edges from the list and add a single edge.
so the above list will now look like [(6,7);(1,3);(5,4)]. Here we see (1,2);(2,3) is removed and (1,3) is inserted in the second position. I have devised an algorithm for this. But to do this I need to know how can I remove the edges (1,2);(2,3) from position 2,3 and insert (1,3) in position 2 without any explicit variable and in a recursive manner.
OCaml list is immutable so there's no such thing like removing and inserting elements in list operations.
What you can do is creating a new list by reusing certain part of the old list. For example, to create a list (1, 3)::xs' from (1, 2)::(2, 3)::xs' you actually reuse xs' and make the new list using cons constructor.
And pattern matching is very handy to use:
let rec transform xs =
match xs with
| [] | [_] -> xs
| (x, y1)::(y2, z)::xs' when y1 = y2 -> (x, z)::transform xs'
| (x, y1)::(y2, z)::xs' -> (x, y1)::transform ((y2, z)::xs')
You can do something like that :
let rec compress l = match l with
[] -> []
| x :: [] -> [x]
| x1 :: x2 :: xs ->
if snd x1 = fst x2 then
(fst x1, snd x2) :: compress xs
else x1 :: compress (x2 :: xs)
You are using the wrong datastructure to store your edges and your question doesnt indicate that you can't choose a different datastructure. As other posters already said: lists are immutable so repeated deletion of elements deep within them is a relatively costly (O(n)) operation.
I also dont understand why you have to reinsert the new edge at position 2. A graph is defined by G=(V,E) where V and E are sets of vertices and edges. The order of them therefor doesnt matter. This definition of graphs also already tells you a better datastructure for your edges: sets.
In ocaml, sets are represented by balanced binary trees so the average complexity of insertion and deletion of members is O(log n). So you see that for deletion of members this complexity is definitely better than the one of lists (O(n)) on the other hand it is more costly to add members to a set than it is to prepend elements to a list using the cons operation.
An alternative datastructure would be a hashtable where insertion and deletion can be done in O(1) time. Let the keys in the hashtable be your edges and since you dont use the values, just use a constant like unit or 0.

Resources