Is there an F# equivalent of Enumerable.DefaultIfEmpty? - linq

After searching quite a bit, I couldn't find an F# equivalent of Enumerable.DefaultIfEmpty.
Does something similar exists in F# (perhaps in a different, idiomatic, way)?

To preserve the laziness of the sequence, we could work with the enumerator's state.
let DefaultIfEmpty (l:'t seq) (d:'t) =
seq{
use en = l.GetEnumerator()
if en.MoveNext() then
yield en.Current
while en.MoveNext() do
yield en.Current
else
yield d }

Seq module functions operate and return IEnumerable<_>'s and DefaultIfEmpty operate and return IEnumerable<_>'s. How about just wrap it in function that is composable.
let inline DefaultIfEmpty d l = System.Linq.Enumerable.DefaultIfEmpty(l, d)
This also preserves laziness.
example:
Seq.empty |> DefaultIfEmpty 0
Update
I've made an open source library inlining many extension and static methods, including Enumerable.defaultIfEmpty -- ComposableExtesions

There are a few options:
Use DefaultIfEmpty which might be non-idiomatic but will work
write your own like so:
let DefaultIfEmpty (l:'t seq) (d:'t) =
match Seq.length l with |0 -> seq [d] |_ -> l
Worry about infinite sequences
let DefaultIfEmpty (l:'t seq) (d:'t) =
match Seq.isEmpty l with |true -> seq [d] |false -> l

Related

Equivalent of Ruby's Enumerable#each_slice method in FSharp

I'm currently writing a bit of F#. I've created a method that is the equivalent of Ruby's Enumerable#each_slice method and was wondering if somebody has a better (i.e. more elegant, more concise, more readable) solution.
Here it is:
let rec slicesBySize size list =
match list with
| [] -> [] // case needed for type inference
| list when list.Length < size -> [list]
| _ ->
let first = list |> Seq.take size |> List.ofSeq
let rest = list |> Seq.skip size |> List.ofSeq
[first] # slicesBySize size rest
Thanks for any and all feedback/help.
You're looking for List.chunkBySize, which was added in F# 4.0. There are also Seq and Array variants.

Point-free: confused about where to put parenthesis

let list_to_string = (String.concat "") (List.map (String.make 1));;
This is wrong, but how do I make it understand that the argument is still to be supplied? The argument is expected to be of type char list, I.e. the first function that needs to be applied to it is the (List.map (String.make 1)), and then pass it to String.concat "". I think I've tried all combinations of parenthesis I could think of... no joy so far.
Help?
I also figured I could do it like this:
let ($) f g x = f (g x);;
let list_to_string = (String.concat "") $ (List.map (String.make 1));;
But just wanted to make sure there isn't a better way.
The real (and always perplexing) problem is that OCaml doesn't have a built-in function composition operator. So it's not so good out of the box for pointfree coding. If you really want to get fancy with it, you also need flip, which inverts the order of the arguments of a two-argument function.
let flip f a b = f b a
At any rate, I don't see any problem with your solution once you've defined function composition as $. You can leave out some of the parentheses:
# let lts = String.concat "" $ List.map (String.make 1);;
val lts : char list -> string = <fun>
As to efficiency, I assume this is more of a puzzle than a practical bit of code. Otherwise you should use the functions that Edwin suggests.
I don't think that partial application helps here, just write out the function parameter:
let list_to_string x = String.concat "" (List.map (String.make 1) x)
This isn't very efficient though, it would be better to create just one string and fill it with characters.
If you use Batteries then see Batstring.of_list, and if you use Core then see String.of_char_list
You can definitely just define a function composition operator in OCaml. It just can't be dot because that's already an operator in OCaml.
(* I just made up this symbol *)
let (^.^) f g x = f (g x)
let list_to_string = String.concat "" ^.^ List.map (String.make 1);;

Why is there a let in OCaml's List.map?

In OCaml 3.12.1, List.map is written as follows:
let rec map f = function
[] -> []
| a::l -> let r = f a in r :: map f l
I'd expect that last line to be written as | a::l -> f a :: map f l, but instead, there is a seemingly useless let binding. Why?
I believe it is there to guarantee an order of function application for the map. The order of evaluation of simple expressions in OCaml is unspecified, so without the let the order of applications of f to the elements of the list would be unspecified. Since OCaml is not a pure language, you really would like the order to be specified (f is called on the head of the list first, and so on recursively).

F# equivalent of LINQ Single

Ok, so for most LINQ operations there is a F# equivalent.
(Generally in the Seq module, since Seq= IEnumerable)
I can't find the equiv of IEmumerable.Single, I prefer Single over First (which is Seq.find), because it is more defensive - it asserts for me the state is what I expect.
So I see a couple of solutions (other than than using Seq.find).
(These could be written as extension methods)
The type signature for this function, which I'm calling only, is
('a->bool) -> seq<'a> -> 'a
let only = fun predicate src -> System.Linq.Enumerable.Single<'a>(src, predicate)
let only2 = Seq.filter >> Seq.exactlyOne
only2 is preferred, however it won't compile (any clues on that?).
In F# 2.0, this is a solution works without enumerating the whole sequence (close to your 2nd approach):
module Seq =
let exactlyOne seq =
match seq |> Seq.truncate 2 with
| s when Seq.length s = 1 -> s |> Seq.head |> Some
| _ -> None
let single predicate =
Seq.filter predicate >> exactlyOne
I choose to return option type since raising exception is quite unusual in F# high-order functions.
EDIT:
In F# 3.0, as #Oxinabox mentioned in his comment, Seq.exactlyOne exists in Seq module.
What about
let Single source f =
let worked = ref false
let newf = fun a ->
match f a with
|true ->
if !worked = true then failwith "not single"
worked := true
Some(a)
|false -> None
let r = source |> Seq.choose newf
Seq.nth 0 r
Very unidiomatic but probably close to optimal
EDIT:
Solution with exactlyOne
let only2 f s= (Seq.filter f s) |> exactlyOne

Why is F#'s Seq.sortBy much slower than LINQ's IEnumerable<T>.OrderBy extension method?

I've recently written a piece of code to read some data from a file, store it in a tuple and sort all the collected data by the first element of the tuple. After some tests I've noticed that using Seq.sortBy (and Array.sortBy) is extremely slower than using IEnumerable.OrderBy.
Below are two snippets of code which should show the behaviour I'm talking about:
(filename
|> File.ReadAllLines
|> Array.Parallel.map(fun ln -> let arr = ln.Split([|' '|], StringSplitOptions.RemoveEmptyEntries)
|> Array.map(double)
|> Array.sort in arr.[0], arr.[1])
).OrderBy(new Func(fun (a,b) -> a))
and
filename
|> File.ReadAllLines
|> Array.Parallel.map(fun ln -> let arr = ln.Split([|' '|], StringSplitOptions.RemoveEmptyEntries) |> Array.map(double) |> Array.sort in arr.[0], arr.[1])
|> Seq.sortBy(fun (a,_) -> a)
On a file containing 100000 lines made of two doubles, on my computer the latter version takes over twice as long as the first one (no improvements are obtained if using Array.sortBy).
Ideas?
the f# implementation uses a structural comparison of the resulting key.
let sortBy keyf seq =
let comparer = ComparisonIdentity.Structural
mkDelayedSeq (fun () ->
(seq
|> to_list
|> List.sortWith (fun x y -> comparer.Compare(keyf x,keyf y))
|> to_array) :> seq<_>)
(also sort)
let sort seq =
mkDelayedSeq (fun () ->
(seq
|> to_list
|> List.sortWith Operators.compare
|> to_array) :> seq<_>)
both Operators.compare and the ComparisonIdentity.Structural.Compare become (eventually)
let inline GenericComparisonFast<'T> (x:'T) (y:'T) : int =
GenericComparisonIntrinsic x y
// lots of other types elided
when 'T : float = if (# "clt" x y : bool #)
then (-1)
else (# "cgt" x y : int #)
but the route to this for the Operator is entirely inline, thus the JIT compiler will end up inserting a direct double comparison instruction with no additional method invocation overhead except for the (required in both cases anyway) delegate invocation.
The sortBy uses a comparer so will go through an additional virtual method call but is basically about the same.
In comparison the OrderBy function also must go through virtual method calls for the equality (Using EqualityComparer<T>.Default) but the significant difference is that it sorts in place and uses the buffer created for this as the result. In comparison if you take a look at the sortBy you will see that it sorts the list (not in place, it uses the StableSortImplementation which appears to be merge sort) and then creates a copy of it as a new array. This additional copy (given the size of your input data) is likely the principle cause of the slow down though the differing sort implementations may also have an effect.
That said this is all guessing. If this area is a concern for you in performance terms then you should simply profile to find out what is taking the time.
If you wish to see what effect the sorting/copying change would have try this alternate:
// these are taken from the f# source so as to be consistent
// beware doing this, the compiler may know about such methods
open System.Collections.Generic
let mkSeq f =
{ new IEnumerable<'b> with
member x.GetEnumerator() = f()
interface System.Collections.IEnumerable with
member x.GetEnumerator() = (f() :> System.Collections.IEnumerator) }
let mkDelayedSeq (f: unit -> IEnumerable<'T>) =
mkSeq (fun () -> f().GetEnumerator())
// the function
let sortByFaster keyf seq =
let comparer = ComparisonIdentity.Structural
mkDelayedSeq (fun () ->
let buffer = Seq.to_array seq
Array.sortInPlaceBy (fun x y -> comparer.Compare(keyf x,keyf y)) buffer
buffer :> seq<_>)
I get some reasonable percentage speedups within the repl with very large (> million) input sequences but nothing like an order of magnitude. Your mileage, as always, may vary.
A difference of x2 is not much when sorts are O(n.log(n)).
Small differences in data structures (e.g. optimising for input being ICollection<T>) could make this scale of difference.
And F# is currently Beta (not so much focus on optimisation vs. getting the language and libraries right), plus the generality of F# functions (supporting partial application etc.) could lead to a slight slow down in calling speed: more than enough to account for the different.

Resources