Using the F# pipe symbol with an object constructor - syntax

I'm trying to figure out the correct syntax to use the pipe operator |> into the creation of an object. Currently I'm using a static member to create the object and just piping to that. Here is the simplified version.
type Shape =
val points : Vector[]
new (points) =
{ points = points; }
static member create(points) =
Shape(points)
static member concat(shapes : Shape list) =
shapes
|> List.map (fun shape -> shape.points)
|> Array.concat
|> Shape.create
What I want to do ...
static member concat(shapes : Shape list) =
shapes
|> List.map (fun shape -> shape.points)
|> Array.concat
|> (new Shape)
Is something like this possible? I don't want to duplicate code by repeating my constructor with the static member create.
Update
Constructors are first-class functions as of F# 4.0
In F# 4.0 the correct syntax is.
static member concat(shapes : Shape list) =
shapes
|> List.map (fun shape -> shape.points)
|> Array.concat
|> Shape

There's always
(fun args -> new Shape(args))

Apparently, object constructors aren't composable. Discriminated union constructors don't seem to have this problem:
> 1 + 1 |> Some;;
val it : int option = Some 2
If you want to use the pipeline, Brian's answer is probably best. In this case, I'd consider just wrapping the entire expression with Shape( ).

Related

F# int list versus unit list

open System
let rec quick (cast: int list) mmm =
match mmm with
| [] -> []
| first::rest ->
let small = (rest |> List.filter (fun x -> x < first))
let large = (rest |> List.filter (fun x -> x >= first))
quick small |> ignore
quick large |> ignore
//[small # [first] # large]
List.concat [small; [first]; large]
[<EntryPoint>]
let main argv =
printfn "%A" (quick [3;5;6;7;8;7;5;4;3;4;5;6]);;
0
Trying to implement a simple quicksort function in F#.
Relatively new to the language, but by all account from what I've read and my understanding of the syntax this should present an integer list but is instead presenting the ambiguous "unit list".
Why does this give a unit list and not an int list?
It errors out at "%A" saying the types do not match.
As given in the OP, quick is a function that takes two parameters: cast and mmm. The type of the function is int list -> int list -> int list.
The function call quick [3;5;6;7;8;7;5;4;3;4;5;6], however, only supplies one argument. Since F# functions are curried, the return value is a new function:
> quick [3;5;6;7;8;7;5;4;3;4;5;6];;
val it : (int list -> int list) = <fun:it#3-4>
This function (in my F# Interactive window called it#3-4) has the type int list -> int list - that is: It's a function that 'still waits' for an int list argument before it runs.
When you print it with the %A format specifier, it prints <fun:it#4-5> to the console. The return value of printfn is () (unit):
> printfn "%A" (quick [3;5;6;7;8;7;5;4;3;4;5;6]);;
<fun:it#4-5>
val it : unit = ()
You probably only want the function to take a single list parameter. Additionally, the steps you ignore are having no effect, so you might consider another way to recursively call quick.

Seq.take won't return elements

When I run the following code:
getTheData() |> Seq.take 3
it does not return the elements, instead it outputs this:
val it : seq<Collections.Generic.KeyValuePair<ID,Data>>
I am using Visual Studio 2017 and F# Interactive
What is wrong, should it not output the first 3 items?
getTheData function =
let getTheData() =
(#"C:\Users\data.xlsx")
|> (ParseExcel >> datap)
|> Seq.distinct
|> Seq.map(fun b -> b.ID, b)
|> Map.ofSeq
Seq.take is not considered a terminal operation on a sequence in F#. As mentioned in the comments, sequences are lazily evaluated, and only operations that are considered "terminal" will cause a sequence to be iterated. Terminal operations include Seq.iter (if you want to perform an action on each element) and Seq.toList (if you want a materialized list of each element), as well as others like Seq.exactlyOne.
In F# interactive, you can probably just evaluate it to see the first few values. In the following example mirroring yours, evaluating it at the end will display the 3 values taken:
open System
let getTheData() =
seq {
for n in {0..1000} -> Guid.NewGuid(), n
} |> Map.ofSeq
getTheData()
|> Seq.take 3;;
it;;
val it : seq<Collections.Generic.KeyValuePair<Guid,int>> =
seq
[[001830fe-9ce3-4649-8609-571e4aedb4c7, 791]
{Key = 001830fe-9ce3-4649-8609-571e4aedb4c7;
Value = 791;};
[001bf0a9-5981-4bc0-bcaf-046af7f4866a, 383]
{Key = 001bf0a9-5981-4bc0-bcaf-046af7f4866a;
Value = 383;};
[004b44a7-85d2-4ce5-91bf-49bcc44f03ba, 91]
{Key = 004b44a7-85d2-4ce5-91bf-49bcc44f03ba;
Value = 91;}]

Linq like function GroupMultipleValBy

In the process of learning the awesomness that is F# + Linq I came to a problem I can not solve(nicely) using functional, OOP, nor Linq like syntax, would anyone be willing to help?
Lets say my input is the following sequence:
let db = seq [ ("bob", 90, ['x';'y'])
("bob", 70, ['z'])
("frank", 20, ['b'])
("charlie", 10, ['c']) ]
Rows could read for example "Student bob has enrolled in x,y in semester 90"
What I need is this instead:
[ ("bob", [90; 70], ['x'; 'y'; 'z'])
("frank", [20], ['b'])
("charlie", [10], ['c']) ]
This would read instead "Bob has finished semesters 90,70 and taken x,y,z".
Linq/Relational approach usualy gives the most readable solutions to such problems. But the best I can come up with is:
type Student = string
type Semester = int
type Class = char
let restructure (inp:seq<Student * Semester * Class list>) = query {
for (student, semester, classes) in inp do
groupValBy (semester,classes) student into data
yield (data.Key, Seq.map fst data, Seq.collect snd data)
}
Which is neither readable, nor fast, nor pretty, nor idiomatic, and due to intricacies of F# requires that I write the input type signature...
Is there a better way some GroupMultipleValBy function?
Thank you very much!
If you can stick to "classic" F# code you can rewrite it in a more readable way (especially by using locals to make the code even more readable)
Well it seems Tomas beat me to this, we've gone roughly the same road with some "quirks" in the middle
let restructure inp =
// could have been defined at a more global scope as helpers
let fst3 (x, _, _) = x
let flip f y x = f x y
let folder cont (student, semester, classes) (_, semesters, allClasses) =
cont (student, semester :: semesters, classes # allClasses)
let initialState = "", [], []
inp
|> Seq.groupBy fst3
|> Seq.map (snd >> flip (Seq.fold folder id) initialState)
The hard part to understand is the folder one, to keep semesters in the wanted order we either have to add an extra step reversing that part or as done here using a continuation
That (and the use of flip to keep it point free) added with the use of # makes me think Tomas code is "better" (but his answer makes semesters and classes seq instead of list)
Addendum
Here's Tomas code written in a way I find more readable (but that's a matter of taste) and maybe more agnostic about what's being manipulated although it's longer
[that doesn't take anything to it's answer which is great]
let restructure inp =
// could have been defined at a more global scope as helpers
let fst3 (x, _, _) = x
let snd3 (_, y, _) = y
let trd3 (_, _, z) = z
let mapping (key, values) =
key,
// replace with commented part to have lists instead of seqs
values |> Seq.map snd3, //[ for value in values -> snd3 value ],
values |> Seq.collect trd3 //[ for value in values do yield! trd3 value ]
inp
|> Seq.groupBy fst3
|> Seq.map mapping
If you do not insist on using the query syntax (which is needed if you are working with databases, but is just one of the options when working with in-memory data), then I would probably use simple Seq.groupBy function:
db
|> Seq.groupBy (fun (name, _, _) -> name)
|> Seq.map (fun (name, group) ->
name,
group |> Seq.map (fun (_, sem, _) -> sem),
group |> Seq.collect (fun (_, _, courses) -> courses) )
Here, we are saying that we want to group records by student name and return a triple with:
The student name, which was used as the grouping key
Get semester of all the records
Collect all the courses they attended
This is not shorter than your version, but I think a combination of groupBy and map is a fairly common pattern that is quite easy to understand. That said, I'm quite curious to see other answers! I can imagine there would be a nicer way of doing this...

Terrific performance difference between almost equal methods

while working on a project I accidentally noticed that the same method with only one additional (unused) argument manages to run even ten times faster than the other one, with optimizations enabled.
type Stream () =
static member private write (x, o, a : byte[]) = (for i = 0 to 3 do a.[o + i] <- byte((x >>> 24 - i * 8) % 256)); 4
static member private format f x l = Array.zeroCreate l |> fun a -> (f(x, 0, a) |> ignore; a)
static member private format1 f x l o = Array.zeroCreate l |> fun a -> (f(x, 0, a) |> ignore; a)
static member Format (value : int) = Stream.format (fun (x: int, i, a) -> Stream.write(x, i, a)) value 4
static member Format1 (value : int) = Stream.format1 (fun (x: int, i, a) -> Stream.write(x, i, a)) value 4
When tested, Stream.Format1 runs much faster than Stream.Format, although the only difference between the private members Stream.format and Stream.format1 is just the o argument, which moreover is unused by the method itself.
How does the compiler treat in so different ways two almost identical methods?
EDIT: thanks for the explanation and sorry for the ignorance.
The problem is that when you call Format1 with just a single argument, it only returns a function. It doesn't do the actual formatting yet. This means that if you compare the performance of:
Stream.Format 42
Stream.Format1 42
... then you're actually comparing the performance of actual formatting (that creates the array and writes something in it) in the first case and the performance of code that simply returns a function value without doing anything.
If you're not using the o parameter of format1 for anything, then you can just pass in some dummy value, to actually evaluate the function and get the result. Then you should get similar performance:
Stream.Format 42
Stream.Format1 42 ()
Format actually invokes Array.zeroCreate l |> fun a -> (f(x, 0, a) |> ignore; a).
Format1 returns a function that when passed an object invokes Array.zeroCreate l |> fun a -> (f(x, 0, a) |> ignore; a).
I.e., one does actual work, the other is merely a partial function application; the latter is obviously quicker.
If you're not familiar with partial function application, there is a section in the F# docs titled 'Partial Application of Arguments' that's worth reading over: Functions (F#)

Why is F#'s Seq.sortBy much slower than LINQ's IEnumerable<T>.OrderBy extension method?

I've recently written a piece of code to read some data from a file, store it in a tuple and sort all the collected data by the first element of the tuple. After some tests I've noticed that using Seq.sortBy (and Array.sortBy) is extremely slower than using IEnumerable.OrderBy.
Below are two snippets of code which should show the behaviour I'm talking about:
(filename
|> File.ReadAllLines
|> Array.Parallel.map(fun ln -> let arr = ln.Split([|' '|], StringSplitOptions.RemoveEmptyEntries)
|> Array.map(double)
|> Array.sort in arr.[0], arr.[1])
).OrderBy(new Func(fun (a,b) -> a))
and
filename
|> File.ReadAllLines
|> Array.Parallel.map(fun ln -> let arr = ln.Split([|' '|], StringSplitOptions.RemoveEmptyEntries) |> Array.map(double) |> Array.sort in arr.[0], arr.[1])
|> Seq.sortBy(fun (a,_) -> a)
On a file containing 100000 lines made of two doubles, on my computer the latter version takes over twice as long as the first one (no improvements are obtained if using Array.sortBy).
Ideas?
the f# implementation uses a structural comparison of the resulting key.
let sortBy keyf seq =
let comparer = ComparisonIdentity.Structural
mkDelayedSeq (fun () ->
(seq
|> to_list
|> List.sortWith (fun x y -> comparer.Compare(keyf x,keyf y))
|> to_array) :> seq<_>)
(also sort)
let sort seq =
mkDelayedSeq (fun () ->
(seq
|> to_list
|> List.sortWith Operators.compare
|> to_array) :> seq<_>)
both Operators.compare and the ComparisonIdentity.Structural.Compare become (eventually)
let inline GenericComparisonFast<'T> (x:'T) (y:'T) : int =
GenericComparisonIntrinsic x y
// lots of other types elided
when 'T : float = if (# "clt" x y : bool #)
then (-1)
else (# "cgt" x y : int #)
but the route to this for the Operator is entirely inline, thus the JIT compiler will end up inserting a direct double comparison instruction with no additional method invocation overhead except for the (required in both cases anyway) delegate invocation.
The sortBy uses a comparer so will go through an additional virtual method call but is basically about the same.
In comparison the OrderBy function also must go through virtual method calls for the equality (Using EqualityComparer<T>.Default) but the significant difference is that it sorts in place and uses the buffer created for this as the result. In comparison if you take a look at the sortBy you will see that it sorts the list (not in place, it uses the StableSortImplementation which appears to be merge sort) and then creates a copy of it as a new array. This additional copy (given the size of your input data) is likely the principle cause of the slow down though the differing sort implementations may also have an effect.
That said this is all guessing. If this area is a concern for you in performance terms then you should simply profile to find out what is taking the time.
If you wish to see what effect the sorting/copying change would have try this alternate:
// these are taken from the f# source so as to be consistent
// beware doing this, the compiler may know about such methods
open System.Collections.Generic
let mkSeq f =
{ new IEnumerable<'b> with
member x.GetEnumerator() = f()
interface System.Collections.IEnumerable with
member x.GetEnumerator() = (f() :> System.Collections.IEnumerator) }
let mkDelayedSeq (f: unit -> IEnumerable<'T>) =
mkSeq (fun () -> f().GetEnumerator())
// the function
let sortByFaster keyf seq =
let comparer = ComparisonIdentity.Structural
mkDelayedSeq (fun () ->
let buffer = Seq.to_array seq
Array.sortInPlaceBy (fun x y -> comparer.Compare(keyf x,keyf y)) buffer
buffer :> seq<_>)
I get some reasonable percentage speedups within the repl with very large (> million) input sequences but nothing like an order of magnitude. Your mileage, as always, may vary.
A difference of x2 is not much when sorts are O(n.log(n)).
Small differences in data structures (e.g. optimising for input being ICollection<T>) could make this scale of difference.
And F# is currently Beta (not so much focus on optimisation vs. getting the language and libraries right), plus the generality of F# functions (supporting partial application etc.) could lead to a slight slow down in calling speed: more than enough to account for the different.

Resources