F# 2010 Seq.generate_using - visual-studio-2010

Is there an alternative/workaround for Seq.generate_using in Visual Studio 2010? The FSharp.PowerPack.dll is not available for 2010 AFAIK

(Sorry about the PowerPack still not being available for 2010 yet.)
I don't recall if this already is true of the CTP update, but in internal bits I get the warning:
This construct is deprecated. This
function will be removed in a future
release. If necessary, take a copy of
its implementation from the F#
PowerPack and copy it into your
application
so here is the code from the PowerPack:
#nowarn "9"
namespace Microsoft.FSharp.Compatibility
open System.Collections.Generic
module Seq =
let combine ie1 ie2 = Seq.zip ie1 ie2
let nonempty (ie : seq<'T>) = use e = ie.GetEnumerator() in e.MoveNext() //'
let generate openf compute closef =
seq { let r = openf()
try
let x = ref None
while (x := compute r; (!x).IsSome) do
yield (!x).Value
finally
closef r }
let generate_using (openf : unit -> ('b :> System.IDisposable)) compute = //'
generate openf compute (fun (s:'b) -> s.Dispose()) //'
let cons (x:'T) (s: seq<'T>) =
seq { yield x
yield! s }

FYI, the PowerPack binaries for .Net 4.0 Beta1 came online today:
http://www.microsoft.com/downloads/details.aspx?FamilyID=e475a670-9596-4958-bfa2-dc0ac29b4631&displaylang=en

Related

F# Array.Parallel.map does not provide parallel processing

I have to simulate a discrete environment in F#, to be called by Python, for a reinforcement learning problem. I had a function with primitive types (mainly float) to make the exchange of data smoother. Now I am in the position to run this function many times with different data, so to run it in parallel seems a good idea.
I have the following code:
type AscentStrategy = |Strategy of seq<float>
let simulateAscent env ascentLimiter initState (sequenceOfDepths:seq<float>) =
//let infinitSeqOfConstantValues = (fun _ -> constantDepth) |> Seq.initInfinite
sequenceOfDepths
|> Seq.scan ( fun ( nextState, rew, isTerminal, _ ) depth -> getNextEnvResponseAndBoundForNextAction(env, nextState , depth , ascentLimiter) ) ( initState, 0.0 , false, 0.0)
|> SeqExtension.takeWhileWithLast (fun (_ , _, isTerminalState, _) -> not isTerminalState)
|> Seq.toArray
and then
let simulateStrategy ({MaxPDCS = maxPDCS ; MaxSimTime = maximumSimulationTime ; PenaltyForExceedingRisk = penaltyForExceedingRisk ;
RewardForDelivering = rewardForDelivering ; PenaltyForExceedingTime = penaltyForExceedingTime ; IntegrationTime = integrationTime
ControlToIntegrationTimeRatio = controlToIntegrationTimeRatio; DescentRate = descentRate; MaximumDepth = maximumDepth ;
BottomTime = bottomTime ; LegDiscreteTime = legDiscreteTime } : SimulationParameters) (Strategy ascentStrategy : AscentStrategy) =
let env, initState , ascentLimiter , _ = getEnvInitStateAndAscentLimiter ( maxPDCS , maximumSimulationTime ,
penaltyForExceedingRisk , rewardForDelivering , penaltyForExceedingTime ,
integrationTime ,
controlToIntegrationTimeRatio,
descentRate ,
maximumDepth ,
bottomTime ,
legDiscreteTime )
ascentStrategy
|> simulateAscent env ascentLimiter initState
finally I call the function for testing:
let commonSimulationParameters = {MaxPDCS = 0.32 ; MaxSimTime = 2000.0 ; PenaltyForExceedingRisk = 1.0 ; RewardForDelivering = 10.0; PenaltyForExceedingTime = 0.5 ;
IntegrationTime = 0.1; ControlToIntegrationTimeRatio = 10; DescentRate = 60.0; MaximumDepth = 20.0 ; BottomTime = 10.0; LegDiscreteTime = 0.1}
printfn"insert number of elements"
let maxInputsString = Console.ReadLine()
let maxInputs = maxInputsString |> Double.Parse
let inputsStrategies = [|0.0 .. maxInputs|] |> Array.map (fun x -> Seq.initInfinite (fun _ -> x ) )
let testParallel = inputsStrategies
|> Array.Parallel.map (fun x -> (simulateStrategy commonSimulationParameters ( Strategy x )) )
I have compared this with Array.map and, while it is faster and uses 70% of the CPU on my laptop, still does not seem to use the whole processing power. I have run it on a machine with many more cores ( ~50) and it barely increases the CPU usage (it gets up to 3/4% of total usage with 50ish independent inputs). I think there must be a deadlock generated somewhere, but how can I detect and get rid of it?
Also, why does this happen? One of the advantages of functional programming, as I see it, is also to be able to parallelize easily.
PS: SeqExtension.takeWhileWithLast is a function I have found on SO, kindly provided by Tomas Petricek in one of his brilliant answers, if needed I can post it.
PPS: env is the environment, whose type is defined as:
type Environment<'S, 'A ,'I> = |Environment of (State<'S> -> Action<'A> -> EnvironmentOutput<'S ,'I>)
I have tried the same with Async.Parallel and ParallelSeq, reporting the same problem.
Would a message-based solution solve the problem>? I am looking into it, although I am not familiar at all, but would it be a good way of getting the code parallel, using MailboxProcessor?
Following my question,
I have tried also this great library for parallel code, based on streams of data. https://nessos.github.io/Streams/.
I have added the following code:
let nessosResult = inputsStrategies
|> ParStream.ofArray
|> ParStream.map simulateStrategy
|> ParStream.toArray
I have defined an ad hoc type for inputStrategy (basic the old tuple I had) so that simulateStrategy accepts only one input. Unfortunately the problem seems very well hidden somewhere. I attach a graph with CPU usage. Time spent on my machine for the different cases is: ~8.8 sec (sequential); ~6.2 sec (Array.Parallel.map); ~ 6.1 sec (Nessos.Streams)
I have found that server garbage collection is necessary to get the best parallel performance on .NET. Something like this in your app.config:
<configuration>
<runtime>
<gcServer enabled="true" />
</runtime>
</configuration>

Visual studio 2017 problem with namespaces in F#

I wrote this in HelperFunctions.fs:
namespace Tutorial1.HelperFunctions
module Factorials =
let rec fact n =
match n with
| 0 -> 1
| 1 -> 1
| _ -> n * fact (n - 1)
And then this in Tutorial.fsx:
#load "HelperFunctions.fs"
open Tutorial1.HelperFunctions
module start =
let x = Factorials.fact 5
printfn "%d" x
Code compiles and returns 120 as expected BUT: VS throws FS0039 error: Factorials and Tutorial1 namespace, type or module not defined... Tried many other combinations of open, module etc but then codes does not even compile. What is the problem I am not seeing here?
Okay, apparently the order of files in the vstudio matters, even if you include the file with #load. I had to shift the files upwards and it worked

F# Type Provider slows down intellisense in Visual Studio 2017

I have a very simple type provider; all types are erased, the provided type has 2000 int readonly properties Tag1..Tag2000
let ns = "MyNamespace"
let asm = Assembly.GetExecutingAssembly()
let private newProperty t name getter isStatic = ProvidedProperty(name, t, getter, isStatic = isStatic)
let private newStaticProperty t name getter = newProperty t name (fun _ -> getter) true
let private newInstanceProperty t name getter = newProperty t name (fun _ -> getter) false
let private addStaticProperty t name getter (``type``:ProvidedTypeDefinition) = ``type``.AddMember (newStaticProperty t name getter); ``type``
let private addInstanceProperty t name getter (``type``:ProvidedTypeDefinition) = ``type``.AddMember (newInstanceProperty t name getter); ``type``
[<TypeProvider>]
type TypeProvider(config : TypeProviderConfig) as this =
inherit TypeProviderForNamespaces(config)
let provider = ProvidedTypeDefinition(asm, ns, "Provider", Some typeof<obj>, hideObjectMethods = true)
let tags = ProvidedTypeDefinition(asm, ns, "Tags", Some typeof<obj>, hideObjectMethods = true)
do [1..2000] |> Seq.iter (fun i -> addInstanceProperty typeof<int> (sprintf "Tag%d" i) <## i ##> tags |> ignore)
do provider.DefineStaticParameters([ProvidedStaticParameter("Host", typeof<string>)], fun name args ->
let provided = ProvidedTypeDefinition(asm, ns, name, Some typeof<obj>, hideObjectMethods = true)
addStaticProperty tags "Tags" <## obj() ##> provided |> ignore
provided
)
do this.AddNamespace(ns, [provider; tags])
Then a test project with two modules in separate files:
module Common
open MyNamespace
type Provided = Provider<"">
let providedTags = Provided.Tags
type LocalTags() =
member this.Tag1 with get() : int = 1
member this.Tag2 with get() : int = 2
.
.
member this.Tag1999 with get() : int = 1999
member this.Tag2000 with get() : int = 2000
let localTags = LocalTags()
module Tests
open Common
open Xunit
[<Fact>]
let ProvidedTagsTest () =
Assert.Equal<int>(providedTags.Tag1001, 1001)
[<Fact>]
let LocalTagsTest () =
Assert.Equal<int>(localTags.Tag100, 100)
Everything works as expected (tests execution included). The problem I have is with the design time behavior inside Visual Studio, while I write code. I expect to have some overhead due to the type provider, but the slowness seems frankly excessive. The times reported below are in seconds and refer to the time measured from pushing the dot (.) key until the intellisense property list appears on the screen
providedTags. -> 15
localTags. -> 5
If I comment out or remove the first test code lines (so to eliminate any references to the provided stuff), then I get
localTags. -> immediate
If the number of properties is greater, the time seems to increase exponentially, not linearly, so that at 10000 it becomes minutes.
Questions are:
Is this normal or am I doing something wrong?
Are there guidelines to achieve a faster response?
If someone is curious about why I need so many properties, I am trying to supply an instrument to data analysts so that they can write F# scripts and get data out of an historian database with more than 10000 tags in its schema.
Issue has been fixed by Don Syme, see
https://github.com/fsprojects/FSharp.TypeProviders.SDK/issues/220
and
https://github.com/fsprojects/FSharp.TypeProviders.SDK/pull/229

How to do test-driven development?

I think everything is in the title but I am looking specifically for:
What is the "standard" unit test framework in OCaml?
How do I integrate executing tests in the build?
How to automatically execute tests upon every file change?
As a bonus, I would be interested in test coverage tooling...
It seems that the package ounit enjoys quite a large popularity, there are several other packages like kaputt or broken – I am the author of the latter.
I guess you are interested as the specific part of TDD where tests can be automated, here is how I do it on my own projects. You can find a few examples on GitHub such as Lemonade or Rashell that both have a test suite found in their respective testsuite folders.
Usually I work according to the according workflow:
I start to work simultaneously on tests and interface (.mli) files, this way I write a minimal program and do not only write a test case for the functions I want to implement but also have the opportunity to experiment with interfaces to be sure that I have an easy-to-use interface.
For instance, for the interface to the find(1) command found in Rashell_Posix I started by writing test cases:
open Broken
open Rashell_Broken
open Rashell_Posix
open Lwt.Infix
let spec base = [
(true, 0o700, [ base; "a"]);
(true, 0o750, [ base; "a"; "b"]);
(false, 0o600, [ base; "a"; "b"; "x"]);
(false, 0o640, [ base; "a"; "y" ]);
(true, 0o700, [ base; "c"]);
(false, 0o200, [ base; "c"; "z"]);
]
let find_fixture =
let filename = ref "" in
let cwd = Unix.getcwd () in
let changeto base =
filename := base;
Unix.chdir base;
Lwt.return base
in
let populate base =
Toolbox.populate (spec base)
in
make_fixture
(fun () ->
Lwt_main.run
(Rashell_Mktemp.mktemp ~directory:true ()
>>= changeto
>>= populate))
(fun () ->
Lwt_main.run
(Unix.chdir cwd;
rm ~force:true ~recursive:true [ !filename ]
|> Lwt_stream.junk_while (fun _ -> true)))
let assert_find id ?expected_failure ?workdir predicate lst =
assert_equal id ?expected_failure
~printer:(fun fft lst -> List.iter (fun x -> Format.fprintf fft " %S" x) lst)
(fun () -> Lwt_main.run(
find predicate [ "." ]
|> Lwt_stream.to_list
|> Lwt.map (List.filter ((<>) "."))
|> Lwt.map (List.sort Pervasives.compare)))
()
lst
The spec and find_fixture functions are used to create a file hierarchy with the given names and permissions, to exercise the find function. Then the assert_find function prepares a test-case comparing the results of a call to find(1) with the expected results:
let find_suite =
make_suite ~fixture:find_fixture "find" "Test suite for find(1)"
|& assert_find "regular" (Has_kind(S_REG)) [
"./a/b/x";
"./a/y";
"./c/z";
]
|& assert_find "directory" (Has_kind(S_DIR)) [
"./a";
"./a/b";
"./c"
]
|& assert_find "group_can_read" (Has_at_least_permission(0o040)) [
"./a/b";
"./a/y"
]
|& assert_find "exact_permission" (Has_exact_permission(0o640)) [
"./a/y";
]
Simultaneously I was writing on the interface file:
(** The type of file types. *)
type file_kind = Unix.file_kind =
| S_REG
| S_DIR
| S_CHR
| S_BLK
| S_LNK
| S_FIFO
| S_SOCK
(** File permissions. *)
type file_perm = Unix.file_perm
(** File status *)
type stats = Unix.stats = {
st_dev: int;
st_ino: int;
st_kind: file_kind;
st_perm: file_perm;
st_nlink: int;
st_uid: int;
st_gid: int;
st_rdev: int;
st_size: int;
st_atime: float;
st_mtime: float;
st_ctime: float;
}
type predicate =
| Prune
| Has_kind of file_kind
| Has_suffix of string
| Is_owned_by_user of int
| Is_owned_by_group of int
| Is_newer_than of string
| Has_exact_permission of int
| Has_at_least_permission of int
| Name of string (* Globbing pattern on basename *)
| And of predicate list
| Or of predicate list
| Not of predicate
val find :
?workdir:string ->
?env:string array ->
?follow:bool ->
?depthfirst:bool ->
?onefilesystem:bool ->
predicate -> string list -> string Lwt_stream.t
(** [find predicate pathlst] wrapper of the
{{:http://pubs.opengroup.org/onlinepubs/9699919799/utilities/find.html} find(1)}
command. *)
Once I was pleased with my test-cases and interfaces, I could try to compile them, even without an implementation. This is possible with bsdowl by just giving an interface file instead of an implementation file in the Makefile.
Here compilation probably uncovered a few type errors in my tests that I could fix.
When the test compiled against the interface, I could implement the function, starting with an alibi function:
let find _ =
failwith "Rashell_Posix.find: Not implemented"
With this implementation I was able to compile my library and my test-suite. Of-course at this point, the test just fails.
At that point, I just needed to implement the Rashell_Posix.find function and iterate the tests until they passed.
This is how I do test-driven development in OCaml when I use automated tests. Some persons see interacting with the REPL as a form of test-driven development, this is a technique that I also like to use, it is rather straightforward to setup and use. The only setup step to use this latter form of test-driven-development in Rashell was to write an .ocamlinit file for the toplevel loading all the required libraries. This file looks like:
#use "topfind";;
#require "broken";;
#require "lemonade";;
#require "lwt.unix";;
#require "atdgen";;
#directory "/Users/michael/Workshop/rashell/src";;
#directory "/Users/michael/obj/Workshop/rashell/src";;
The two #directory directives correspond to the directories for sources and objects.
(Disclaimer: if you look carefully at the history, you will find that I took some liberties with the chronology, but there are other projects where I proceed exactly this way – I just cannot remember precisely which ones.)

How to achieve Asynchrony instead of Parallelism in F#

(Sticking to a common example with async fetch of many web pages)
How would I spin off multiple (hundreds) of web page requests asynchronously, and then wait for all requests to complete before going to the next step? Async.AsParallel processes a few requests at a time, controlled by number of cores on the CPU. Grabbing a web page is not a CPU-bound operation. Not satisfied with the speedup of Async.AsParallel, I am looking for alternatives.
I tried to connect the dots between Async.StartAsTask and Task[].WaitAll. Instinctively, I wrote the following code, but it does not compile.
let processItemsConcurrently (items : int seq) =
let tasks = items |> Seq.map (fun item -> Async.StartAsTask(fetchAsync item))
Tasks.Task.WaitAll(tasks)
How would you approach this?
Async.Parallel is almost definitely right here. Not sure what you're not happy with; the strength of F# asyncs lies more in async computing than in task-parallel CPU-bound stuff (which is more tailored to Tasks and the .NET 4.0 TPL). Here's a full example:
open System.Diagnostics
open System.IO
open System.Net
open Microsoft.FSharp.Control.WebExtensions
let sites = [|
"http://bing.com"
"http://google.com"
"http://cnn.com"
"http://stackoverflow.com"
"http://yahoo.com"
"http://msdn.com"
"http://microsoft.com"
"http://apple.com"
"http://nfl.com"
"http://amazon.com"
"http://ebay.com"
"http://expedia.com"
"http://twitter.com"
"http://reddit.com"
"http://hulu.com"
"http://youtube.com"
"http://wikipedia.org"
"http://live.com"
"http://msn.com"
"http://wordpress.com"
|]
let print s =
// careful, don't create a synchronization bottleneck by printing
//printf "%s" s
()
let printSummary info fullTimeMs =
Array.sortInPlaceBy (fun (i,_,_) -> i) info
// for i, size, time in info do
// printfn "%2d %7d %5d" i size time
let longest = info |> Array.map (fun (_,_,time) -> time) |> Array.max
printfn "longest request took %dms" longest
let bytes = info |> Array.sumBy (fun (_,size,_) -> float size)
let seconds = float fullTimeMs / 1000.
printfn "sucked down %7.2f KB/s" (bytes / 1024.0 / seconds)
let FetchAllSync() =
let allsw = Stopwatch.StartNew()
let info = sites |> Array.mapi (fun i url ->
let sw = Stopwatch.StartNew()
print "S"
let req = WebRequest.Create(url)
use resp = req.GetResponse()
use stream = resp.GetResponseStream()
use reader = new StreamReader(stream,
System.Text.Encoding.UTF8, true, 4096)
print "-"
let contents = reader.ReadToEnd()
print "r"
i, contents.Length, sw.ElapsedMilliseconds)
let time = allsw.ElapsedMilliseconds
printSummary info time
time, info |> Array.sumBy (fun (_,size,_) -> size)
let FetchAllAsync() =
let allsw = Stopwatch.StartNew()
let info = sites |> Array.mapi (fun i url -> async {
let sw = Stopwatch.StartNew()
print "S"
let req = WebRequest.Create(url)
use! resp = req.AsyncGetResponse()
use stream = resp.GetResponseStream()
use reader = new AsyncStreamReader(stream, // F# PowerPack
System.Text.Encoding.UTF8, true, 4096)
print "-"
let! contents = reader.ReadToEnd() // in F# PowerPack
print "r"
return i, contents.Length, sw.ElapsedMilliseconds })
|> Async.Parallel
|> Async.RunSynchronously
let time = allsw.ElapsedMilliseconds
printSummary info time
time, info |> Array.sumBy (fun (_,size,_) -> size)
// By default, I think .NET limits you to 2 open connections at once
ServicePointManager.DefaultConnectionLimit <- sites.Length
for i in 1..3 do // to warmup and show variance
let time1,r1 = FetchAllSync()
printfn "Sync took %dms, result was %d" time1 r1
let time2,r2 = FetchAllAsync()
printfn "Async took %dms, result was %d (speedup=%2.2f)"
time2 r2 (float time1/ float time2)
printfn ""
On my 4-core box, this consistently gives a nearly 4x speedup.
EDIT
In reply to your comment, I've updated the code. You're right in that I've added more sites and am not seeing the expected speedup (still holding steady around 4x). I've started adding a little debugging output above, will continue investigating to see if something else is throttling the connections...
EDIT
Editted the code again. Well, I found what might be the bottleneck. Here's the implementation of AsyncReadToEnd in the PowerPack:
type System.IO.StreamReader with
member s.AsyncReadToEnd () =
FileExtensions.UnblockViaNewThread (fun () -> s.ReadToEnd())
In other words, it just blocks a threadpool thread and reads synchronously. Argh!!! Let me see if I can work around that.
EDIT
Ok, the AsyncStreamReader in the PowerPack does the right thing, and I'm using that now.
However, the key issue seems to be variance.
When you hit, say, cnn.com, a lot of the time the result will come back in like 500ms. But every once in a while you get that one request that takes 4s, and this of course potentially kills the apparent async perf, since the overall time is the time of the unluckiest request.
Running the program above, I see speedups from about 2.5x to 9x on my 2-core box at home. It is very highly variable, though. It's still possible there's some bottleneck in the program that I've missed, but I think the variance-of-the-web may account for all of what I'm seeing at this point.
Using the Reactive Extensions for .NET combined with F#, you can write a very elegant solution - check out the sample at http://blog.paulbetts.org/index.php/2010/11/16/making-async-io-work-for-you-reactive-style/ (this uses C#, but using F# is easy too; the key is using the Begin/End methods instead of the sync method, which even if you can make it compile, it will block up n ThreadPool threads unnecessarily, instead of the Threadpool just picking up completion routines as they come in)
My bet is that the speedup you're experiencing is not significant enough for your taste because you're either using a subtype of WebRequest or a class relying on it (such as WebClient).
If that's the case, you need to set the MaxConnection on the ConnectionManagementElement (and I suggest you only set it if needed otherwise it's gonna become a pretty time-consuming operation) to a high value, depending on the number of simultaneous connections you wanna initiate from your application.
I'm not an F# guy, but from a pure .NET perspective what you're looking for is TaskFactory::FromAsync where the asynchronous call you'd be wrapping in a Task would be something like HttpRequest::BeginGetResponse. You could also wrap up the EAP model that WebClient exposes using a TaskCompletionSource. More on both of these topics here on MSDN.
Hopefully with this knowledge you can find the nearest native F# approach to accomplish what you're trying to do.
Here's some code that avoids the unknowns, such as web access latency. I am getting under 5% CPU utilization, and about 60-80% efficiency for both sync and async code paths.
open System.Diagnostics
let numWorkers = 200
let asyncDelay = 50
let main =
let codeBlocks = [for i in 1..numWorkers ->
async { do! Async.Sleep asyncDelay } ]
while true do
printfn "Concurrent started..."
let sw = new Stopwatch()
sw.Start()
codeBlocks |> Async.Parallel |> Async.RunSynchronously |> ignore
sw.Stop()
printfn "Concurrent in %d millisec" sw.ElapsedMilliseconds
printfn "efficiency: %d%%" (int64 (asyncDelay * 100) / sw.ElapsedMilliseconds)
printfn "Synchronous started..."
let sw = new Stopwatch()
sw.Start()
for codeBlock in codeBlocks do codeBlock |> Async.RunSynchronously |> ignore
sw.Stop()
printfn "Synchronous in %d millisec" sw.ElapsedMilliseconds
printfn "efficiency: %d%%" (int64 (asyncDelay * numWorkers * 100) / sw.ElapsedMilliseconds)
main

Resources