Alea doesnt release memory correctly - aleagpu

it seems that Alea is not disposing class DeviceMemory2D correctly
my code for checking free memory
let getFreeMemory () =
let free = Marshal.AllocHGlobal sizeof<uint64>
let total = Marshal.AllocHGlobal sizeof<uint64>
let freePtr = NativeInterop.NativePtr.ofNativeInt<nativeint> free
let totalPtr = NativeInterop.NativePtr.ofNativeInt<nativeint> total
CUDAInterop.cuMemGetInfo(freePtr, totalPtr)
|> cuSafeCall
let result = NativeInterop.NativePtr.get freePtr 0
Marshal.FreeHGlobal free
Marshal.FreeHGlobal total
uint64 result
And here's the code that checks whether there's any leak
Seq.init 100 (fun _ ->
use arr = Gpu.Default.AllocateDevice<float>(1000,1000)
getFreeMemory()
|> printfn "%d"
)
|> Seq.iter id
From the output image, the memory doesn't seem to be disposed

Can you try the latest beta version: https://www.nuget.org/packages/Alea/3.0.4-beta3

Related

How to use SetDisplayConfig (windows-rs) to force screen re-detection?

I am using windows-rs (Latest version from GitHub, because it contains some fixes the stable version on Crates.io doesn't have).
My goal is to develop a small software that automatically forces the screen to be re-detected and set to the highest resolution (It's for a school with a weird setup where teachers have to turn projectors on before the PC for resolutions to get detected, but often forget that, leading the PCs to have a very low resolution, and the higher resolutions not being detected).
For re-initializing the screen, I have the following function:
// Some imports may be unused here, I haven't checked them yet, the full file has more functions
use windows::Win32::Graphics::Gdi::{ChangeDisplaySettingsA, EnumDisplaySettingsA, DEVMODEA, SDC_FORCE_MODE_ENUMERATION, SDC_APPLY, SDC_SAVE_TO_DATABASE, SDC_USE_SUPPLIED_DISPLAY_CONFIG, QDC_ALL_PATHS};
use windows::Win32::Media::Audio::Endpoints::IAudioEndpointVolume;
use windows::Win32::Media::Audio::{IMMDeviceEnumerator, MMDeviceEnumerator};
use windows::Win32::Devices::Display::{GetDisplayConfigBufferSizes, QueryDisplayConfig, SetDisplayConfig, DISPLAYCONFIG_TOPOLOGY_ID};
use windows::core::GUID;
use windows::Win32::System::Com::{CoInitialize, CoCreateInstance, CLSCTX_ALL};
// Forces Windows to reinit display settings
pub fn force_reinit_screen() -> i32 {
let mut path_count = 0;
let mut mode_count = 0;
let result = unsafe { GetDisplayConfigBufferSizes(QDC_ALL_PATHS, &mut path_count, &mut mode_count) };
println!("GetDisplayConfigBufferSizes returned {}", result);
let mut path_array = Vec::with_capacity(path_count as usize);
let mut mode_array = Vec::with_capacity(mode_count as usize);
let result = unsafe {
QueryDisplayConfig(
QDC_ALL_PATHS,
&mut path_count,
path_array.as_mut_ptr(),
&mut mode_count,
mode_array.as_mut_ptr(),
::core::mem::transmute(::core::ptr::null::<DISPLAYCONFIG_TOPOLOGY_ID>()),
)
};
println!("QueryDisplayConfig returned {}", result);
let flags = SDC_FORCE_MODE_ENUMERATION | SDC_APPLY | SDC_USE_SUPPLIED_DISPLAY_CONFIG | SDC_SAVE_TO_DATABASE;
let result = unsafe { SetDisplayConfig(Some(&path_array), Some(&mode_array), flags) };
result
}
However, it does not work on any computer I've tried this on (Returns code 87, which seems to mean bad parameters). What am I doing wrong?

Visual Studio - Breakpoints not being hit with Array.map

In the code below ( which is part of a function with = task{ )
Breakpoint 3 is being hit - and breakpoint 1 and 2 are not. the code is working ( i.e. the code is running in the Array.map and produces 29 rows but I can no longer debug that part of the code ). They were breakking and then they just stopped all of a sudden. Tried resetting settings, rebooting machine etc. Any ideas? All the builds are debug / Any CPU
let rows =
data.Rows
|> Seq.toArray
|> Array.map(fun row ->
let transactionDateString = row[0] + " " + row[1] //Breakpoint 1
let ttt = TimeZoneInfo.ConvertTime(DateTime.Parse(transactionDateString, CultureInfo.CreateSpecificCulture("en-AU")),TimeZoneInfo.FindSystemTimeZoneById("E. Australia Standard Time"),TimeZoneInfo.Local)
let transactionDateTime = DateTime.ParseExact(transactionDateString, "dd/MM/yyyy HH:mm:ss", CultureInfo.InvariantCulture).AddHours(2)
let ttt2 = ttt.ToLocalTime()
let transactionNumber = row.Columns[2].ToString() // [ "TRXN Reference-Eway"]
let t2 = int transactionNumber // Breakpoint 2
let transactionType = row[3].ToString()
let transactionCurrency = currency
let csvData: EwayCSVData = {
transactionDateTime = transactionDateTime
transactionNumber = transactionNumber
transactionType = transactionType
currency = transactionCurrency
}
csvData)
do! EwayTransactions.saveTransactions(rows,sTransCollection, httpClient) // Breakpoint 3

F# | How to manage WebSocketClient ReceiveAsync on multithread scenario?

Looking for WebSocketClient example I only found simple example with a single request/response scenario.
Kind of:
type WSClientSimple (url) =
let ws = new ClientWebSocket()
let lockConnection = Object()
let connect() =
lock lockConnection ( fun () ->
if not (ws.State = WebSocketState.Open) then
ws.ConnectAsync(Uri(url), CancellationToken.None)
|> Async.AwaitTask |> Async.RunSynchronously // await
else ()
)
let receive () =
lock lockConnection ( fun () ->
let rec readStream finalText endOfMessage =
let buffer = ArraySegment(Array.zeroCreate<byte> 1024)
let result = ws.ReceiveAsync(buffer, CancellationToken.None) |> Async.AwaitTask |> Async.RunSynchronously
let text = finalText + Encoding.UTF8.GetString (buffer.Array |> Array.take result.Count)
if result.EndOfMessage then text
else readStream text true
readStream "" false
)
let sendRequest jsonMessage =
let bytes = Encoding.UTF8.GetBytes(jsonMessage:string)
let bytesMessage = ArraySegment(bytes, 0, bytes.Length)
if not (ws.State = WebSocketState.Open) then
connect()
// send request...
ws.SendAsync(bytesMessage, WebSocketMessageType.Text, true, CancellationToken.None) |> Async.AwaitTask |> Async.RunSynchronously
// ... read response
receive()
member this.SendRequest request = sendRequest request
Obviously it works with:
[<Test>]
member this.``Receive sequentially`` () =
let client = WSClientSimple("url")
for i in 1..100 do
client.SendRequest "aaa" |> ignore
and also (thanks to the orrible lock) with multiple thread using the same Client:
[<Test>]
member this.``Receive parallel on same client`` () =
let client = WSClientSimple("url")
for _ in 1..100 do
async {
client.SendRequest "aaa" |> ignore
} |> Async.Start
Now, if I really want to get the beast from WebSocket "duplex" cpmmunication I would continuosly read from the socket, send requests without any block, and distribute the received messages to the right call.
So, this is an ongoing receive function that collect all the inbound messages.
type WSClientTest2 (url:string) =
let onMessageReceived = new Event<string>()
let responseMessage = new Event<ResponseMessage>()
let receivedMesasages = System.Collections.Concurrent.ConcurrentQueue<ResponseMessage>()
let responseCallbacks = Map.empty<int, (string -> unit)>
let manageMessage (message:string) =
match message.Split(':') with
| [|id;message|] ->
responseMessage.Trigger {Id=int(id);Message=message}
receivedMesasages.Enqueue {Id=int(id);Message=message}
| _ -> ()
let startReceiving() =
let mutable counter = 1
async {
// simulate receiving from a WebSocket
while true do
System.Threading.Tasks.Task.Delay 100 |> Async.AwaitTask |> Async.RunSynchronously
onMessageReceived.Trigger (sprintf "message %d" counter)
manageMessage (sprintf "%d:message" counter)
counter <- counter + 1
} |> Async.Start
do
startReceiving()
How can I send a request and wait for the correlated response message?
This is my try:
let mutable requestId = 0
let sendRequest message: string =
let requestId = requestId+1
let received = new Event<string>()
let receivedCall = fun (msg:string) ->
received.Trigger msg
responseCallbacks.Add(requestId, receivedCall) |> ignore
let cancel = fun () -> failwith "Timeout"
async {
System.Threading.Thread.Sleep 500 // wait x seconds
cancel()
} |> Async.Start
// simulate send/receive messsage after some time
let generateRequest () =
System.Threading.Thread.Sleep 100 // wait x time for the response
responseMessage.Trigger {Id=requestId; Message=message}
generateRequest()
Async.AwaitEvent(received.Publish, cancel)
|> Async.RunSynchronously
Async.AwaitWaitHandle seems the right thing to use but I don't know how to create a WaitHandle.
I'm using Async.AwaitEvent but it seems not to work.
The cancel() is always called but it does not raise any Exception!
What could be a proper way to wait for an Event while executing a function and then check and return its content?
I also tried to use a Map<id, response> populatd with any inbound message but still I don't know how to "wait" for the proper message and also it probably requires a check for orphan response messages (add complexity).
More in general, if the resulting code is so crappy I would prefer to use a simple API for this Request/Response scenario and use the WebSocket only for a realtime update.
I'm looking for a nice solution, otherwise I think it is not really worth for the sake of performance, not for my needs.

Why don't I get output on OSX with this F# code but I do on Windows

I'm trying to execute the following F# script code on my macbook pro using FSI in visual studio code and the ionide plugin.
#r "packages/Newtonsoft.Json.9.0.1/lib/net40/Newtonsoft.Json.dll"
#r "System.Net.Http"
open System
open System.Net.Http
open Newtonsoft.Json
let client = new HttpClient()
type AlbumInfo = { userId:int; id:int; title:string }
let url = "https://jsonplaceholder.typicode.com/albums/1"
async {
let! res = Async.AwaitTask <| client.GetAsync(url)
let! content = Async.AwaitTask <| res.Content.ReadAsStringAsync()
let x = JsonConvert.DeserializeObject<AlbumInfo>(content)
printfn "%s" x.title
} |> Async.Start
printfn "Please wait..."
But I don't get any output apart from Please wait.... However, when I put https://jsonplaceholder.typicode.com/albums/1 into the browser I get the expected Json response. So I know there's no problem reaching the API.
Also, when I run the same code in Visual Studio 2013 on my Windows 10 PC. The code produces the expected result. i.e. Please wait... and the title of the album.
Any ideas why it doesn't work correctly on my macbook?
In Visual Studio there is a process hosting FSI and keeping the thread (pool) for the async computation alive. In FSI on the command line or VS Code, FSI will just terminate as soon as the main thread has finished writing Please wait... (which typically is before the computation was even started on the thread pool).
If you want to observe the side effects of an async computation you have to await its result (in this example unit):
let computation = async {
printfn "Starting async"
let! res = Async.AwaitTask <| client.GetAsync(url)
let! content = Async.AwaitTask <| res.Content.ReadAsStringAsync()
let x = JsonConvert.DeserializeObject<AlbumInfo>(content)
printfn "Downloaded %s" x.title
}
async {
let! started = computation |> Async.StartChild
let! _ = Async.Sleep 1 // only here to get interleaved ouput
printfn "Please wait..."
let! res = started
printfn "Got result %A" res
} |> Async.RunSynchronously
will likely print:
Starting async
Please wait...
Downloaded quidem molestiae enim
Got result <null>

F# Akka.NET agents performance optimization when synchronizing

I am trying to solve the following problem. I have some agents running in real-time, with a large heartbeat of couple of milliseconds, and the order of operations they process for this reason are mostly deterministic (as the message processing is not the bottleneck).
Now, I am running a large amount of simulations of the system where I no longer have a heartbeat (otherwise it will take a couple of centuries) - but I need to make sure the order of operations are preserved. For this, I adopted the following solution: the simulator makes sure each agent has processed his message queue, by posting a dummy synchronization message and blocking while waiting for the answer. This does work for my application, but the time it takes is not intuitive - as a single threaded implementation would be an order of magnitude faster (I guess - x 100 ish - although I have not tested ).
I have isolated a small test that shows the issue, even trying to use another library, akka.net
type Greet =
| Greet of string
| Hello of AsyncReplyChannel<bool>
| Hello2
[<EntryPoint>]
let main argv =
let system = System.create "MySystem" <| Configuration.load()
let greeter = spawn system "greeter" <| fun mailbox ->
let rec loop() = actor {
let! msg = mailbox.Receive()
let sender = mailbox.Sender()
match msg with
| Greet who -> () // printf "Hello, %s!\n" who
| Hello2 -> sender.Tell(true)
| _ -> ()
return! loop()
}
loop()
let greeterF =
MailboxProcessor.Start
(fun inbox ->
async {
while true do
let! msg = inbox.Receive()
match msg with
| Greet who -> () // printf "Hello, %s!\n" who
| Hello reply -> reply.Reply true
| _ -> ()
}
)
let n = 1000000
let t1 = System.Diagnostics.Stopwatch()
t1.Start()
for i = 1 to n do
let rep = greeterF.PostAndReply(fun reply -> (Hello reply)) |> ignore
()
printfn "elapsed Mailbox:%A" t1.ElapsedMilliseconds
t1.Restart()
for i = 1 to n do
let res = greeter.Ask (Hello2)
let rr = res.Result
()
printfn "elapsed Akka:%A" t1.ElapsedMilliseconds
System.Console.ReadLine () |> ignore
0
Basically, both take about 10 seconds for a mere 1 millions of synchronizations - and not computation what-so-ever involved, and this is... unfortunate.
I am wondering if anybody has come across the same problem and if there is anyway to switch off the overhead forcing everything to run in a single threaded mode... something like better than de-activating all the cpus but 1 in the bios - or writing a clone of the whole system without the agents.
Any help well appreciated.
The reason for the Akka.NET version beeing slow here is how you communicate with the actor:
main process Task FutureActorRef !!ThreadPool!! greeter
Ask ---------------------->
Tell----------->
MailboxRun ----->
(greeter mailbox is empty) |
<--------------------------Tell
<--Complete task
<----------.Result
For each iteration, a TPL task will be created
Then a single message is sent to the greeter.
The main process then blocks while waiting for the response to come back.
The greeter replies back which in turn completes the task inside the FutureActorRef
Rinse and Repeat..
This design will cause Akka.NET to start and stop the greeters "mailbox run" for each message as the mailbox queue becomes empty for each iteration.
This results in threadpool schedulation for each single message that is passed.
It's a bit like entering your car, putting the pedal to the metal, then abruptly stop and step out of the car, and then repeating the procedure again.
That is just not a very effective way to travel fast.
#Aaronontheweb's suggestion will only have effect if you work out the above kinks in your code.
The mailbox needs to be able to constantly pick messages of the internal queue to work with messages in batches to achieve full throughput.
Instead, separate the producer from the consumer.
Create an actor that listens for the responses from your greeter.
And once that actor have processes your 1000000 messages, let that actor send a WorkCompleted message back to the consumer.
[Edit]
I gave it a shot myself, I don't know F# so it might not be completely idiomatic :)
open Akka
open Akka.Actor
open Akka.FSharp
type Greet =
| Greet of string
| Hello of AsyncReplyChannel<bool>
| Hello2
type Consume =
| Response
| SetSender
[<EntryPoint>]
let main argv =
let system = System.create "MySystem" <| Configuration.load()
let greeter = spawn system "greeter" <| fun mailbox ->
let rec loop() = actor {
let! msg = mailbox.Receive()
let sender = mailbox.Sender()
match msg with
| Greet who -> () // printf "Hello, %s!\n" who
| Hello2 -> sender.Tell(Response)
| _ -> ()
return! loop()
}
loop()
let consumer = spawn system "consumer" <| fun mailbox ->
let rec loop(count,sender : IActorRef) = actor {
if count = 1000000 then sender.Tell(true)
let! msg = mailbox.Receive()
match msg with
| Response -> return! loop(count+1,sender)
| SetSender -> return! loop(count,mailbox.Sender())
}
loop(0,null)
let n = 1000000
let t1 = System.Diagnostics.Stopwatch()
t1.Start()
for i = 1 to n do
greeter.Tell(Hello2,consumer)
let workdone = consumer.Ask SetSender
workdone.Wait()
printfn "elapsed Akka:%A" t1.ElapsedMilliseconds
System.Console.ReadLine () |> ignore
0
I updated your code to use a separate consumer for the actor responses and then reply back once all replies had been processed.
By doing so, your processing time is now down to 650ms on my machine.
If you want better throughput, you need to involve more actors to parallelize more.
I'm not sure if this helps in your specific scenario
Here is a slightly modified MailboxProcessor version:
module MBPAsync =
type Greet =
| Greet of string
| Hello of AsyncReplyChannel<bool>
let run n =
let timer = Stopwatch.StartNew ()
use greeter =
MailboxProcessor.Start <| fun inbox -> async {
while true do
let! msg = inbox.Receive()
match msg with
| Greet who -> () // printf "Hello, %s!\n" who
| Hello reply -> reply.Reply true
}
Async.RunSynchronously <| async {
for i = 1 to n do
do! Async.Ignore (greeter.PostAndAsyncReply Hello)
}
let elapsed = timer.Elapsed
printfn "%A" elapsed
The difference here is that this version uses PostAndAsyncReply and keeps the computation in an async workflow. On my quick test this seemed to be much faster that using PostAndReply, but YMMV.
The timings I get from the above MBP version look roughly like this:
> MBPAsync.run 1000000 ;;
00:00:02.6883486
val it : unit = ()
A comment earlier mentioned my Hopac library. Here is an optimized version using Hopac:
module Hop =
type Greet =
| Greet of string
| Hello of IVar<bool>
let run n =
let timer = Stopwatch.StartNew ()
let greeterCh = ch ()
do greeterCh >>= function
| Greet who -> Job.unit ()
| Hello reply -> reply <-= true
|> Job.forever
|> server
Job.forUpToIgnore 1 n <| fun _ ->
let reply = ivar ()
greeterCh <-- Hello reply >>.
reply
|> run
let elapsed = timer.Elapsed
printfn "%A" elapsed
The timings I get from the above Hopac version look roughly like this:
> Hop.run 1000000 ;;
00:00:00.1088768
val it : unit = ()
I'm not an F# developer, but I'm a core dev on Akka.NET. A couple of ideas for your scenario:
If you're only using a single actor for this work, you can try using a PinnedDispatcher - that way the actor runs on its own dedicated thread all the time. That will save you on unnecessary context switching overhead.
You can also set the throughput of the mailbox to be much higher for this PinnedDispatcher than the normal settings. i.e. set a throughput value of 10000 (or something) instead of the normal 25. Assuming that the contents of your mailbox grow in large bursts this should save you on mailbox synchronization overhead.
Here's what your dispatcher configuration might look like:
my-pinned-dispatcher {
type = PinnedDispatcher
throughput = 1000 #your mileage may vary
}
And then configure an actor to use it
C# Fluent Interface
var myActor = myActorSystem.ActorOf(Props.Create<FooActor>()
.WithDispatcher("my-pinned-dispatcher");
Config
akka.actor.deployment{
/greeter{
dispatcher = my-pinned-dispatcher
}
}
These are both options you can configure via HOCON in App.config or Web.config or you can use the fluent interface on the Props class to do this. Also worth noting: there's a bug with pinned dispatchers at the moment, but that should be fixed in our next maintenance release (v1.0.1,) which should be out next week.
Your mileage may vary, but this is what I would try - basically it's just designed to help reduce contention and overhead around a single actor.

Resources