Is it safe to run two instances of `stack` at the same time? - haskell-stack

See title.
It would be nice if I could run multiple instances of stack at the same time. It would allow for some nice parallelisation.
I do not know which commands I want to run beforehand, so I can not just merge the commands and have stack figure out how to do it in parallel.
If this is not possible. Is it in the scope of Stack?

Running multiple instances of stack concurrently is supposed to be possible and safe.
I know that parts of ~/.stack are protected by file locks – the project state in my-proj/.stack-work should get that in the future.

Related

Spark Performance Tuning Question - Resetting all caches for performance testing

I'm currently working on performance & memory tuning for a spark process. As part of this I'm performing multiple runs of different versions of the code and trying to compare their results side by side.
I've got a few questions to ask, so I'll post each separately so they can be addressed separately.
Currently, it looks like getOrCreate() is re-using the Spark Context each run. This is causing me two problems:
Caching from one run may be affecting the results of future runs.
All of the tasks are bundled into a single 'job', and I have to guess at which tasks correspond to which test run.
I'd like to ensure that I'm properly resetting all caches in each run to ensure that my results are comparable. I'd also ideally like some way of having each run show up as a separate job in the local job history server so that it's easier for me to compare.
I'm currently relying on spark.catalog.clearCache() but not sure if this is covering all of what I need. I'd also like a way to ensure that the tasks for each job run are clearly grouped in some fashion for comparison so I can see where I'm losing time and hopefully see total memory used by each run as well (this is one thing I'm currently trying to improve on).

OpenMDAO External Code Component with mpi

I am trying to optimize an airfoil using openMDAO and SU2. I have multiple Designpoints that i want to run in parallel. I managed to do that with a "Parallel Group" and XFoil. But i now want to use SU2 instead of XFoil.
The Big Problem is, SU2 by itself, is started by MPI (mpirun-np 4 SU2_CFD config.cfg). Now i want openMDAO to divide all the available processes evenly to all DesignPoints. And then run one SU2 instance per Designpoint. Every SU2 instance should then use all the processes that openMDAO allocated to that DesginPoint.
How could i do that?
Probably wrong approach:
I played around with the external-code component. But if this component gets 2 processes, it is run twice. I dont want to run SU2 twice. I want to run it once, but using both available processes.
Best Regards
David
I don't think your approach to wrapping SU2 is going to work, if you want to run it in parallel as part of a larger model. ExternalCodeComp is designed for file-wrapping and spawns sub-processes, which doesn't give you any way to share MPI communicators with the parent process (that I know of anyway).
Im not an expert in SU2, so I can't speak to their python interface. But Im quite confident that ExternalCodeComp isn't going to give you what you want here. I suggest you talk to the SU2 developers to discuss their in-memory interface.
I couldn't figure out a simple way. But I discorvered ADflow: https://github.com/mdolab/adflow.
It is a CFD-Solver that comes shipped with an OpenMDAO-Wrapper. So I am going to use that.

Driving motors with image processing on raspberry pi

I have a question about processing the image while driving a motor. I did some researches, probably I need to use multiprocessing. However, I couldn't find out how to run two processors together.
Let's say I have two functions as imageProcessing() and DrivingMotor(). With coming information from imageProcessing(), I need to update my DrivingMotor() function simultaneously. How can I handle this issue?
In multiprocessing, you must create two process(process means program in execution) and must implement interproccesing communication methods to communicate process each other, this is tedious,hard and inefficient way .Multiproccesing less efficient than multithreading.Therefore I think you should multithread ,it is very efficient way ,communication between thread is very easy, you can use global data for communication.
You shall create two threads, one thread is handle imageProcessing() ,and other thread DrivingMotor().Operating system handled execution of thread,Operating system run synchronous these threads.
there is basic tutorial for multithreading below links
https://www.tutorialspoint.com/python/python_multithreading.htm

Is it suitable to use MPI_Comm_split when assigning different jobs to different group?

I'm writing a MPI program where all processes are divided into two groups. Each group does different jobs. For example, processes of group A do some computation and communicate with each other, while processes of group B do nothing. Should I use MPI_Comm_split there?
I'd prefer to add a comment but I'm new to stack overflow so don't have sufficient reputation ...
As already mentioned, sub-communicators are essential if you want to call collectives. Even without that, they'd be recommended as they'll make development easier. For example, if you try and send a message outside of group A then this will fail with a sub-communicator, but could cause your code to hang/misbehave if everyone stays in COMM_WORLD.
However, I would be very careful of going down the MPMD route as it may not be portable between systems and makes launching the program more complicated. Having a single MPI executable is the standard and simplest model.

What's the best erlang approach to being able to identify a processes identity from its process id?

When I'm debugging, I'm usually looking at about 5000 processes, each of which could be one of about 100 gen_servers, fsms, etc. If I want to know WHAT an erlang process is, I can do:
process_info(pid(0,1,0), initial_call).
And get a result like:
{initial_call,{proc_lib,init_p,5}}
...which is all but useless.
More recently, I hit upon the idea (brace yourselves) of registering each process with a name that told me WHO that process represented. For example, player_1150 is the player process that represents player 1150. Yes, I end up making a couple million atoms over the course of a week-long run. (And I would love to hear comments on the drawbacks of boosting the limit to 10,000,000 atoms when my system runs with about 8GB of real memory unused, if there are any.) Doing this meant that I could, at the console of a live system, query all processes for how long their message queue was, find the top offenders, then check to see if those processes were registered and print out the atom they were registered with.
I've hit a snag with this: I'm moving processes from one node to another. Now a player process can have 3 different names; player_1158, player_1158_deprecating, player_1158_replacement. And I have to make absolutely sure I register and unregister these names with precision timing to make sure that a process is always named and that the appropriate names always exist, AND that I don't try to register a name that some dying process already holds. There is some slop room, since this is only used for console debugging of a live system Nonetheless, the moment I started feeling like this mechanism was affecting how I develop the system (the one that moves processes around) I felt like it was time to do something else.
There are two ideas on the table for me right now. An ets tables that associates process ids with their description:
ets:insert(self(), {player, 1158}).
I don't really like that one because I have to manually keep the tables clean. When a player exits (or crashes) someone is responsible for making sure that his data are removed from the ets table.
The second alternative was to use the process dictionary, storing similar information. When my exploration of a live system led me to wonder who a process is, I could just look at his process dictionary using process_info.
I realize that none of these solutions is functionally clean, but given that the system itself is never, EVER the consumer of these data, I'm not too worried about it. I need certain debugging tools to work quickly and easily, so the behavior described is not open for debate. Are there any convincing arguments to go one way or another (other than the academic "don't use the _, it's evil" canned garbage?) I'd be happy to hear other suggestions and their justifications.
You should try out gproc, it's a very convenient application for keeping process metadata.
A process can be registered with several names and you can associate arbitrary properties to a process (where the key and value can be any erlang term). Also gproc monitors the registered processes and unregisters them automatically if they crash.
If you're debugging gen_servers and gen_fsms while they're still running, I would implement the handle_info functions for these behaviors. When you send each process a {get_info, ReplyPid} tuple, the process in question can send back a term describing its own state, what it is, etc. That way you don't have to keep track of this information outside of the process itself.
Isac mentions there is already a built in way to do this

Resources