In my application, I'd like to use asynchronous pipe open but use synchronous reading and writing.
Is this mixed usage supported and if yes, are there any repercussions to it? I don't find this documented and I wouldn't know what to test for. I can see that this might have performance or correctness issues that a casual test might not detect.
Related
I want to implement (in C++) a feature, using MPI, in an existing (non-MPI) application. I am thinking of using mpich-3.4.1 for this.
I am planning to create a .so file for that feature, which the original application can link to. I initially thought to have a function in the .so file that starts with an MPI_Init() and ends with MPI_Finalize() and, in between, calls all required MPI apis to do the parallel job. As part of the MPI job, the new feature makes the current application an MPI server by calling APIs like 'MPI_Open_port' and 'MPI_Comm_accept'. Other worker processes (possibly running on different machines) connect to this server, send/receive messages, and complete a heavy computation in parallel. The application then resumes its other non-mpi work.
It seems to me that Singleton MPI_INIT mechanism will be useful for this. I found the following page on Singleton Init:
https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report/node254.htm
This page says, "A high-quality implementation will allow any process (including those not started with a ``parallel application'' mechanism) to become an MPI process by calling MPI_INIT. Such a process can then connect to other MPI processes...".
However, the comments in mpich-3.4.1/src/mpi/init/init.c says, "The MPI standard does not say what a program can do before an 'MPI_INIT' or after an 'MPI_FINALIZE'. In the MPICH implementation, you should do as little as possible. In particular, avoid anything that changes the external state of the program, such as opening files, reading standard input or writing to standard output."
Based on the above comments, it seems we should not have MPI_Init(NULL, NULL) and MPI_Finalize() as part of any implementation in a library. In that case, I am thinking to have the init and finalize APIs in the original application's main function, and have rest of the API calls made from the .so file. My original application is a working large software, and may not need to execute my mpi feature at all, in some situations.
My questions are:
(1) Does it make sense to have MPI_Init(NULL, NULL) and MPI_Finalize() called in the main function of this application, and rest of the MPI functionalities in a .so file?
(2) Once MPI_Init(NULL, NULL) is called in the main, would it interfere with the normal execution of the software in any way? Would there be any performance impact on the existing application?
(3) Is there an MPI implementation that handles this better?
(4) Is MPI a good approach to handle this requirement, or other mechanisms like ZeroMQ better? In the comments made by Wesley Bland in the following link, he says that "MPI may not be right for you if you're looking for a client/server model. Yes, it's possible, but it's not really optimized for that use case and you might have better luck using a different communication mechanism". Is that true in 2022?
client relationship within MPI server
Here's my problem: I need to call multiple 3rd party methods inside an ApiController. The signature for those methods is Task DoSomethingAsync(SomeClass someData, SomeOtherClass moreData). I want those calls to continue running in the background, after the ApiController has sent the data back to the client. When DoSomethingAsync completes I want to do some logging and maybe save some data to the file system. How can I do that? I'd prefer to use the asyny/await syntax.
Great news, there is a new solution in .NET 4.5.2 called the QueueBackgroundWorkItem API. It's really simple to use:
HostingEnvironment.QueueBackgroundWorkItem(ct => DoSomething(a, b, c));
Here's an article that describes it in detail.
https://blogs.msdn.microsoft.com/webdev/2014/06/04/queuebackgroundworkitem-to-reliably-schedule-and-run-background-processes-in-asp-net/
And here's anohter article that mentions a few other approaches not mentioned in this thread.
http://www.hanselman.com/blog/HowToRunBackgroundTasksInASPNET.aspx
You almost never want to do this. It is almost always a big mistake.
ASP.NET (and most other servers) work on the assumption that it's safe to tear down your service once all requests have completed. So you have no guarantee that your logging will be done, or that your data will be written to disk. Particularly with the disk writes, it's entirely possible that your writes will be corrupted.
That said, if you are absolutely sure that you want to implement this extremely dangerous design, you can use the BackgroundTaskManager from my blog.
Update: I've written a blog series that goes into detail on a proper solution for request-extrinsic code. In summary, what you really want to do is move the request-extrinsic code out of ASP.NET. Introduce a durable queue and an independent processor; the ASP.NET controller action will place a request onto the queue, and the independent processor will read requests and execute them. This "processor" can be an Azure Function/WebJob, Win32 Service, etc.
Stephen described why starting essentially long running fire-and-forget tasks inside an ApiController is a bad idea.
Perhaps you should create a separate service to execute those fire-and-forget tasks. That service could be a different ApiController, a worker behind a queue, anything that can be hosted on its own and have an independent lifetime.
This would make management of the different task lifetimes much easier and separate the concerns of the long-running tasks from the ApiController's core responsibilities.
As pointed out by others, it is not recommended. However, whenever there is a need there is a way, so take a look at IRegisteredObject
See also
http://haacked.com/archive/2011/10/16/the-dangers-of-implementing-recurring-background-tasks-in-asp-net.aspx/
Though the question is several years old, best possible solution now is to use Singal R in this case.
https://github.com/Myrmex/signalr-notify-progress
I am going to tell the problem that I have to solve and I need some suggestions if i am in the right path.
The problem is:
I need to create a Windows Service application that receive a request and do some action. (Socket communication) This action is to execute a script (maybe in lua or perl).This script models te bussiness rules of the client, querying in Databases, making request in websites and then send a response to the client.
There are 3 mandatory requirements:
The service will receive a lot of request at the same time. So I think to use the worker's thread model.
The service must have a high throughput. I will have many of requests at the same second.
Low Latency: I must response these requests very quickly.
Every request will generate a log entries. I cant write these log entries in the physical disk at same time the scripts execute because the big I/O time. Probably I will make a queue in memory and others threds will consume this queue and write on disk.
In the future, is possible that two woker's thread have to change messages.
I have to make a protocol to this service. I was thinking to use Thrift, but i don't know the overhead involved. Maybe i will make my own protocol.
To write the windows service, i was thinking in Erlang. Is it a good idea?
Does anyone have suggestions/hints to solve this problem? Which is the better language to write this service?
Yes, Erlang is a good choice if you're know it or ready to learn. With Erlang you don't need any worker thread, just implement your server in Erlang style and you'll receive multithreaded solution automatically.
Not sure how to convert Erlang program to Windows service, but probably it's doable.
Writing to the same log file from many threads are suboptimal because requires locking. It's better to have a log-entries queue (lock-free?) and a separate thread (Erlang process?) that writes them to the file. BTW, are you sure that executing external script in another language is much faster than writing a log-record to the file?
It's doubtfully you'll receive much better performance with your own serialization library than Thrift provides for free. Another option is Google Protocol Buffers, somebody claimed that it's faster.
Theoretically (!) it's possible that Erlang solution won't provide you required performance. In this case consider a compilable language, e.g. C++ and asynchronous networking, e.g. Boost.Asio. But be ready that it's much more complicated than Erlang way.
Question:
Can I use the multiprocessing module together with gevent on Windows in an efficient way?
Scenario:
I have a gevent based Python application doing asynchronous I/O on Windows. The application is mostly I/O bound, but there are spikes of higher CPU load as well. This application would need to control a console application via its stdin and stdout. I cannot modify this console application and the user will be able to use his own custom one, only the text (line) based communication protocol is fixed.
I have a working implementation using subprocess and threads, but I would rather move the whole subprocess based communication code together with those threads into a separate process to turn the main application back to single-threaded. I plan to use the multiprocessing module for this.
Prior reading:
I have been searching the Web a lot and read some source code, so I know that the multiprocessing module is using a Pipe implementation based on named pipes on Windows. A pair of multiprocessing.queue.Queue objects would be used to communicate with the second Python process. These queues are based on that Pipe implementation, e.g. the IPC would be done via named pipes.
The key question is, whether calling the incoming Queue's get method would block gevent's main loop or not. There's a timeout for that method, so I could make it into a loop with a small timeout, but that's not a good solution, since it would still block gevent for small time periods hurting its low I/O latency.
I'm also open to suggestions on how to circumvent the whole problem of using pipes on Windows, which is known to be hard and sometimes fragile. I'm not sure whether shared memory based IPC is possible on Windows or not. Maybe I could wrap the console application in a way which would allow communicating with the child process using network sockets, which is known to work well with gevent.
Please don't question my primary use case, if possible. Thanks.
The Queue's get method is really blocking. Using it with timeout could potentially solve your problem, but it definitely won't be a cleanest solution and, which is the most important, will introduce extra latency for no good reason. Even if it wasn't blocking, that won't be a good solution either. Just because non-blocking itself is not enough, the good asynchronous call/API should smoothly integrate into the I/O framework in use. Be that gevent for Python, libevent for C or Boost ASIO for C++.
The easiest solution would be to use simple I/O by spawning your console applications and attaching to its console in and out descriptors. There are at two major factors to consider:
It will be extremely easy for your clients to write client applications. They will not have to work with any kind of IPC, socket or other code, which could be very hard thing for many. With this approach, application will just read from stdin and write to stdout.
It will be extremely easy to test console applications using this approach as you can manually start them, enter text into console and see results.
Gevent is a perfect fit for async read/write here.
However, the downside is that you will have to start this application, there will be no support for concurrent communication with it, and there will be no support for communication over network. There is even a good example for starters.
To keep it simple but more flexible, you can use TCP/IP sockets. If both client and server are running on the same machine. Also, a good operating system will use IPC as an underlying implementation, so it will be fast. And, if you are worrying about performance of this case, you probably should not use Python at all and look at other technologies.
Even fancies solution – use ZeroC ICE. It is very modern technology allowing almost seamless inter-process communication. It is a CORBA killer, very easy to use. It is heavily used by many, proven to be fastest in its class and rock stable. The beauty of this solution is that you can seamlessly integrate programs in many different languages, like Python, Java, C++ etc. But this will require some of your time to get familiar with a concept. If you decide to go this way, just spend a day reading trough documentation.
Hope it helps. Good luck!
Your question is already quite old. Nevertheless, I would like to recommend http://gehrcke.de/gipc which -- I believe -- would tackle the outlined challenge in a very straight-forward fashion. Basically, it allows you to integrate multiprocessing-based child processes anywhere in your application (also on Windows). Interaction with Process objects (such as calling join()) is gevent-cooperative. Via its pipe management, it allows for cooperatively blocking inter-process communication. However, on Windows, IPC currently is much less efficient than on POSIX-compliant systems (since non-blocking I/O is imitated through a thread pool). Depending on the IPC messaging volume of your application, this might or might not be of significance.
This seems, to me, a slightly more specific question that those already asked, so: How reliable is the Windows Event Log service if I'm looking for a 'fire and forget' logging service, so that even an error in calling the service does not impact the caller, and is noted somewhere, somehow, by the OS?
On Windows side, the event log is fine. Being used for so long by so many applications, it is definitely stable. I'm sure you can find creative ways to crash the API by feeding it bad enough input, but that's probably true with every API. When used properly it will work.
Having that said, you usually don't use the event log in a "fire and forget" context. Keep in mind it is a system-global log, which is supposed to be read by an administrator. Being fed too many events, it will become quite useless from the administrator's point of view. If you do use it sparsely and only for significant events, you can take your time and make sure your input is valid and no exceptions propagate back to your main logic.
Lastly, if you're looking for a real "fire and forget" logging infrastructure, take a look at ETW, which is a high performance event tracing infrastructure that is built into Windows.
In my experience, I have never had a programmatic issue with the Event Service. I did have an issue once, but it was a 4201(?) 'Access Denied' error caused by the Platforms "gurus" at my shop. But never anything regarding any usage or API calls.
The Windows event log has worked well for us in practice. The only problems that we had regarding stability were in the days of NT4 and are long gone in practice. Just make sure that you don't flood it with the same event repeatedly or it becomes a pain to actually look at ;)