What is the best practise to run long runnig calculations with OpenCPU? - opencpu

Hi I want to use the power and flexibility of OpenCPU to start long runnig calculations (several minutes or so). I am facing issues where OpenCPU terminates processing of given scripts. I have modified /etc/opencpu/server.conf option timelimit.post to 600 (seconds) but it seems that is not taking an effect. Is there any other cofig file that must be modified in order to increase timeouts?
Or more generally - what is the best practise to run long running calculations with OpenCPU?

Related

Why real time is much higher than "user" and "system" CPU TIME combined?

We have a batch process that executes every day. This week, a job that usually does not past 18 minutes of execution time (real time, as you can see), now is taking more than 45 minutes to finish.
Fullstimmer option is already active, but we don't know why only the real time was increased.
In old documentation there are Fullstimmer stats that could help identify the problem but they do not appear in batch log. (The stats are those down below: Page Faults, Context Switches, Block Operation and so on, as you can see)
It might be an I/O issue. Does anyone know how we can identify if it is really an I/O problem or if it could be some other issue (network, for example)?
To be more specific, this is one of the queries that have increased in time dramatically. As you can see, it is reading from a data base (SQL Server, VAULT schema) and work and writing in work directory.
Number of observations its almost the same:
We asked customer about any change in network traffic, and they said still the same.
Thanks in advance.
For a process to complete, much more needs to be done than the actual calculations on the CPU.
Your data has te be read and your results have to be written.
You might have to wait for other processes to finish first, and if your process includes multiple steps, writing to and reading from disk each time, you will have to wait for the CPU each time too.
In our situation, if real time is much larger than cpu time, we usually see much trafic to our Network File System (nfs).
As a programmer, you might notice that storing intermediate results in WORK is more efficient then on remote libraries.
You might safe much time by creating intermediate results as views instead of tables, IF you only use them once. That is not only possible in SQL, but also in data steps like this
data MY_RESULT / view=MY_RESULT;
set MY_DATA;
where transaction_date between '1jan2022'd and 30jun2022'd;
run;

How to compare file upload and download speed at varying time in my application?

I would like to compare data read/write speed (i.e.; the file upload and download speed) with my application between various servers (like machineA, machineB and machineC) at varying times.
Just have tried to automate download with the help of curl as suggested here.
The network speed varies from time to time. Also I could not make parallel test runs between machines. In such case, what would be the best way to make "valid data read/write speed comparison with respect to network speed".
Is there any open source tools to do these speed tests?
Any suggestions would be greatly appreciated !
It will never be the same time, have this in mind. The best you can do is set the parameters equally for each test and run a number X of tests. You can get the average time of a series of tests. It's a good way to do this.
Another issue, I guess, it's your software itself. You can write a code to compare the times. You don't say what application is, but you have to write this code before the download (count time) and stop right after, without post-processing. Store the data (ID machine, download time, upload time, etc.) and then compare.
Hope it helps!

How to speed up nagios to monitor hosts over the cloud

while using nagios with multiple hosts spread over the network,hosts status shows a recognizable lag and takes a long time to reflect on nagios server cgi.Thus what is the optimal nrpe/nagios configration to speed up the status process for a distributed host environment.
In my case I use nagios core 4.1
nrpe 1.5
server/clients: Amazon ec2
The GUI is usually only updated once each minute (automatically), though clicking refresh can provide you with 'nearly' the latest information. I say nearly because there is a distinct processing loop inside of the Nagios core that causes it to never be real time. NRPE is going to run at the speed of your network connection - it does little else besides sending and receiving tiny amounts of data. About the only delay here is the time it takes to actually perform the check and send back the response - which, of course, has way to many factors to mention. Try looking at the output of
[nagioshome]/bin/nagiostats
There are several entries that tell you:
'Latency' - the time between when the check was scheduled to start, and the actual start time.
'Execution Time' - the amount of time checks are actually taking to run.
These entries will have three numbers, which are; Min / Max / Avg
High latency numbers (in my book that means Avg is greater than 1 second) usually means your Nagios server is over worked. There are a few things you can do to improve latency times, and these are outlined in the 'nagios.cfg' file. This latency has nothing to do with network speed or the speed of NRPE - it is primarily hardware speed. If you're already using the optimal values specified in nagios.cfg, then its time to find some faster hardware.
High execution times (for me an Avg greater than 5 seconds) can be blamed on just about everything except your Nagios system. This can be caused by faulty networks (improper packet routing), over loaded networks, faulty and/or poorly designed checks, slow target systems, ... the list is endless. Nothing you do with the Nagios and/or NRPE configs will help lower these values. Well, you could disable NRPE's encryption to improve wire time; but if you have encryption enabled in the first place, then its not likely you'd want it disabled.

Break down and speed up Rake task

Our application has a feature that requires a rake tast to run every day at a specific time over all the users. This involves computation of some of their attributes and running through db queries and sending a push notification to every user. As such, the task has been designed to run in O(n) but that would still mean growing total time to finish with increasing user base. And we want the task to finish in not more than a minute - it already is take 8 minutes at 14000 users and also ever increasing the CPU util (throughout the rest of the day, the average cpu util sits around 10% but goes up to 50% when the task runs). I want to solve two problems here - make the task run in lesser time and bring down the cpu util the task run spikes.
Tech Specs - Sinatra application serving an API for the app, running on Phusion Passenger (nginx module), using MongoDB and deployed on a c3.large ec2 instance.
P.S - I don't have much knowledge about how parallel processing and threading are done in Ruby and if it can solve this issue, but can bucketing the total users and paralelly computing those buckets be an answer? If so, how do I go about doing something like that? I want to avoid buying out a bigger server just for this purpose as rest of the time it handles the requests quite easily like I pointed out above.

How to determine a good cache time to live for live or semi-live data

I write a lot of web-applications that poll data from a server. Often these are updated live, or at least semi-live, but generating the data often takes some time and should be cached to reduce server-strain. I do however have some trouble finding any good guides on how to best set an appropriate time to live, etc. Anyone have some good suggestions or rules of thumb?
Use the longest duration you could afford your data to be stale as your TTL. If you can afford ten seconds, use a ten-second TTL. If you can afford one second, use a one-second TTL.
You can also look at the problem from the other side: have a single asynchronous server process continuously run the data generation query as often as possible and update the cache as fast as possible. This approach solves the cache stampede problem elegantly and you get an effective and optimum TTL of "how long does it take to generate the data?"

Resources