I am running an API on a linux server. For the majority of the day the API runs completely fine, however every single day at 1:00 am/pm I receive a large spike of failures that the end after about 5 minutes. Looking into the failures, there is not one consistent pattern in the requests, and most requests still process fine (just a higher proportion fail). It also is not likely traffic related as 1:00am is a very slow traffic window
I am not the only person who is using this server though, so I suspect that someone else is running a process every 12 hours at this time which is eating up a lot of resources.
My question is if there is a bash command which I can run on my server to see if there are any processes that are always being run during this window?
Use top command in batch mode to display system statistics in real time and save the output in a textfile
top -b -n 1 > top.txt
To grab more than one iteration of top command, change -n 1 to your disired number. For example -n 100
top -b -n 100 > top.txt
Put the command in a script and execute the script via crontab or as an atjob at 1:00 am then check your textfile
Related
I have a bash script that causes a few hundred cURL requests for data. While it is important for each cURL to be successful, it is arguably more important for the script (which runs multiple times an hour) to not be un-expectedly delayed because of an external server... so on each of the lines, is there something that can be used to set a max processing time to ensure one line doesn't delay the entire script?
some of the cURL are used to be included in calculations otherwise I would just put & at the end.. I want to find a way that a given line must be completed in less than X seconds or it kills the cURL and moves to the next line.
(and I can put in IF statements if values are empty).
You can use the --max-time <seconds> argument to make sure that the curl command doesn't take more time than desired. From the curl man page:
Maximum time in seconds that you allow the whole operation to take.
This is useful for preventing your batch jobs from hanging for hours
due to slow networks or links going down.
Adding this argument to each curl command with a reasonable timeout for your problem should ensure that the whole script doesn't take too long.
I have a bash script that runs every five minutes to get an updated file from a ftp server. I know what time they generate the file which is every 5 minutes starting at 0 minutes of each hour. It usually takes them 30 seconds to generate the file and so I have mine offset by 1 minute (running every 5 minutes starting at 1 minute of each hour). However, there are times when their server bogs down and takes longer (sometimes minutes) to generate their file and it only takes about 13 seconds for me to download and process the file. When this happens I end up getting basically the first portion of the file and not the rest. Is there a way to verify that what I just downloaded matches what is on the ftp server? I was thinking maybe there was a way to check that the filesize of what I downloaded matches what is on the ftp server. Is that possible? My other thought was that if that is possible, depending on how quickly it can compare those two files, a delay may need to be built in to ensure it has time to for the file on the ftp server to have more data written to it if it is still in progress. Thoughts/suggestions? Thanks in advance.
My problem :
Each night, my crontab launches several nightly tests on a supercomputer working with PBS under CentOS 6.5. When launched, the jobs wait in the queue. When the scheduler allow to run, my jobs start. It is quite common than the scheduler launch all the jobs exaclty at the same time (even if my crontab lauched them at separated moments).
I can't modify the main part of the job (but I can add things before). Each job starts with an update of a common SVN repository. But, when the jobs start simultaneously, I have an error due to concurrent updates on the same repository. I want to avoid that.
What I expect :
When launched by the scheduler, the job could wait some seconds before starting. A solution could be wait a random time before starting, but the risk to have the same random time grow fast with the number of tests I perform in parallel. If I reduce this risk by choosing a big random number, I have to wait too long (locking unused resources on the supercomputer).
I suppose it's possible to store the information of "I will launch now, others have to wait for 1 minute" for each job, in a multi-thread-safe manner, but I don't know how. What I imagine is a kind of mutex but inducing only a delay and not a lock waiting the end.
A solution without MPI is prefered.
Of course, I'm open to other solutions. Any help is welcome.
Call your script from a wrapper that attempts to obtain an exclusive lock on a lock file first. For example
{
flock -s 200
# your script/code here
} 200> /var/lock/myscript
The name of the lock file doesn't really matter, as long as you have write permission to open it. When this wrapper runs, it will first attempt to get an exclusive lock on /var/lock/myscript. If another script already has the lock, it will block until the lock becomes available.
Note that there are no arbitrary wait times; each script will run as soon as possible, in the order in which they first attempt to obtain the lock. This means you can also start the jobs simultaneously; the operating system will manage the access to the lock and the ordering.
Here is a solution by using GNU parallel
It might seem a bit counter-intuitive at first to use this tool, but if you set the maximum number of jobs to run at a time to 1, it can simulate a job queue that runs multiple jobs in sequence without any overlaps.
You can observe the desired effect of this command by using this example
seq 1 5 | parallel -j1 -k 'echo {}; sleep 1'
-j1 sets max jobs running at a time to 1 while -k preserves the order.
To apply this to your original problem, we can create say a file such that it contains a list of script files line by line. We can then pipe the content of that file to parallel to make multiple scripts run in sequence and in order.
cat file | parallel -j1 -k bash {}
We have about 10 different Python scripts that download data from the web, read data from a database and write data back to that database. They do so repeatedly every 10 seconds (or 10 seconds after the last task has completed).
The question is, what is the best approach at running these tasks? I can think of a few ways:
a while True that runs the task then sleeps for the interval. It could be guarded by a watchdog like supervisord, making sure it is always up.
having the script execute the task just once, and invoking the script externally once every 10 seconds by another process.
having the script execute the task lets say for 1 hour (every 10 seconds for an hour), and having a watchdog make sure that task runs again once the hour is over.
I would like to avoid long running processes that actually do something because I don't want to deal with memory problems etc over long periods of time.
Additional Information
The scripts are different because they each retrieve data from a different source, and query, calculate and insert different data into the database.
The tasks are performed every 10 seconds since the data being retrieve is in real-time, and we need to not only keep updating it very frequently, but also keep all the historical data in the database.
There are a lot of resources being used by the scripts - MySQL connections, HTTP connections, Redis connections, etc. We have encountered issues with using the long-running approach before, specifically with MySQL connections (things like MySQL server has gone away, even though all connections had been closed). Hence the inclination toward having the scripts run in shorter periods of time.
What are some common approaches at this?
Unless your scripts somehow leak memory (quite unlikely), they should all be the same. So, for sheer simplicity (your time programming/debugging is much more expensive than a few miliseconds of the machine's time, even each 10 seconds!) I'd go for the single script that checks each 10 seconds.
OTOH, checking each 10 seconds sounds like busywork. Can't you set up so that whatever you are monitoring tells you when there are changes? Or batch the records up so you can retrieve, say, a day's worth at at time?
If you are running on linux, cron has granularity of a minute. We have processes we run constantly. Rather than watch them, the script will open a semaphore that gets released when the program finishes normally or not. This way if it runs long and it gets called again by cron, the copy will exit when it can't get the lock. This way you can call it a often as you need to without it stepping on a possibly still running copy.
I have to set up a cron job on my hosting provider.
This cron job needs to run every second. It's not intensive, just doing a check.
The hosting provider however only allows cron jobs to be run every two minutes. (can't change hosting btw)
So, I'm clueless on how to go about this?
My thoughts so far:
If it can only run every two minutes, I need to make it run every second for two minutes. 1) How do I make my script run for two minutes executing a function every second?
But it's important that there are no interruptions. 2) I have to ensure that it runs smoothly and that it remains constantly active.
Maybe I can also try making it run forever, and run the cron job every two minutes checking whether it is running? 3) Is this possible?
My friend mentioned using multithreading to ensure it's running every second. 4) any comments on this?
Thanks for any advice. I'm using ZF.
Approach #3 is the standard solution. For instance you can have the cron job touch a file every time it runs. Then on startup you can check whether that file has been touched recently, and if it has then exit immediately. Else start running. (Other approaches include using file locking, or else writing the pid to a file and on startup check whether that pid exists and is the expected program.)
As for the one second timeout, I would suggest calling usleep at the end of your query, supplying the number of milliseconds from now to when you next want to run. If you do a regular sleep then you'll actually run less than once a second because sleeps sometimes last longer than expected, and your check takes time. As long as your check takes under a second to run, this should work fine.
I don't think cron allows second level resolution. http://unixhelp.ed.ac.uk/CGI/man-cgi?crontab+5
field allowed values
----- --------------
minute 0-59
hour 0-23
day of month 1-31
month 1-12 (or names, see below)
day of week 0-7 (0 or 7 is Sun, or use names)
So, even if your hosting provider allows you can't run a process that repeats every second. However, you can user command something like watch for repeated execution of your script. see here