I've setup my home server to wake-on-lan and my router sends the wake-on-lan package when I try to access the server from anywhere.
Now the server should also suspend again after like 5 minutes after
last disk activity
last network activity
and last considerable CPU load
Is there a an existing script / solution for this? Can systemd solve this?
You can try sspender https://github.com/mountassir/sspender
It allows you to suspend your machine based on pre-defined CPU/Disk usage, and makes sure the machine wakes up at certain times when you need it to be ON.
This is configurable in systemd with the logind.conf file.
cat /etc/systemd/logind.conf
...
IdleAction=suspend
IdleActionSec=5min
Check man logind.conf for more configuration options
Related
Is it possible to create a script that is always running on my VPS server? And what need i to do to run it the hole time? (I haven't yet a VPS server, but if this is possible i wants to buy one!
Yes you can, there are many methods to get your expected result.
Supervisord
Supervisord is a process control system that keeps any process running. It automatically start or restart your process whenever necessary.
When to use it: Use it when you need a process that run continuously, eg.:
A queue worker that reads a database continuously waiting for a job to run.
A node application that acts like a daemon
Cron
Cron allow you running processes regularly, in time intervals. You can for example run a process every 1 minute, or every 30 minutes, or any time interval you need.
When to use it: Use it when your process is not long running, it do a task and end, and you do not need it beign restarted automatically like on Supervisord, eg.:
A task that collects logs everyday and send it on a gzip by email
A backup routine.
Whatever you choose, there are many tutorials on the internet on how configuring both, so I'll not go into this details.
Server Scenario:
Ubuntu 12.04 LTS
Torque w/ Maui Scheduler
Hadoop
I am building a small cluster (10 nodes). The users will have the ability to ssh into any child node(LDAP Auth) but this is really unnecessary since all computation jobs they want to run can be submitted on the head node using torque, hadoop, or other resource managers tied with a scheduler to insure priority and proper resource allocation throughout the nodes. Some users will have priority over others.
Problem:
You can't force a user to use a batch system like torque. If they want to hog all the resources on one node or the head node they can just run their script / code directly from their terminal / ssh session.
Solution:
My main users or "superusers" want me to set up a remote login timeout which is what their current cluster uses to eliminate this problem. (I do not have access to this cluster so I can not grab the configuration). I want to setup a 30 minute timeout on all remote sessions that are inactive(keystrokes), if they are running processes I also want the session to be killed along with all job processes. This will eliminate people from NOT using an available batch system / scheduler.
Question:
How can I implement something like this?
Thanks for all the help!
I've mostly seen sys admins solve this by not allowing ssh access to the nodes (often done using the pam module in TORQUE), but there are other techniques. One is to use pbstools. The reaver script can be setup to kill user processes that aren't part of jobs (or shouldn't be on those nodes). I believe it can also be configured to simply notify you. Some admins forcibly kill things, others educate users, that part is up to you.
Once you get people using jobs instead of ssh'ing directly, you may want to look into the cpuset feature in TORQUE as well. It can help you as you try to get users to use the amount of resources they request. Best of luck.
EDIT: noted that the pam module is one of the most common ways to restrict ssh access to the compute nodes.
I am wondering if there is a way to monitor these automatically. Right now, in our production/QA/Dev environments - we have bunch of services running that are critical to the application. We also have automatic ETLs running on windows task scheduler at a set time of the day. Currently, I have to log into each server and see if all the services are running fine or not, or check event logs for any errors, or check task scheduler to see if ETLs ran well etc etc... I have to do all the manually... I am wondering if there is a tool out there that will do the monitoring for me and send emails only in case something needs attention (like ETLs fail to run, or service get stopped for whatever reason or errors in event log etc). Thanks for the help.
Paessler PRTG Network Monitor can do all that. we have very good experience with it.
http://www.paessler.com/prtg/features
Nagios is the best tool for monitoring. It checks for the server status as well the defined services in it and if any service goes down or system goes down, sends the mail to specified mail id.
Refer the : http://nagios.org/
Thanks for the above information. I looked at the above options but they have a price.. what I did is an inexpensive way to address my concerns..
For my windows task scheduler jobs that run every night - I installed this tool/service from codeplex that is working great.
http://motash.codeplex.com/documentation#CommentsAnchor
For Windows services - I am just setting the "Recovery" Tab in each service "property" with actions to do when it fails. (like restart, reboot, or run a program which could be an email that will notify)
I built a simple tool (https://cronitor.io) for monitoring periodic/scheduled tasks. The name is a play on "cron" from the unix world, but it is system/task agnostic. All you have to do is make an http request to a unique tracking URL whenever your job runs. If your job doesn't check-in according to the rules you define then it will send you an email/sms message.
It also allows you to track the duration of your jobs by making calls at the beginning and end of your task. This can be really useful for long running jobs since you can be alerted if they start taking too long to run. For example, I once had a backup task that was scheduled every hour. About six months after I set it up it started taking longer than an hour to run!
There is https://eyewitness.io - which is for monitoring server cron tasks, queues and websites. It makes sure each of your cron jobs run when they are supposed to, and alerts you if they failed to be run.
I have a program that runs on Windows and scans shares on a Netapp filer over CIFS. When the scanner runs on W2K3 and exits, running "cifs sessions -c " on the Netapp filer shows that the session is gone. When the same scanner runs on W2K8 and exits, the above Netapp command shows the session to be still alive. The scanner application has already exited so there should be no traffic on the session anymore.
What I have done so far:
1) waited for (more than) cifs.idle_timeout number of seconds to see if the session goes away. It does not.
2) analyzed packet-traces for the communication between W2K3->Netapp and W2K8->Netapp. W2K3 trace shows TREE_CONNECT and TREE_DISCONNECT but W2K8 trace shows only the TREE_CONNECT request. I will verify this but I think the TREE_DISCONNECT on W2K3 is followed by a LOGOFF request (which causes the session to end I would believe?).
3) To see how things would look when Netapp is not in the picture -- from the W2K8-client machine, I accessed via windows-explorer \\C$. This causes a TREE_CONNECT request on the wire. Now I closed the explorer window and the W2K3-server machine closed the session even though there was no TREE_DISCONNECT on the wire generated by the W2K8-client.
So seems like Netapp is expecting something that W2K8 is not sending.
Question : Is there a specific API I can use in my scanner application to
i) send an explicit LOGOFF SMB request when I am done scanning?
ii) or, re-use an existing session b/w the W2K8-client and the Netapp filer for a new scan?
ii) anyone face this or similar scenarios?
Thanks in advance.
For the API - look at NetUseEnum and NetUseDel.
The disconnect you see is due to idle session (session with no open handles). I'm not sure if this is done by the server (which would indicate a difference between Netapp and Windows) or by the client (your scanner holds an open handle, maybe the current directory, while Explorer closes the handle when you close the window).
I'm not sure what you are trying to accomplish, but it might be easier using NetApp tools/APIs. For a lot of administration tasks the Data ONTAP PowerShell Toolkit can really simplify things:
PS C:\> Connect-NaController fas2040rre1
PS C:\> Get-NaCifsShare
MountPoint ShareName Description
---------- --------- -----------
/etc ETC$ Remote Administration
/vol/vol0/home HOME Default Share
... snip ...
/vol/cifs/backups backups
And if PowerShell isn't your thing, you can take a look at the Manageability SDK for a Web Services API to C/C++, Java, Perl and .Net. In that case you'd want to look at the cifs-share-list API's (slightly different between 7.x/8.x).
At my workplace, we have lab machines that we use to do our testing.
The standard procedure to reserve a machine for testing was to walk around the office to make sure that no one was using the machine.
This is highly inefficient and time consuming.
At first, I set up a web page where people could reserve the lab machine but nobody was keeping the page updated so that turned up to be useless.
I finally found a solution using Microsoft log parser and wanted to share it to the stack overflow community.
It is a batch file that runs on the machine so the user can identify the last users that use the machine and easily IM them to ask if the machine is free.
Is there a better solution to do this?
Use the built-in command qwinsta (Query Win Station) to figure out what sessions (including console) are active or inactive (disconnected) and then act on the given information (creds to krusty.ar btw for linking this already).
If you feel people are abusing the machine in question, refer to rwinsta to nuke their sessions into oblivion...
You will need to install the Microsoft Log Parser
Then create the following 2 files
TSLoginsDetails.sql
SELECT
timegenerated,
EXTRACT_TOKEN(Strings,1,'|') AS Domain,
EXTRACT_TOKEN(Strings,0,'|') AS User,
EXTRACT_TOKEN(Strings,3,'|') AS SessionName,
EXTRACT_TOKEN(Strings,4,'|') AS ClientName,
EXTRACT_TOKEN(Strings,5,'|') AS ClientAddress,
EventID
FROM Security
WHERE EventID=682
ORDER BY timegenerated DESC
TSLogins.bat
echo off
cls
c:
cd "c:\Program Files\Log Parser 2.2\"
logparser.exe file:TSLoginsDetails.sql -o:DATAGRID
Now by placing this batch file on the desktop, the user can see who were the last people to login and contact them by IM to verify if they are done.
How about posting the information from the log file to the website that tells who is currently using the machine as well.
Check and notify when they log in.
Updated the "who is using the machine" page you made prior.
Run a AT job that checks every couple of hours who is on it.
Totally out of the box:
You can install the Software Testing Automation Framework (STAF) on your servers and desktops to manage your tests. It's written in Java, so you can use it on Windows and Unix/Linux desktops and servers.
Using STAF, you can create a resource pool of test servers on which you conduct tests, then write STAX jobs (STAX is a STAF execution framework) to conduct the tests. The job can grab the first available server from the resource pool, run the test, monitor the test status, log results, notify the submitter, then release the server back into the pool when done. If you have multiple people submitting jobs for tests, STAF will manage the queue of requests and satisfy them as they came in. Users can either monitor the job from their desktop, or you can set up email alerts to notify them when the test is complete.
I'm not sure if I understand you, but there are a set of command line tools to deal with terminal server sessions, and there's also a Windows API to do the same if you need to do this from a program.
Since it sounds like you're a microsoft shop, you can set up the machines as resources in outlook/exchange and reserve them that way.