Bash Cluster Monitor - Help for assignment - bash

I am a student at university and I am stuck with some code for my assignment, and would appreciate the help from the community with this. I am very new to bash, and I would say that I do not know how to write bash at all, I struggle with it. I much prefer python or C# lol. Anyways, I am required to create a cluster monitor. I have got to create a simple menu with 2 options that allows the user to choose between the options "Cluster Status" and "Process Analytics". I have created the menu, it's very basic, and the code for that is below:
#!/bin/bash
clear
echo="Choose one of the following options: "
echo="1) Cluster Status"
echo="2) Process Analytics"
echo="3) Exit"
read ans:
if ["$ans"=='1']
then
echo="Loading..."
bash Cluster_Status.sh
elif ["$ans"=='2']
then
echo="Loading..."
bash Process_Analytics.sh
elif ["$ans"=='3']
then
echo="Loading..."
In this menu, I also need to make it so that when it has completed execution, it pauses and returns to the main menu.
I have got 5 nodes for the cluster, with sample data in them, which can be provided if anyone wants to help with this.
The part that I am most stuck with is the creating of the cluster status and the process analytics part of this.
The cluster status is a script that provides a screen and file printout of the current stats of the entire cluster, with all of its parameters set in a nodeconfig.read.me file, which I have got. The stats need to be presented either in a sum or as an average. For example, the total amount of CPUs would be a sum, and the total CPU load would be an average. I need to do this in a table, which I cannot figure out how to do, as I am limited by the amount of different functions I can use.
The process analytics section asks me to process the current processes that are running on each node, if there are any, and I need to present the following stats onscreen and also have it saved to a file:
• Most popular process (in terms of instances running)
• Most CPU demanding process (in terms of CPU usage)
• Most MEM demanding process (in terms of Memory usage)
• Most Disk demanding process (in terms of Disk usage)
• Most Net demanding process (in terms of Network usage)
• 5 top users of CPU/MEM/DISK & Net (separate tables)
I have got to do this in PuTTy, so I am limited to the built in functions of PuTTy, cat, grep, ls, pipe, echo, tr, tee, cut, touch, head, tail. These are the only things that I can use when writing this, and because of the limitation, I am stuck when trying to find how to write this. I have researched how to do this for the last week, and I still do not know how to complete this. I do not ask for this assignment to be completed by anyone. I just want to ask for help on how to use these commands to be able to create this script. Thank you for your help from now, and I am sorry if this is posted in the wrong area. I am still fairly new with using stackoverflow.

Related

Simple bash tool to execute command on vocal input

I am trying to record mouse positions and events for later playback (linux - Ubuntu 20.04).
Right now this happens at regular intervals - every 2 seconds I record where the mouse is with xdotools and take a screenshot. In a separate question elsewhere I am looking for better tools to do this.
However in a separate train of thought I am also looking for something that would execute a command when I - for example - loudly say "NOW".
I have looked for tools and while there's definitely interesting stuff out there, this is most likely a one-off, and the installation and training process offsets the benefits.
So, anything out there that will do something for example when a sound above a threshold is captured by the microphone?

How to monitor and control background processes in shell script

I need to write a shell (bash) script that will be executing several Hive queries.
Each of the queries will produce a directory with a lot of files.
After all queries are finished I need to process all these files in a specific order.
I want to run Hive queries in parallel as background processes as each one might take couple of hours.
I would also like to parallelize resulting file processing but there are some culprits, that I don't know how to handle. I.e. I can start processing results of the first and second queries as soon as they are finished, but for the third, I need to hold until first two processors are done. Similarly for the fourth and fifth.
I won't have any problems writing such a program in Java, but how to do it in shell - beats me.
If someone can give me a hint on how can I monitor execution of these components in the shell script, I would appreciate it greatly.

How to debug potential CPU/RAM errors in Bash script on Linux

I have a relatively simple bash script that reads from a set of static input files, stores the input in bash variables and then does a bunch of processing over said input by calling out to external scripts (e.g. written in Python, Go, other bash scripts etc.) and using the intermediate results.
Lately I have been experiencing an intermittent problem where a single character seems to be getting altered somewhere during the processing which then causes subsequent errors. Specifically, a lot of the processing I'm doing involves slicing up a list of comma-separated records, and one of the values on each line is a unix timestamp, e.g. 1354245000.
What seems to be happening is that occasionally one of these values will get altered slightly, so I end up with a timestamp like 13542458=2 or 13542458>2 or 13542458;2 coming out of one of the intermediate scripts. This then subsequently gets fed into another script, which throws an exception when it tries to parse the value to an integer.
In the title of this question, I've suggested that this might be a potential CPU/RAM error. I know the general folly in thinking errors are caused by low level things like hardware/compilers etcetera, but the nature of this particular error makes me think it may be possible, for the following reasons:
The input files are the same on each invocation of the script, and the script only fails on some invocations.
I cannot think of any sources of randomness in the source code prior to where the script is breaking. It's basically just slicing and dicing csv input.
I cannot think of any sources of concurrency in the source code -- even the Go scripts aren't actually written to run anything concurrently.
This problem has only arisen in the last week or so. Prior to this time, this error would never occur.
While I haven't documented every erroneous character, they seem to often be quite close in the ASCII table to numeric values (=, >, ; etc). That said, I guess the Hamming distance between two characters quite far apart can be small also with changes to a high order bit.
The script often breaks at a different stage on different runs. i.e. I have a number of separate Python scripts, and sometimes it'll make it past one script and then the error will be induced in another. Other times it'll be induced on an earlier script.
What I'd like to know is, is there any methodical way to either confirm or rule out a hardware error for this problem? Or if it is a hardware problem, is it possibly undetectable by the operating system?
A bit of further info on the machine:
Linux 64-bit, Ubuntu 12.04
Intel i7 processor
16GB DDR3 RAM
I'm hoping someone can either point me to a reliable way to verify whether the hardware is to blame or otherwise a sound reason as to what else might be the cause.
Try booting into Memtest to check your memory.
While it is highly unlikely that it will be hardware, if you have exhausted you standard software debug as suggested by #OliCharlesworth, here is an outline of hardware error investigation:
(1) check your log area for any `MCE` logs (machine check exceptions).
If you find any in either your log area (syslog) or sometimes in
the present working dir or /dir -- you have a hardware failure.
(2) check your log area for disk errors. e.g:
smartd[3963]: Device: /dev/sda [SAT], 34 Currently unreadable (pending) sectors
(3) check your drive integrity, e.g.: (as root) # `smartctl -a /dev/sda` if any abnormality, run:
smartctl -t short /dev/sda (change drive as required)
(4) download/install/boot to [memtest86](http://www.memtest86.com/download.htm)
(run the complete test)
If your cpu/motherboard has thrown no mce's, you have no disk error, your drive tests OK with smartctl and you have no memory errors with memtest86, then recheck the software debugging. While additional hardware errors can still be present (bad capacitors, etc..) the likelihood at this point is software. Good luck.

How to check Matplotlib's speed in Xcode and increase performance?

I'm running into some considerable speed bottlenecks with a Python-Matplotlib-Xcode combination. I know some immediate responses will probably ask "Why are you doing python stuff in Xcode, just man up and use vim" --> I like the organizing ability and the built in version control, it makes elements of my work easier to deal with.
Getting python to run in xcode in the first place was a bit more tricky than I had hoped, but its possible. Now I have the following scenario:
A master file, 'main.py' does all the import stuff for me and sets up some universal formatting to make all the figures (for eventual inclusion in my PhD thesis) nice and uniform. Afterwards it runs a series of execfile commands to generate whichever graphics I need. Two things I can think of right off the bat:
1) at the very beginning of main.py after I import all the normal python stuff you tend to need, I call a system script which checks whether a certain filesystem is mounted. I keep all my climate model data on there since my local hard drive is too small to deal with all of it at once. Python pauses itself and waits for the system to do its thing, but once the filesystem has been found, it keeps going. Usually this only needs to happen once in the morning when I get to work, or if the VPN server kicked me off for whatever reason. (Side question, it'd be cool to know if theres a trick to automate an VPN login to reconnect as soon as it notices its not connected)
2) I'm not sure how much xcode is using on its own. running the same program from terminal is (somewhat) faster. I've tried to be memory conscience and turn off stuff I don't need while running the python/xcode combination.
Also, python launches a little window whenever I call plt.show(), this in itself takes time, I've considered just saving them as quick png files and opening them with some other viewer, although I guess that would also have to somehow take time to open up. Given how often these graphics change as I add model runs or think of nicer ways of displaying the data, it'd be nice to not waste something on the order of 15 to 30 minutes (possibly more) out of the entire day twiddling my thumbs and waiting for a window to pop up.
Benchmark it!
import datetime
start = datetime.datetime.now()
# your plotting code
td = datetime.datetime.now() - start
print td.total_seconds() # requires python version >= 2.7
Run it in xcode and from the command line, see what the difference is.

Network Usage of Process Using Powershell

I want to calculate the bytes sent and recieved by a particular process .I want to use powershell for that.
Something which I can do using Resource Monitor->Network Activity.
How can i do that using get-counter?
There is a really good Scripting Guy article on using the Get-Counter cmdlet here:
Scripting Guy - Get-Counter
The trick will be finding the counter that you need, and I don't think you can get the granularity you're after as these are the same counters that PerfMon uses. It's more focused around the whole Network Interface than it is around the individual processes using the interface. With that said, if it's the only thing using the given interface it should do the trick nicely.
Have a look at the Network Interface options available for a start:
(get-counter -list "Network Interface").paths
You can't, it seems. I'm absolutely unable to find the counters the performance monitor is reading from, though other people may chime in. There may be some other way than get-counter too, but that is what you specifically requested.
Looking through the counters, the closest thing you will find is the "IO Read Bytes/sec" and "IO Write Bytes/sec" counters on the process object.
The problem with those is that they count more than just network activity. The description in perfmon says:
"This counter counts all I/O activity generated by the process to
include file, network and device I/Os."
That being said, if you know that the process you want to monitor only or mainly writes to the network connection, this may be better than not measuring anything at all.
You'd go about it like this (I'll use Chrome as an example since it is conveniently running and using data right now):
get-counter "\Process(chrome*)\IO Read Bytes/sec"
This will just give you a one-time reading. If you want to keep reading you can add the continous switch.
The PerformanceCounterSampleSet object that is returned is not exactly pretty to work with, but you can find the actual reading in $obj.countersamples.cookedvalue.
The list will be fairly long (if you browse like me). Chrome is running in many separate processes, so we'll do a bit of math to get them all added up, and presented in KB.
Final result:
get-counter "\Process(chrome*)\IO Read Bytes/sec" -Continuous | foreach {
[math]::round((($_.countersamples.cookedvalue | measure -sum).sum / 1KB), 2)
}
Running this will just continously output a reading of how many KB/s Chrome is using.

Resources