How to Monitor Resource Utilization? - windows

Is there a tool which logs the system resource utilization like cpu,memory,io and network for a period of time and generate graph ?
I need to monitor system and identify in which period resource is been highly utilized.
If anyone of you had experience with this kind of tool,kindly suggest.
Thanks in Advance.

Besides third party tools, there is Windows Performance Monitor that can help. It shows real-time graphs, and can save the performance information into files that you can open and analyze later
It provides multiple metrics for CPU, memory, I/O and Network utilization, and shows an instance for each processor on the machine. It can also be used to monitor remote machines
You can also create collector sets, to have all monitored counters in a single component
Performance Monitoring Getting Started Guide
Create a Data Collector Set to Monitor Performance Counters

I think this tool will help you
System-Resources-Monitoring
System Monitoring

Related

Spark program to monitor the executors performance

I am working on a spark program that monitor each executors' performance such as mark down when one executor start to work and when it finishes its job. I am thinking two ways to do that:
First, develop programs so when the executor starts work, it mark down the current time to a file, when it finishes, mark down that time to the same file. In the ends, all "log" files will be spread the whole cluster networks except for the driver machine.
Second, since executors will report to driver periodically, each time the driver receives message from executors, if the message contains "start" and "finish" information, let the driver record everything.
Is that possible?
There are many ways to Monitor the executor performance as well as application performance
Best ways are to Monitor with the help of Spark Web UI and Other Monitoring tools available Open Source (Ganglia)
You Need to Monitor your application whether your cluster is under utilized or not how much resources are used by your application which you have created.
Monitoring can be done using various tools eg. Ganglia From Ganglia you can find CPU, Memory and Network Usage.Based on Observation about CPU and Memory Usage you can get a better idea what kind of tuning is needed for your application
Hope this Helps!!!....

How to profile set of process in freeBSD?

I am trying to debug a service with respect to its performance. The service I am trying to debug, internally spawns instances of the same binary. To improve the through-put, I am planning to increase number of instances of the binary. After a point in number of processes of the binary, through-put is not increasing. Now I am trying to reason-out why this is happening.
I need some help on where to start, tools available for process level profiling. I am using freeBSD platform.
If using more processes doesn't improve output, then your service isn't CPU bound. It might be constrained by e.g. disk or network throughput instead.
Start with systat. Especially systat -vmstat. See man systat.
This will show you several aspects (like memory usage, interrupts, processot usage and disk activity) of how busy your system is.
If your program does a lot of network activity, using systat -tcp might give insight as well.
If your service is a HTTP server, you might want to look at varnish.

monitoring application (CPU and cache usage) on single Linux box with 80 cores

I am looking for a performance monitoring tool for my application which will collect/visualize in realtime the CPU and cache usage on single Linux box like IBM System or HP ProLiant with typical configuration 8 processors / 80 cores.
Application is the home-grown multithreaded C+ code which uses OpenMP.
This monitoring tool should not run 24 hours per day; it should not do e-mail notification.
I will run this tool just before sending commands to my apps, the apps will execute the command (it may take as a maximum few minutes only). During this time interval I need to analyze:
- usage of cores
- data movement between processors
- usage of L1, L2, L3 caches
- some other metrics (help me here) which can help to find bottleneck in application
performance and resource utilization
I guess that tools like Nagios / Zabbix are too heavy for this task.
From another side using the command-line tools like "top" and "sar" for 80 cores not very convenient and plotting (not necessary real-time) would be nice to have...
While getting the per core usage is rather easy - the other values might prove to be not practical, not at least without running that application within a profiler of some sorts.
Measuring QPI utilization is something highly non-trivial if at all possible. Intel's vTune might be able to acquire such things but only when running instrumented version of your binaries.
Also on x86 there is no way to figure out L1,L2,L3 usage of any kind - you can grab the low level CPU counters to measure cache misses though (but would probably need to use instrumented/profiled binaries and always withan something like vTune or PAPI).
You could "easily" setup something to pull all the lower level performance counters into SNMP and grab the SNMP values via standard SNMP capable monitoring tools but be aware that SNMP pulling is something that you don't want to occur more than 1-2/s max. Or pull that info into something like collectd.
I'm also having the impression that you don't understand the problem domain of monitoring tools. They are not ment to be used as low level analysis probes for finding application level/system bottlenecks - at best you could get some hints which resource (from a 10K feet view) is running under full utilization. Monitoring and alterting tools are something that operations staff needs to use to understand which part of their IT system is currently used and how, to gather historical data and predict future resource utilization and to be alerted when something breaks.
SiteScope, Hyperic or any combination of shell scripts, native OS utilities and a DB to store the results may do the job.

Diagnosing pathological behavior of a piece of cluster software

I'm using a kind of load balancer over a small cluster that is able to achieve >2000rps on zero-duration requests (t.i. ones that are immediately satisfied by the worker nodes).
But as soon as the requests stop being zero-duration and start taking even 1ms, performance immediately drops >10x. The data being transfered in both directions is identical and is about 2kb in size.
This is for sure not related to saturation of the cluster or network throughput, because 200rps of 1ms requests is a very tiny load and the network is 10Gbit. Besides, the CPU load is just some 2-5% both on the load balancer and on the worker nodes.
I wonder whether that might be related to some pathological behavior of the OS scheduler, or the OS network stack (t.i. there is some special case behavior for very short interactions).
How might I diagnose the reason? Which perfcounters to watch? What tools or methodologies to use?
(Just in case someone simply knows the answer to my particular problem, I'm talking about the MS HPC Server 2008 R2's "WCF Broker", running on Windows Server 2008 R2 over Hyper-V)
One thing you can do is use ETW tracing to try and understand what the nodes are doing while your WCF job is running. On HPC server, I sometimes clusrun xperf to collect traces on all or specific nodes. There are a number of tools that you can use for analyzing ETW traces, including xperf itself. I haven't done any serious work using HPC SOA (WCF), but I did write a simple WCF raytracer app and then used xperf to profile it on several of the nodes.
Turned out it was a completely network-unrelated issue having to do with peculiarities of the scheduling mechanism of HPC Server. I resolved the issue by tweaking a configuration option "serviceRequestPrefetchCount" to 0 in the loadBalancing section of the WCF service config file.
I'm assuming that there are some shared resources with some kind of locking system in place? Is locking a bottleneck? It's hard to guess without seeing the system.
Do you have a way to profile the workers? What are they spending most of their time on, especially in the fast vs slow scenarios?

CPU Utiliztion per process in Win32 API

I am doing a project on a centralized LAN management system. I need to know how many CPU cycles is each process of a remote PC consuming(as in a Task Manager )so that the network admin can close few processes,in case the CPU utilization of a system in network goes beyond acceptable rates..
I would like to know if there is a Win32 API for this requirement of mine n if so ,i request you to give me information about it..
thank you in advance..
Win32 API has lots of functions to find all kinds of information about currently running processes and threads, here's a link to the full list of them: http://msdn.microsoft.com/en-us/library/ms683223(VS.85).aspx
Explore the list and you should be able to find the function(s) there that meet your requirements, for example GetProcessTimes() returns structures that contain the amounts of time the process has executed in kernel mode, in user mode, etc.
You need to look at the performance monitor system. You can get the stats from there (in the Process counter).
Here's a (delphi) explanation of it, that's pretty good and simple to understand.
When you understand how it all works, you then need the Performance Counters API to read the data counters.

Resources