How to collect stats with StatsD? - metrics

So, the one thing I do not understand from the Etsy StatsD documentation is how to get useful metrics sent from StatsD.
I understand that you can send metrics by:
echo "service.test.random:1|c" | nc -u -w0 127.0.0.1 8125
But, do I wrap a bunch of metrics I want to use in a bash cron job? Do I use the files located in /install/location/statsd/lib/ as basic metrics to get the process going? If so, do they just run? Do I add a flag in the config to pick up these files?
I am a bit confused on getting StatsD working for me.

You usually send stats using a statsd client from a running application. For example, increment a counter every time a user buys a product, set timers for the different views. This way metrics are sent in "real time" from the application, and you can later inspect them (and their rollups) in Graphite.
Check the list of available clients.

Related

How to download 300k log lines from my application?

I am running a job on my Heroku app that generates about 300k lines of log within 5 minutes. I need to extract all of them into a file. How can I do this?
The Heroku UI only shows logs in real time, since the moment it was opened, and only keeps 10k lines.
I attached a LogDNA Add-on as a drain, but their export also only allows 10k lines export. To even have the option of export, I need to apply a search filter (I typed 2020 because all the lines start with a date, but still...). I can scroll through all the logs to see them, but as I scroll up the bottom gets truncated, so I can't even copy-paste them myself.
I then attached Sumo Logic as a drain, which is better, because the export limit is 100k. However I still need to filter the logs in 30s to 60s intervals and download separately. Also it exports to CSV file and in reverse order (newest first, not what I want) so I have to still work on the file after its downloaded.
Is there no option to get actual raw log files in full?
Is there no option to get actual raw log files in full?
There are no actual raw log files.
Heroku's architecture requires that logging be distributed. By default, its Logplex service aggregates log output from all services into a single stream and makes it available via heroku logs. However,
Logplex is designed for collating and routing log messages, not for storage. It retains the most recent 1,500 lines of your consolidated logs, which expire after 1 week.
For longer persistence you need something else. In addition to commercial logging services like those you mentioned, you have several options:
Log to a database instead of files. Something like Apache Cassandra might be a good fit.
Send your logs to a logging server via Syslog (my preference):
Syslog drains allow you to forward your Heroku logs to an external Syslog server for long-term archiving.
Send your logs to a custom logging process via HTTPS.
Log drains also support messaging via HTTPS. This makes it easy to write your own log-processing logic and run it on a web service (such as another Heroku app).
Speaking solely from the Sumo Logic point of view, since that’s the only one I’m familiar with here, you could do this with its Search Job API: https://help.sumologic.com/APIs/Search-Job-API/About-the-Search-Job-API
The Search Job API lets you kick off a search, poll it for status, and then when complete, page through the results (up to 1M records, I believe) and do whatever you want with them, such as dumping them into a CSV file.
But this is only available to trial and Enterprise accounts.
I just looked at Heroku’s docs and it does not look like they have a native way to retrieve more than 1500 and you do have to forward those logs via syslog to a separate server / service.
I think your best solution is going to depend, however, on your use-case, such as why specifically you need these logs in a CSV.

Live User monitoring by jmeter

I am looking to monitor live user responses through jmeter. Can the backend listener in jmeter used to record live users(end users)? I am not talking about virtual users that we set up in jmeter. But the real end users.How can this be achieved?
Editing to add more details:
Our requirement is to monitor the real users, in 2-3 geographical locations, all through out the business hours..say from 8 to 5.
For this purpose, do you think, I need to have a dedicated machine with jmeter, grafana and influxdb for monitoring alone? I have other testing going on using jmeter and I don't want to use the same machine to do both monitoring and testing. DO you think this is achievable by jmeter? ANy suggestions?
you can use the following tools in combination to achieve live monitoring:
JMeter backend Listener - to send results to influxDB
influxdb - store the results sent by backend listener
grafana - run continuous queries for metrics and plot graphs like average response times etc.
Follow the steps mentioned here:
http://jmeter.apache.org/usermanual/realtime-results.html - First Option
https://www.linkedin.com/pulse/jmeter-live-performance-monitoring-dashboard-grafana-influxdb-sarker
http://www.testautomationguru.com/jmeter-real-time-results-influxdb-grafana/
http://techblogsearch.com/a/live-performance-result-monitoring-with-jmeter-grafana-influxdb.html
We use to perform general production app (here- Scada-LTS) monitoring by javamelody. But this will give You general statistics. For per user monitoring it seems You should use log4j + ELK or other simpler syslog analyzing tool.
Jmeter should be used rather for test environment for stress tests.

Using Grafana with Jmeter

I am trying to make Grafana display all my metrics (CPU, Memory, etc).
I have already configured Grafana on my server and have configured influxdb and of course I have configured Jmeter listener (Backend Listener) but still I cannot display all grpahas, any idea what should I do in order to make it work ?
It seems like that system metrics (CPU/Memory, etc.) are not in the scope of the JMeter Backend Listener implementation. Actually capturing those KPIs is a part of PerfMon plugin, which currently doesn't seem to support dumping the metrics to InfluxDB/Graphite (at least it doesn't seem to work for me). It might be a good idea to raise such a request at https://groups.google.com/forum/#!forum/jmeter-plugins. Until this gets done, I guess you also have the option of using some alternative metric-collection tools to feed data in InfluxDB/Graphite. Those would depend on the server OS you want to monitor (e.g. Graphite-PowerShell-Functions for Windows or collectd for everything else)
Are you sure that JMeter posts the data to InfluxDB? Did you see the default measurements created in influxDB?
I am able to send the data using backend listener to influxdb. I have given the steps in this site.
http://www.testautomationguru.com/jmeter-real-time-results-influxdb-grafana/

IIS 7 request duration monitoring

i am curious if there is a way of monitoring the request duration time on an iis server. Personally I have came up with a solution but it's really resource intensive and that is why i'm asking the question, just to gather more opinions.
My plan is to extract the duration time of each request and send it to graphite so as to have a real time overview of the performance of the webserver. The idea i've came up with is to use poweshell with its webadministration module. And if you run get-item IIS:\AppPools\DefaultAppPool | Get-WebRequest for example you get all the requests on that app pool with a lot of info including the time info.
The thing is that i should have a script which runs every 100 ms to get all requests and that is kinda wasteful. Is there a way to tell iis to put the request duration time(in miliseconds) in the logs? Because then it would be much easier to get the information I need.
I don't know if there is such a feature on IIS, but I've done the same (sending iis page times to graphite) by using a reverse proxy between internet and the iis server, like nginx.
The proxy module from nginx allow you to log on each request the time the backend took to produce the page.
Also, having a proxy like nginx in fron of an IIS could be very helpful if you have to deal with visits with slow connections, nginx will store the reply from backend, drop backend connection and wait until visitor gets all the content. Highly recommended.
In case you go this route, you should use logster (also from etsy guys) or logstash to parse nginx logs each period of time you want (likely every minute).
Seems that there is a feature that logs requests based on a regex, and it's called Advanced Logging Module. You can specify from a number of fields what you want to get loged and it's W3C compliant. In my case i had time take as a filed which can be specified and that was what i was looking for. After that i written a script in powershell which parses the logs and gets the information i need, constructs a metric and sends it to statsd which in term sends it to powershell.
The method i chose for the log parsing was the following: in the script i used get-content comandlet from powershell to gather all the logs in one file(yes iis breaks the logs in multiple files, and i'm guessing the number of logs is dependent on the number of your working processes but i'm not sure). This was the first iteration in a second iteration i gather all the logs in another file and make a diff between the first file and the latter and only the difference gets processed.
I chose this method because it's i thought it wold be better to have the minimum regex processing. The next step is erasing the first file of accumulated logs and moving the second one in pace of the first that was erased and running the script again, so to have always a method of comparison. Also the log rollover is at one hour, after which the logs are erased.

What's the best way to monitor a large number of Ruby processes?

I have a farm of several physical servers each running a large number of Ruby "workers" (daemon-like processes) and I'd like to be able to monitor the health and progress of these processes from a central location, perhaps with historical graphing like Cacti provides. What's the simplest preferably-open-standard protocol for doing something like that? Please note I'm already using monit to keep the processes up and running and under control; what I'm asking for here is a single point of entry (i.e. dashboard) for checking in on them. Thanks.
If you are already using Monit then M/Monit sounds like a perfect match.
"M/Monit expand upon Monit's capabilities to provide monitoring and management of all Monit enabled hosts from one simple to use web-interface. " - http://mmonit.com/
G'day,
What about having a monitoring process on each server that checks the status of each process and then writes that out to a flat text file, say once every five minutes.
Then another process located on a central server can retrieve at those flat files and trawl through the results and flag any issues.
If you save the individual files and timestamp them, you would also be able to see any trends forming.
Just a quick ideea.
BTW The above system is used to monitor the servers in one of the largest websites in the world. Our scripts are written in Perl with a little bit of shell script but I don't see why you couldn't write your monitoring scripts in Ruby as well.
HTH
cheers,
I'd suggest to take a look at Zabbix.
It's not as simple as monit, of course, but it allows you to run data collecting agent on each of your servers, with all agents feeding the central reporting and storage server with their data. Those agents can use any custom scripts to get the metrics - you can write simple scripts to extract the data you need from your workers, send it back to the central reporting server and display it there on the dashboard.

Resources