Monitor specific key patterns in memcache server? - debugging

We have a namespacing convention to distinguish our memcache entries. I would like to monitor the get and set that happens to a certain namespace to track a bug.
I can monitor the entire memcache operations but I fear that it is going to be a huge data because its almost a significant subset of the DB data and the logs are going to run into GB's, so I need to filter only the namespace I am interested in.
I have a client side solution which is to decorate (or over-ride) memcache.get and memcache.set to print the arguments if the key matches our desired pattern.
However I feel it is better to do this in server side. Also there would be too many clients if I have collect this information from all nodes. Is there something that we could in the server side to get the same effect? Anything in memcached debug module that would help us?

Unfortunately there's no way at the moment. I see on github that they're working on such a feature, but currently it does not exist as far as i know.
What i use in this case is:
tcpdump -i lo -s 65535 -A -ttt port 11211| cut -c 9- | grep -i '^get\|set'
As alternative you can use a proxy tool to get such a output like mcrouter. https://github.com/facebook/mcrouter/wiki/Mcpiper

Related

"tail -F" equivalent in lftp

I'm currently looking for tips to simulate a tail -F in lftp.
The goal is to monitor a log file the same way I could do with a proper ssh connection.
The closest command I found for now is repeat cat logfile.
It works but that not the best when my file is too big cause it displays each time all the file.
The lftp program specifically will not support this, but if the server supports the extension, it is possible to pull only the last $x bytes from a file with, e.g. curl --range (see this serverfault answer). This, combined with some logic to only grab as many bytes as have been added since the last poll, could allow you to do this relatively efficiently. I doubt if there are any off-the-shelf FTP clients with this functionality, but someone else may know better.

how to view an extremely large file on a sever: Freebase

I'm trying to look at the Freebase data dump which is stored on a server that I access through ssh. The trouble is I don't know how I can view it in a way that doesn't take forever, make things freeze or crash, I had been trying to view it with nano and it evokes the precisely the behaviour just described.
The operating system is Darwin.
How can I examine this data?
Basically you could use command more or less to scroll over the file. If you know which lines in the file you are interested in, like from line 3000 to 3999, you could show them with sed -n '3000,3999p' your_file_name.

How to use RRDTool/Cacti to count "user activities" in apache access logs?

Goal
I wish to use RRDTool to count logical "user activity" from our web application's apache/tomcat access logs.
Specifically we want to count, for a period, occurrences of several url patterns.
Example
We have two applications (call them 'foo' and 'bar')
These url's interest us. They indicate when users 'did interesting stuff'.
/foo/hop
/foo/skip
/foo/jump
/bar/crawl
/bar/walk
/bar/run
Basically we want to know for a given interval (10 minutes, hour, day, etc.) how many users: hopped,skipped,jumped,crawled, walked, etc.
Reference/Starting point
This article on importing access logs into RRDTool seemed like a helpful starting point.
http://neidetcher.com/programming/2014/05/13/just-enough-rrdtool.html
However to clarify, this example uses the access log directly , whereas we want to a handful of url's 'in buckets' and count the 'number in each bucket'
Some Scripting Required..
I could do this with bash & grep & wc --iterating through the patterns, sending output to an 'intermediate results' text file....but believe RRDTool could do this with minimal 'outside coding'
That said, I believe RRDTool could do this with minimal 'outside coding'--but am unclear on the details.
Some points
I mention 'two applications' because we actually serve them up from separate servers with different log file formats. I'd like go get them into the same RRA file
Eventually I'd like to report this in cacti; initially however, I wanted to understand RRDTool details
Open to doing any coding, but would like to keep it as efficient as possible--both administratively and computer-resources. (By administratively, I mean: easy to monitor new instances)
I am very new to RRDTool and am RTM'ing . (and Walking through the Tutorial). I'm used to relational databases and spreadsheets, etc and don't have my mind around all the nuances of the RRA format.
Thanks in advance!
You could setup a separate RRD file with ABSOLUTE type datasources for each address you want to track.
Then you tail the log file and whenever you see one of the interesting urls rush by you call:
rrdtool update url-xyz.rrd N:1
The ABSOLUTE data source type is like a counter, but it gets reset every time it is read. Your counter will just count to one, but that should not be a problem.
In the example above I am using N: and not the timestamp from the access log. You could also use that if you are not doing this in real time ... but beware that you can not update the same rrd file twice at the same time. N: will use milli timestamps internally and thus probably avoid this problem.
On the other hand it may make more sense to accumulate matching log entries with the same timestamp and only update rrdtool with that number once the timestamp on the logfile changes.

Network Usage of Process Using Powershell

I want to calculate the bytes sent and recieved by a particular process .I want to use powershell for that.
Something which I can do using Resource Monitor->Network Activity.
How can i do that using get-counter?
There is a really good Scripting Guy article on using the Get-Counter cmdlet here:
Scripting Guy - Get-Counter
The trick will be finding the counter that you need, and I don't think you can get the granularity you're after as these are the same counters that PerfMon uses. It's more focused around the whole Network Interface than it is around the individual processes using the interface. With that said, if it's the only thing using the given interface it should do the trick nicely.
Have a look at the Network Interface options available for a start:
(get-counter -list "Network Interface").paths
You can't, it seems. I'm absolutely unable to find the counters the performance monitor is reading from, though other people may chime in. There may be some other way than get-counter too, but that is what you specifically requested.
Looking through the counters, the closest thing you will find is the "IO Read Bytes/sec" and "IO Write Bytes/sec" counters on the process object.
The problem with those is that they count more than just network activity. The description in perfmon says:
"This counter counts all I/O activity generated by the process to
include file, network and device I/Os."
That being said, if you know that the process you want to monitor only or mainly writes to the network connection, this may be better than not measuring anything at all.
You'd go about it like this (I'll use Chrome as an example since it is conveniently running and using data right now):
get-counter "\Process(chrome*)\IO Read Bytes/sec"
This will just give you a one-time reading. If you want to keep reading you can add the continous switch.
The PerformanceCounterSampleSet object that is returned is not exactly pretty to work with, but you can find the actual reading in $obj.countersamples.cookedvalue.
The list will be fairly long (if you browse like me). Chrome is running in many separate processes, so we'll do a bit of math to get them all added up, and presented in KB.
Final result:
get-counter "\Process(chrome*)\IO Read Bytes/sec" -Continuous | foreach {
[math]::round((($_.countersamples.cookedvalue | measure -sum).sum / 1KB), 2)
}
Running this will just continously output a reading of how many KB/s Chrome is using.

Process Management w/ bash/terminal

Quick bash/terminal question -
I work a lot on the command line, but have never really had a good idea of how to manage running processes with it - I am aware of 'ps', but it always gives me an exceedingly long and esoteric list of junk, including like 30 google chrome workers, and I always end up going back to activity monitor to get a clean look at what's actually going on.
Can anyone offer a bit of advice on how to manage running processes from the command line? Is there a way to get a clean list of what you've got running? I often use 'killall' on process names that I know as a quick way to get rid of something that's freezing up - can I get those names to display via terminal rather than the strange long names and numbers that ps displays by default? And can I search for a specific process or quick regex of a process, like '*ome'?
If anyone has the answers to these three questions, that would be amazingly helpful to many people, I'm sure : )
Thanks!!
Yes grep is good.
I don't know what you want to achieve but do you know the top command ? Il gives you a dynamic view of what's going on.
On Linux you have plenty of commands that should help you getting what you want in a script and piping commands is a basic we are taught when studying IT.
You can also get a look to the man of jobs and I would advise you to read some articles about process management basics. :)
Good Luck.
ps -o command
will give you a list of just the process names (more exactly, the command that invoked the process). Use grep to search, like this:
ps -o command | grep ".*ome"
there may be scripts out there..
but for example if you're seeing a lot of chrome that you're not interested in, something as simple as the following would help:
ps aux | grep -v chrome
other variations could help showing each image only once... so you get one chrome, one vim etc.. (google show unique rows with perl or python or sed for example)
you could use ps to specifiy one username... so you filter out system processes, or if more than one user is logged to the machine etc.
Ps is quite versatile with the command line arguments.. a little digging help finding a lot of nice tweaks and flags in combinations with other tools such as perl and sed etc..

Resources