Does collectd support arbitary nesting of metrics? - metrics

As per the collectd naming schema - metrics dumped out of collectd plugins need to follow this structure
host / plugin - plugin_instance / type - type_instance
That works fine for system metrics like cpu, memory, etc. but for an application that exposes its health status via a health URL can have arbitary nested of parameters of the form
{"datacenter": {"region": {"server": {"service": {"parameter": value } } } } }
which when dispatched by collectd should be translated into a graphite schema of
$datacenter.$region.$server.$service.$parameter = $value
But the current collectd naming schema does not allow that. How can one achieve that in collectd?

Others have faced this issue as mentioned here
I found a roundabout way of doing this:
Instead of collectd-python plugin, use the collectd exec plugin
Change the write_graphite plugin to have EscapeCharacter as "."
Restart collectd.
Now if I write an exec plugin:
#!/bin/bash
HOSTNAME="${COLLECTD_HOSTNAME:-localhost}"
INTERVAL="${COLLECTD_INTERVAL:-10}"
function gen_random() {
echo $RANDOM % 10 + 1 | bc
}
while sleep "$INTERVAL"; do
VALUE=$(do_magic)
echo "PUTVAL \"$HOSTNAME/region.datacenter.rack.cluster.server.service/gauge-service_parameter\" interval=$INTERVAL N:$VALUE" | tee -a /var/tmp/test.log;
done
That will create the following hierarchy in graphite:
region/
region/datacenter
region/datacenter/rack
region/datacenter/rack/cluster
region/datacenter/rack/cluster/server
region/datacenter/rack/cluster/server/service
region/datacenter/rack/cluster/server/service/gauge-service_parameter.wsp
Notice that the prefix "gauge" is important because collectd needs to know the type of value being pushed.

Related

Taskwarrior: How do I find the tasks that depend on a specific tasks?

How do I find out which task(s) depend on a specific task without reading the information of all tasks?
Reproduction
System
Version
$ task --version
2.5.1
.taskrc
# Taskwarrior program configuration file.
# Files
data.location=~/.task
alias.cal=calendar
rc.date.iso=Y-M-D
default.command=ready
journal.info=no
rc.regex=on
Here are the tasks that I created for testing purposes:
$ task list
ID Age Description Urg
1 2min Something to do 0
2 1min first do this 0
3 1min do this whenever you feel like it 0
3 tasks
Create the dependency from task#1 to task#2:
$ task 1 modify depends:2
Modifying task 1 'something to do'.
Modified 1 task.
$ task list
ID Age D Description Urg
2 4min first do this 8
3 4min do this whenever you feel like it 0
1 4min D Something to do -5
3 tasks
Goal
Now I want to find the tasks that are dependent on task#2, which should be task#1.
Trials
Unfortunately, this does not result in any matches:
$ task list depends:2
No matches.
$ # I can filter by blocked tasks
$ task blocked
ID Deps Age Description
1 2 18min Something to do
1 task
$ # But when I want to only have tasks \
that are blocked by task#2 also task#3 is returned
$ task blocked:2
[task ready ( blocked:2 )]
ID Age Description Urg
2 20min first do this 8
3 19min do this whenever you feel like it 0
2 tasks
Suggestions?
How would you approach this?
Parsing the taskwarrior output through a script looks like a bit of an overkill.
You have the right command but have actually encountered a bug: the depends attribute does not work with "short id", it's a comma-delimited string of uuids.
It will work if you use UUID instead. Use task <id> _uuid to resolve id to UUID.
$ task --version
2.5.1
# Create tasks
$ task rc.data.location: add -- Something to do
$ task rc.data.location: add -- first do this
$ task rc.data.location: add -- do this whenever you feel like it
$ task rc.data.location: list
ID Age Description Urg
1 - Something to do 1.8
2 - first do this 1.8
3 - do this whenever you feel like it 1.8
3 tasks
# Set up dependency
$ task rc.data.location: 1 modify depends:2
Modifying task 1 'Something to do'.
Modified 1 task.
# Query using depends:UUID
$ task rc.data.location: list "depends.has:$(task rc.data.location: _get 2.uuid)"
ID Age D Description Urg
1 - D Something to do -3.2
1 task
# Query using depends:SHORT ID
# This does not work, despite documentation. Likely a bug
$ task rc.data.location: list "depends.has:$(task rc.data.location: _get 2.id)"
No matches.
Small correction with your trial to find blocked tasks
There is no blocked attribute and you're using the ready report.
$ task blocked:2
[task ready ( blocked:2 )]
The ready report will filter out what we're looking for, the blocked report is what we need. To unmagickify this, these are simply useful default reports that have preset filters on top of task all.
$ task show filter | grep -e 'blocked' -e 'ready'
report.blocked.filter status:pending +BLOCKED
report.ready.filter +READY
report.unblocked.filter status:pending -BLOCKED
Blocked tasks will have the virtual tag +BLOCKED, which is mutually exclusive to +READY.
The blocked attribute doesn't exist, use task _columns to show available attributes (e.g. depends). Unfortunately, the CLI parser is probably attempting to apply the filter blocked:2 and ends up ignoring it. For your workflow, the useful command to use is task blocked "depends.has:$(task _get 2.uuid)". Advisable to write a shell function to make it easier to use:
#!/bin/bash
# Untested but gets the point across
function task_blocked {
blocker=$1
shift
task blocked depends.has:$(task _get ${blocker}.uuid) "$#"
}
# Find tasks of project "foo" that are blocked on task 2
task_blocked 2 project:foo
# What about other project that is also impacted
task_blocked 2 project:bar
You could use this taskwarrior hook script that adds a "blocks" attribute to the tasks: https://gist.github.com/wbsch/a2f7264c6302918dfb30

Is it possible to see all metrics (all paths) in whisper (graphite)?

I have a lot of metrics in Graphite and I have to search through them.
I tried to use whisper-fetch.py, but it returns the metric values (numbers), I want the metric names, something like that:
prefix1.prefix2.metricName1
prefix1.prefix2.metricName2
...
Thank you.
Graphite has a dedicated endpoint for retrieving all metrics as part of its HTTP API: /metrics/index.json
For example, running this command against my local Graphite
curl localhost:8080/metrics/index.json | jq "."
produces the following output:
[
"carbon.agents.graphite-0-a.activeConnections",
"carbon.agents.graphite-0-a.avgUpdateTime",
"carbon.agents.graphite-0-a.blacklistMatches",
"carbon.agents.graphite-0-a.cache.bulk_queries",
"carbon.agents.graphite-0-a.cache.overflow",
...
"stats_counts.response.200",
"stats_counts.response.400",
"stats_counts.response.404",
"stats_counts.statsd.bad_lines_seen",
"stats_counts.statsd.metrics_received",
"stats_counts.statsd.packets_received",
"statsd.numStats"
]
You can just use the unix find command, e.g. find /data/graphite -name 'some_pattern' or use the web api, e.g. curl http://my-graphite/metrics/find?query=somequery, see graphite metrics api

Bash case not properly evaluating value

The Problem
I have a script that has a case statement which I'm expecting to execute based on the value of a variable. The case statement appears to either ignore the value or not properly evaluate it instead dropping to the default.
The Scenario
I pull a specific character out of our server hostnames which indicates where in our environment the server resides. We have six different locations:
Management(m): servers that are part of the infrastructure such as monitoring, email, ticketing, etc
Development(d): servers that are for developing code and application functionality
Test(t): servers that are used for initial testing of the code and application functionality
Implementation(i): servers that the code is pushed to for pre-production evaluation
Production(p): self-explanatory
Services(s): servers that the customer needs to integrate that provide functionality across their project. These are separate from the Management servers in that these are customer servers while Management servers are owned and operated by us.
After pulling the character from the hostname I pass it to a case block. I expect the case block to evaluate the character and add a couple lines of text to our rsyslog.conf file. What is happening instead is that the case block returns the default which does nothing but tell the person building the server to manually configure the entry due to an unrecognized character.
I've tested this manually against a server I recently built and verified that the character I am pulling from the hostname (an 's') is expected and accounted for in the case block.
The Code
# Determine which environment our server resides in
host=$(hostname -s)
env=${host:(-8):1}
OLDFILE=/etc/rsyslog.conf
NEWFILE=/etc/rsyslog.conf.new
# This is the configuration we need on every server regardless of environment
read -d '' common <<- EOF
...
TEXT WHICH IS ADDED TO ALL CONFIG FILES REGARDLESS OF FURTHER CODE EXECUTION
SNIPPED
....
EOF
# If a server is in the Management, Dev or Test environments send logs to lg01
read -d '' lg01conf <<- EOF
# Relay messages to lg01
*.notice ##xxx.xxx.xxx.100
#### END FORWARDING RULE ####
EOF
# If a server is in the Imp, Prod or is a non-affiliated Services zone server send logs to lg02
read -d '' lg02conf <<- EOF
# Relay messages to lg02
*.notice ##xxx.xxx.xxx.101
#### END FORWARDING RULE ####
EOF
# The general rsyslog configuration remains the same; pull it out and write it to a new file
head -n 63 $OLDFILE > $NEWFILE
# Add the common language to our config file
echo "$common" >> $NEWFILE
# Depending on which environment ($env) our server is in, add the appropriate
# remote log server to the configuration with the $common settings.
case $env in
m) echo "$lg01conf" >> $NEWFILE;;
d) echo "$lg01conf" >> $NEWFILE;;
t) echo "$lg01conf" >> $NEWFILE;;
i) echo "$lg02conf" >> $NEWFILE;;
p) echo "$lg02conf" >> $NEWFILE;;
s) echo "$lg02conf" >> $NEWFILE;;
*) echo "Unknown environment; Manually configure"
esac
# Keep a dated backup of the original rsyslog.conf file
cp $OLDFILE $OLDFILE.$(date +%Y%m%d)
# Replace the original rsyslog.conf file with the new version
mv $NEWFILE $OLDFILE
An Aside
I've already determined that I can combine the different groups of code from the case block onto single lines (a total of two) using the | operator. I've listed it in the manner above since this is how it is coded while I'm having issues with it.
I can't see what's wrong with your code. Maybe add another ;; to the default clause. To find the problem add a set -vx as a first line. Will show you lots of debug information.

Can we set the multiple generic arguments with -D option in GenericOptionsParser?

I want to pass multiple configuration parameters to my Hadoop job through GenericOptionsParser.
With "-D abc=xyz" I can pass one argument and able to retrieve the same from the configuration object but I am not able to pass the multiple argument.
Is it possible to pass multiple argument?If yes how?
Passed the parameters as -D color=yellow -D number=10
Had the following code in the run() method
String color = getConf().get("color");
System.out.println("color = " + color);
String number = getConf().get("number");
System.out.println("number = " + number);
The following was the o/p in the console
color = yellow
number = 10
I recently ran in to this issue after upgrading from Hadoop 1.2.1 to Hadoop 2.4.1. The problem is that Hadoop's dependency on commons-cli 1.2 was being omitted due to a conflict with commons-cli 1.1 that was pulled in from Cassandra 2.0.5.
After a quick look through the source it looks like commons-cli options that have an uninitialized number of values (what Hadoop's GenericOptionsParser does) default to a limit of 1 in version 1.1 and no limit in 1.2.
I hope that helps!
I tested passing multiple parameters and I used the -D flag multiple times.
$HADOOP_HOME/bin/hadoop jar /path/to/my.jar -D mapred.heartbeats.in.second=80 -D mapred.map.max.attempts=2 ...`
Doing this changed the values to what I specified in the Job's configuration.

What gems do you recommend to use for this kind of automation?

I have to create a script to manage maintenance pages server for my hosting company.
I will need to do a CLI interface that would act like this (example scenario) :
(here, let's suppose that mcli is the name of the script, 1.1.1.1 the original server address (that host the website, www.exemple.com)
Here I just create the loopback interface on the maintenance server with the original ip address and create the nginx site-specific config file in sites-enabled
$ mcli register www.exemple.com 1.1.1.1
[DEBUG] Adding IP 1.1.1.1 to new loopback interface lo:001001001001
[WARNING] No root directory specified, setting default maintenance page.
[DEBUG] Registering www.exemple.com maintenance page and reloading Nginx: OK
Then when I want to enable the maintenance page and completely shutdown the website:
$ mcli maintenance www.exemple.com
[DEBUG] Connecting to router with SSH: OK
[DEBUG] Setting new route to 1.1.1.1 to maintenance server: OK
[DEBUG] Writing configuration: Ok
Then removing the maintenance page:
$ mcli nomaintenance www.exemple.com
[DEBUG] Connecting to router with SSH: OK
[DEBUG] Removing route to 1.1.1.1: Ok
[DEBUG] Writing configuration: Ok
And I would need a function to see the actual states of the websites
$ mcli list
+------------------+-----------------+------------------+
| Site Name | Server I.P | Maintenance mode |
+------------------+-----------------+------------------+
| www.example.com | 1.1.1.1 | Enabled |
| www.example.org | 1.1.1.2 | Disabled |
+------------------+-----------------+------------------+
$ mcli show www.example.org
Site Name: www.example.org
Server I.P: 1.1.1.1
Maintenance Mode: Enabled
Root Directory : /var/www/maintenance/default/
But I never did this kind of scripting with Ruby. What gems do you recommend for this kind of things ? For command line parsing ? Column/Colorized output ? SSH connection (needed to connect to cisco routers)
Do you recommend me to use a local database (sqlite) to store meta datas (Stages changes, actual states) or do you recommend me to compute on the fly by analyzing nginx/interfaces configuration files and using syslog for monitoring changes done with this script ?
This script will be used at first time for a massive datacenter physical migration, and next for standard usages for scheduled downtimes.
Thank you
First of all, I'd recommend you get a copy of Build awesome command-line applications in Ruby.
That said, you might want to check
GLI command line parsing like git
OptionParser command line parsing
Personally, I'd go for the SQLite approach for storing data, but I'm biased (having a strong SQL background).
Thor is a good gem for handling CLI options. It allows this type of organization in your script:
class Maintenance < Thor
desc "maintenance", "put up maintenance page"
method_option :switch, :aliases => '-s', :type => 'string'
#The method name is the name of the task that would be run => mcli maintenance
def maintenance
#do stuff
end
no_tasks do
#methods that you don't want cli tasks for go here
end
end
Maintenance.start
I don't really have any good suggestions for column/colorized output though.
I definitely recommend using some kind of a database to store states though. Maybe not sqlite, I would probably opt for maybe a redis database that stores key/value pairs with the information you are looking for.
We have similar task. I use next architecture
Small application (C) what generate config file
Add nginx init.d script new switch update_clusters. This script will restart nginx only if config file is changed
update_clusters() {
${CONF_GEN} --outfile=/tmp/nginx_clusters.conf
RETVAL=$?
if [[ "$RETVAL" != "0" ]]; then
return 5
fi
if ! diff ${CLUSTER_CONF_FILE} /tmp/nginx_clusters.conf > /dev/null; then
echo "Cluster configuration changed. Reload service"
mv -f /tmp/nginx_clusters.conf ${CLUSTER_CONF_FILE}
reload
fi
}
Set of bash scripts to add records to database.
Web console to add/modify/delete records in database (extjs+nginx module)

Resources