Several questions regarding Collectd system stats collector - metrics

I've recently started checking Collectd system stats collector after using Intel Snap for a while. By far it seems that it is less dynamic then Snapd collector but maybe I'm simply not fully aware of its capabilities. I've got few questions regarding its use:
Is it possible to collect only part of the metrics from each plugin? In case I'm not interested in all of them.
Is it possible to dynamically change the metrics that are being collected, something similar to Snap tasks? For example collecting only part of the metrics of the first 3 plugins? Or I'll need to change the configuration file every time for that?
I couldn't find the list of the stats/metrics that are being collected by each plugin, can I find it somewhere?
For people who used both Intel Snap and Collectd, does Collectd has any advantages/disadvantages over Snap?
Thanks!

(1)
Most collectd plugins support some sort of metric selection, and an inverse selection mechanism, which you can find in the plugin documentation. Also, there is the collectd "chain" plugin, that can be used to rename or filter metrics.
(2)
No. collectd does not natively support dynamic changes. See: https://github.com/collectd/collectd/issues/1005
(3)
A couple ways. Look at the man pages for collectd if you aren't finding what you need on the collectd wiki. If you want to check the running collectd process, then a good way to see the metrics is to enable a unixsock plugin, and use the "collectdctl listval" for metric introspection
(4)
Pros and cons to both
collectd is more mature
collectd supports more legacy and embedded
systems
collectd is light-weight with many compile time options and
embedded interpreters
collectd plugins are more frequently available
via the OS distribution packages
snap supports dynamic configuration (see answer to #2)
snap plugin docmentation is more clear. E.g. see the metric lists, roadmap, and install directions
snap task scheduling is more advanced : (I know this from suffering the pain for trying to match input plugin intervals to output plugin intervals in collectd)
snap processors provide more funcionality than collectd. E.g. I know of no equivalent to tag or anomalydetection from snap in collectd
collectd and snap do not support the same set of plugins. The presence or lack of plugin will be a clear advanatage/disadvantage based on use case. The method of building a plugin also differs greatly, so that could be a factor as well.

Related

How to set up monitoring for Redmine tasks/issues in Grafana?

I plan to set up monitoring for Redmine, with the help of which I can see man-hours spent on tickets, time taken to complete a ticket etc to monitor the productivity of my team. I want to see all of these using Graphana. As of now I think using Prometheus and exposing the Metrics but not sure how. (Might have to create an exporter I think, but not sure if that would work). So basically how can this be possible?
A Prometheus exporter is simply an HTTP server that sits next to your target (Redmine in your case, although I have no experience with it) and whenever it gets a /metrics request it does one or more API calls to the target (assuming Redmine provides an API to query the numbers you need) and returns said numbers as Prometheus metrics with names, labels etc.
Here are the Prometheus clients (that help expose metrics in the format accepted by Prometheus) for Go and Java (look for simpleclient_http or simpleclient_servlet). There is support for many other languages.
Adding on to #Alin's answer to expose Redmine metrics to Prometheus. You would need to install an exporter.
https://github.com/mbeloshitsky/redmine_prometheus.git
Here is a redmine plugin available for prometheus.
You can get the hours and all the data you need through Redmine Rest APIs. Write a little program to fetch and update the data in Graphite or Prometheus. You can perform this task using sensu through creating a metric script in python,ruby or Perl. Next all you have to do is Plotting the graphs. Well thats another race :P
RedMine guide: http://www.redmine.org/projects/redmine/wiki/Rest_api_with_python

How to monitor Elastic Stack without X-Pack?

Can we monitor the elastic stack 6.0 and above(like elastic search..) without using the X-Pack?As we know many of the Features like security, machine learning, graph APIs don't be supported under BASIC(free Licence).
So I want to know if there are any APIs, without Licence limitation, can be used to implement those functionalities mentioned above?
All the information should be in the cluster APIs, you'll just lack the visualizations.
Monitoring (of the local cluster) is actually included in X-Pack Basic unlike the other features. Any reason you don't want to use it?
Alternatives include Kopf, Cerebro,... though you'll need to run them as a separate process and watch out for version compatibilities.
We've had success with ElasticHQ for Monitoring (requires python)
https://github.com/ElasticHQ/elasticsearch-HQ
And sentinl for setting up alerts/watchers (it is a plugin for kibana)
https://github.com/sirensolutions/sentinl/wiki
We have set up a reverse proxy to enable ssl/tls and use ubuntu user management to create logins, however, we do not limit access within Kibana itself.
We have little need for graph/machine learning so I am unaware of free alternatives.
The company I work for is heavily Open Source, so these projects suit us.

Using opendaylight to document networks

I am facing a task to analyze, document and visualize a rather large global network (more than 200 sites, various technologies) and wonder if opendaylight might help me with this. The documentation should mainly focus on Layer 3 and Layer 4 but may partially also include L2 topology. There is no need to actually use a controller to configure devices through southbound APIs, but I would like to use the benefits of a structured, consistent description (e.g. YANG) and a visualization (DLUX, NeXt) which ODL provides.
So here's my question: Is there any way to manually add a topology (nodes, links) through a graphical editor to ODL?
From the general project description and from the last 20 seconds of this video, NeXt in general seems to be able to add/modify topology information. How far does the NeXt integration with ODL go? Would I need to write my own app (which could use NeXt) which would need to use the RESTCONF APIs to add information to the topology? Or should I maybe create a virtual topology of the real network using mininet? Any other ideas?
I understand that an sdn controller is probably not the right kind of tool for this task, but other alternatives (e.g. Net2Plan, Visio, yEd) are basically local solutions, a bit too complicated or provide no standardized DSL for topologies. Besides that, the documentation covering ODL and NeXt integration is very limited - couldn't get a grasp of how that integration should work (I'll try harder).

Difference between Apache NiFi and StreamSets

I am planning to do a class project and was going through few technologies where I can automate or set the flow of data between systems and found that there are couple of them i.e. Apache NiFi and StreamSets ( to my knowledge ). What I couldn't understand is the difference between them and use-cases where they can be used? I am new to this and if anyone can explain me a bit would be highly appreciated. Thanks
Suraj,
Great question.
My response is as a member of the open source Apache NiFi project management committee and as someone who is passionate about the dataflow management domain.
I've been involved in the NiFi project since it was started in 2006. My knowledge of Streamsets is relatively limited so I'll let them speak for it as they have.
The key thing to understand is that NiFi was built to do one really important thing really well and that is 'Dataflow Management'. It's design is based on a concept called Flow Based Programming which you may want to read about and reference for your project 'https://en.wikipedia.org/wiki/Flow-based_programming'
There are already many systems which produce data such as sensors and others. There are many systems which focus on data processing like Apache Storm, Spark, Flink, and others. And finally there are many systems which store data like HDFS, relational databases, and so on. NiFi purely focuses on the task of connecting those systems and providing the user experience and core functions necessary to do that well.
What are some of those key functions and design choices made to make that effective:
1) Interactive command and control
The job of someone trying to connect systems is to be able to rapidly and efficiently interact with the constant streams of data they see. NiFi's UI allows you do just that as the data is flowing you can add features to operate on it, fork off copies of data to try new approaches, adjust current settings, see recent and historical stats, helpful in-line documentation and more. Almost all other systems by comparison have a model that is design and deploy oriented meaning you make a series of changes and then deploy them. That model is fine and can be intuitive but for the dataflow management job it means you don't get the interactive change by change feedback that is so vital to quickly build new flows or to safely and efficiently correct or improve handling of existing data streams.
2) Data Provenance
A very unique capability of NiFi is its ability to generate fine grained and powerful traceability details for where your data comes from, what is done to it, where its sent and when it is done in the flow. This is essential to effective dataflow management for a number of reasons but for someone in the early exploration phases and working a project the most important thing this gives you is awesome debugging flexibility. You can setup your flows and let things run and then use provenance to actually prove that it did exactly what you wanted. If something didn't happen as you expected you can fix the flow and replay the object then repeat. Really helpful.
3) Purpose built data repositories
NiFi's out of the box experience offers very powerful performance even on really modest hardware or virtual environments. This is because of the flowfile and content repository design which gives us the high performance but transactional semantics we want as data works its way through the flow. The flowfile repository is a simple write ahead log implementation and the content repository provides an immutable versioned content store. That in turn means we can 'copy' data by only ever adding a new pointer (not actually copying bytes) or we can transform data by simply reading from the original and writing out a new version. Again very efficient. Couple that with the provenance stuff I mentioned a moment ago and it just provides a really powerful platform. Another really key thing to understand here is that in the business of connecting systems you don't always get to dictate things like size of data involved. The NiFi API was built to honor that fact and so our API lets processors do things like receive, transform, and send data without ever having to load the full objects in memory. These repositories also mean that in most flows the majority of processors do not even touch the content at all. However, you can easily see from the NiFi UI precisely how many bytes are actually being read or written so again you get really helpful information in establishing and observing your flows. This design also means NiFi can support back-pressure and pressure-release naturally and these are really critical features for a dataflow management system.
It was mentioned previously by the folks from the Streamsets company that NiFi is file oriented. I'm not really sure what the difference is between a file or a record or a tuple or an object or a message in generic terms but the reality is when data is in the flow then it is 'a thing that needs to be managed and delivered'. That is what NiFi does. Whether you have lots of really high speed tiny things or you have large things and whether they came from a live audio stream off the Internet or they come from a file sitting on your harddrive it doesn't matter. Once it is in the flow it is time to manage and deliver it. That is what NiFi does.
It was also mentioned by the Streamsets company that NiFi is schemaless. It is accurate that NiFi does not force conversion of data from whatever it is originally to some special NiFi format nor do we have to reconvert it back to some format for follow-on delivery. It would be pretty unfortunate if we did that because what this means is that even the most trivial of cases would have problematic performance implications and luckily NiFi does not have that problem. Further had we gone that route then it would mean handling diverse datasets like media (images, video, audio, and more) would be difficult but we're on the right track and NiFi is used for things like that all the time.
Finally, as you continue with your project and if you find there are things you'd like to see improved or that you'd like to contribute code we'd love to have your help. From https://nifi.apache.org you can quickly find information on how to file tickets, submit patches, email the mailing list, and more.
Here are a couple of fun recent NiFi projects to checkout:
https://www.linkedin.com/pulse/nifi-ocr-using-apache-read-childrens-books-jeremy-dyer
https://twitter.com/KayLerch/status/721455415456882689
Good luck on the class project! If you have any questions the users#nifi.apache.org mailing list would love to help.
Thanks
Joe
Both Apache NiFi and StreamSets Data Collector are Apache-licensed open source tools.
Hortonworks does have a commercially supported variant called Hortonworks DataFlow (HDF).
While both have a lot of similarities such as a web-based ui, both are used for ingesting data there are a few key differences. They also both consist of a processors linked together to perform transformations, serialization, etc.
NiFi processors are file-oriented and schemaless. This means that a piece of data is represented by a FlowFile (this could be an actual file on disk, or some blob of data acquired elsewhere). Each processor is responsible for understanding the content of the data in order to operate on it. Thus if one processor understands format A and another only understands format B, you may need to perform a data format conversion in between those two processors.
NiFi can be run standalone, or as a cluster using its own built-in clustering system.
StreamSets Data Collector (SDC) however, takes a record based approach. What this means is that as data enters your pipeline it (whether its JSON, CSV, etc) it is parsed into a common format so that the responsibility of understanding the data format is no longer placed on each individual processor and any processor can be connected to any other processor.
SDC also runs standalone, and also a clustered mode, but it runs atop Spark on YARN/Mesos instead, leveraging existing cluster resources you may have.
NiFi has been around for about the last 10 years (but less than 2 years in the open source community).
StreamSets was released to the open source community a little bit later in 2015. It is vendor agnostic, and as far as Hadoop goes Hortonworks, Cloudera, and MapR are all supported.
Full Disclosure: I am an engineer who works on StreamSets.
They are very similar for data ingest scenarios.
Apache NIFI(HDP) is more mature and StreamSets is more lightweight.
Both are easy to use, both have strong capability. And StreamSets could easily
They have companies behind, Hortonworks and Cloudera.
Obviously there are more contributors working on NIFI than StreamSets, of course, NIFI have more enterprise deployments in production.
Two of the key differentiators between the two IMHO are.
Apache NiFi is a Top Level Apache project, meaning it has gone through the incubation process described here, http://incubator.apache.org/policy/process.html, and can accept contributions from developers around the world who follow the standard Apache process which ensures software quality. StreamSets, is Apache LICENSED, meaning anyone can reuse the code, etc. But the project is not managed as an Apache project. In fact, in order to even contribute to Streamsets, you are REQUIRED to sign a contract. https://streamsets.com/contributing/ . Contrast this with the Apache NiFi contributor guide, which wasn't written by a lawyer. https://cwiki.apache.org/confluence/display/NIFI/Contributor+Guide#ContributorGuide-HowtocontributetoApacheNiFi
StreamSets "runs atop Spark on YARN/Mesos instead, leveraging existing cluster resources you may have." which imposes a bit of restriction if you want to deploy your dataflows further toward the Edge where the Devices that are generating the data live. Apache MiniFi, a sub-project of NiFi can run on a single Raspberry Pi, while I am fairly confident that StreamSets cannot, as YARN or Mesos require more resources than a Raspberry Pi provides.
Disclosure: I am a Hortonworks employee

Remote Execution in Ruby (Capistrano or MCollective) to collect cloud server performance metrics

I am looking for a way to collect data remotely from various cloud instances (EC2, Rackpsace). The Rackspace API provides no way for collecting server performance metrics (ie load average, cpu usage, memory) via it's API, otherwise this would have never been asked.
I started looking at solutions like Capistrano or Mcollective (I have also considered collectd), but I am unsure of which one would best suit my application. I am trying to avoid using ssh keys for trending purposes (I don't want to have to keep logging in to collect these metrics) The script I am writing is a Ruby script which reboots a cloud server if it's load average is over a certain number. Because these providers don't expose these metrics via their API, I am looking at a way to gather them myself, and I am new to the Ruby community so after briefing over the documentation for all of these tools, I still haven't been able to get a sense of which framework would work best, or if there are other alternatives.
It sounds like Capistrano is more suited to be a deployment tool, although it can perform remote tasks, so after I read the documentation for that it was pretty much out for the purposes of my script.
MCollective looks really attractive for what I am trying to do but it seems I would have to write my own RPC style plugin for this purpose.
I've also considered plugging into some greater monitoring system such as Nagios, Munin, Zenoss, Hyperic, etc, but I'd rather not install some large bulk monitoring system when all I want to collect is but a few simple metrics.
If your intention is to trigger certain actions based on the system performance (like restarting when cpu usage is too high), you should check out god.
I'm not sure if this is also useful when you want to generate some performance statistics over a longer time period. Personally, I'm using Munin for this, but if you don't like it maybe you can find something on Ruby Toolbox | Server Monitoring.

Resources