How to configure collectd-snmp to poll a router? - snmp

I am trying to use a Raspberry Pi to poll the interface MIB (IF:MIB) of a TP-LINK router and then send the metrics to Librato.
Setting up collectd and integrating it with Librato is no problem at all - I am successfully tracking other metrics (cpu, memory, etc.). The challenge I have is with the collectd-snmp plugin configuration.
I installed net-snmp and can "see" the router:
pi#raspberrypi ~ $ snmpwalk -v 1 -c public 192.168.0.1 IF-MIB::ifInOctets
IF-MIB::ifInOctets.2 = Counter32: 1206812646
IF-MIB::ifInOctets.3 = Counter32: 1548296842
IF-MIB::ifInOctets.5 = Counter32: 19701783
IF-MIB::ifInOctets.10 = Counter32: 0
IF-MIB::ifInOctets.11 = Counter32: 0
IF-MIB::ifInOctets.15 = Counter32: 0
IF-MIB::ifInOctets.16 = Counter32: 0
IF-MIB::ifInOctets.22 = Counter32: 0
IF-MIB::ifInOctets.23 = Counter32: 0
The Pi is on 192.168.0.20, the router on 192.168.0.1.
My collectd.conf is as follows:
<Plugin snmp>
<Data "ifmib_if_octets32">
Type "if_octets"
Table true
Instance "IF-MIB::ifDescr"
Values "IF-MIB::ifInOctets" "IF-MIB::ifOutOctets"
</Data>
<Host "localhost">
Address "192.168.0.1"
Version 1
Community "public"
Collect "ifmib_if_octets32"
Interval 60
</Host>
</Plugin>
When I restart collectd I get the following error:
pi#raspberrypi ~ $ sudo service collectd restart
[....] Restarting statistics collection and monitoring daemon: collectdNo log handling enabled - turning on stderr logging
MIB search path: $HOME/.snmp/mibs:/usr/share/mibs/site:/usr/share/snmp/mibs:/usr/share/mibs/iana:/usr/share/mibs/ietf:/usr/share/mibs/netsnmp
Cannot find module (IF-MIB): At line 0 in (none)
[2015-01-24 23:01:31] snmp plugin: read_objid (IF-MIB::ifDescr) failed.
[2015-01-24 23:01:31] snmp plugin: No such data configured: `ifmib_if_octets32'
No log handling enabled - turning on stderr logging
MIB search path: $HOME/.snmp/mibs:/usr/share/mibs/site:/usr/share/snmp/mibs:/usr/share/mibs/iana:/usr/share/mibs/ietf:/usr/share/mibs/netsnmp
Cannot find module (IF-MIB): At line 0 in (none)
[2015-01-24 23:01:33] snmp plugin: read_objid (IF-MIB::ifDescr) failed.
[2015-01-24 23:01:33] snmp plugin: No such data configured: `ifmib_if_octets32'
. ok
It obviously can't find the MIB, it doesn't even seem to be looking at the router's IP. Any suggestions on how to configure this correctly?

I figured it out:
<Plugin snmp>
<Data "if_Octets">
Type "if_octets"
Table true
Values "IF-MIB::ifInOctets" "IF-MIB::ifOutOctets"
</Data>
<Host "tp-link">
Address "192.168.0.1"
Version 1
Community "public"
Collect "if_Octets"
Interval 60
</Host>
</Plugin>

Related

Getting error for basic Eland example (loading index from a locally installed ELK docker container)

We installed ELK in docker based on this example. Like:
docker run -d --name elasticsearchdb --net es-stack-network -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:6.8.13
docker run -d --name kibana-es-ui --net es-stack-network -e "ELASTICSEARCH_URL=http://elasticsearchdb:9200" -p 5601:5601 kibana:6.8.13
We then set up Elastic with the basic built-in data sets, including the flights dataset offered by default.
Then we tried using Eland to pull that data into a dataframe, and I think we're following the documentation correctly.
But with the code:
import eland as ed
index_name = 'flights'
ed_df = ed.DataFrame('localhost:9200', index_name)
we get this error:
File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\elastic_transport\client_utils.py:198, in url_to_node_config(url)
192 raise ValueError(f"Could not parse URL {url!r}") from None
194 if any(
195 component in (None, "")
196 for component in (parsed_url.scheme, parsed_url.host, parsed_url.port)
197 ):
--> 198 raise ValueError(
199 "URL must include a 'scheme', 'host', and 'port' component (ie 'https://localhost:9200')"
200 )
202 headers = {}
203 if parsed_url.auth:
ValueError: URL must include a 'scheme', 'host', and 'port' component (ie 'https://localhost:9200')
So when we add http://, like so:
import eland as ed
index_name = 'flights'
ed_df = ed.DataFrame('http://localhost:9200', index_name)
We get this error:
File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\elastic_transport\_node\_http_urllib3.py:199, in Urllib3HttpNode.perform_request(self, method, target, body, headers, request_timeout)
191 err = ConnectionError(str(e), errors=(e,))
192 self._log_request(
193 method=method,
194 target=target,
(...)
197 exception=err,
198 )
--> 199 raise err from None
201 meta = ApiResponseMeta(
202 node=self.config,
203 duration=duration,
(...)
206 headers=response_headers,
207 )
208 self._log_request(
209 method=method,
210 target=target,
(...)
214 response=data,
215 )
ConnectionError: Connection error caused by: ConnectionError(Connection error caused by: ProtocolError(('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))))
So I thought, well, maybe it's serving on HTTPS by default for some reason, maybe not related but in the logs I saw:
05T17:17:04.734Z", "log.level": "WARN", "message":"received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/172.18.0.3:9200, remoteAddress=/172.18.0.1:59788}", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[21683dc12cff][transport_worker][T#14]","log.logger":"org.elasticsearch.xpack.security.transport.netty4.SecurityNetty4HttpServerTransport","elasticsearch.cluster.uuid":"XuzqXMk_QgShA3L5HnfXgw","elasticsearch.node.id":"H1CsKboeTyaFFjk2-1nw2w","elasticsearch.node.name":"21683dc12cff","elasticsearch.cluster.name":"docker-cluster"}
so I try replacing http with https and get this error:
TlsError: TLS error caused by: TlsError(TLS error caused by: SSLError([SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1129)))
So I look up this error and find this thread which says do something like:
import ssl
from elasticsearch.connection import create_ssl_context
ssl_context = create_ssl_context(<use `cafile`, or `cadata` or `capath` to set your CA or CAs)
context.check_hostname = False
context.verify_mode = ssl.CERT_NONE
es = Elasticsearch('localhost', ssl_context=context, timeout=60
But this isn't helpful because Eland handles elasticsearch instancing internally, I'm not controlling that.
This is a very basic scenario, so I'm sure the solution must be much simpler than all this. What can I do to make this work?
For who ever is still struggling, the following worked for me with a local Elastic cluster using Docker/docker-compose:
Following this guide you'd have the http_ca.crt file locally, using the command:
docker cp es01:/usr/share/elasticsearch/config/certs/http_ca.crt .
You can use the http_ca.crt file when creating your es_client:
from elasticsearch import Elasticsearch
es_client = Elasticsearch("https://localhost:9200",
ca_certs="/path/to/http_ca.crt",
basic_auth=("[elastic username]",
"[elastic password]"))
And then use the es_client to connect to eland:
import eland as ed
df = ed.DataFrame(es_client=es_client, es_index_pattern="[Your index]")
df.head()

Possible reasons for groovy program running as kubernetes job dumping threads during execution

I have a simple groovy script that leverages the GPars library's withPool functionality to launch HTTP GET requests to two internal API endpoints in parallel.
The script runs fine locally, both directly as well as a docker container.
When I deploy it as a Kubernetes Job (in our internal EKS cluster: 1.20), it runs there as well, but the moment it hits the first withPool call, I see a giant thread dump, but the execution continues, and completes successfully.
NOTE: Containers in our cluster run with the following pod security context:
securityContext:
fsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
Environment
# From the k8s job container
groovy#app-271df1d7-15848624-mzhhj:/app$ groovy --version
WARNING: Using incubator modules: jdk.incubator.foreign, jdk.incubator.vector
Groovy Version: 4.0.0 JVM: 17.0.2 Vendor: Eclipse Adoptium OS: Linux
groovy#app-271df1d7-15848624-mzhhj:/app$ ps -ef
UID PID PPID C STIME TTY TIME CMD
groovy 1 0 0 21:04 ? 00:00:00 /bin/bash bin/run-script.sh
groovy 12 1 42 21:04 ? 00:00:17 /opt/java/openjdk/bin/java -Xms3g -Xmx3g --add-modules=ALL-SYSTEM -classpath /opt/groovy/lib/groovy-4.0.0.jar -Dscript.name=/usr/bin/groovy -Dprogram.name=groovy -Dgroovy.starter.conf=/opt/groovy/conf/groovy-starter.conf -Dgroovy.home=/opt/groovy -Dtools.jar=/opt/java/openjdk/lib/tools.jar org.codehaus.groovy.tools.GroovyStarter --main groovy.ui.GroovyMain --conf /opt/groovy/conf/groovy-starter.conf --classpath . /tmp/script.groovy
groovy 116 0 0 21:05 pts/0 00:00:00 bash
groovy 160 116 0 21:05 pts/0 00:00:00 ps -ef
Script (relevant parts)
#Grab('org.codehaus.gpars:gpars:1.2.1')
import static groovyx.gpars.GParsPool.withPool
import groovy.json.JsonSlurper
final def jsl = new JsonSlurper()
//...
while (!(nextBatch = getBatch(batchSize)).isEmpty()) {
def devThread = Thread.start {
withPool(poolSize) {
nextBatch.eachParallel { kw ->
String url = dev + "&" + "query=$kw"
try {
def response = jsl.parseText(url.toURL().getText(connectTimeout: 10.seconds, readTimeout: 10.seconds,
useCaches: true, allowUserInteraction: false))
devResponses[kw] = response
} catch (e) {
println("\tFailed to fetch: $url | error: $e")
}
}
}
}
def stgThread = Thread.start {
withPool(poolSize) {
nextBatch.eachParallel { kw ->
String url = stg + "&" + "query=$kw"
try {
def response = jsl.parseText(url.toURL().getText(connectTimeout: 10.seconds, readTimeout: 10.seconds,
useCaches: true, allowUserInteraction: false))
stgResponses[kw] = response
} catch (e) {
println("\tFailed to fetch: $url | error: $e")
}
}
}
}
devThread.join()
stgThread.join()
}
Dockerfile
FROM groovy:4.0.0-jdk17 as builder
USER root
RUN apt-get update && apt-get install -yq bash curl wget jq
WORKDIR /app
COPY bin /app/bin
RUN chmod +x /app/bin/*
USER groovy
ENTRYPOINT ["/bin/bash"]
CMD ["bin/run-script.sh"]
The bin/run-script.sh simply downloads the above groovy script at runtime and executes it.
wget "$GROOVY_SCRIPT" -O "$LOCAL_FILE"
...
groovy "$LOCAL_FILE"
As soon as the execution hits the first call to withPool(poolSize), there's a giant thread dump, but execution continues.
I'm trying to figure out what could be causing this behavior. Any ideas 🤷🏽‍♂️?
Thread dump
For posterity, answering my own question here.
The issue turned out to be this log4j2 JVM hot-patch that we're currently leveraging to fix the recent log4j2 vulnerability. This agent (running as a DaemonSet) patches all running JVMs in all our k8s clusters.
This, somehow, causes my OpenJDK 17 based app to thread dump. I found the same issue with an ElasticSearch 8.1.0 deployment as well (also uses a pre-packaged OpenJDK 17). This one is a service, so I could see a thread dump happening pretty much every half hour! Interestingly, there are other JVM services (and some SOLR 8 deployments) that don't have this issue 🤷🏽‍♂️.
Anyway, I worked with our devops team to temporarily exclude the node that deployment was running on, and lo and behold, the thread dumps disappeared!
Balance in the universe has been restored 🧘🏻‍♂️.

Error while adding flow using OpenDaylight REST API

I have an OpenDaylight controller set up as the mechanism driver in an OpenStack cloud. The installation was done with DevStack in an all-in-one virtual machine.
Everything is working fine. I have two VM in my OpenStack cloud and they can ping each other. Then I want to add a new flow to my Open vSwitch, I have the following error:
curl -u admin:admin -H 'Content-Type: application/yang.data+xml' -X PUT -d #flow_data.xml http://192.168.100.100:8181/restconf/config/opendaylight-inventory:nodes/node/openflow:1/table/234/flow/100770 | python -m json.tool
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1347 0 638 100 709 42078 46761 --:--:-- --:--:-- --:--:-- 50642
{
"errors": {
"error": [
{
"error-info": "Data from case (urn:opendaylight:flow:inventory?revision=2013-08-19)output-action-case are specified but other data from case (urn:opendaylight:flow:inventory?revision=2013-08-19)drop-action-case were specified earlier. Data aren't from the same case.",
"error-message": "Error parsing input: Data from case (urn:opendaylight:flow:inventory?revision=2013-08-19)output-action-case are specified but other data from case (urn:opendaylight:flow:inventory?revision=2013-08-19)drop-action-case were specified earlier. Data aren't from the same case.",
"error-tag": "malformed-message",
"error-type": "protocol"
}
]
}
}
flow_data.xml :
<?xml version="1.0"?>
<flow xmlns="urn:opendaylight:flow:inventory">
<priority>14865</priority>
<flow-name>jpsampleFlow</flow-name>
<idle-timeout>12000</idle-timeout>
<match>
<ethernet-match>
<ethernet-type>
<type>2048</type>
</ethernet-type>
</ethernet-match>
<ipv4-source>10.0.0.1/32</ipv4-source>
<ipv4-destination>10.0.0.2/32</ipv4-destination>
<ip-match>
<ip-dscp>28</ip-dscp>
</ip-match>
</match>
<id>9</id>
<table_id>0</table_id>
<instructions>
<instruction>
<order>6555</order>
</instruction>
<instruction>
<order>0</order>
<apply-actions>
<action>
<order>0</order>
<drop-action/>
<output-action>
<output-node-connector>1</output-node-connector>
</output-action>
</action>
</apply-actions>
</instruction>
</instructions>
</flow>
Any idea what am I doing wrong ? Thank you.
you have one action to drop and another action to forward. aren't
those mutually exclusive? try without one or the other.

MapR installation failing for single node cluster

I was referring quick installation guide for single node cluster. For this i used 20GB storage file for MaprFS but while on installation , it is giving ' Unable to find disks: /maprfs/storagefile' .
Here is my configuration file.
# Each Node section can specify nodes in the following format
# Hostname: disk1, disk2, disk3
# Specifying disks is optional. If not provided, the installer will use the values of 'disks' from the Defaults section
[Control_Nodes]
maprlocal.td.td.com: /maprfs/storagefile
#control-node2.mydomain: /dev/disk3, /dev/disk9
#control-node3.mydomain: /dev/sdb, /dev/sdc, /dev/sdd
[Data_Nodes]
#data-node1.mydomain
#data-node2.mydomain: /dev/sdb, /dev/sdc, /dev/sdd
#data-node3.mydomain: /dev/sdd
#data-node4.mydomain: /dev/sdb, /dev/sdd
[Client_Nodes]
#client1.mydomain
#client2.mydomain
#client3.mydomain
[Options]
MapReduce1 = true
YARN = true
HBase = true
MapR-DB = true
ControlNodesAsDataNodes = true
WirelevelSecurity = false
LocalRepo = false
[Defaults]
ClusterName = my.cluster.com
User = mapr
Group = mapr
Password = mapr
UID = 2000
GID = 2000
Disks = /maprfs/storagefile
StripeWidth = 3
ForceFormat = false
CoreRepoURL = http://package.mapr.com/releases
EcoRepoURL = http://package.mapr.com/releases/ecosystem-4.x
Version = 4.0.2
MetricsDBHost =
MetricsDBUser =
MetricsDBPassword =
MetricsDBSchema =
Below is the error that i am getting.
2015-04-16 08:18:03,659 callbacks 42 [INFO]: Running task: [Verify Pre-Requisites]
2015-04-16 08:18:03,661 callbacks 87 [ERROR]: maprlocal.td.td.com: Unable to find disks: /maprfs/storagefile from /maprfs/storagefile remove disks: /dev/sda,/dev/sda1,/dev/sda2,/dev/sda3 and retry
2015-04-16 08:18:03,662 callbacks 91 [ERROR]: failed: [maprlocal.td.td.com] => {"failed": true}
2015-04-16 08:18:03,667 installrunner 199 [ERROR]: Host: maprlocal.td.td.com has 1 failures
2015-04-16 08:18:03,668 common 203 [ERROR]: Control Nodes have failures. Please fix the failures and re-run the installation. For more information refer to the installer log at /opt/mapr-installer/var/mapr-installer.log
Please help me here.
Thanks
Shashi
Error is resolved by adding skip-check new option after install
/opt/mapr-installer/bin/install --skip-checks new

gogo: CommandNotFoundException: Command not found: services

I know some of the commands have changed names when Apache Felix started using GoGo
For example: ps --> lb (list bundles)
What is the equivalent for services <BUNDLENO>
I am trying to get the following output from my console:
services 5
Distributed OSGi Zookeeper-Based Discovery Single-Bundle Distribution (6) provides:
-----------------------------------------------------------------------------------
... other services ...
----
objectClass = org.osgi.service.cm.ManagedService
felix.fileinstall.filename = org.apache.cxf.dosgi.discovery.zookeeper.cfg
service.id = 38
service.pid = org.apache.cxf.dosgi.discovery.zookeeper
zookeeper.host = localhost
zookeeper.port = 2181
zookeeper.timeout = 3000
inspect capability service 5
check more details here
help inspect

Resources