Jmxtrans Config for multiple queries to the same Writer - elasticsearch

While setting up metric reporting for Apache Kafka to ElasticSearch with jmxtrans, we have written a configuration file that queries about 50 metrics.
The queries are as follows:
{
"obj" : "kafka.server:type=BrokerTopicMetrics,name=TotalFetchRequestsPerSec",
"outputWriters" : [ {
"#class" : "com.googlecode.jmxtrans.model.output.elastic.ElasticWriter",
"connectionUrl": "http://elasticHost:9200"
}]
}
Since there are so many of them all writing to the same destination, is there a way in the config file to shorten this?
Any help is highly appreciated.

You can try to be more precise in your MBean path -
kafka.server:name=TotalFetchRequestsPerSec,topic=MyCoolTopic,type=BrokerTopicMetrics
Take a look on this one as a great example - jmxtrans supports resultAlias as well.
Here you can find a list of Kafka MBeans which could become handy for you.

Related

Filebeat - what is the configuration json.message_key for?

I am working on a Filebeat project for indexing logs, in a Json format.
I see in the configuration that there is the option json.message_key: message
I don't really understand, what is this for, if I remove it, I see no change.
Can someone explain me ?
Logs are in this format :
{"appName" : "blala", "version" : "1.0.0", "level":"INFO", "message": "log message"}
Message is the default key for raw content line.
So if you remove if from the config, filebeat will still use message, and apply grok on it.
If you change it to "not-a-message", you should see a difference. But you should not do it as every automation depend on it.

Use sql_last_value for more than one file: Logstah

I have a jdbc config file from logstash
statement => "SELECT * from TEST where id > :sql_last_value"
which includes the above query.
Suppose i have 2 or more conf files, how to i differentiate my sql_last_value from each other?
Can i give an alias to differentiate them? How?
The idea is to configure a different last_run_metadata_path value in each configuration file. For instance:
Configuration file 1:
input {
jdbc {
...
last_run_metadata_path => "/Users/me/.logstash_jdbc_last_run1"
...
}
}
Configuration file 2:
input {
jdbc {
...
last_run_metadata_path => "/Users/me/.logstash_jdbc_last_run2"
...
}
}
#Val's answer is the correct way to implement various .logstash_jdbc_last_run files.
On top of this I want to give you some hints on implementing multiple jdbc input plugins in the same pipeline:
You need to keep in mind that when one input plugin throws some errors (for example the query isn't correct, the db user has no grants etc.) the whole pipeline will stop - not only the respective input plugin. That means you can block your other input plugin (which may work well).
So a common way to avoid this is to specify multiple pipelines with only one jdbc input plugin. You can then decide if you want to copy the rest of the plugins (filters and output) or send the incoming requests to a central processing pipeline with the pipeline output plugin.

Apache NiFi: PutElasticSearchHttp is not working, with blank error

I currently have Elasticsearch version 6.2.2 and Apache Nifi version 1.5.0 running on the same machine. I'm trying to follow the Nifi example located: https://community.hortonworks.com/articles/52856/stream-data-into-hive-like-a-king-using-nifi.html except instead of storing to Hive, I want to store to Elasticsearch.
Initially I tried using the PutElasticsearch5 processor but I was getting the following error on Elasticsearch:
Received message from unsupported version: [5.0.0] minimal compatible version is: [5.6.0]
When I tried Googling this error message, it seemed like the consensus was to use the PutElasticsearchHttp processor. My Nifi looks like:
And the configuration for the PutElasticsearchHttp processor:
When the flowfile gets to the PutElasticsearchHttp processor, the following error shows up:
PutElasticSearchHttp failed to insert StandardFlowFileRecord into Elasticsearch due to , transferring to failure.
It seems like the reason is blank/null. There also wasn't anything in the Elasticsearch log.
After the ConvertAvroToJson, the data is a JSON array with all of the entries on a single line. Here's a sample value:
{"City": "Athens",
"Edition": 1896,
"Sport": "Aquatics",
"sub_sport": "Swimming",
"Athlete": "HAJOS, Alfred",
"country": "HUN",
"Gender": "Men",
"Event": "100m freestyle",
"Event_gender": "M",
"Medal": "Gold"}
Any ideas on how to debug/solve this problem? Do I need to create anything in Elasticsearch first? Is my configuration correct?
I was able to figure it out. After the ConvertAvroToJSON, the flow file was a single line that contained a JSON Array of JSON indices. Since I wanted to store the individual indices I needed a SplitJSON processor. Now my Nifi looks like this:
The configuration of the SplitJson looks like this:
The index name cannot contain the / character. Try with a valid index name: e.g. sports.
I had a similar flow, wherein changing the type to _doc did the trick after including splitTojSON.

Is there a way to see 'raw' blockchain data using Hyperledger Composer?

Composer seems to add quite a bit of abstraction on top of Fabric - is there any way to see the underlying cryptography?
For example
- Is there a way to see transaction hashes?
- Is there a way to examine past blocks?
Thanks!
From my experience, Composer does not give you a "block" view of your transactions. To see transaction hashes and information you can do this by using a query. Make a query.qry file in the root of your project directory. Then add this:
query getAllHistorianRecords {
description: "getTradeRelatedHistorianRecords"
statement:
SELECT org.hyperledger.composer.system.HistorianRecord
WHERE (transactionTimestamp > '0000-01-01T00:00:00.000Z')
}
This will let you see data such as:
{
"$class": "org.hyperledger.composer.system.HistorianRecord",
"transactionId": "b7b202906deba4d4bca1581ae6033562699361d67d31c2af45cd60b0225d5624",
"transactionType": "org.hyperledger.composer.system.AddParticipant",
"transactionInvoked": "resource:org.hyperledger.composer.system.AddParticipant#b7b202906deba4d4bca1581ae6033562699361d67d31c2af45cd60b0225d5624",
"eventsEmitted": [],
"transactionTimestamp": "2017-10-03T16:24:14.864Z"
}
}...

How to use Hadoop API copyMerge function? What is the addString parameter?

Does anyone know or have used copyMerge function in Hadoop API - FileUtil?
copyMerge(FileSystem srcFS, Path srcDir, FileSystem dstFS, Path dstFile, boolean deleteSource, Configuration conf, String addString);
In the function, what is the addString parameter? How do I set how those files are merged? Example I have part number 1,2,3,4,5..., I want to combine them into one file in ascending order, how can I do it?
Detail about the API: http://archive.cloudera.com/cdh/3/hadoop-0.20.2+320/api/org/apache/hadoop/fs/FileUtil.html
Thanks!
Looks like the the addString is just written to the OutputStream in the FileUtil class
if (addString!=null)
out.write(addString.getBytes("UTF-8"));
}
When there is no documentation, source code is the true and best source for details. I have written a few articles on how to setup Git here and here. Git helps for faster and easier access to the code.

Resources