Sort loki logs in grafana - sorting

Sort loki logs
<timestamp><level> <msg in json>
<timestamp><level> <msg in json>
<timestamp><level> <msg in json>
<timestamp><level> <msg in json>
<timestamp><level> <msg in json>
this is how my logs look in like
in msg json there is duration:20.32ms
so looking to sort logs while displaying in grafana.
based on duration.

Related

How to find out maximum offset value for a particular kafka topic in CDH kerborised cluster

I have a CDH cluster of 41 nodes using Kerberos, 28 of which have Kafka installed.
I want to find out the maximum offset value for a particular Kafka Topic.
I am using the below command, but it is not working.
(Note: the option to use kafka-run-class.sh is not working for CDH)
./kafka-consumer-groups.sh \
--command-config /home/username/client.properties \
--group examplehost1:9092,examplehost2:9092,<many more>, examplehost41:9092 \
--topic roc-parse-7485 \
--zookeeper examplezookeperhost1:2181,examplezookeperhost2:2181,examplezookeperhost3:2181
Your problem is probably because you configured your brokers in the group parameter. --group should be the group name you wish to track.
Nevertheless, you can use GetOffsetShell - it gives you the latest offset for each topic partition.
You can find more info here: GetOffsetShell
You should use it like this:
bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list BROKER_LISTS --partitions
PARTITIONS_LIST --topic TOPIC_NAME
In your case:
bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list rahdpapp00.tt-tim.tr:9092,rahdpapp01.tt-tim.tr:9092,rahdpapp02.tt-tim.tr:9092,rahdpapp03.tt-tim.tr:9092,rahdpapp04.tt-tim.tr:9092,rahdpapp05.tt-tim.tr:9092,rahdpapp06.tt-tim.tr:9092,rahdpapp07.tt-tim.tr:9092,rahdpdtp00.tt-tim.tr:9092,rahdpdtp01.tt-tim.tr:9092,rahdpdtp02.tt-tim.tr:9092,rahdpdtp03.tt-tim.tr:9092,rahdpdtp04.tt-tim.tr:9092,rahdpdtp05.tt-tim.tr:9092,rahdpdtp06.tt-tim.tr:9092,rahdpdtp07.tt-tim.tr:9092,rahdpdtp08.tt-tim.tr:9092,rahdpdtp09.tt-tim.tr:9092,rahdpdtp10.tt-tim.tr:9092,rahdpdtp11.tt-tim.tr:9092,rahdpdtp12.tt-tim.tr:9092,rahdpdtp13.tt-tim.tr:9092,rahdpmp00.tt-tim.tr:9092,rahdpmp01.tt-tim.tr:9092,rahdpmp02.tt-tim.tr:9092,rahdppmp00.tt-tim.tr:9092,rahdppmp01.tt-tim.tr:9092, rahdppmp02.tt-tim.tr:9092
--partitions 2,1,0 --topic roc-parse-7485

Filebeat harvest rotated files

Problem description:
I have relatively big /var/log/messages file which is rotated.
The file list looks like this:
ls -l /var/log/messages*
-rw-------. 1 root 928873050 Mar 5 10:37 /var/log/messages
-rw-------. 1 root 889843643 Mar 5 07:49 /var/log/messages.1
-rw-------. 1 root 890148183 Mar 5 07:50 /var/log/messages.2
-rw-------. 1 root 587333632 Mar 5 07:51 /var/log/messages.3
My filebeat configuration snippet:
filebeat.prospectors:
- input_type: log
paths:
- /var/log/messages
- /var/lib/ntp/drift
- /var/log/syslog
- /var/log/secure
tail_files: True
With multiple /var/log/messages* files as shown above each time filebeat is restarted it starts to harvest and ingest the old log files.
When I have just one /var/log/messages file, this issue is not observed.
On Linux systems, Filebeat keeps track of files not by filename but with inode number which doesn't change when renamed. This is from Filebeat documentation.
The harvester is responsible for opening and closing the file, which
means that the file descriptor remains open while the harvester is
running. If a file is removed or renamed while it’s being harvested,
Filebeat continues to read the file. This has the side effect that the
space on your disk is reserved until the harvester closes. By default,
Filebeat keeps the file open until close_inactive is reached.
Which means this is what happens in your case
Reads current messages file (inode#1) and keeps track of its inode number in the registry.
Filebeat Stops, but messages file rotated to messages.1 (inode#1) and new messages (inode#2) file got created.
When Filebeat restarts then it will start reading
messages.1 (inode#1) file from where it left off
messages (inode#2) since it matches the path you configured (/var/log/messages)
If your plan is to harvest all messages file even the rotated ones, then it would be better to configure the path as
/var/log/messages*
It seems like the syslog and security plugins were ON in the configuration. That triggered the loading of the rotated syslog files.

Clearing/truncating a Kafka topic working on linux but not on mac

A utility bash function to clear a kafka topic by manipulating the retention interval is as follows:
clearKafka() {
tname=$1 ;
kafka-topics.sh --zookeeper localhost:2181 --alter --topic $tname --config retention.ms=1000
sleep 25s;
kafka-console-consumer.sh --from-beginning --bootstrap-server localhost:9092 --property print.key=true --property print.value=false --property print.partition --topic $tname --timeout-ms 300 | tail -n 10|grep "Processed a total of"
kafka-topics.sh --zookeeper localhost:2181 --alter --topic $tname --config retention.ms=600000;
sleep 25s;
kafka-console-consumer.sh --from-beginning --bootstrap-server localhost:9092 --property print.key=true --property print.value=false --property print.partition --topic $tname --timeout-ms 300 | tail -n 10|grep "Processed a total of"
}
This is a little messy (WARNING's..) but does work on linux - notice the last information that found 0 messages.
$ clearKafka airsmall
WARNING: Altering topic configuration from this script has been deprecated and may be removed in future releases.
Going forward, please use kafka-configs.sh for this functionality
Updated config for topic airsmall.
[2019-05-13 03:11:01,552] ERROR Error processing message, terminating consumer process: (kafka.tools.ConsoleConsumer$)
org.apache.kafka.common.errors.TimeoutException
Processed a total of 2000 messages
WARNING: Altering topic configuration from this script has been deprecated and may be removed in future releases.
Going forward, please use kafka-configs.sh for this functionality
Updated config for topic airsmall.
[2019-05-13 03:11:01,552] ERROR Error processing message, terminating consumer process: (kafka.tools.ConsoleConsumer$)
org.apache.kafka.common.errors.TimeoutException
Processed a total of 0 messages
The same utility on mac ends up with showing 2000 messages even at the very end.
Processed a total of 2000 messages
And in fact still shows 2000 messages with that command even several minutes later . So what is the deal on mac for this ? How to clear a topic on mac?
Turns out the command does work on Mac .. it just takes much longer: after another few minutes - maybe 5 or 10 minutes total - the topic is showing up as cleared
Processed a total of 0 messages
Not sure why linux gets it done in a few tens of seconds but mac requires minutes..

logstash iptables log parse

ELK run in containers
I setup iptables send all input/forward/output logs to logstash.
example log seen on kibana discover pane.
#version:1 host:3.3.3.3 #timestamp:March 3rd 2018, 12:14:45.220 message:<4>Mar 3 20:14:47 myhost kernel: [2242132.946331] LOG_ALL_TRAF public INPUT IN=public OUT= MAC=00:1e:67:f2:db:28:00:1e:67:f2:d9:7c:08:00 SRC=1.1.1.1 DST=2.2.2.2 LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=17722 DF PROTO=TCP SPT=3504 DPT=8080 WINDOW=512 RES=0x00 ACK URGP=0 type:rsyslog tags:_jsonparsefailure _id:AWHtgJ_qYRe3mIjckQsb _type:rsyslog _index:logstash-2018.03.03 _score: -
The entire log is categorized as 'message' field.
I want to use SRC, DST, SPT, DPT etc as each individual field and then also use them to visualize.
Any guidance is much appreciated.
You will need to learn about Grok filter plugin that will enable you split the message into named fields.
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html
The list of common patterns is available here.
And you can test your patterns here.

How to mapreduce in this situation in hadoop?

I want to analysis a text file
the text file's format is like this...
<msg time='2015-07-30T16:37:48.408+09:00' org_id='oracle' comp_id='rdbms'
msg_id='opiexe:3056:2780954927' client_id='' type='NOTIFICATION'
group='admin_ddl' level='16' host_id='TEST_DB1'
host_addr='127.0.0.1' module='sqlplus#TEST_DB1 (TNS V1-V3)' pid='24436'>
<txt>ORA-1543 signalled during: create tablespace TS_MODULE_I datafile &apos;/data001/orasvc01/NEWDB/ts_module_i_01.dbf&apos; size 20m...
</txt>
</msg>
<msg time='2015-07-30T16:39:13.173+09:00' org_id='oracle' comp_id='rdbms'
client_id='' type='UNKNOWN' level='16'
host_id='TEST_DB1' host_addr='127.0.0.1' module=''
pid='23242'>
<txt>Errors in
file /logs001/orasvc01/diag/rdbms/newdb/NEWDB/trace/NEWDB_smon_23242.trc:
ORA-01116: error in opening database file 6
ORA-01110: data file 6:
&apos;/data001/orasvc01/NEWDB/ts_module_d_01.dbf&apos;
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
</txt>
</msg>
....
sometimes it includes 7lines but other thing inculde 10 lines.
in this situation..
I want an output like
(column[0]) (column[1]) sum of errors
2015-07-31 ora-1051 7
what should i do?
Your input file is xml. If you have entire xml as a string in each line, you would hae used straight forward map reduce. However your input is in different form. Mostly dependent on start and end tag, to get a record.
So you should use record reader, and create your own format for map reduce- XmlInputFormat. The good news is, its already created, and you have to customize it. You can search for "xmlinputformat mahout", for actual class. However even more easier way is, to see an example which uses the above format. You can find it here. Once your mappers recornizes a record, and you get hold of the contents inside, the rest is straigth forward, and it depends on you which details are to be sent to output. Happy coding

Resources