everyone!
I have logstash config, which forwards logs from RabbitMQ to elasticSearch. Something like this:
input {
rabbitmq {
...
}
}
filter {
if [type] == "rabbitmq" {
json {
source => "message"
target => "message"
}
}
}
output {
elasticsearch {
hosts => ["${ES_HOST}"]
user => "${ES_USERNAME}"
password => "${ES_PASSWORD}"
sniffing => false
index => "kit_events-%{[message][elasticsearch][index]}"
}
}
And we were forced to compress logs on a fly, because they are spending too much traffic.
Logs were moved into array and gzipped.
What is the correct way of configuring un-gzipping and splitting array back into objects?
I did some research and found out that there is gzip_lines plugin and something on Ruby(?) to parse array, but I failed to implement it. Did anyone make something like this before?
UPD:
Added this filter
filter {
if [type] == "kitlog-rabbitmq" {
ruby {
init => "
require 'zlib'
require 'stringio'
"
code => "
body = event.get('[http][response][body]').to_s
sio = StringIO.new(body)
gz = Zlib::GzipReader.new(sio)
result = gz.read.to_s
event.set('[http][response][body]', result)
"
}
}
}
And now catching an error
ERROR][logstash.filters.ruby ] Ruby exception occurred: not in gzip format
[DEBUG][logstash.pipeline ] output received {"event"=>{"#timestamp"=>2018-11-30T09:16:19.127Z, "tags"=>["_rubyexception"], "#version"=>"1", "message"=>"x^\\x8B\\xAEV*\\xCE\\xCE\\xCC\\xC9)V\\xB2R\\x88V\\xD26T07\\xB7\\xB0\\xB4\\xB44000W\\x8A\\xD5QPJ\\xCE\\xCF+IL.\\u0001\\xCA*)\\u0001\\xB9\\xA9\\xB9\\x89\\x999 N\\x96C\\x96^r~.X,\\xA5\\u0014(R\\xADT\\x9A\\u000E6#\\xA0\\xB2$#?\\u000F\\xAC\\xB9\\u0000\\\"\\xE2\\u001C\\xAC\\u0014[\\v\\xE4\\xE6%概\\xF4z\\u0001\\xE9b%\\xA0\\xC8\\xC0\\xD9\\u001D\\v\\u0000\\u0003\\x9ADk", "type"=>"kitlog-rabbitmq"}}
Was trying different gzipping methods, but result is still the same. Also tried changing input codecs (plain - utf-8, plain - binary)
So the content in rabbitmq is gzipped?
In the best of all possible worlds, logstash would see the content-encoding header and unzip it for you, but the plugin doesn't seem to do anything with that knowledge. You might request the feature.
The plugin does let you access the header, so you could do the gzip yourself. Something like this:
filter {
if [#metadata][rabbitmq_properties][content-encoding] == "gzip" {
ruby {
...
}
}
}
Examples of unzipping a string with ruby exist elsewhere. Hopefully the 'zip' gem is available in logstash.
Related
I am trying to take data that have auth and args from Websocket server to Logstash using Websocket input plugin.
My Logstash conf file
input {
websocket {
mode => client
url => "wss://0.0.0.0/api"
}
}
output {
stdout {codec => rubydebug}
elasticsearch {
hosts => ["0.0.0.0:9200"]
index => "ws_cdrs"
}
}
But to get the data at the first I have to add
{
"command":"auth",
"command_ref":"command reference",
"args":{
"token":"9118ce123456db123456b268e0135e3"
}
}
{
"command":"get_calls",
"command_ref":"command reference",
"args":{
"start_time":"2018-07-06T12:13:17Z",
"end_time":"2022-07-21T12:13:17Z"
}
}
the second one is to get the data.
I am currently pulling JSON log files from an S3 bucket which contain different types of logs defined as RawLog, along with another value which is MessageSourceType (there are more metadata fields which I don't care about). Each line on the file is a separate log in case that makes a difference.
I currently have these all going into 1 index as seen in my config below, however, I ideally want to split these out into separate indexes. For example, if the MessageSourceType = Syslog - Linux Host then I need logstash to extract the RawLog as syslog and place it into an index called logs-syslog, whereas if the MessageSourceType = MS Windows Event Logging XML I want it to extract the RawLog as XML and place it in an index called logs-MS_Event_logs.
filter {
mutate {
replace => [ "message", "%{message}" ]
}
json {
source => "message"
remove_field => "message"
}
}
output {
elasticsearch {
hosts => ["http://xx.xx.xx.xx:xxxx","http://xx.xx.xx.xx:xxxx"]
index => "logs-received"
}
Also for a bit of context here is an example of one of the logs:
{"MsgClassTypeId":"3000","Direction":"0","ImpactedZoneEnum":"0","message":"<30>Feb 13 23:45:24 xx.xx.xx.xx Account=\"\" Action=\"\" Aggregate=\"False\" Amount=\"\" Archive=\"True\" BytesIn=\"\" BytesOut=\"\" CollectionSequence=\"825328\" Command=\"\" CommonEventId=\"3\" CommonEventName=\"General Operations\" CVE=\"\" DateInserted=\"2/13/2021 11:45:24 PM\" DInterface=\"\" DIP=\"\" Direction=\"0\" DirectionName=\"Unknown\" DMAC=\"\" DName=\"\" DNameParsed=\"\" DNameResolved=\"\" DNATIP=\"\" DNATPort=\"-1\" Domain=\"\" DomainOrigin=\"\" DPort=\"-1\" DropLog=\"False\" DropRaw=\"False\" Duration=\"\" EntityId=\"" EventClassification=\"-1\" EventCommonEventID=\"-1\" FalseAlarmRating=\"0\" Forward=\"False\" ForwardToLogMart=\"False\" GLPRAssignedRBP=\"-1\" Group=\"\" HasBeenInserted_EMDB=\"False\" HasBeenQueued_Archiving=\"True\" HasBeenQueued_EventProcessor=\"False\" HasBeenQueued_LogProcessor=\"True\" Hash=\"\" HostID=\"44\" IgnoreGlobalRBPCriteria=\"False\" ImpactedEntityId=\"0\" ImpactedEntityName=\"\" ImpactedHostId=\"-1\" ImpactedHostName=\"\" ImpactedLocationKey=\"\" ImpactedLocationName=\"\" ImpactedNetworkId=\"-1\" ImpactedNetworkName=\"\" ImpactedZoneEnum=\"0\" ImpactedZoneName=\"\" IsDNameParsedValue=\"True\" IsRemote=\"True\" IsSNameParsedValue=\"True\" ItemsIn=\"\" ItemsOut=\"\" LDSVERSION=\"1.1\" Login=\"\" LogMartMode=\"13627389\" LogSourceId=\"158\" LogSourceName=\"ip-xx-xx-xx-xx.eu-west-2.computer.internal Linux Syslog\" MediatorMsgID=\"0\" MediatorSessionID=\"1640\" MsgClassId=\"3999\" MsgClassName=\"Other Operations\" MsgClassTypeId=\"3000\" MsgClassTypeName=\"Operations\" MsgCount=\"1\" MsgDate=\"2021-02-13T23:45:24.0000000+00:00\" MsgDateOrigin=\"0\" MsgSourceHostID=\"44\" MsgSourceTypeId=\"88\" MsgSourceTypeName=\"Syslog - Linux Host\" NormalMsgDate=\"2021-02-13T23:45:24.0540000Z\" Object=\"\" ObjectName=\"\" ObjectType=\"\" OriginEntityId=\"0\" OriginEntityName=\"\" OriginHostId=\"-1\" OriginHostName=\"\" OriginLocationKey=\"\" OriginLocationName=\"\" OriginNetworkId=\"-1\" OriginNetworkName=\"\" OriginZoneEnum=\"0\" OriginZoneName=\"\" ParentProcessId=\"\" ParentProcessName=\"\" ParentProcessPath=\"\" PID=\"-1\" Policy=\"\" Priority=\"4\" Process=\"\" ProtocolId=\"-1\" ProtocolName=\"\" Quantity=\"\" Rate=\"\" Reason=\"\" Recipient=\"\" RecipientIdentity=\"\" RecipientIdentityCompany=\"\" RecipientIdentityDepartment=\"\" RecipientIdentityDomain=\"\" RecipientIdentityID=\"-1\" RecipientIdentityTitle=\"\" ResolvedImpactedName=\"\" ResolvedOriginName=\"\" ResponseCode=\"\" Result=\"\" RiskRating=\"0\" RootEntityId=\"9\" Sender=\"\" SenderIdentity=\"\" SenderIdentityCompany=\"\" SenderIdentityDepartment=\"\" SenderIdentityDomain=\"\" SenderIdentityID=\"-1\" SenderIdentityTitle=\"\" SerialNumber=\"\" ServiceId=\"-1\" ServiceName=\"\" Session=\"\" SessionType=\"\" Severity=\"\" SInterface=\"\" SIP=\"\" Size=\"\" SMAC=\"\" SName=\"\" SNameParsed=\"\" SNameResolved=\"\" SNATIP=\"\" SNATPort=\"-1\" SPort=\"-1\" Status=\"\" Subject=\"\" SystemMonitorID=\"9\" ThreatId=\"\" ThreatName=\"\" UniqueID=\"7d4c4ed3-a2fc-44bc-a7ec-0b8b68e7f456\" URL=\"\" UserAgent=\"\" UserImpactedIdentity=\"\" UserImpactedIdentityCompany=\"\" UserImpactedIdentityDomain=\"\" UserImpactedIdentityID=\"-1\" UserImpactedIdentityTitle=\"\" UserOriginIdentity=\"\" UserOriginIdentityCompany=\"\" UserOriginIdentityDepartment=\"\" UserOriginIdentityDomain=\"\" UserOriginIdentityID=\"-1\" UserOriginIdentityTitle=\"\" VendorInfo=\"\" VendorMsgID=\"\" Version=\"\" RawLog=\"02 13 2021 23:45:24 xx.xx.xx.xx <SYSD:INFO> Feb 13 23:45:24 euw2-ec2--001 metricbeat[3031]: 2021-02-13T23:45:24.264Z#011ERROR#011[logstash.node_stats]#011node_stats/node_stats.go:73#011error making http request: Get \\\"https://xx.xx.xx.xx:9600/\\\": dial tcp xx.xx.xx.xx:9600: connect: connection refused\"","CollectionSequence":"825328","NormalMsgDate":"2021-02-13T23:45:24.0540000Z"}
I am a little unsure of the best way to achieve this and thought you guys might have some suggestions. I have looked into grok and think this may achieve my objective however I'm unsure where to start.
You can do this with conditionals in your filter section and define the target index according to the type of logs you're parsing.
filter {
... other filters ...
if [MsgSourceTypeName] == "Syslog - Linux Host" {
mutate {
add_field => {
"[#metadata][target_index]" => "logs-syslog"
}
}
}
else if [MsgSourceTypeName] == "MS Windows Event Logging XML" {
mutate {
add_field => {
"[#metadata][target_index]" => "logs-ms_event_log"
}
}
}
}
output {
elasticsearch {
hosts => ["http://xx.xx.xx.xx:xxxx","http://xx.xx.xx.xx:xxxx"]
index => "%{[#metadata][target_index]}"
}
}
I'm very new to logstash and elasticsearch, I am trying to stash my first log to logstash in a way that I can (correct me if it is not the purpose) search it using elasticsearch....
I have a log that looks like this basically:
2016-12-18 10:16:55,404 - INFO - flowManager.py - loading metadata xml
So, I have created a config file test.conf that looks like this:
input {
file {
path => "/home/usr/tmp/logs/mylog.log"
type => "test-type"
id => "NEWTRY"
}
}
filter {
grok {
match => { "message" => "%{YEAR:year}-%{MONTHNUM:month}-%{MONTHDAY:day} %{HOUR:hour}:%{MINUTE:minute}:%{SECOND:second} - %{LOGLEVEL:level} - %{WORD:scriptName}.%{WORD:scriptEND} - " }
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "ecommerce"
codec => line { format => "%{year}-%{month}-%{day} %{hour}:%{minute}:%{second} - %{level} - %{scriptName}.%{scriptEND} - \"%{message}\"" }
}
}
And then : ./bin/logstash -f test.conf
I do not see the log in elastic search when I go to: http://localhost:9200/ecommerce OR to http://localhost:9200/ecommerce/test-type/NEWTRY
Please tell me what am I doing wrong.... :\
Thanks,
Heather
I found a solution eventually-
I added both sincedb_path=>"/dev/null" (which from what I understood is for testing enviorment only) and start_position => "beginning" to the output file plugin and the file appeared both in elastic and in kibana
Thanks anyway for responding and trying to help!
I want to collect and process logs from dnsmasq and I´ve decided to use ELK. Dnsmasq is used as a DHCP Server and as a DNS Resolver and hence it creates log entries for both services.
My goal is to send to Elasticsearch all DNS Queries with the requester IP, requester hostname (if available) and requester mac address. That will allow me to group the request per mac address regardless if the device IP changed or not, and display the host name.
What I would like to do is the following:
1) Read the entries like:
Mar 30 21:55:34 dnsmasq-dhcp[346]: 3806132383 DHCPACK(eth0) 192.168.0.80 04:0c:ce:d1:af:18 air
2) Store temporarily the relationship:
192.168.0.80 => 04:0c:ce:d1:af:18
192.168.0.80 => air
3) Enrich the entries like the one below adding the mac address and hostname. If the hostname was empty I would add the mac address.
Mar 30 22:13:05 dnsmasq[346]: query[A] imap.gmail.com from 192.168.0.80
I found a module called “memorize” that would allow me to store them but unfortunately does not work with the latest version of Logstash
The versions I´m using:
ElastiSearch 2.3.0
Kibana 4.4.2
Logstash 2.2.2
And the logstash filter (this is my first attempt with logstash and hence I´m sure the configuration file can be improved)
input {
file {
path => "/var/log/dnsmasq.log"
start_position => "beginning"
type => "dnsmasq"
}
}
filter {
if [type] == "dnsmasq" {
grok {
match => [ "message", "%{SYSLOGTIMESTAMP:reqtimestamp} %{USER:program}\[%{NONNEGINT:pid}\]\: ?(%{NONNEGINT:num} )?%{NOTSPACE:action} %{IP:clientip} %{MAC:clientmac} ?(%{HOSTNAME:clientname})?"]
match => [ "message", "%{SYSLOGTIMESTAMP:reqtimestamp} %{USER:program}\[%{NONNEGINT:pid}\]\: ?(%{NONNEGINT:num} )?%{USER:action}?(\[%{USER:subaction}\])? %{NOTSPACE:domain} %{NOTSPACE:function} %{IP:clientip}"]
match => [ "message", "%{SYSLOGTIMESTAMP:reqtimestamp} %{USER:program}\[%{NONNEGINT:pid}\]\: %{NOTSPACE:action} %{DATA:data}"]
}
if [action] =~ "DHCPACK" {
}else if [action] == "query" {
}else
{
drop{}
}
}
}
output {
elasticsearch { hosts => ["localhost:9200"] }
stdout { codec => rubydebug }
}
Questions:
1) Is there an alternative to the plugin “memorize” working with the latest logstash version? Either another plugin or different procedure.
2) Shall I downgrade logstash to a version before 2 (I think the previous is 1.5.4)? If so, is there any known sever issue or incompatibility with elasticsearch 2.2.1?
3) Or shall I modify the plugin “memorize” allowing logstash 2.x (if so I´ll appreciate any pointer on how to start)?
There's no need to repack the memorize plugin for this in my opinion. You can use the aggregate filter to achieve what you want.
...
# record host/mac in temporary map
if [action] =~ "DHCPACK" {
aggregate {
task_id => "%{clientip}"
code => "map['clientmac'] = event['clientmac']; map['clientname'] = event['clientname'];"
map_action => "create_or_update"
# timeout set to 48h
timeout => 172800
}
}
# add host/mac where/when needed
else if [action] == "query" {
aggregate {
task_id => "%{clientip}"
code => "event['clientmac'] = map['clientmac']; event['clientname'] = map['clientname']"
map_action => "update"
}
}
So to use memorize with logstash >2.0
Clone the repository.
Open file logstash-filter-memorize.gemspec
Change s.add_runtime_dependency "logstash-core", '>= 1.4.0', '< 2.0.0' as s.add_runtime_dependency "logstash-core", '>= 1.4.0', '< 3.0.0'
Build plugin via: gem build logstash-filter-memorize.gemspec
Install it via: $ bin/logstash-plugin install /path/to/memorize/logstash-filter-memorize-0.9.1.gem
I tried it and seems to work.
The code mentioned is my logstash conf file . I provide my nginx access log file as input and output to elasticsearch .I also write the output to a text file which works fine .. But the output is never been written to elasticsearch.
input {
file {
path => "filepath"
start_position => "beginning"
}
}
output {
file {
path => "filepath"
}
elasticsearch {
host => localhost
port => "9200"
}
}
I also tried executing logstash binary from command line using -e option
input { stdin{ } output { elasticsearch { host => localhost } }
which works fine. I get the output written to elasticsearch.. But in the former case i dont . Help me solve this
I tried a few things, I have no idea why your case with just host works. If I try it, i get timeouts. This is the configuration that works for me:
elasticsearch {
protocol => "http"
host => "localhost"
port => "9200"
}
I tried with logstash 1.4.2 and elasticsearch 1.4.4