Correlate messages in ELK by field

Correlate messages in ELK by field - elasticsearch

Related to: Combine logs and query in ELK
We are setting up ELK and would want to create a visualization in Kibana 4.
The issue here is that we want to relate between two different types of message.
To simplify:
Message type 1 fields: message_type, common_id_number, byte_count,
...
Message type 2 fields: message_type, common_id_number, hostname, ...
Both messages share the same index in elasticsearch.
As you can see we were trying to graph without taking that common_id_number into account, but it seems that we must use it. We don't know how yet, though.
Any help?
EDIT
These are the relevant field definitions in the ES template:
"URIHost" : {
"type" : "string",
"norms" : {
"enabled" : false
},
"fields" : {
"raw" : {
"type" : "string",
"index" : "not_analyzed",
"ignore_above" : 256
}
}
},
"Type" : {
"type" : "string",
"norms" : {
"enabled" : false
},
"fields" : {
"raw" : {
"type" : "string",
"index" : "not_analyzed",
"ignore_above" : 256
}
}
},
"SessionID" : {
"type" : "long"
},
"Bytes" : {
"type" : "long"
},
"BytesReceived" : {
"type" : "long"
},
"BytesSent" : {
"type" : "long"
},
This is a TRAFFIC type, edited document:
{
"_index": "logstash-2015.11.05",
"_type": "paloalto",
"_id": "AVDZqdBjpQiRid-uxPjE",
"_score": null,
"_source": {
"#version": "1",
"#timestamp": "2015-11-05T21:59:55.543Z",
"syslog_severity_code": 5,
"syslog_facility_code": 1,
"syslog_timestamp": "Nov 5 22:59:58",
"Type": "TRAFFIC",
"SessionID": 21713,
"Bytes": 939,
"BytesSent": 480,
"BytesReceived": 459,
},
"fields": {
"#timestamp": [
1446760795543
]
},
"sort": [
1446760795543
]
}
And this is a THREAT type document:
{
"_index": "logstash-2015.11.05",
"_type": "paloalto",
"_id": "AVDZqVNIpQiRid-uxPjC",
"_score": null,
"_source": {
"#version": "1",
"#timestamp": "2015-11-05T21:59:23.440Z",
"syslog_severity_code": 5,
"syslog_facility_code": 1,
"syslog_timestamp": "Nov 5 22:59:26",
"Type": "THREAT",
"SessionID": 21713,
"URIHost": "whatever.nevermind.com",
"URIPath": "/connectiontest.html"
},
"fields": {
"#timestamp": [
1446760763440
]
},
"sort": [
1446760763440
]
}
This is the logstash "filter" configuration:
filter {
if [type] == "paloalto" {
syslog_pri {
remove_field => [ "syslog_facility", "syslog_severity" ]
}
grok {
match => {
"message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{HOSTNAME:hostname} %{INT},%{YEAR}/%{MONTHNUM}/%{MONTHDAY} %{TIME},%{INT},%{WORD:Type},%{GREEDYDATA:log}"
}
remove_field => [ "message" ]
}
if [Type] == "THREAT" {
csv {
source => "log"
columns => [ "Threat_OR_ContentType", "ConfigVersion", "GenerateTime", "SourceAddress", "DestinationAddress", "NATSourceIP", "NATDestinationIP", "Rule", "SourceUser", "DestinationUser", "Application", "VirtualSystem", "SourceZone", "DestinationZone", "InboundInterface", "OutboundInterface", "LogAction", "TimeLogged", "SessionID", "RepeatCount", "SourcePort", "DestinationPort", "NATSourcePort", "NATDestinationPort", "Flags", "IPProtocol", "Action", "URL", "Threat_OR_ContentName", "reportid", "Category", "Severity", "Direction", "seqno", "actionflags", "SourceCountry", "DestinationCountry", "cpadding", "contenttype", "pcap_id", "filedigest", "cloud", "url_idx", "user_agent", "filetype", "xff", "referer", "sender", "subject", "recipient" ]
remove_field => [ "log" ]
}
mutate {
convert => {
"SessionID" => "integer"
"SourcePort" => "integer"
"DestinationPort" => "integer"
"NATSourcePort" => "integer"
"NATDestinationPort" => "integer"
}
remove_field => [ "ConfigVersion", "GenerateTime", "VirtualSystem", "InboundInterface", "OutboundInterface", "LogAction", "TimeLogged", "RepeatCount", "Flags", "Action", "reportid", "Severity", "seqno", "actionflags", "cpadding", "pcap_id", "filedigest", "recipient" ]
}
grok {
match => {
"URL" => "%{URIHOST:URIHost}%{URIPATH:URIPath}(%{URIPARAM:URIParam})?"
}
remove_field => [ "URL" ]
}
}
else if [Type] == "TRAFFIC" {
csv {
source => "log"
columns => [ "Threat_OR_ContentType", "ConfigVersion", "GenerateTime", "SourceAddress", "DestinationAddress", "NATSourceIP", "NATDestinationIP", "Rule", "SourceUser", "DestinationUser", "Application", "VirtualSystem", "SourceZone", "DestinationZone", "InboundInterface", "OutboundInterface", "LogAction", "TimeLogged", "SessionID", "RepeatCount", "SourcePort", "DestinationPort", "NATSourcePort", "NATDestinationPort", "Flags", "IPProtocol", "Action", "Bytes", "BytesSent", "BytesReceived", "Packets", "StartTime", "ElapsedTimeInSecs", "Category", "Padding", "seqno", "actionflags", "SourceCountry", "DestinationCountry", "cpadding", "pkts_sent", "pkts_received", "session_end_reason" ]
remove_field => [ "log" ]
}
mutate {
convert => {
"SessionID" => "integer"
"SourcePort" => "integer"
"DestinationPort" => "integer"
"NATSourcePort" => "integer"
"NATDestinationPort" => "integer"
"Bytes" => "integer"
"BytesSent" => "integer"
"BytesReceived" => "integer"
"ElapsedTimeInSecs" => "integer"
}
remove_field => [ "ConfigVersion", "GenerateTime", "VirtualSystem", "InboundInterface", "OutboundInterface", "LogAction", "TimeLogged", "RepeatCount", "Flags", "Action", "Packets", "StartTime", "seqno", "actionflags", "cpadding", "pcap_id", "filedigest", "recipient" ]
}
}
date {
match => [ "syslog_timastamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
timezone => "CET"
remove_field => [ "syslog_timestamp" ]
}
}
}
What we are trying to do is to visualize URIHost terms as X axis and Bytes, BytesSent and BytesReceived sums as Y axis.

I think you can use the aggregate filter to carry out your task. The aggregate filter provides support for aggregating several log lines into one single event based on a common field value. In your case, the common field we're going to use will be the SessionID field.
Then we need another field to detect the first event vs the second/last event that should be aggregated. In your case, this would be the Type field.
You need to change your current configuration like this:
filter {
... all other filters
if [Type] == "THREAT" {
... all other filters
aggregate {
task_id => "%{SessionID}"
code => "map['URIHost'] = event['URIHost']; map['URIPath'] = event['URIPath']"
}
}
else if [Type] == "TRAFFIC" {
... all other filters
aggregate {
task_id => "%{SessionID}"
code => "event['URIHost'] = map['URIHost']; event['URIPath'] = map['URIPath']"
end_of_task => true
timeout => 120
}
}
}
The general idea is that when Logstash encounters THREAT logs it will temporarily store the URIHost and URIPath in the in-memory event map, and then when a TRAFFIC log comes in, the URIHost and URIPath fields will be added to the event. You can copy other fields, too, if needed. You can also adapt the timeout (in seconds) depending on how long you expect a TRAFFIC event to come in after the last THREAT event.
In the end, you'll get documents with data merged from both THREAT and TRAFFIC log lines and you can easily create the visualization showing bytes count per URIHost as shown on your screenshot.

Related

Getting JSON format error in Logstash. Need to decode JSON from the message part

I am using the below code in my logstash but the message field is coming after getting converted into JSON twice, but I need the message with a single JSON format, how can I decode it from JSON one step back:
file {
id => "my_lt_log"
path => "/logs/logtransformer.log"
type => "log"
start_position => "beginning"
}
}
filter {
if [type] == "log" {
mutate {
remove_field => [ "kubernetes"]
}
mutate {
gsub => [ "message", "(\W)at(\W)", '\1""\2' ]
}
if [message][metadata][proc_id] {
mutate {
add_field => { "[metadata][proc_id]" => "%{[message][metadata][proc_id]}" }
}
}
if "_jsonparsefailure" in [tags] {
mutate {
add_field => {
"logplane" => "adp-app-logs"
"abc" => "%{[message]}"
}
remove_field => [ "message", "kubernetes" ]
}
}
else {
mutate {
rename => {
"path" => "filename"
}
add_field => {
"def" => "%{[message]}"
"message" => "%{[message][message]}"
"timestamp" => "%{[message][timestamp]}"
}
}
}
}
output {
...
}
Output:
{
"_index" : "adp-app-logs-2023.02.03",
"_type" : "_doc",
"_id" : "PpiDF4YBCMtUNdxoMJFW",
"_score" : 0.79323065,
"_source" : {
"#version" : "1",
"service_id" : "%{[message][service_id]}",
"def" : "{\"version\": \"1.1.0\", \"timestamp\": \"2023-02-03T13:41:43.034Z\", \"severity\": \"info\", \"service_id\": \"eric-log-transformer\", \"metadata\" : {\"namespace\": \"zyadros\", \"pod_name\": \"eric-log-transformer-7b64896976-s6h5r\", \"node_name\": \"node-10-63-142-135\", \"pod_uid\": \"336c9706-41a9-41c0-b459-2eb4e9f6e2b4\", \"container_name\": \"logtransformer\"}, \"message\": \"Starting pipeline {:pipeline_id=>'opensearch', 'pipeline.workers'=>2, 'pipeline.batch.size'=>2048, 'pipeline.batch.delay'=>50, 'pipeline.max_inflight'=>4096, 'pipeline.sources'=>['/opt/logstash/resource/searchengine.conf'], :thread=>'#<Thread:0x7649ae47 run>'}\"}",
"#timestamp" : "2023-02-03T13:41:58.044976Z",
"severity" : "%{[message][severity]}"
}
}
Expected Output for the field 'def':
{"version": "1.1.0", "timestamp": "2023-02-06T06:18:33.647Z", "severity": "info", "service_id": "eric-log-transformer", "metadata" : {"namespace": "zyadros", "pod_name": "eric-log-transformer-5cb7dbc6b5-ghrsc", "node_name": "node-10-63-142-135", "pod_uid": "52b8e6fe-9547-4091-9034-36e1141f4391", "container_name": "logtransformer"}, "message": "Starting tcp input listener {:address=>'0.0.0.0:5015', :ssl_enable=>true}"}
Input file sample:
{"version": "1.1.0", "timestamp": "2023-02-06T13:42:59.634Z", "severity": "info", "service_id": "eric-log-transformer", "metadata" : {"namespace": "roshan", "pod_name": "eric-log-transformer-5bc84c4cb-c8qtw", "node_name": "node-10-63-142-135", "pod_uid": "00632d7b-d151-4b0c-84fe-8a6ee6b64b35", "container_name": "logtransformer"}, "message": "Starting pipeline {:pipeline_id=>'logstash', 'pipeline.workers'=>2, 'pipeline.batch.size'=>2048, 'pipeline.batch.delay'=>50, 'pipeline.max_inflight'=>4096, 'pipeline.sources'=>['/opt/logstash/resource/logstash.conf'], :thread=>'#<Thread:0x3d8b1518 run>'}"}
Getting the below Error while adding json {source}:
{
"_index" : "%{logplane}-2023.02.07",
"_type" : "_doc",
"_id" : "9d6xKoYBCoUR1nQuuhlp",
"_score" : 2.969562E-4,
"_source" : {
"path" : "/logs/logtransformer.log",
"#version" : "1",
"tags" : [
"_jsonparsefailure"
],
"#timestamp" : "2023-02-07T07:05:22.465556Z",
"host" : "eric-log-transformer-d6dddd6f9-lp6d7",
"message" : " at [Source: (byte[])' at [Source: (byte[])' at [Source: (byte[])'{'version': '1.1.0', 'timestamp': '2023-02-07T07:05:06.163Z', 'severity': 'warning', 'service_id': 'eric-log-transformer', 'metadata' : {'namespace': 'zyadros', 'pod_name': 'eric-log-transformer-d6dddd6f9-lp6d7', 'node_name': 'node-10-63-142-138', 'pod_uid': '02659e5c-c9ac-49c8-a4bd-74b9477e846d', 'container_name': 'logtransformer'}, 'message': 'Error parsing json {:source=>'message', :raw=>'{\\'version\\': \\'1.1.0\\', \\'timestamp\\': \\'2023-02-07T07:05:02'[truncated 68 bytes]; line: 1, column: 5]>}\"}"
}
},

How to fetch field from array of objects in Elasticsearch Index as CSV file to Google Cloud Storage Using Logstash

I am using ElasticSearch to index data and wanted to export few fields from index created every day to Google cloud storage, How to get fields from array of objects in elastic search index and send them as csv file to GCS bucket using Logstash
Tried below conf to fetch nested fields from index:
input {
elasticsearch {
hosts => "host:443"
user => "user"
ssl => true
connect_timeout_seconds => 600
request_timeout_seconds => 600
password => "pwd"
ca_file => "ca.crt"
index => "test"
query => '
{
"_source": ["obj1.Name","obj1.addr","obj1.obj2.location", "Hierarchy.categoryUrl"],
"query": {
"match_all": {}
}
}
'
}
}
filter {
mutate {
rename => {
"[obj1][Name]" => "col1"
"[obj1][addr]" => "col2"
"[obj1][obj2][location]" => "col3"
"[Hierarchy][0][categoryUrl]" => "col4"
}
}
}
output {
google_cloud_storage {
codec => csv {
include_headers => true
columns => [ "col1", "col2","col3"]
}
bucket => "bucket"
json_key_file => "creds.json"
temp_directory => "/tmp"
log_file_prefix => "log_gcs"
max_file_size_kbytes => 1024
date_pattern => "%Y-%m-%dT%H:00"
flush_interval_secs => 600
gzip => false
uploader_interval_secs => 600
include_uuid => true
include_hostname => true
}
}
How to get field populated to above csv from array of objects, in below example wanted to fetch categoryUrl from the first object of an array and populate to csv table and send it to GCS Bucket:
have tried below approaches :
"_source": ["obj1.Name","obj1.addr","obj1.obj2.location", "Hierarchy.categoryUrl"]
and
"_source": ["obj1.Name","obj1.addr","obj1.obj2.location", "Hierarchy[0].categoryUrl"]
with
mutate {
rename => {
"[obj1][Name]" => "col1"
"[obj1][addr]" => "col2"
"[obj1][obj2][location]" => "col3"
"[Hierarchy][0][categoryUrl]" => "col4"
}
for input sample :
"Hierarchy" : [
{
"level" : "1",
"category" : "test",
"categoryUrl" : "testurl1"
},
{
"level" : "2",
"category" : "test2",
"categoryUrl" : "testurl2"
}}
Attaching sample document where I am trying to fetch merchandisingHierarchy[0].categoryUrl and pricingInfo[0].basePrice :
{
"_index" : "amulya-test",
"_type" : "_doc",
"_id" : "ldZPJoYBFi8LOEDK_M2f",
"_score" : 1.0,
"_ignored" : [
"itemDetails.description.keyword"
],
"_source" : {
"itemDetails" : {
"compSku" : "202726",
"compName" : "abc.com",
"compWebsite" : "abc.com",
"title" : "Monteray 38.25 in. x 73.375 in. Frameless Hinged Corner Shower Enclosure in Brushed Nickel",
"description" : "Create the modthroom of your dreams with the clean lines of the VIGO Monteray Frameless Shower Enclosure. Solid 3/8 in. tempered glass combined with stainless steel and solid brass construction makes this enclosure strong and long-lasting. The sleek, reversible, outward-opening door features a convenient towel bar. This versatile enclosure can be installed on a tile floor or with a VIGO Shower Base. With a single water deflector along the bottom seal strip, water is redirected back into the shower to keep your bathroom dry, clean, and pristine.",
"modelNumber" : "VG6011BNCL40",
"upc" : "8137756684",
"hasVariations" : false,
"productDetailsBulletPoints" : [ ],
"itemUrls" : {
"productPageUrl" : "https://.abc.com/p/VIGO-Monteray-38-in-x-73-375-in-Frameless-Hinged-Corner-Shower-Enclosure-in-Brushed-Nickel-VG6011BNCL40/202722616",
"primaryImageUrl" : "https://images.thdstatic.com/productImages/d77d9e8b-1ea1-4811-a470-8364c8e47402/svn/vigo-shower-enclosures-vg6011bncl40-64_600.jpg",
"secondaryImageUrls" : [
"https://images.thdstatic.com/productImages/d77d9e8b-1e1-4811-a470-8364c8e47402/svn/vigo-shower-enclosures-vg6011bncl40-64_1000.jpg",
"https://images.thdstatic.com/productImages/db539ff9-6df-48c2-897a-18dd1e1794e3/svn/vigo-shower-enclosures-vg6011bncl40-e1_1000.jpg",
"https://images.thdstatic.com/productImages/47c5090b-49a-46bc-a36d-921ddae5e1ab/svn/vigo-shower-enclosures-vg6011bncl40-40_1000.jpg",
"https://images.thdstatic.com/productImages/add6691c-a02-466d-9a1a-47200b05685e/svn/vigo-shower-enclosures-vg6011bncl40-a0_1000.jpg",
"https://images.thdstatic.com/productImages/d638230e-0d9-40c9-be93-7f7bf24f0732/svn/vigo-shower-enclosures-vg6011bncl40-1d_1000.jpg"
]
}
},
"merchandisingHierarchy" : [
{
"level" : "1",
"category" : "Home",
"categoryUrl" : "host/"
},
{
"level" : "2",
"category" : "Bath",
"categoryUrl" : "host/b/Bath/N-5yc1vZbzb3"
},
{
"level" : "3",
"category" : "Showers",
"categoryUrl" : "host/b/Bath-Showers/N-5yc1vZbzcd"
},
{
"level" : "4",
"category" : "Shower Doors",
"categoryUrl" : "host/b/Bath-Showers-Shower-Doors/N-5yc1vZbzcg"
},
{
"level" : "5",
"category" : "Shower Enclosures",
"categoryUrl" : "host/b/Bath-Showers-Shower-Doors-Shower-Enclosures/N-5yc1vZcbn2"
}
],
"reviewsAndRatings" : {
"pdtReviewCount" : 105
},
"additionalAttributes" : {
"isAddon" : false
},
"productSpecifications" : {
"Warranties" : { },
"Details" : { },
"Dimensions" : { }
},
"promoDetails" : [
{
"promoName" : "Save $150.00 (15%)",
"promoPrice" : 849.9
}
],
"locationDetails" : { },
"storePickupDetails" : {
"deliveryText" : "Get it by Mon, Feb 20",
"toEddDate" : "Mon, Feb 20",
"isBackordered" : false,
"selectedEddZipcode" : "20147",
"shipToStoreEnabled" : true,
"homeDeliveryEnabled" : true,
"scheduledDeliveryEnabled" : false
},
"recommendedProducts" : [ ],
"pricingInfo" : [
{
"type" : "SAS",
"offerPrice" : 849.9,
"sellerName" : "abc.com",
"onClearance" : false,
"inStock" : true,
"isBuyBoxWinner" : true,
"promo" : [
{
"onPromo" : true,
"promoName" : "Save $150.00 (15%)",
"promoPrice" : 849.9
}
],
"basePrice" : 999.9,
"priceVariants" : [
{
"basePrice" : 999.9,
"offerPrice" : 849.9
}
],
"inventoryDetails" : {
"stockInStore" : false,
"stockOnline" : true
}
}
]
}
}

You can do it like this:
input {
elasticsearch {
...
query => '
{
"_source": ["merchandisingHierarchy.categoryUrl"],
"query": {
"match_all": {}
}
}
'
}
}
filter {
mutate {
add_field => {
"col1" => "%{[merchandisingHierarchy][0][categoryUrl]}"
"col2" => "%{[pricingInfo][0][basePrice]}"
}
}
}
output {
stdout {
codec => csv {
include_headers => true
columns => [ "col1"]
}
}
}
I've tested with your sample document and I get the output below, which looks like is working per your expectation:
col1,col2
host/,999.9

Using multiple config files for logstash

I am just learning elasticsearch and I need to know how to correctly split a configuration file into multiple. I'm using the official logstash on docker with ports bound on 9600 and 5044. Originally I had a working single logstash file without conditionals like so:
input {
beats {
port => '5044'
}
}
filter
{
grok{
match => {
"message" => "%{TIMESTAMP_ISO8601:timestamp} \[(?<event_source>[\w\s]+)\]:\[(?<log_type>[\w\s]+)\]:\[(?<id>\d+)\] %{GREEDYDATA:details}"
"source" => "%{GREEDYDATA}\\%{GREEDYDATA:app}.log"
}
}
mutate{
convert => { "id" => "integer" }
}
date {
match => [ "timestamp", "ISO8601" ]
locale => en
remove_field => "timestamp"
}
}
output
{
elasticsearch {
hosts => ["http://elastic:9200"]
index => "logstash-supportworks"
}
}
When I wanted to add metricbeat I decided to split that configuration into a new file. So I ended up with 3 files:
__input.conf
input {
beats {
port => '5044'
}
}
metric.conf
# for testing I'm adding no filters just to see what the data looks like
output {
if ['#metadata']['beat'] == 'metricbeat' {
elasticsearch {
hosts => ["http://elastic:9200"]
index => "%{[#metadata][beat]}-%{[#metadata][version]}"
}
}
}
supportworks.conf
filter
{
if ["source"] =~ /Supportwork Server/ {
grok{
match => {
"message" => "%{TIMESTAMP_ISO8601:timestamp} \[(?<event_source>[\w\s]+)\]:\[(?<log_type>[\w\s]+)\]:\[(?<id>\d+)\] %{GREEDYDATA:details}"
"source" => "%{GREEDYDATA}\\%{GREEDYDATA:app}.log"
}
}
mutate{
convert => { "id" => "integer" }
}
date {
match => [ "timestamp", "ISO8601" ]
locale => en
remove_field => "timestamp"
}
}
}
output
{
if ["source"] =~ /Supportwork Server/ {
elasticsearch {
hosts => ["http://elastic:9200"]
index => "logstash-supportworks"
}
}
}
Now no data is being sent to the ES instance. I have verified that filebeat at least is running and publishing messages, so I'd expect to at least see that much going to ES. Here's a published message from my server running filebeat
2019-03-06T09:16:44.634-0800 DEBUG [publish] pipeline/processor.go:308 Publish event: {
"#timestamp": "2019-03-06T17:16:44.634Z",
"#metadata": {
"beat": "filebeat",
"type": "doc",
"version": "6.6.1"
},
"source": "C:\\Program Files (x86)\\Hornbill\\Supportworks Server\\log\\swserver.log",
"offset": 4773212,
"log": {
"file": {
"path": "C:\\Program Files (x86)\\Hornbill\\Supportworks Server\\log\\swserver.log"
}
},
"message": "2019-03-06 09:16:42 [COMMS]:[INFO ]:[4924] Helpdesk API (5005) Socket error while idle - 10053",
"prospector": {
"type": "log"
},
"input": {
"type": "log"
},
"beat": {
"name": "WIN-22VRRIEO8LM",
"hostname": "WIN-22VRRIEO8LM",
"version": "6.6.1"
},
"host": {
"name": "WIN-22VRRIEO8LM",
"architecture": "x86_64",
"os": {
"platform": "windows",
"version": "6.3",
"family": "windows",
"name": "Windows Server 2012 R2 Standard",
"build": "9600.0"
},
"id": "e5887ac2-6fbf-45ef-998d-e40437066f56"
}
}

I got this working by adding a mutate filter to __input.conf to replace backslashes with forward slashes in the source field
filter {
mutate{
gsub => [ "source", "[\\]", "/" ]
}
}
And removing the " from the field accessors in my conditionals So
if ["source"] =~ /Supportwork Server/
Became
if [source] =~ /Supportwork Server/
Both changes seemed to be necessary to get this configuration working.

logstash splits event field values and assign to #metadata field

I have a logstash event, which has the following field
{
"_index": "logstash-2016.08.09",
"_type": "log",
"_id": "AVZvz2ix",
"_score": null,
"_source": {
"message": "function_name~execute||line_no~128||debug_message~id was not found",
"#version": "1",
"#timestamp": "2016-08-09T14:57:00.147Z",
"beat": {
"hostname": "coredev",
"name": "coredev"
},
"count": 1,
"fields": null,
"input_type": "log",
"offset": 22299196,
"source": "/project_root/project_1/log/core.log",
"type": "log",
"host": "coredev",
"tags": [
"beats_input_codec_plain_applied"
]
},
"fields": {
"#timestamp": [
1470754620147
]
},
"sort": [
1470754620147
]
}
I am wondering how to use filter (kv maybe?) to extract core.log from "source": "/project_root/project_1/log/core.log", and put it in e.g. [#metadata][log_type], and so later on, I can use log_type in output to create an unique index, composing of hostname + logtype + timestamp, e.g.
output {
elasticsearch {
hosts => "localhost:9200"
manage_template => false
index => "%{[#metadata][_source][host]}-%{[#metadata][log_type]}-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
stdout { codec => rubydebug }
}

You can leverage the mutate/gsub filter in order to achieve this:
filter {
# add the log_type metadata field
mutate {
add_field => {"[#metadata][log_type]" => "%{source}"}
}
# remove everything up to the last slash
mutate {
gsub => [ "[#metadata][log_type]", "^.*\/", "" ]
}
}
Then you can modify your elasticsearch output like this:
output {
elasticsearch {
hosts => ["localhost:9200"]
manage_template => false
index => "%{host}-%{[#metadata][log_type]}-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
stdout { codec => rubydebug }
}

logstash - geoip in Kibana can not show any information using the IP addresses

I want to display the number of users accessing my app in a World Map using ElasticSearch, Kibana and Logstash.
Here is my log (Json format):
{
"device": "",
"public_ip": "70.90.17.210",
"mac": "00:01:02:03:04:05",
"ip": "192.16.1.10",
"event": {
"timestamp": "2014-08-15T00:00:00.000Z",
"source": "system",
"name": "status"
},
"status": {
"channel": "channelname",
"section": "pictures",
"downlink": 1362930,
"network": "Wi-Fi"
}
}
And this is my config file:
input {
file {
path => ["/mnt/logs/stb.events"]
codec => "json"
type => "event"
}
}
filter {
date {
match => [ "timestamp", "yyyy-MM-dd HH:mm:ss", "ISO8601" ]
}
}
filter {
mutate {
convert => [ "downlink", "integer" ]
}
}
filter {
geoip {
add_tag => [ "geoip" ]
database => "/opt/logstash/vendor/geoip/GeoLiteCity.dat"
source => "public_ip"
target => "geoip"
add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
}
mutate {
convert => [ "[geoip][coordinates]", "float" ]
}
}
output {
elasticsearch {
host => localhost
}
}
At the end in Kibana I see only an empty geoip tag
Can someone help me and to point me where is my mistake?

Since Logstash 1.3.0 you can use the geoip.location field that is created automatically instead of creating the coordinates field and converting it to float manually.
One curly bracket seems to be missing from your log, I guess this is the correct format:
{
"device": {
"public_ip": "70.90.17.210",
"mac": "00:01:02:03:04:05",
"ip": "192.16.1.10"
},
"event": {
"timestamp": "2014-08-15T00:00:00.000Z",
"source": "system",
"name": "status"
},
"status": {
"channel": "channelname",
"section": "pictures",
"downlink": 1362930,
"network": "Wi-Fi"
}
}
In this case I would suggest you to try the following configuration for the filter (without mutate):
filter {
geoip {
source => "[device][public_ip]"
}
}
Then you should be able to use "geoip.location" in your map. I did quite some research and debugging to find out that in order to be resolved correctly, nested fields should be surrounded by [ ] when used as source.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Correlate messages in ELK by field - elasticsearch

Related

Getting JSON format error in Logstash. Need to decode JSON from the message part

How to fetch field from array of objects in Elasticsearch Index as CSV file to Google Cloud Storage Using Logstash

Using multiple config files for logstash

logstash splits event field values and assign to #metadata field

logstash - geoip in Kibana can not show any information using the IP addresses

Categories

Resources