I have implemented the kafka with logstash input and elasticsearch output. its working fine in kibana.. I want to filter the data based on statuscode - elasticsearch

This is kibana dashboard json Data.. Here i have to filter the based on response statuscode with in the message json data field..
{
"_index": "rand-topic",
"_type": "_doc",
"_id": "ulF8uH0BK9MbBSR7DPEw",
"_version": 1,
"_score": null,
"fields": {
"#timestamp": [
"2021-12-14T10:27:56.956Z"
],
"#version": [
"1"
],
"#version.keyword": [
"1"
],
"message": [
"{\"requestMethod\":\"GET\",\"headers\":{\"content-type\":\"application/json\",\"user-agent\":\"PostmanRuntime/7.28.4\",\"accept\":\"*/*\",\"postman-token\":\"977fc94b-38c8-4df4-ad73-814871a32eca\",\"host\":\"localhost:5600\",\"accept-encoding\":\"gzip, deflate, br\",\"connection\":\"keep-alive\",\"content-length\":\"44\"},\"body\":{\"category\":\"CAT\",\"noise\":\"purr\"},\"query\":{},\"requestUrl\":\"http://localhost:5600/kafka\",\"protocol\":\"HTTP/1.1\",\"remoteIp\":\"1\",\"requestSize\":302,\"userAgent\":\"PostmanRuntime/7.28.4\",\"statusCode\":200,\"response\":{\"success\":true,\"message\":\"Kafka Details are added\",\"data\":{\"kafkaData\":{\"_id\":\"61b871ac69be37078a9c1a79\",\"category\":\"DOG\",\"noise\":\"bark\",\"__v\":0},\"postData\":{\"category\":\"DOG\",\"noise\":\"bark\"}}},\"latency\":{\"seconds\":0,\"nanos\":61000000},\"responseSize\":193}"]},"sort[1639477676956]}
Expected output like this Here added the statuscode field from message field
{
"_index": "rand-topic",
"_type": "_doc",
"_id": "ulF8uH0BK9MbBSR7DPEw",
"_version": 1,
"_score": null,
"fields": {
"#timestamp": [
"2021-12-14T10:27:56.956Z"
],
"#version": [
"1"
],
"#version.keyword": [
"1"
],
"statusCode": [
200
],
"message": [
"{\"requestMethod\":\"GET\",\"headers\":{\"content-
type\":\"application/json\",\"user-
agent\":\"PostmanRuntime/7.28.4\",\"accept\":\"*/*\",\"postman-
token\":\"977fc94b-38c8-4df4-ad73-
814871a32eca\",\"host\":\"localhost:5600\",\"accept-
encoding\":\"gzip, deflate, br\",\"connection\":\"keep-
alive\",\"content-length\":\"44\"},\"body\":
{\"category\":\"CAT\",\"noise\":\"purr\"},\"query\": {}, \"requestUrl\":\"http://localhost:5600/kafka\",\"protocol\":\"HTTP/1.1\",\"remoteIp\":\"1\",\"requestSize\":302,\"userAgent\":\"PostmanRuntime/7.28.4\",\"statusCode\":200,\"response\":{\"success\":true,\"message\":\"Kafka Details are added\",\"data\":{\"kafkaData\":{\"_id\":\"61b871ac69be37078a9c1a79\",\"category\":\"DOG\",\"noise\":\"bark\",\"__v\":0},\"postData\":{\"category\":\"DOG\",\"noise\":\"bark\"}}},\"latency\":{\"seconds\":0,\"nanos\":61000000},\"responseSize\":193}"
]},"sort": [1639477676956]}
Please help me how to configure logstash filter for statusCode
input {
kafka {
topics => ["randtopic"]
bootstrap_servers => "192.168.29.138:9092"
}
}
filter{
mutate {
add_field => {
"statusCode" => "%{[status]}"
}
}
}
output {
elasticsearch {
hosts => ["192.168.29.138:9200"]
index => "rand-topic"
workers => 1
}
}

output {
if [message][0][statusCode] == "200" {
Do Somethings ....
stdout { codec => ""}
}
}

Related

How to convert Elasticsearch index field of string type to json?

I have an Opensearch index with a string field message defined as below:
{"name":"message.message","type":"string","esTypes":["text"],"count":0,"scripted":false,"searchable":true,"aggregatable":false,"readFromDocValues":false}
Sample data:
"_source" : {
"message" : {
"message" : "user: AB, from: home, to: /app1"
}
}
I would like to convert the message column into json so that I can access the values message.user, message.from and message.to individually.
How do I go about it?
You can use Json Processor.
POST /_ingest/pipeline/_simulate
{
"pipeline": {
"description": "convert json to object",
"processors": [
{
"json": {
"field": "foo",
"target_field": "json_target"
}
}
]
},
"docs": [
{
"_index": "index",
"_id": "id",
"_source": {
"foo": "{\"name\":\"message.message\",\"type\":\"string\",\"esTypes\":[\"text\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false}\r\n"
}
}
]
}
Response:
{
"docs": [
{
"doc": {
"_index": "index",
"_id": "id",
"_version": "-3",
"_source": {
"foo": """{"name":"message.message","type":"string","esTypes":["text"],"count":0,"scripted":false,"searchable":true,"aggregatable":false,"readFromDocValues":false}
""",
"json_target": {
"esTypes": [
"text"
],
"readFromDocValues": false,
"name": "message.message",
"count": 0,
"aggregatable": false,
"type": "string",
"scripted": false,
"searchable": true
}
},
"_ingest": {
"timestamp": "2022-11-09T19:38:01.16232Z"
}
}
}
]
}

elasticsearch query for finding id in fields in json file

I have a json file that I indexed on elasticsearch and I need a query to retrieve "_id_osm". can you help me plz.
and this is one line of my json file:
{
"index": {
"_index": "pariss",
"_type": "sig",
"_id": 1
}
}{
"fields": {
"_id_osm": 416747747,
"_categorie": "",
"_name": [
""
],
"_location": [
36.1941834,
5.3595221
]
}
}
Based on the comments in the answer updated the answer,
If you have store true in your mapping for _id_osm then you can use below query to fetch the field value.
{
"stored_fields" : ["_id_osm"],
"query": {
"match": {
"_id": 1
}
}
}
Above call returns below response and you can notice the fields section in the response which contains the field name and value.
"hits": [
{
"_index": "intqu",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"fields": {
"_id_osm": [
416747747
]
}
}
]
If you don't have store true which is default, then use _source filtering to get the data.
{
"_source": [ "_id_osm" ],
"query": {
"match": {
"_id": 1
}
}
}
which returns below response, you can see _source has the data.
"hits": [
{
"_index": "intqu",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"_id_osm": 416747747
}
}
]

logstash extract and move nested fields into new parent field

If in my log I print the latitude and longitude of a given point, how can I capture this information so that it is processed as a geospatial data in elastic search?
Below I show an example of a document in Elasticsearch corresponding to a log line:
{
"_index": "memo-logstash-2018.05",
"_type": "doc",
"_id": "DDCARGMBfvaBflicTW4-",
"_version": 1,
"_score": null,
"_source": {
"type": "elktest",
"message": "LON: 12.5, LAT: 42",
"#timestamp": "2018-05-09T10:44:09.046Z",
"host": "f6f9fd66cd6c",
"path": "/usr/share/logstash/logs/docker-elk-master.log",
"#version": "1"
},
"fields": {
"#timestamp": [
"2018-05-09T10:44:09.046Z"
]
},
"highlight": {
"type": [
"#kibana-highlighted-field#elktest#/kibana-highlighted-field#"
]
},
"sort": [
1525862649046
]
}
You can first separate LON and LAT into their own fields as follows,
grok {
match => {"message" => "LON: %{NUMBER:LON}, LAT: %{NUMBER:LAT}"}
}
once they are separated you can use mutate filter to create a parent field around them, like this,
filter {
mutate {
rename => { "LON" => "[location][LON]" }
rename => { "LAT" => "[location][LAT]" }
}
}
let me know if this helps.

How do I replicate the _id and _type of elasticsearch index when dumping data through Logstash

I have an "Index":samcorp with "type":"sam".
One of them looks like the below :
{
"_index": "samcorp",
"_type": "sam",
"_id": "1236",
"_version": 1,
"_score": 1,
"_source": {
"name": "Sam Smith",
"age": 22,
"confirmed": true,
"join_date": "2014-06-01"
}
}
I want to replicate the same data into a different "index" name "jamcorp" with the same "type" and same "id"
I am using Logstash to do it:
I use the below code in the configuration file of logstash I end up having wrong ids and type
input {
elasticsearch {
hosts => ["127.0.0.1:9200"]
index => "samcorp"
}
}
filter {
mutate {
remove_field => [ "#version", "#timestamp" ]
}
}
output {
elasticsearch {
hosts => ["127.0.0.1:9200"]
manage_template => false
index => "jamcorp"
document_type => "%{_type}"
document_id => "%{_id}"
}
}
I've tried all possible combinations, I gt the following output:
Output:
{
"_index": "jamcorp",
"_type": "%{_type}",
"_id": "%{_id}",
"_version": 4,
"_score": 1,
"_source": {
"name": "Sam Smith",
"age": 22,
"confirmed": true,
"join_date": "2014-06-01"
}
}
The Ouptut I require is:
{
"_index": "jamcorp",
"_type": "sam",
"_id": "1236",
"_version": 4,
"_score": 1,
"_source": {
"name": "Sam Smith",
"age": 22,
"confirmed": true,
"join_date": "2014-06-01"
}
}
Any help would be appreciated. :) Thanks
In your elasticsearch input, you need to set the docinfo parameter to true
input {
elasticsearch {
hosts => ["127.0.0.1:9200"]
index => "samcorp"
docinfo => true <--- add this
}
}
As a result the #metadata hash will be populated with the index, _type and _id of the document and you can reuse that in your filters and outputs:
output {
elasticsearch {
hosts => ["127.0.0.1:9200"]
manage_template => false
index => "jamcorp"
document_type => "%{[#metadata][_type]}" <--- use #metadata
document_id => "%{[#metadata][_id]}" <--- use #metadata
}
}

How to extract feature from the Elasticsearch _source to index

I have used logstash,elasticsearch and kibana to collect logs.
The log file is json which like this:
{"_id":{"$oid":"5540afc2cec7c68fc1248d78"},"agentId":"0000000BAB39A520","handler":"SUSIControl","sensorId":"/GPIO/GPIO00/Level","ts":{"$date":"2015-04-29T09:00:00.846Z"},"vHour":1}
{"_id":{"$oid":"5540afc2cec7c68fc1248d79"},"agentId":"0000000BAB39A520","handler":"SUSIControl","sensorId":"/GPIO/GPIO00/Dir","ts":{"$date":"2015-04-29T09:00:00.846Z"},"vHour":0}
and the code I have used in logstash:
input {
file {
type => "log"
path => ["/home/data/1/1.json"]
start_position => "beginning"
}
}
filter {
json{
source => "message"
}
}
output {
elasticsearch { embedded => true }
stdout { codec => rubydebug }
}
then the output in elasticsearch is :
{
"_index": "logstash-2015.06.29",
"_type": "log",
"_id": "AU5AG7KahwyA2bfnpJO0",
"_version": 1,
"_score": 1,
"_source": {
"message": "{"_id":{"$oid":"5540afc2cec7c68fc1248d7c"},"agentId":"0000000BAB39A520","handler":"SUSIControl","sensorId":"/GPIO/GPIO05/Dir","ts":{"$date":"2015-04-29T09:00:00.846Z"},"vHour":1}",
"#version": "1",
"#timestamp": "2015-06-29T16:17:03.040Z",
"type": "log",
"host": "song-Lenovo-IdeaPad",
"path": "/home/song/soft/data/1/Average.json",
"_id": {
"$oid": "5540afc2cec7c68fc1248d7c"
},
"agentId": "0000000BAB39A520",
"handler": "SUSIControl",
"sensorId": "/GPIO/GPIO05/Dir",
"ts": {
"$date": "2015-04-29T09:00:00.846Z"
},
"vHour": 1
}
}
But the information in the json file all in the _source not index
so that i can't use kibana to analysis them.
the kibana shows that Analysis is not available for object fields.
the _source is object fields
how to solve this problem?

Resources