how to use nested Json field as elasticsearch doc in logstash - elasticsearch

say the event is like this:
{
"name": "xxx",
"data": {
"a": xxx
}
}
with logstash, how to just use inner data field as document source send to elasticsearch, like:
{
"a": xxx
}
any response would be appreciated!
tried to use json filter
filter {
json {
source => "data"
}
}
but seems like the event is already parsed as a json, the terminal just print this error message:
Error parsing json {:source=>"data", :raw=>{"a"=>xxx}, :exception=>java.lang.ClassCastException: org.jruby.RubyHash cannot be cast to org.jruby.RubyIO}

FYI, found an answer works
https://discuss.elastic.co/t/move-subarrays-to-document-root/143876
just use ruby code to move nested fields to document root , and remove all other fields
ruby {
code => 'event.get("data").each { | k, v| event.set(k, v) }'
}
mutate {
remove_field => [ "name", "data" ]
}

Related

Splitting a json array format with same fields name

Currently, I have this kind of JSON array with the same field, what I wanted is to split this data into an independent field and the field name is based on a "name" field
events.parameters (this is the field name of the JSON array)
{
"name": "USER_EMAIL",
"value": "dummy#yahoo.com"
},
{
"name": "DEVICE_ID",
"value": "Wdk39Iw-akOsiwkaALw"
},
{
"name": "SERIAL_NUMBER",
"value": "9KJUIHG"
}
expected output:
events.parameters.USER_EMAIL : dummy#yahoo.com
events.parameters.DEVICE_ID: Wdk39Iw-akOsiwkaALw
events.parameters.SERIAL_NUMBER : 9KJUIHG
Thanks.
Tldr;
There is no filter that does exactly what you are looking for.
You will have to use the ruby filter
I just fixed the problem, for everyone wondering here's my ruby script
if [events][parameters] {
ruby {
code => '
event.get("[events][parameters]").each { |a|
name = a["name"]
value = a["value"]
event.set("[events][parameters_split][#{name}]", value)
}
'
}
}
the output was just like what I wanted.
Cheers!

ElasticSearch: populating ip_range type field via logstash

I'm experimenting with the ip_range field type in ElasticSearch 6.8 (https://www.elastic.co/guide/en/elasticsearch/reference/6.8/range.html) and struggle to find a way to load ip data into the field properly via logstash
I was able to load some sample data via Kibana Dev Tools, but cannot figure out a way to do the same via logstash.
Index definition
PUT test_ip_range
{
"mapping": {
"_doc": {
"properties": {
"ip_from_to_range": {
"type": "ip_range"
},
"ip_from": {
"type": "ip"
},
"ip_to": {
"type": "ip"
}
}
}
}
}
Add sample doc:
PUT test_ip_range/_doc/3
{
"ip_from_to_range" :
{
"gte" : "<dotted_ip_from>",
"lte": "<dotted_ip_to>"
}
}
Logstash config (reading from DB)
input {
jdbc {
...
statement => "SELECT ip_from, ip_to, <???> AS ip_from_to_range FROM sample_ip_data"
}
}
output {
stdout { codec => json_lines }
elasticsearch {
"hosts" => "<host>"
"index" => "test_ip_range"
"document_type" => "_doc"
}
}
Question:
How do I get ip_from and ip_to DB fields into their respective gte and lte parts of the ip_from_to_range via logstash config??
I know I can also insert the ip range in CIDR notation, but would like to be able to have both options - loading in CIDR notation and loading as a range.
After some trial and error, finally figured out the logstash config.
I had posted about a similar issue here, which finally got me on the right track with the syntax for this use case as well.
input { ... }
filter {
mutate {
add_field => {
"[ip_from_to_range]" =>
'{
"gte": "%{ip_from}",
"lte": "%{ip_to}"
}'
}
}
json {
source => "ip_from_to_range"
target => "ip_from_to_range"
}
}
output { ... }
Filter parts explained
mutate add_field: create a new field [ip_from_to_range] with its value being a json string ( '{...}' ). It is important to have the field as [field_name], otherwise the next step to parse the string into json object doesn't work
json: parse the string representation into a json object

How can I let elasticsearch map base64 field?

I have a json file as input to my Elasticsearch 7.10.1 cluster. The format of the json is something like:
{
"data" : "eyJtZXRyaWNfc3RyZWFtX25hbWUiOiJtGltZW5zaW9ucy...
}
The data value in the json is a base64 of a json. How can I create a map in elasticsearch to decode the base64 value and make index on each field inside the decoded json?
Ingest pipeline to the rescue!! You can create an ingest pipeline that will decode the base64 encoded field and then parse the resulting JSON and added all fields to the document. It basically goes like this:
PUT _ingest/pipeline/b64-decode
{
"processors": [
{
"script": {
"source": "ctx.decoded = ctx.b64.decodeBase64();"
}
},
{
"json": {
"field": "decoded",
"add_to_root": true
}
},
{
"remove": {
"field": "decoded"
}
}
]
}
Then you can refer to that ingest pipeline when indexing new documents, as shown below:
PUT index/_doc/1?pipeline=b64-decode
{
"b64": "eyJmaWVsZCI6ICJoZWxsbyB3b3JsZCJ9"
}
The b64 field contains the following base64-encoded JSON
{ "field" : "hello world" }
Finally, the document that will be indexed will look like this:
{
"b64" : "eyJmaWVsZCI6ICJoZWxsbyB3b3JsZCJ9",
"field" : "hello world"
}

Lowercase field name in Logstash for Elasticsearch index

I have a logstash command that I'm piping a file to that will write to Elasticsearch. I want to use one field to select the index I will write to (appName). However the data in this field is not all lowercase so I need to do that when selecting the index but I don't want the data in the document itself to be modified.
I have an attempt below where I first copy the original field (appName) to a new one (appNameIndex), lowercase the new field, remove it from the upload and then use it pick the index.
input {
stdin { type => stdin }
}
filter {
csv {
separator => " "
columns => ["appName", "field1", "field2", ...]
convert => {
...
}
}
filter {
mutate {
copy => ["appName", "appNameIndex"]
}
}
filter {
mutate {
lowercase => ["appNameIndex"]
}
}
filter {
mutate {
remove_field => [
"appNameIndex", // if I remove this it works
...
]
}
}
output {
amazon_es {
hosts =>
["my-es-cluster.us-east-1.es.amazonaws.com"]
index => "%{appNameIndex}"
region => "us-east-1"
}
}
However I am getting errors that say
Invalid index name [%{appIndexName}]
Clearly it's not grabbing my mutation. Is it because the remove section takes it out entirely? I was hoping that just removed it from the document upload. Am I going about this incorrectly?
UPDATE I tried taking out the remove index name part and it does in fact work, so that helps identify the source of the error. Now the question becomes how do I get around it. With that part of the config removed I essentially have two fields with the same data, one lowercased and one not
You can define a #metadata field that is a special field which will never be included in the output https://www.elastic.co/guide/en/logstash/current/event-dependent-configuration.html#metadata.
input {
stdin { type => stdin }
}
filter {
csv {
separator => " "
columns => ["appName", "field1", "field2", ...]
convert => {
...
}
}
filter {
mutate {
copy => ["appName", "[#metadata][appNameIndex]"]
}
}
filter {
mutate {
lowercase => ["[#metadata][appNameIndex]"]
}
}
output {
amazon_es {
hosts => ["my-es-cluster.us-east-1.es.amazonaws.com"]
index => "%{[#metadata][appNameIndex]}"
region => "us-east-1"
}
}

Logstash create nested field

I have some fields from parsed string:
string example:
"testmessage 10.5 100"
match => { "message" => "%{GREEDYDATA:text} %{NUMBER:duraton} %{NUMBER:code)"
output will be the
{
"text": "testmessage",
"duraton": "10.5",
"code": "100"
}
But i want to get like this:
{
"text": "testmessage",
"values": {
"duraton": "10.5",
"code": "100"
}
}
How to create a field "values" containing nested field?
The syntax is:
%{NUMBER:[values][duraton]}
Note that you can also cast them in logstash:
%{NUMBER:[values][duraton]:float}
("int" also works).
Right way:
%{GREEDYDATA:text} %{NUMBER:[values]duraton:float} %{NUMBER:[values]code:int)

Resources